Proceedings of the 4th Annual Linux Showcase & Conference ... · U-OBL) approac h, eliminating an y...

USENIX Association

Proceedings of the4th Annual Linux Showcase & Conference,

Atlanta

Atlanta, Georgia, USAOctober 10 –14, 2000

THE ADVANCED COMPUTING SYSTEMS ASSOCIATION

© 2000 by The USENIX Association All Rights Reserved For more information about the USENIX Association:Phone: 1 510 528 8649 FAX: 1 510 548 5738 Email: [email protected] WWW: http://www.usenix.org

Rights to individual papers remain with the author or the author's employer.

Permission is granted for noncommercial reproduction of the work for educational or research purposes.

This copyright notice must be included in the reproduced paper. USENIX acknowledges all trademarks herein.

Dynamic Bu�er Cache Management Scheme based on

Simple and Aggressive Prefetching�

H. Seok Jeon Sam H. Noh

Department of Computer EngineeringHong-Ik University

Mapo-Gu Sangsoo Dong 72-1 Seoul, Korea 121-791tel) +82-2-320-1470 fax)+82-2-320-1105

[email protected]

[email protected], http://www.cs.hongik.ac.kr/�noh

Abstract

Many replacement and prefetching policies have re-cently been proposed for bu�er cache management.However, many real operating systems, includingGNU/Linux, generally use the simple Least RecentlyUsed (LRU) replacement policy with prefetching be-ing employed in special situations such as when sequen-tiality is detected. In this paper, we propose the SA-W 2R scheme that integrates bu�er management andprefetching, where prefetching is done constantly in ag-gressive fashion. The scheme is simple to implementmaking it a feasible solution in real systems. In its ba-sic form, for bu�er replacement, it uses the LRU policy.However, its modular design allows for any replacementpolicy to be incorporated into the scheme. For prefetch-ing, it uses the LRU-One Block Lookahead (LRU-OBL)approach, eliminating any extra burden that is gener-ally necessary in other prefetching approaches. Imple-mentation studies based on the GNU/Linux kernel ver-sion 2.2.14 show that the SA-W 2

R performs better thanthe current version of GNU/Linux with a maximum in-creases of 23 % for the workloads considered.

1 Introduction

Considerable research for optimizing the use of bu�ercaches in both replacement and prefetching aspectshave been undertaken. This paper proposes yet an-other scheme, but which has sailent features such assimple prefetching and modular integration of replace-ment and prefetching. These features lead to a schemethat is easily implementable, and which leads to per-

�This work was supported by the Korea Science and Engineer-ing Foundation grant 98-0102-09-01-3.

formance improvements compared to previously knownschemes.

In the following, we �rst discuss the state-of-the-artin bu�er cache replacement and prefetching, and thenpoint out their limitations, which is the motivation be-hind the development of the proposed scheme.

1.1 Previous Research in Bu�er CacheManagement

Many replacement policies for improving the per-formance of bu�er cache management have been pro-posed. Policies such as the Least Recently Used (LRU),Least Frequently Used (LFU), LRU-K [11], 2Q [6],Frequency-Based Replacement (FBR) [14], and LeastRecently/Frequently Used (LRFU) [8] are examples.

All these replacement policies were developed inde-pendent of prefetching. Recent developments in bu�ermanagement policies has lead to the investigation of in-corporating prefetching to the replacement policies. Ithas been shown that for some workloads incorporatingprefetching can result in considerable improvement inthe performance of bu�er management [15, 16].

Research in incorporating prefetching can be cate-gorized into three groups. The �rst group of poli-cies maintain a history of past behavior of the appli-cations [5, 7, 9]. This speculative approach, however,may result in performance degradation due to inaccu-rate prefetching and history maintenance overhead.

The second category of policies obtain hints from ap-plications themselves prior to execution or during exe-cution [2, 3, 4, 10, 13, 17]. Currently, these approachesappear promising as it has been shown that hints may

be obtainable for speci�c applications with minor over-head, though their feasibility in real systems still needsto be tested.

The �nal approach does not require any informationneither from the application nor from observations ofpast behavior. The LRU-OBL (One Block Lookahead)scheme [15, 16] (and its variant of prefetching multipleblocks at once) is the only scheme known to date usingthis approach. This scheme simply prefetches the logi-cal next block of the currently referenced block if it isnot resident in cache. It has been shown that throughthis simple scheme, improvements of up to 80% in thehit rate was possible for some workloads [16]. To thebest of our knowledge, this approach is the only prefetchtechnique used in real systems due to its simplicity ande�ectiveness.

1.2 Bu�er Cache Management inGNU/Linux

GNU/Linux adopts the LRU scheme for bu�er cachemanagement. In GNU/Linux, the bread() function han-dles the block requests. If the requested block exists inthe hash table, that is, the bu�er cache, it returns theblock pointer. Otherwise, it makes a request for a diskI/O.

The breada() function is provided as a primitive forprefetching in GNU/Linux. This function is simplya variant of the LRU-OBL scheme in that it readsblocks adjacent to the requested block. However, in theGNU/Linux kernel version 2.2.14, the breada() functionis seldom used. To the best of our knowledge, prefetch-ing is issued only at two places. One is for readingdirectories in the ISO 9660 �le system and the other isfor the kernel thread that synchronizes the spare diskwith the active disk array in RAID reconstruction.

1.3 The Remainder of the Paper

The rest of the paper is organized as follows. Inthe next section, we provide the motivation behind thiswork. In Section 3, the SA-W 2R scheme, which is thebu�er cache management scheme proposed in this pa-per, is presented. Simulation and implementation ex-perimental studies are presented in Section 4 and 5, re-spectively. Finally, Section 6 concludes with a summaryand directions for further research.

2 Motivation

In this section, we discuss the motivation behind thedevelopment of the SA-W 2

R scheme that is presentedin the next section. To this end, we �rst describe theWeighing-Waiting Room (W 2R) scheme. This schemeprovides the framework for an eÆcient integration ofbu�er replacement and prefetching that is simple ande�ective. However, its limitation restricts it from beingdeployed in real systems hence, providing the basis forthe development of the SA-W 2

R scheme.

2.1 The W 2R Scheme

The Weighing-Waiting Room (W 2R) scheme parti-

tions the bu�er cache into two rooms, that is, theWeighing Room and the Waiting Room as shown inFigure 1. The name is derived from the fact that weuse the weight analogy in describing the managementof the bu�er cache. In general, the block to be replacedby the incoming block is the block that is considered tobe the least likely to be re-referenced. This likelihoodcan be represented as a weight. Each block is given aweight, and the heavier block is considered more likelyto be re-referenced. Then, in general, the lightest blockis replaced by the incoming block as it is considered tobe the least likely to be referenced again.

The Weighing Room, in the W 2R scheme, is wherethe weights of the blocks are contested, and rank isformed among the blocks. Only blocks that have beenreferenced have weights associated with them. In bu�erreplacement policies such as the LRU or 2Q, the wholebu�er is simply the Weighing Room as only blocks thathave been referenced are brought into the bu�er.

The Waiting Room is where the prefetched blocksreside. Prefetching is done exactly like the LRU-OBLscheme, where the logical next block of the currentlyaccessed block is prefetched when it is not residingin cache. They remain in this part of the bu�er andwait until they obtain permission to be weighed withthe other blocks. This permission is obtained whenand only when the block has actually been referenced.Until this time, the prefetched blocks are weightless.Again, blocks obtain weight only when referenced astheir weight cannot be determined until they have beenreferenced.

Note some of the features of this scheme. First, re-placement (through the Weighing Room) and prefetch-ing (through the Waiting Room) are integrated into thewhole scheme, yet modularity is ensured. New replace-

Weighing Room

Waiting Room Prefetch

Miss

Hit

Hit

Figure 1: Structure of the W 2R scheme.

ment policies are constantly being developed. As newreplacement policies that have practical signi�cance aredeveloped, these policies can directly be incorporatedinto the W 2

R scheme, simply by replacing the imple-mentation in the Weighing Room. This is unlike theLRU-OBL scheme that provide no means of doing this.

Second, by partitioning the bu�er into a WeighingRoom and a Waiting Room, prefetched blocks, whichhave not yet proven their worth, cannot replace a blockin the Weighing Room, which has proven its value bybeing referenced. Figure 2 quanti�es a de�ciency in theLRU-OBL approach. This �gure shows that in somesituations over 60% of prefetched blocks may never beused when using the LRU-OBL scheme. Hence, in LRU-OBL, a prefetched block can hold on to valuable realestate without ever being referenced. This problem isalleviated in the W 2R scheme as blocks that are notreferenced after being prefetched are not promoted tothe Weighing Room. Hence, a prefetched block that isnever referenced cannot replace a block that has someweight (that is, in the Weighing Room). That is to say,the wrong block may be prefetched into the bu�er, butit will not replace a block that has proven its worth byhaving been referenced.

Third, note that a prefetched block enters the Weigh-ing Room only after it is referenced. Hence, theprefetched block does not directly replace a block that isreferenced prior to the prefetched block. The block be-ing promoted to the Weighing Room is promoted rightwhen it is needed, and never before.

2.2 Motivation for SA-W 2R

Although the bene�ts of the W 2R scheme seem to beattractive, as it stands, there is a serious problem thatneeds to be resolved in order for it to be deployablein real systems. The obvious and diÆcult problem ishow to partition a �xed-size bu�er cache into the tworooms. By introducing the Waiting Room, we are in ef-fect, reducing the size of the Weighing Room comparedto conventional bu�er management policies. Hence, we

Figure 2: Percentage of prefetched blocks, using theLRU-OBL scheme, that are never referenced for theSprite, DB2, and OLTP traces.

want to keep the Waiting Room small. However, oncea block is prefetched we would like to hold it in theWaiting Room just enough so that it is eventually ref-erenced. That is, if the Waiting Room is too small aprefetched block may be evicted too early to be of anyhelp to the system. A judicious selection of the roomsize is necessary for eÆcient management of the bu�er.This problem is addressed in the SA-W 2R scheme.

3 Self-Adjusting W 2R

The optimal partition ratio between the WeighingRoom and the Waiting Room will vary according tothe system environment and workload. To reiteratethe point mentioned above, we want to keep the Wait-ing Room small so that the deterioration in perfor-mance due to the reduced Weighing Room size is lim-ited, but we want to keep it big enough so that theprefetched blocks are used, instead of being evicted fromthe Waiting Room. Hence, the partitioning should ac-commodate the workload characteristics such that theperformance bene�ts obtained by increasing the Wait-ing Room outweighs the loss incurred by the reducedWeighing Room. For all practical purposes, this shouldbe determined on-line and must be done with minimaloverhead. The Self-Adjusting W 2

R (SA-W 2R) scheme

attempts to do both.

The SA-W 2R adjusts the partitioning between the

Weighing and Waiting Rooms via a two-step pro-cess, namely, interval-based and fault-based adjustmentsteps as shown in Figure 3. We discuss the two steps in

Figure 3: The SA-W 2R Scheme.

the following subsections.

3.1 Interval-based Adjustment

The Waiting Room, as the name implies, is wherethe prefetched blocks wait to be referenced. In otherwords, the role of the Waiting Room is to maintain theprefetched blocks until they are actually referenced. Re-call that prefetched blocks are weightless, hence they aremanaged in a FIFO queue; the newly prefetched blockis put at the rear (position 1) of the queue, which pushesthe block at the head (position n) of the queue out of theFIFO queue, where n is the size of the Waiting Room.Let us de�ne the position in the FIFO queue at which ablock is actually referenced to be the reference intervalof that block. If the block is evicted from the WaitingRoom without being referenced, then the reference in-terval of that block is 1. This reference interval willvary from workload to workload, and in adjusting theroom sizes the adjustment should be such that the ma-jority of the blocks have reference intervals less or equalto n for as small an n as possible. Hence, given a certainn value, if the reference intervals of blocks are gettingsmaller and smaller, then n should also be adjusted tobecome smaller, so that the loss in the Weighing Roomcan be minimized. On the other hand, if the referenceintervals of blocks are getting larger and getting closeto or surpasses n, then n should be increased, so as toincrease the bene�ts of the Waiting Room.

The SA-W 2R scheme tunes the Waiting Room size

by maintaining the reference interval values of the lastk blocks that are referenced in the Waiting Room. (Tominimize bookkeeping overhead, we set k to 3.) Usingthis information, we observe the trend of the referenceintervals as shown in Figure 4. If the reference inter-val of the last k blocks show an increasing trend, thenthe Waiting Room is increased to accommodate the ref-erence interval increase. Similarly, if the reference in-tervals show a decreasing trend, the Waiting Room isdecreased. If the reference intervals of the last k blocksdo not show any regularity, the Waiting Room size re-mains unchanged.

Figure 4: Interval-based Adjustment.

3.2 Fault-based Adjustment

Interval-based adjustments cannot be ideal as it ad-justs the Waiting Room size based only on observationsmade in the Waiting Room. Adjustments of the tworooms can also be made based on hints provided bymisses that occur upon a block reference. Consider thefollowing situation. Block i is referenced, but it is not inthe bu�er, hence a miss occurs. Since we are prefetch-ing the logical next block, we can deduce considerableinformation based on the availability of blocks i�1 andi + 1 in the bu�er cache. For example, if we �nd thatblock i � 1 is in the Weighing Room and that block i

is in disk, then we know that block i was evicted fromthe bu�er cache as block i would have been prefetchedwith block i� 1.

The SA-W 2R scheme exploits these fault-based in-

formation for adjusting the room sizes. It considers thesituation when a request for block i is a miss. When amiss occurs on block i, nine possible situations can arisedepending on the location of blocks i, i � 1, and i + 1as shown in Table 1. Since a miss occurred on block i,it is in disk. At this point, blocks i� 1 and i+1 can bein the Weighing Room, the Waiting Room, or the disk.Table 1 also shows the adjustments that are made foreach of the nine cases.

Let us now consider how these adjustment recom-mendations came about. Let us start with cases 1 and3, which have in common block i � 1 in the WeighingRoom. (Case 2 also falls into this category, but we willconsider this case separately later.) The fact that blocki� 1 is in the Weighing Room tells us that block i hadonce been in the Waiting Room before being evicted. Itmay also have been in the Weighing Room before be-ing evicted, but the fact that block i + 1 is not in theWaiting Room makes it likely that block i was in theWaiting Room before being evicted. (Note that we are

Table 1: Nine situations in relation to the locations of blocks i, i� 1, i+ 1 when a miss for block i occurs.

Cases Weighing Room Waiting Room Disk Adjustment Made

case 1 i-1, i+1 i Increase Waiting Roomcase 2 i-1 i+1 i Increase Weighing Roomcase 3 i-1 i, i+1 Increase Waiting Roomcase 4 i+1 i-1 i No Adjustmentcase 5 i-1, i+1 i Increase Weighing Roomcase 6 i-1 i, i+1 No Adjustmentcase 7 i+1 i-1, i No Adjustmentcase 8 i+1 i-1, i Increase Weighing Roomcase 9 i-1, i, i+1 No Adjustment

referring to likely scenarios that could have occurredand are not guaranteeing such scenarios.) Hence, weconjecture that block i was evicted before being refer-enced because the Waiting Room was too small. So, weincrease the Waiting Room size.

Now consider cases 5 and 8. This is the oppositeof the previous situation. The fact that block i + 1 isin the Waiting Room tells us that block i was in theWeighing Room. This means that the Weighing Roomhad to evict a block that was to be referenced soon inthe future, meaning that the Weighing Room was toosmall. Hence, adjustments to increase the WeighingRoom size is made.

Let us now consider case 2. Case 2 satis�es both ofthe two previous scenarios, that is, block i� 1 is foundin the Weighing Room and block i + 1 is found in theWaiting Room. Note, however, that the implicationsare di�erent. While having block i � 1 in the Weigh-ing Room only suggests that block i could have beenevicted from the Waiting Room, having block i + 1 inthe Waiting Room tells us that block i must have beenin the Weighing Room when it was evicted. Hence, theWeighing Room is increased in this situation.

For cases 4, 6, 7, and 9, no solid relation can be de-duced. For example, take case 4. The fact that blocki+1 is in the Weighing Room suggests that block i+2could be in the Waiting Room, but nothing in relationto block i can be deduced. Likewise, the fact that blocki� 1 is in the Waiting Room suggests that block i� 2may still be in the Weighing Room, but again, nothingin relation to block i may be deduced. Hence, for thesecases no adjustments are made.

Based on these situations and their adjustments, theSA-W 2

R scheme adjusts the partitioning of the bu�ercache between the Weighing and Waiting Rooms.

3.3 Adaptability of the SA-W 2R Scheme

Figure 5 shows how the SA-W 2R adapts to the chang-

ing workload of the system when the cache size is 3000for the DB2 and Sprite 3C53 traces that will be ex-plained in the next section. The dark line that lookslike the upper boundary of the �gure shows the Wait-ing Room size at each time point. Each dot within the\boundary" represents the access point of a block, thatis, the reference interval within the Waiting Room ateach time point. For the DB2 trace, the streaky linesgoing up within the boundary is showing that the refer-ence interval is increasing, while the streaky lines goingdown shows that the reference interval is decreasing.The �gure shows that SA-W 2

R is adjusting the Wait-ing Room as needed.

Figure 5(b) is actually more interesting. Note thatthe size of the Waiting Room is much smaller comparedto the DB2 trace. This is because there is much more se-quentiality in this trace. When references are sequentialthere is no need to increase the Waiting Room. In fact,if the workload is totally sequential, a Waiting Roomsize of one is suÆcient. Hence, for this trace, the SA-W 2R scheme is keeping the Waiting Room size small.The seemingly horizontal lines in the �gure show that amajority of the blocks are being referenced at a partic-ular point in the Waiting Room, that is, the referenceinterval is constant. The lowest horizontal line repre-sents total sequentiality, while horizontal lines abovethis line show that the reference interval grows, but re-mains constant over time. The Waiting Room size isadjusted to re ect this change.

(a) DB2 (b) Sprite 3C53

Figure 5: Adaptability of the SA-W 2R scheme for DB2 and Sprite 3C53 traces when the cache size is 3000.

4 Simulation Experiments

In this section, we discuss the trace driven simula-tion experiments conducted to evaluate the SA-W 2

R

scheme. A description of the traces that were used isgiven in the next subsection. In the subsequent subsec-tion, we report and discuss the results of these experi-ments.

4.1 The Simulator and Traces

The simulator developed to evaluate the schemes isprogrammed in C++. The basic component in this sim-ulator is the bu�er cache module which takes the tracesas input. The bu�er cache module checks if the blocknumber is in the bu�er. If it is a hit, appropriate ac-tion, which is dependent on the policy used, is taken.Otherwise, a block fetch request action to the disk isemulated. The block size, size of the bu�er, and thepolicy used for managing the bu�er are controllable pa-rameters.

A wide range of traces are used to drive the simulatorto show the robustness of the SA-W 2R scheme. Specif-ically, database traces, the Sprite traces, traces of realapplication programs, and synthetic traces that showZip�an distribution are used. Detailed descriptions ofthese traces are given below.

Database traces: Two traces, namely, DB2 andOLTP, obtained from database systems were used.These traces are identical to the traces used inthe papers by Johnson and Shasha [6] and byO'Neil and others [11]. The DB2 trace is ob-tained from running a DB2 commercial applicationand contains 500,000 block requests to 75,514 dis-tinct blocks. Obtained from the On-Line Transac-tion Processing System, the OLTP trace containsrecords of block requests to a CODASYL databasefor a window of one hour. It contains a total of914,145 requests to 186,880 distinct blocks.

Sprite traces: Sprite traces are traces obtained from4 �le servers and 40 clients running the Sprite dis-tributed �le system [12]. This �le system environ-ment had roughly 30 users who were consistentusers with an additional 40 some users who usedthe system occasionally. Traces were obtained foreight separate periods of 24 or 48 hours. Thesetraces are considered to represent scienti�c work-loads as most of the users were operating system re-searchers, computer architecture researchers, VLSIcircuit designers, parallel processing researchers,etc. Of these traces, the traces that are used inour experiments are traces taken on the 2nd and3rd periods. The trace that we denote as Sprite2C39, which is client 39 of the 2nd period, con-sists of a total of 141,233 block accesses to 19,990distinct blocks, while the trace that we refer toas Sprite 3C53, which is client 53 of the 3rd pe-riod, consists of a total of 239,748 block accesses

to 49,277 distinct blocks. Though these two traceswere taken from the same system, they themselvesare quite di�erent in their characteristics. Accord-ing to Baker and others [1], the traces in the 2ndperiod represent general scienti�c access patterns,while for the 3rd period the workload consistedlargely of accesses to very large �les making it dif-ferent from the general workload of the 2nd period.

Application traces: These set of traces, obtainedfrom executing real application programs, are thoseused in a previous study [3]. The length of thesetraces are very short compared to the database andSprite traces ranging from roughly four thousandto thirty-�ve thousand block accesses. Speci�c de-tails regarding the characteristics of these applica-tions are given below.

cpp: Cpp is the GNU C-compatible compiler pre-processor. The kernel source was used as in-put with the size of header �les and C-source�les of about 1MB and 10MB, respectively.

link: Link is the Unix link-editor. This applica-tion is used to build the FreeBSD kernel fromabout 2.5MB of object �les.

ld: Ld is the trace of a linking editor. It has ran-dom accesses for both reads and writes. Thereis no reuse of data, but since the size of aread request is not always 8K, there are oc-casional reuses at the block level. (A block is8K bytes.)

XDataSlice: XDataSlice is obtained from a 3Dvolume rendering software working on a 220�220�220 data volume, rendering 22 slices withstride 10, along the X axis, then Y axis, thenZ axis. This trace accesses blocks in a �lewith regular strides. There is no reuse of datawhen rendering along one axis, but moderatereuse is done between rendering along di�er-ent axes.

Zip�an traces: The Zip�an traces are synthetic tracescorrespondent with a Zip�an distribution of refer-ence frequencies. The Zip�an distribution of refer-ence frequencies is where the probability for refer-encing a page with block number less than or equal

to i is ( i

N)loga

logb with constants a and b between 0and 1. The notion of constants a, b, and N is thatfraction a of the references accesses fraction b of theN blocks. We generated two types of distributionsreferred to as Zip�anA and Zip�anB. The Zip�anAtrace contains 500,000 references to 75,514 distinctblocks with a = 0:8, b = 0:2 and a = 0:7, b = 0:3.Zip�anB contains 914,145 requests to 186,880 dis-tinct blocks with constants a and b the same as that

of the Zip�anA trace. The Zip�an distribution andtheir respective constants were taken as they areknown to be a good representation of database ref-erence patterns [11]. The number of requests anddistinct blocks were selected to be the same as theDB2 and OLTP traces, respectively.

4.2 Results

Figure 6 shows the hit rates for the synthetic work-load that shows Zip�an distribution. These workloadsare interesting because no sequentiality is found, andwe believe this represents one extreme end of referencecharacteristics. The results using the Zip�anA trace,shown in Figure 6, show that the LRU-OBL scheme iscertainly not the choice. (The results are similar for theZip�anB traces.) It performs even worse than the tra-ditional LRU replacement policy. SA-W 2R shows con-sistently better performance than both the LRU andLRU-OBL.

Now, also consider how the schemes deal with purelysequential references, which is the other extreme end ofreference characteristics. The LRU replacement policywill incur misses on every reference to a new block re-sulting in zero hit rate, while for both the LRU-OBLand the SA-W 2R schemes, the hit rate will approach100% as every logical next block will be prefetched. Theresults of the two extreme reference characteristics showthat the SA-W 2R is a versatile scheme.

Figures 7, 8, and 9 show the hit rates for the SA-W 2R

scheme compared with other schemes for real work-loads. Regarding the �gures, �rst note that the scalesare all di�erent. Also, for all these �gures that do notshow the hit rates for LRU and/or OPT (the optimalreplacement) policies, they are not shown because theirmargin of di�erence from the LRU-OBL and SA-W 2R

is so large that it makes the lines indecipherable. Hence,we omitted the LRU and/or OPT lines for these cases.

Overall, the performance of the SA-W 2R scheme per-forms better than all the others. An interesting obser-vation of these results is that the LRU-OBL scheme issuperior to the OPT policy. Except for the extreme casewhere there is only minimum or no sequentiality, theLRU-OBL policy is a good general scheme that couldbe used for general bu�er cache management, and notjust limited to sequential reference patterns. Of course,one has to consider how the hit rate performance mea-sure translates to other measures such as response timein real systems. Extra queueing delays incurred byprefetching may limit the performance bene�ts observ-able by the user. However, with the advent of RAID

45

50

55

60

65

70

75

80

1000 1500 2000 2500 3000 3500 4000 4500 5000 5500 6000

Hit R

ate

(%

)

Buffer Cache Size

OPTSA-W2R

LRULRU-OBL

(a) Zip�anA 80 20 distribution

15

20

25

30

35

40

45

50

55

60

1000 1500 2000 2500 3000 3500 4000 4500 5000 5500 6000

Hit R

ate

(%

)

Buffer Cache Size

OPTSA-W2R

LRULRU-OBL

(b) Zip�anA 70 30 distribution

Figure 6: The hit rates of the SA-W 2R scheme for the Zip�anA traces.

65

70

75

80

85

90

95

1000 1500 2000 2500 3000 3500 4000 4500 5000 5500 6000

Hit R

ate

(%

)

Buffer Cache Size

SA-W2RLRU-OBL

OPTLRU

(a) DB2

30

35

40

45

50

55

60

65

70

75

1000 1500 2000 2500 3000 3500 4000 4500 5000 5500 6000

Hit R

ate

(%

)

Buffer Cache Size

SA-W2RLRU-OBL

OPTLRU

(b) OLTP

Figure 7: The hit rates of the SA-W 2R scheme for the DB2 and OLTP traces.

60

65

70

75

80

85

90

95

100

1000 1500 2000 2500 3000 3500 4000 4500 5000 5500 6000

Hit R

ate

(%

)

Buffer Cache Size

SA-W2RLRU-OBL

(a) Sprite 2C39

91

92

93

94

95

96

97

98

1000 1500 2000 2500 3000 3500 4000 4500 5000 5500 6000

Hit R

ate

(%

)

Buffer Cache Size

SA-W2RLRU-OBL

(b) Sprite 3C53

Figure 8: The hit rates of the SA-W 2R scheme for the Sprite traces.

82

84

86

88

90

92

94

96

98

100

100 200 300 400 500 600 700 800 900 1000

Hit R

ate

(%

)

Buffer Cache Size

SA-W2RLRU-OBL

OPT

(a) cpp

70

75

80

85

90

95

100

100 200 300 400 500 600 700 800 900 1000

Hit R

ate

(%

)

Buffer Cache Size

SA-W2RLRU-OBL

OPT

(b) link

95

95.05

95.1

95.15

95.2

95.25

95.3

100 200 300 400 500 600 700 800 900 1000

Hit R

ate

(%

)

Buffer Cache Size

SA-W2RLRU-OBL

(c) ld

75

80

85

90

95

100

100 200 300 400 500 600 700 800 900 1000

Hit R

ate

(%

)

Buffer Cache Size

SA-W2RLRU-OBL

(d) XDataSlice

Figure 9: The hit rates of SA-W 2R scheme for real application traces.

systems and better caching techniques, delays due tothis type of queueing should not have aggravated in u-ence on the performance. Hence, hit rates should be agood re ection of the actual performance seen by theuser for these schemes.

The SA-W 2R is an even better scheme than the LRU-

OBL performing consistently better than the LRU-OBL for all situations including the extreme cases men-tioned previously. The maximum performance di�er-ence comes from the XDataSlice application where theSA-W 2R has a hit rate that is over 11 percentage pointshigher than the LRU-OBL scheme. Overall, the perfor-mance improvement range somewhere around the 1 to 4percentage point increase compared to the LRU-OBL.

5 Implementation Experiments

The SA-W 2R scheme was implemented on theGNU/Linux kernel version 2.2.14 on a Pentimum III430 MHz PC with 128 MB of memory. (At the timeof this implementation, the kernel version 2.2.14 wasthe latest stable version.) For performance comparisonpurposes, we also implemented the LRU-OBL scheme.The applications used to evaluate the performance areas follows.

gcc: Compile the GNU/Linux kernel version 2.2.14.

cp: Copy the whole GNU/Linux source code from/usr/src/linux directory to another directory.

tar: Create/extract the tar �le for the GNU/Linuxsource code.

gzip: Compress/uncompress the tar �le of theGNU/Linux source code.

grep: Grep the string "linux" from the /usr/src/linuxdirectory.

sort: Sort 1,000,000 random data items.

5.1 Individual Application Performance

Table 2 shows the execution time of each application.The results shown in the table are averages of threeexecutions of each application. Before each execution ofeach application, the system was rebooted to eliminatethe e�ect of caching from the previous execution.

The results show that the SA-W 2R scheme shows the

best performance compared to the original GNU/Linuxand LRU-OBL schemes. Speci�cally, the perfor-mance improvements due to the proposed scheme rangebetween 5 to 23 percent compared to the originalGNU/Linux scheme.

Note also that the LRU-OBL scheme always per-forms considerably better than the original GNU/Linuxscheme, though worse than the SA-W 2

R. This is be-cause regular reference patterns such as sequential ref-erences is a dominant characteristic of many of theseapplications. Hence, it may be argued that this setof applications, speci�cally, this characteristic, unfairlyfavors the LRU-OBL and SA-W 2R schemes. To showthat the SA-W 2R scheme is a robust solution, we ex-ecuted with each application a 'random' process thatrandomly references blocks such that it disrupts regularreference sequences such as sequential block references.Those results are shown in Table 3.

The results in Table 3 show that when regularreference behavior is disrupted LRU-OBL may per-form worse than the original GNU/Linux managementscheme. It also shows, however, that the SA-W 2

R

scheme consistently performs better though its improve-ment is now somewhat smaller. This shows that the SA-W

2R is quite robust in its management of the bu�er.

5.2 Concurrent Execution of Multiple Ap-plications

Using the same methodology for the experiments, wemeasured the performance of applications when multi-ple applications were executed concurrently. Table 4shows the applications that were executed concurrently(Groups 1 to 3) and their respective execution timesusing the di�erent bu�er management schemes.

As the concurrently executing applications in uencethe bu�er and CPU resource allocation, the executiontimes of the applications increase considerably. Again,in all of the situations, the SA-W 2R scheme shows thebest performance. The improvements range from neg-ligible (approximately 1% for gzip (compress) of Group2) to an approximately 20% reduction (for the gzip (un-compress) of Group 3) in execution time.

To measure the overhead of the bookkeeping andmanagement of information for adjusting the roomsizes, we added a CPU bound process that continuouslydoes simple add operations to each of the groups of ap-plications. The results of these experiments are shownin Table 5. Note that the e�ect of the CPU bound pro-cess on the execution of each application depends onthe characteristics of the applications that are execut-ing. For applications of Group 1, the increase in theexecution time is small, while for those of Group 2, theincrease is substantial. (This can be observed by com-paring the execution times of Tables 4 and 5.) Notethough, that the increase in execution time of the CPUbound process (for Groups 1 and 3) are quite small,implying that the overhead for maintaining relevant in-formation for dynamic room partitioning is quite small.

6 Conclusion and Future Works

In this paper, we proposed the SA-W 2R scheme that

is in line with the LRU-OBL scheme, that is, a simpleand practical scheme that integrates prefetching and re-placement policies. It is simple and practical, and yet itis modular in that any replacement policy deemed ap-propriate may be incorporated into the scheme. Sim-plicity results in a scheme that is easy to implement,hence practical.

An extensive implementation study was done, and ex-perimental results show that the SA-W 2R shows betterperformance compared to the original GNU/Linux andLRU-OBL implementations.

Issues such as quantifying the bene�ts of the mod-ularity of the Weighing Room or the e�ect of hint-based prefetching on the SA-W 2R scheme are beingconsidered. Performance of prefetching is also stronglyin uenced by the performance of the disk system aswell, and, thus, their interaction must be studied moreclosely.

Table 2: Average execution time for applications using di�erent bu�er management schemes (in seconds).Applications Original Linux LRU-OBL SA-W 2R

gcc 244 241 230cp 59.40 53.31 48.71

tar (create) 61.02 59.87 55.31tar (extract) 41.70 39.93 36.26

gzip (compress) 72.87 64.73 56.11gzip (uncompress) 21.45 19.99 17.23

sort 47.32 45.14 42.57grep 46.92 38.28 37.01

Table 3: Average execution time for applications with a 'random' process disrupting regular reference behaviorusing di�erent bu�er management schemes (in seconds).

Applications Original Linux LRU-OBL SA-W 2R

gcc 394.15 393 387.24cp 302.32 304.64 297.57

tar (create) 312 304.89 298.99tar (extract) 46.55 49.22 43.58

gzip (compress) 76.93 69.17 66.12gzip (uncompress) 22.47 22.34 20.84

sort 54.55 55.29 54.25grep 294.08 277.59 272.65

Table 4: Average execution time for groups of applications executed concurrently (in seconds).Applications Original Linux LRU-OBL SA-W 2

R

Group 1 cp 325.73 318.44 313.23tar (create) 329.57 323.79 319.53

Group 2 gzip (compress) 95.67 98.26 94.35sort 91.80 92.06 88.56

Group 3 tar (extract) 77.96 83 75.43gzip (uncompress) 59.70 50.91 47.54

grep 130.42 127.42 122.55

Table 5: Average execution time of groups of applications executing concurrently with a CPU bound process (inseconds).

Applications Original Linux LRU-OBL SA-W 2R

CPU bound process only 222 222 222Group 1 cp 329.57 315.82 311.94

tar (create) 331.95 319.74 314.65CPU bound process 235 235 236

Group 2 gzip (compress) 138.77 143.08 137.52sort 138.74 141.71 137.17

CPU bound process 315.07 315.47 314.96Group 3 tar (extract) 90.8 91.58 85.26

gzip (uncompress) 79.83 61.32 58.01grep 146.02 138.91 129.57

CPU bound process 246.01 247.51 248.1

7 Acknowledgement

The authors would like to thank Gerhard Weikumand Theodore Johnson for providing us with the DB2and OLTP traces and Jongmoo Choi for helping withthe application traces. We would also like to thank ourshepherd, Stephen C. Tweedie, for helping us �nalizethis paper.

References

[1] Mary G. Baker, John H. Hartman, Michael D.Kupfer, KenW. Shirri�, and John K. Ousterhout.Measurements of a Distributed File System. InProceedings of the 13th ACM SOSP, pages 198{212, Paci�c Grove, CA, October 1991.

[2] Pei Cao and Edward W. Felten. Implementa-tion and Performance of Integrated Appplication-Controlled File Caching, Prefetching, and DiskScheduling. ACM Transactions on Computer Sys-tems, 14(4):311{343, November 1996.

[3] Pei Cao, Edward W. Felten, Anna R. Karlin, andKai Li. A study of integrated prefetching andcaching strategies. In Proceedings of 1995 JointACM SIGMETRICS and Performance EvaluationConference, pages 188{197, 1995.

[4] Fay Chang and Garth A. Gibson. Automatic I/OHint Generation through Speculative Execution.In Proceedings of the Third USENIX Symposiumon Operating Systems Design and Implementation,pages 1{14, February 1999.

[5] K. Curewitz, P. Krishnan, and J.S. Vitter. Practi-cal Prefetching via Data Compression. In Proceed-ings of the 1993 ACM SIGMOD Conference, pages257{266, May 1993.

[6] Theodore Johnson and Dennis Shasha. 2Q: A LowOverhead High Performance Bu�er ManagementReplacement Algorithm. In Proceedings of the 20thVLDB Conference, pages 439{450, 1994.

[7] David Kotz and Carla Schlatter Ellis. Practi-cal Prefetching Techniques for Multiprocessor FileSystems. Journal of Distributed and ParallelDatabases, 1(1):33{51, January 1993.

[8] Donghee Lee, Jongmoo Choi, Sam H. Noh,Sang Lyul Min, Yookun Cho, and Chong SangKim. On the Existence of a Spectrum of Policiesthat Subsumes the Least Recently Used (LRU) and

Least Frequently Used (LFU) Policies. In Proceed-ings of the 1999 ACM SIGMETRICS Conference,pages 134{143, 1999.

[9] Hui Lei and Dan Duchamp. An Analytical Ap-proach to File Prefetching. In Proceedings of the1997 USENIX Annual Technical Conference, pages275{288, January 1997.

[10] Todd C. Mowry, Angela K. Demke, and Or-ran Krieger. Automatic Compiler-Inserted I/OPrefetching for Out-of-Core Applications. In Pro-ceedings of the 2nd USENIX Symposium on Oper-ating Systems Design and Implementation, Octo-ber 1996.

[11] Elizabeth J. O'Neil, Patrick E. O'Neil, and Ger-hard Weikum. The LRU-K Page Replacement Al-gorithm for Database Disk Bu�ering. In Proceed-ings of the 1993 ACM SIGMOD Conference, pages297{306, May 1993.

[12] J. Ousterhout, A. Cherenson, F. Douglis, M. Nel-son, and B. Welch. The Sprite Network OperatingSystem. IEEE Computer, 21(2):23{36, February1988.

[13] R. Hugo Patterson, Garth A. Gibson, Eka Gint-ing, Daniel Stodolsky, and Jim Zelenka. InformedPrefetching and Caching. In Proceedings of the15th ACM SOSP, pages 79{95, December 1995.

[14] J. T. Robinson and N. V. Devarakonda. DataCache Management Using Frequency-Based Re-plcement. In Proceedings of the 1990 ACM SIG-METRICS Conference, pages 134{142, 1990.

[15] Alan Jay Smith. Sequential program prefetchingin memory heirarchies. IEEE Computer, 3(3):7{21, December 1978.

[16] Alan Jay Smith. Disk cache-miss ratio analysisand design considerations. ACM Transactions onComputer Systems, 3(3):161{203, August 1985.

[17] Andrew Tomkins, R. Hugo Patterson, andGarth A. Gibson. Informed Multi-ProcessPrefetching and Caching. In Proceedings of the1997 ACM SIGMETRICS Conference, pages 100{114, June 1997.

Date post:	13-Oct-2020
Category:	Documents
Upload:	others
View:	2 times
Download:	0 times

Proceedings of the 4th Annual Linux Showcase & Conference ... · U-OBL) approac h, eliminating an y...

Documents