+ All Categories
Home > Documents > Chapter 9, Virtual Memory Overheads, Part 1 Sections 9.1-9.5

Chapter 9, Virtual Memory Overheads, Part 1 Sections 9.1-9.5

Date post: 25-Feb-2016
Category:
Upload: azana
View: 22 times
Download: 3 times
Share this document with a friend
Description:
Chapter 9, Virtual Memory Overheads, Part 1 Sections 9.1-9.5. 9.1 Background. Remember the sequence of developments of memory management from the last chapter: 1. The simplest approach: - PowerPoint PPT Presentation
Popular Tags:
220
Chapter 9, Virtual Memory Overheads, Part 1 Sections 9.1-9.5 1
Transcript
Page 1: Chapter 9, Virtual Memory Overheads, Part 1 Sections 9.1-9.5

1

Chapter 9, Virtual MemoryOverheads, Part 1Sections 9.1-9.5

Page 2: Chapter 9, Virtual Memory Overheads, Part 1 Sections 9.1-9.5

2

9.1 Background

• Remember the sequence of developments of memory management from the last chapter:

• 1. The simplest approach: • Define a memory block of fixed size large

enough for any process and allocate such a block to each process (see MISC)

• This is tremendously rigid and wasteful

Page 3: Chapter 9, Virtual Memory Overheads, Part 1 Sections 9.1-9.5

3

• 2. Allocate memory in contiguous blocks of varying size

• This leads to external fragmentation and a waste of 1/3 of memory (for N units allocated, .5 N are wasted)

• There is overhead in maintaining allocation tables down to the byte level

Page 4: Chapter 9, Virtual Memory Overheads, Part 1 Sections 9.1-9.5

4

• 3. Do (simple) paging. • Memory is allocated in fixed size blocks• This solves the external fragmentation problem• This also breaks the need for allocation of

contiguous memory• The costs as discussed so far consist of the

overhead incurred from maintaining and using a page table

Page 5: Chapter 9, Virtual Memory Overheads, Part 1 Sections 9.1-9.5

5

Limitations of Paging—Loading Complete Programs

• Virtual memory is motivate by several limitations in the paging scheme as presented so far

• One limitation is that it’s necessary to load a complete program for it to run

Page 6: Chapter 9, Virtual Memory Overheads, Part 1 Sections 9.1-9.5

6

• These are examples of why it might not be necessary to load a complete program:

• 1. Error handling routines may not be called during most program runs

• 2. Arrays of predeclared sizes may never be completely filled

Page 7: Chapter 9, Virtual Memory Overheads, Part 1 Sections 9.1-9.5

7

• 3. Other routines besides error handling may also be rarely used

• 4. For a large program, even if all parts are used at some time during a run, by definition, they can’t all be used at the same time

• This means that at any given time the complete program doesn’t have to be loaded

Page 8: Chapter 9, Virtual Memory Overheads, Part 1 Sections 9.1-9.5

8

• Reasons for wanting to be able to run a program that’s only partially loaded

• 1. The size of a program is limited to the physical memory on the machine

• Given current memory sizes, this by itself is not a serious limitation, although in some environments it might still be

Page 9: Chapter 9, Virtual Memory Overheads, Part 1 Sections 9.1-9.5

9

• 2. For a large program, significant parts of it may not be used for significant amounts of time.

• If so, it’s an absolute waste to have the unused parts loaded into memory

• Even with large memory spaces, conserving memory is desirable in order to support multi-tasking

Page 10: Chapter 9, Virtual Memory Overheads, Part 1 Sections 9.1-9.5

10

• 3. Another area of saving is in loading or swapping cost from secondary storage

• If parts of a program are never needed, reading and writing from secondary storage can be saved

• In general this means leaving more I/O cycles available for useful work

Page 11: Chapter 9, Virtual Memory Overheads, Part 1 Sections 9.1-9.5

11

• Not having to load a complete program also means:

• A program will start faster when initially scheduled because there is less I/O for the long term scheduler to do

• The program will be faster and less wasteful during the course of its run in a system that does medium term scheduling or swapping

Page 12: Chapter 9, Virtual Memory Overheads, Part 1 Sections 9.1-9.5

12

Limitations of Paging—Fixed Mapping to Physical Memory

• There is also another, in a sense more general, limitation to paging as presented so far:

• The idea was that once a logical page was allocated a physical frame, it didn’t move

• It’s true that medium term scheduling, swapping, and compaction may move a process, but this has to be specially supported

• Once scheduled and running, a process’s location in memory doesn’t change

Page 13: Chapter 9, Virtual Memory Overheads, Part 1 Sections 9.1-9.5

13

• If page locations are fixed in memory, that implies a fixed mapping between the logical and physical address space throughout a program run

• More flexibility can be attained if the logical and physical address spaces are delinked

Page 14: Chapter 9, Virtual Memory Overheads, Part 1 Sections 9.1-9.5

14

• The idea is that at one time a logical page would be at one physical address, at another time it would be at another

• Run-time address resolution would handle finding the correct frame for a page when needed

Page 15: Chapter 9, Virtual Memory Overheads, Part 1 Sections 9.1-9.5

15

Definition of Virtual Memory

• Definition of virtual memory:• The complete separation of logical memory

space from physical memory space from the programmer’s point of view

Page 16: Chapter 9, Virtual Memory Overheads, Part 1 Sections 9.1-9.5

16

• At any given time during a program run, any page, p, in the logical address space could be at any frame, f, in the physical memory space

• Only that part of a program that is running has to have been loaded into main memory from secondary storage

Page 17: Chapter 9, Virtual Memory Overheads, Part 1 Sections 9.1-9.5

17

• Not only could any page, p, in the address space be at in any frame, f, at run time

• Any logical address, on some page p, could still be located in secondary storage at any point during a run when that address isn’t actually be accessed

Page 18: Chapter 9, Virtual Memory Overheads, Part 1 Sections 9.1-9.5

18

Virtual Memory and Segmentation and Paging

• Both segmentation and paging were mentioned in the last chapter

• In theory, virtual memory can be implemented with segmentation

• However, that’s a mess• The most common implementation is with paging• That is the only approach that will be covered

here

Page 19: Chapter 9, Virtual Memory Overheads, Part 1 Sections 9.1-9.5

19

9.2 Demand Paging

• If it’s necessary to load a complete process in order for it to run, then there is an up-front cost of swapping all of its pages in from secondary storage to main memory

• It it’s not necessary to load a complete process in order for it to run, then a page only needs to be swapped into main memory if the process generates an address on that page

• This is known as demand paging

Page 20: Chapter 9, Virtual Memory Overheads, Part 1 Sections 9.1-9.5

20

• In general, when a process is scheduled it may be given an initial allocation of frames in memory

• From that point on, additional frames may be allocated through demand paging

• If a process is not even given an initial footprint and it acquires all of its memory through paging, this is known as pure demand paging

Page 21: Chapter 9, Virtual Memory Overheads, Part 1 Sections 9.1-9.5

21

Demand Paging Analogy with TLB

• Demand paging from secondary storage to main memory is roughly analogous to what happens on a miss between the page table and the TLB

• Initially, the TLB can be thought of as empty• The first time the process generates an

address on a given page, that causes a TLB miss, and the page entry is put into the TLB

Page 22: Chapter 9, Virtual Memory Overheads, Part 1 Sections 9.1-9.5

22

• With pure demand paging, you can think of the memory allocation of a process as being “empty”

• The attempt to access an unloaded page can be thought of a miss

• This miss is what triggers the allocation of a frame in memory to that page

Page 23: Chapter 9, Virtual Memory Overheads, Part 1 Sections 9.1-9.5

23

The Separation of Logical and Physical Address Space

• An earlier statement characterized virtual memory as completely separating the logical and physical address spaces

• Another way to think about this is that from the point of view of the logical address space, there is no difference between main memory and secondary storage

Page 24: Chapter 9, Virtual Memory Overheads, Part 1 Sections 9.1-9.5

24

• In other words, the logical address space may refer to parts of programs which have been loaded into memory and parts of programs which haven’t

• Accessing memory that hasn’t been loaded is slower, but the loading is handled by the system

Page 25: Chapter 9, Virtual Memory Overheads, Part 1 Sections 9.1-9.5

25

• From the point of view of the address, the running process doesn’t know or care whether it’s in main memory or secondary storage

• The MMU and the disk management system work together to provide transparent access to the logical address space of a program

• The IBM AS/400 is an example of a system where addresses literally extended into the secondary storage space

Page 26: Chapter 9, Virtual Memory Overheads, Part 1 Sections 9.1-9.5

26

Maximum Address Space

• The maximum address space is limited by the machine architecture

• It is defined by how many bits are available for holding an address

• The amount of installed main memory might be less than the maximum address space

Page 27: Chapter 9, Virtual Memory Overheads, Part 1 Sections 9.1-9.5

27

• If so, then the address space extends into secondary storage

• Virtual memory effectively means that secondary storage functions as a transparent extension of physical main memory

Page 28: Chapter 9, Virtual Memory Overheads, Part 1 Sections 9.1-9.5

28

Support for Virtual Memory and Demand Paging

• From a practical point of view, it becomes necessary to have support to tell which pages have been loaded into physical memory and which have not

• This is part of the hardware support for the MMU

Page 29: Chapter 9, Virtual Memory Overheads, Part 1 Sections 9.1-9.5

29

• In the earlier discussions of page tables, the idea of a valid/invalid bit was introduced

• Under that scheme, the page table was long enough to accommodate the maximum number of allocated frames

• If a process wasn’t allocated the maximum, then page addresses outside of its allocation were marked invalid

Page 30: Chapter 9, Virtual Memory Overheads, Part 1 Sections 9.1-9.5

30

• The scheme can be extended for demand paging:

• Valid means valid and in memory.• Invalid means either invalid or not loaded

Page 31: Chapter 9, Virtual Memory Overheads, Part 1 Sections 9.1-9.5

31

• Under the previous scheme, if an invalid page was accessed, a trap was generated, and the running process was halted due to an attempt to access memory outside of its range

• Under the new scheme, an attempt to access an invalid page also generates a trap, but this is not necessarily an error

• The trap is known as a page fault trap

Page 32: Chapter 9, Virtual Memory Overheads, Part 1 Sections 9.1-9.5

32

• This is an interrupt which halts the user process and triggers system software which does the following:

• 1. It checks a table to see whether the address was really invalid or just not loaded

• 2. If invalid, it terminates the process

Page 33: Chapter 9, Virtual Memory Overheads, Part 1 Sections 9.1-9.5

33

• 3. If valid, it gets a frame from the list of free frames (the frame table), allocates it to the process, and updates the data structures to show that the frame is allocated to page x of the process

• 4. It schedules (i.e., requests) a disk operation to read the page from secondary storage into the allocated frame

Page 34: Chapter 9, Virtual Memory Overheads, Part 1 Sections 9.1-9.5

34

• 5. When the read is complete, it updates the data structures to show that the page is now valid (among other things, setting the valid bit)

• 6. It allows the user process to restart on exactly the same instruction that triggered the page fault trap in the first place

Page 35: Chapter 9, Virtual Memory Overheads, Part 1 Sections 9.1-9.5

35

• Note two things about the sequence of events outlined above.

• First:– Restarting is just an example of context switching– By definition, the user process’s state will have been

saved– It will resume at the IP value it was on when it stopped– The difference is that the page will now be in memory

and no fault will result

Page 36: Chapter 9, Virtual Memory Overheads, Part 1 Sections 9.1-9.5

36

• Second:– The statement was made, “get a frame from the list of

free frames”.– You may be wondering, what if there are no free

frames?– At that point, memory is “over-allocated”.– That means that it’s necessary to take a frame from

one process and give it to another– This is an important consideration that will be covered

in detail later

Page 37: Chapter 9, Virtual Memory Overheads, Part 1 Sections 9.1-9.5

37

Demand Paging and TLB’s as a Form of Caching

• Demand paging from secondary storage to main memory is analogous to bringing an entry from the page table to the TLB

• Remember that the TLB is a specialized form of cache

• Its effectiveness relies on locality of reference• If references were all over the map, it would

provide no benefit

Page 38: Chapter 9, Virtual Memory Overheads, Part 1 Sections 9.1-9.5

38

• In practice, memory references tend to cluster in certain areas over certain periods of time, and then move on

• This means that entries remain in the TLB and remain useful over a period of time

Page 39: Chapter 9, Virtual Memory Overheads, Part 1 Sections 9.1-9.5

39

• The logic of bringing pages from secondary storage to memory is also like caching

• Pages that have been allocated to frames should remain useful over time, and can profitably remain in those frames

• Pages should tend not to be used only once and then have to be swapped out immediately because another page is needed and over-allocation of memory has occurred

Page 40: Chapter 9, Virtual Memory Overheads, Part 1 Sections 9.1-9.5

40

Hardware Support for Demand Paging

• Basic hardware support for demand paging is the same as for regular paging

• 1. A page table that records valid/invalid pages• 2. Secondary storage—a disk.– Recall that program images are typically not

swapped in from the file system. – The O/S maintains a ready queue of program

images in the swap space, a.k.a., the backing store

Page 41: Chapter 9, Virtual Memory Overheads, Part 1 Sections 9.1-9.5

41

Problems with Page Faults

• A serious problem can occur when restarting a user process after a page fault

• This is not a problem with context switching per se

• It is a problem that is reminiscent of the problems of concurrency control

Page 42: Chapter 9, Virtual Memory Overheads, Part 1 Sections 9.1-9.5

42

• An individual machine instruction actually consists of multiple sub-parts

• Each of the sub-parts may require memory access

• Thus, a page fault may occur on different sub-parts

Page 43: Chapter 9, Virtual Memory Overheads, Part 1 Sections 9.1-9.5

43

• Memory is like a shared resource• When a process is halted mid-instruction, it may

leave memory in an inconsistent state• One approach to dealing with this is to roll back

any prior action a process has taken on memory before restarting it

• Another approach is to require that a process acquire all memory needed before taking any further action

Page 44: Chapter 9, Virtual Memory Overheads, Part 1 Sections 9.1-9.5

44

• Instruction execution can be broken down into these steps:

• 1. Fetch the instruction• 2. Decode the instruction• 3. Fetch operands, if any• 4. Do the operation (execute)• 5. Write the results, if any

Page 45: Chapter 9, Virtual Memory Overheads, Part 1 Sections 9.1-9.5

45

• A page fault can occur on the instruction fetch• In other words, during execution, the IP

reaches a value that hasn’t been loaded yet• This presents no problem• The page fault causes the page containing the

next instruction to be loaded• Then execution continues on that instruction

Page 46: Chapter 9, Virtual Memory Overheads, Part 1 Sections 9.1-9.5

46

• A page fault can also occur on the operand fetches• In other words, the pages containing the addresses

referred to by the instruction haven’t been loaded yet• A little work is wasted on a restart, but there are no

problems• The page fault causes the operand pages to be loaded

(making sure not to replace the instruction page)• Then execution continues on that instruction

Page 47: Chapter 9, Virtual Memory Overheads, Part 1 Sections 9.1-9.5

47

Specific Problem Scenario

• In some hardware architectures there are instructions which can modify more than one thing (write >1 result).

• If the page fault occurs in the sequence of modifications, there is a potential problem

• Whether the problem actually occurs is simply due to the vagaries of scheduling and paging

• In other words, this is a problem like interrupting a critical section

Page 48: Chapter 9, Virtual Memory Overheads, Part 1 Sections 9.1-9.5

48

• The memory management page fault trap handling mechanism has to be set up to deal with this potential problem

• The book gives two concrete examples of machine instructions which are prone to this

• One example was from a DEC (rest in peace) machine.

• It will not be pursued

Page 49: Chapter 9, Virtual Memory Overheads, Part 1 Sections 9.1-9.5

49

• The other example comes from an IBM instruction set

• There was a memory move instruction which would cause a block of 256 bytes to be relocated to a new address

Page 50: Chapter 9, Virtual Memory Overheads, Part 1 Sections 9.1-9.5

50

• Memory paging is transparent• Application programs simply deal in logical

addresses• There is not, and there should not be any kind

of instruction where an application program has to know or refer to its pages/frames when doing memory operations

Page 51: Chapter 9, Virtual Memory Overheads, Part 1 Sections 9.1-9.5

51

• Put simply, the move instruction could be from a location on one page to a location on another page

• This should be legal and the fact that the move crossed page boundaries should be of no concern to the application program

Page 52: Chapter 9, Virtual Memory Overheads, Part 1 Sections 9.1-9.5

52

• Also, the move instruction in question allowed the new location to overlap with the old location

• In other words, the move instruction could function as a shift

• From an application point of view, this is perfectly reasonable

• Why not have a machine instruction that supports shifting?

Page 53: Chapter 9, Virtual Memory Overheads, Part 1 Sections 9.1-9.5

53

• The problem scenario goes like this:• Suppose you have a 256 byte block of interest,

and it is located at the end of a page• This page is in memory, but the following page

is not in memory• For the sake of argument, let the move

instruction in fact cause a shift to the right of 128 bytes

Page 54: Chapter 9, Virtual Memory Overheads, Part 1 Sections 9.1-9.5

54

• Instruction execution starts by “picking up” the full 256 bytes

• It shifts to the right and lays down the first 128 of the 256 bytes

• It then page faults because the second page isn’t in memory yet

Page 55: Chapter 9, Virtual Memory Overheads, Part 1 Sections 9.1-9.5

55

• Restarting the user process on the same instruction after the page fault without protection will result in inconsistent memory state

• Memory on the first page has already been modified

• When the instruction starts over, it will then shift the modified memory on the first page 128 bytes to the right

Page 56: Chapter 9, Virtual Memory Overheads, Part 1 Sections 9.1-9.5

56

• You do not get the original 256 bytes shifted 128 bytes to the right

• At a position 128 bytes to the right you get 128 blank bytes followed by the first 128 bytes of the original 256

• The problem is that the effects of memory access should be all or nothing.

• In this case you get half and half

Page 57: Chapter 9, Virtual Memory Overheads, Part 1 Sections 9.1-9.5

57

Solving the Problem

• There are two basic approaches to solving the problem

• They are reminiscent of solutions to the problem of coordinating locking on resources

Page 58: Chapter 9, Virtual Memory Overheads, Part 1 Sections 9.1-9.5

58

Solution 1: Locking

• Have the instruction try to access both the source and the destination addresses before trying to shift

• This will force a page fault, if one is needed, before any work is done

• This is the equivalent of having the process acquire a lock on all needed resources in advance so that it will execute correctly

Page 59: Chapter 9, Virtual Memory Overheads, Part 1 Sections 9.1-9.5

59

Solution 2: Rollback

• Use temporary registers to hold operand values• In other words, let the system store the contents

of the source memory location before any changes are made

• If a page fault occurs when trying to complete the instruction, restore the prior state to memory before restarting the instruction

• This is the equivalent of rolling back a half finished process

Page 60: Chapter 9, Virtual Memory Overheads, Part 1 Sections 9.1-9.5

60

• Solution two could also be described as extending the state saving aspect of context switching from registers, etc., to memory.

• Generalizing state is an important idea• For example, concurrency control in database

management transactions extends the idea of state consistency all the way into secondary storage

Page 61: Chapter 9, Virtual Memory Overheads, Part 1 Sections 9.1-9.5

61

Transparency of Paging

• The problem of inconsistent state in memory due to an instruction interrupted by a page fault is not the only difficulty in implementing demand paging

• Other problems will also be discussed

Page 62: Chapter 9, Virtual Memory Overheads, Part 1 Sections 9.1-9.5

62

• In any case, demand paging should be transparent

• In other words, the solutions to any problems should not require user applications to do anything but merrily roll along generating logical addresses

• The system implementation has to deal with any problems that actually involve knowledge of the pages and frames involved

Page 63: Chapter 9, Virtual Memory Overheads, Part 1 Sections 9.1-9.5

63

Demand Paging Performance

• Let the abbreviation ma stand for memory access time

• This will represent the amount of time needed to simply access a known address that has already been loaded into main memory

• In the previous chapter, a figure of 100 ns. was used in cost calculations

• The author now gives 10-200 ns. as the range of memory access times for current computers

Page 64: Chapter 9, Virtual Memory Overheads, Part 1 Sections 9.1-9.5

64

• All previous cost estimates were based on the assumption that all pages were in memory

• The only consideration was whether you had a TLB hit or miss and incurred the cost of one or more additional hits to memory for the page table

Page 65: Chapter 9, Virtual Memory Overheads, Part 1 Sections 9.1-9.5

65

• In the following discussion, the memory access time, ma, will not be broken down into separate parts based on TLB hits and misses

• ma will be the average cost of accessing a page that is in memory, given whatever performance characteristics the TLB and page table might have for a given MMU

• In other words, ma as used here will include the details covered in the previous chapter

Page 66: Chapter 9, Virtual Memory Overheads, Part 1 Sections 9.1-9.5

66

• Under demand paging an additional, very large cost can be incurred:

• The cost of a page fault, requiring a page to be read from secondary storage

• This turns out to be the most significant part of the calculation of the cost of supporting virtual memory

Page 67: Chapter 9, Virtual Memory Overheads, Part 1 Sections 9.1-9.5

67

• This is why, in the following calculations, there is no reason to fiddle with TLB hits and misses

• That portion of the calculation would be completely dominated by the cost of a page fault

Page 68: Chapter 9, Virtual Memory Overheads, Part 1 Sections 9.1-9.5

68

Calculating Average Effective Memory Access Time

• Given a probability p, 0 <= p <= 1, of a page fault, the average effective access time of a system can be calculated

• Average effective memory access time• That is, the average cost of one memory

access• = (1 – p) * ma + p * (page fault time)

Page 69: Chapter 9, Virtual Memory Overheads, Part 1 Sections 9.1-9.5

69

Page Fault Time

• Page fault time includes twelve components:• 1. The time to send the trap to the O/S• 2. Context switch time (saving process state,

registers, etc.)• 3. Determining that the interrupt was a page

fault (i.e., interrupt handling mechanism time)• 4. Checking that the page reference was legal

and determining the location on disk (this is the interrupt handling code in action)

Page 70: Chapter 9, Virtual Memory Overheads, Part 1 Sections 9.1-9.5

70

• 5. Issuing a read from the disk to a free frame.

• This means a call through the disk management system code and includes– A. Waiting in a queue for this device until the read

request is serviced– B. Waiting for the device seek and latency time– C. Beginning the transfer of the page to a free

frame

Page 71: Chapter 9, Virtual Memory Overheads, Part 1 Sections 9.1-9.5

71

• 6. While waiting for the disk read to complete, optionally scheduling another process.

• Note what this entails:– A. It has the desirable effect of increasing multi-

programming and CPU utilization– B. There is a small absolute cost simply to schedule

another process– C. From the point of view of the process that triggered

the page fault, there is a long and variable wait before being able to resume

Page 72: Chapter 9, Virtual Memory Overheads, Part 1 Sections 9.1-9.5

72

• 7. Receiving the interrupt (from the disk when the disk I/O is completed)

• 8. Context switching the other process out if step 6 was taken

• 9. Handling the disk interrupt• 10. Correcting (updating) the frame and page

tables to show that the desired page is now in a frame in memory

Page 73: Chapter 9, Virtual Memory Overheads, Part 1 Sections 9.1-9.5

73

• 11. As noted in 6.C, waiting for the CPU to be allocated again to the process that generated the page fault

• 12. Context switch—restoring the user registers, process state, and updated page table;

• Then the process that generated the page fault can be restarted at the instruction that generated the fault

Page 74: Chapter 9, Virtual Memory Overheads, Part 1 Sections 9.1-9.5

74

• The twelve steps listed above fall into three major components of page fault service time:

• 1. Servicing the page fault interrupt• 2. Reading in the page• 3. Restarting the process

Page 75: Chapter 9, Virtual Memory Overheads, Part 1 Sections 9.1-9.5

75

• The time taken for the whole process is easily summarized in terms of the three components

• The book gives representative times for the components

• 1. Service the page fault interrupt: – Several hundred machine instructions – 1-100 microseconds

Page 76: Chapter 9, Virtual Memory Overheads, Part 1 Sections 9.1-9.5

76

• 2. Read the page from the drive– Latency: 3 milliseconds– Seek: 5 milliseconds– Transfer: .05 milliseconds– Total: 8.05 milliseconds– For all practical purposes, you can round this off to

8 milliseconds

Page 77: Chapter 9, Virtual Memory Overheads, Part 1 Sections 9.1-9.5

77

• 3. Restart the process. – This is similar to #1: 1-100 microseconds

• Points 1 and 3 are negligible compared to the cost of accessing secondary storage

• The overall cost of the three components, 1 + 2 + 3, can be approximated at 8 milliseconds

Page 78: Chapter 9, Virtual Memory Overheads, Part 1 Sections 9.1-9.5

78

Demand Paging Performance—with Representative Numbers

• The average performance of demand paging can now be gauged

• Let the following values be used:• ma = memory access time = 200 ns.• page fault time = 8 milliseconds as estimated

above• p = probability of a page fault, 0 < p < 1

Page 79: Chapter 9, Virtual Memory Overheads, Part 1 Sections 9.1-9.5

79

• Average effective access time• = (1 – p) * ma + p * (page fault time)• For some probability p, substituting in the

values for memory and access and page fault time gives this average effective access time

• = (1 – p) * 200 ns. + p * (8 milliseconds)

Page 80: Chapter 9, Virtual Memory Overheads, Part 1 Sections 9.1-9.5

80

• nano is a billionth and milli is a thousandth, so nano is a million times smaller than milli

• Converting the whole expression to nanoseconds, average effective access time

• = (1 – p) * 200 + p * 8,000,000• = 200 – p * 200 + p * 8,000,000• = 200 + 7,999,800 * p

Page 81: Chapter 9, Virtual Memory Overheads, Part 1 Sections 9.1-9.5

81

• The effective access time is directly proportional to the page fault rate

• Not surprisingly, the cost of a fault (to secondary storage) dominates the expression for the cost overall

Page 82: Chapter 9, Virtual Memory Overheads, Part 1 Sections 9.1-9.5

82

• Say, for example, that p = 1/1000. • In other words, the fault rate was .1%, or the

hit rate was 99.9%• Then the expression for average memory

access time becomes:• 200 + 7,999,800 * p• = 200 + 7,999,800 * .001• ~= 200 + 8,000

Page 83: Chapter 9, Virtual Memory Overheads, Part 1 Sections 9.1-9.5

83

• 8,000 is 40 times as large as 200, the cost for a simple memory access

• In other words, under this scenario, even with a page fault rate of only .1%, virtual memory with demand paging incurs an overhead cost that is 40 times the base cost

• Put in more frightening terms, the overhead cost is 4,000% of the base cost

Page 84: Chapter 9, Virtual Memory Overheads, Part 1 Sections 9.1-9.5

84

• You can use the equation in another way, do a little algebra, and figure out what the page fault rate would have to be in order to attain a reasonable overhead cost

• If 10% overhead is reasonable, 10% of 200 nanoseconds is 20 nanoseconds, and average effective access time would have to be 220 nanoseconds

Page 85: Chapter 9, Virtual Memory Overheads, Part 1 Sections 9.1-9.5

85

• Putting this into the equation gives• 220 = (1 – p) * 200 + p * (8,000,000)• 220 = 200 – p * 200 + p * 8,000,000• 20 = 8,000,000p – 200p• 20 = 7,999,800p

Page 86: Chapter 9, Virtual Memory Overheads, Part 1 Sections 9.1-9.5

86

• p = 20/7,999,800• ~= 20/8,000,000• = 2/800,000 • = 1/400,000 • = .0000025 = .00025%• In other words, if you have a page fault rate of

1 page or fewer in 400,000 memory accesses, the overhead of demand paging is 10% or less

Page 87: Chapter 9, Virtual Memory Overheads, Part 1 Sections 9.1-9.5

87

• Some conclusions that can be drawn from this:

• 1. The page fault rate has to be quite low to get good performance

• 2. This can only happen if there is locality of reference

Page 88: Chapter 9, Virtual Memory Overheads, Part 1 Sections 9.1-9.5

88

• 3. This means that memory has to be relatively large.

• It has to be large enough to hold most of the pages a program will use during a period of time

• Keep in mind that this is in the context of multi-programming/multi-tasking

• Complex memory management would not be needed if only one program at a time could run

Page 89: Chapter 9, Virtual Memory Overheads, Part 1 Sections 9.1-9.5

89

• Programs themselves have to be relatively large/long/repetitive.

• If they did not reuse pages, the initial page reads to load the program would dominate the overall cost

Page 90: Chapter 9, Virtual Memory Overheads, Part 1 Sections 9.1-9.5

90

• All of the foregoing statements were understatements of a sort

• It’s not too soon to try and start seeing the big picture• Real life programs can be mammoth, long-running

beasts• Paging is the engine which allows them to slowly and

smoothly move from executing one part of themselves to another, acquiring the memory needed as they go

Page 91: Chapter 9, Virtual Memory Overheads, Part 1 Sections 9.1-9.5

91

Virtual Memory and Swap Space

• Swap space in secondary storage is managed differently from file space

• There does not have to be a complex directory structure

• The size of the blocks in which data are managed in swap space may be bigger than the size in the file system

• When doing virtual memory paging, there are implementation choices to be made in how to use swap space

Page 92: Chapter 9, Virtual Memory Overheads, Part 1 Sections 9.1-9.5

92

• There are basically two approaches to using the swap space

• 1. When a program is started, copy it completely from the file system space into the swap space and page from there

• 2. If swap space is limited, do initial program page demand reads from the file system, copying those pages into the swap space as they enter the active address space

Page 93: Chapter 9, Virtual Memory Overheads, Part 1 Sections 9.1-9.5

93

Paging Costs

• An implementation of demand paging may bring in an initial, fixed-size set of pages for a process at start up time

• Waiting for the pages to load can cause a time lag before execution starts

• Pure demand paging brings in a page at a time, only when the page is accessed

• This will cause a quicker start-up on the initial page, but complete paging costs are simply incurred over time as each successive page is read in

Page 94: Chapter 9, Virtual Memory Overheads, Part 1 Sections 9.1-9.5

94

9.3 Copy on Write

• Paging costs can be reduced, or extended over the course of a program run, by using techniques based on shared memory

• Recall that in a Unix-like system, a fork() call spawns a new process, a child of some parent process

• As explained thus far, the child may be a copy of the parent, or the code of the parent may be replaced by doing an exec() call

Page 95: Chapter 9, Virtual Memory Overheads, Part 1 Sections 9.1-9.5

95

• In a copy-on-write implementation, the child does not immediately get its own address space

• The child shares the parent’s address space• The shared pages are marked copy-on-write• This information can be stored like

valid/invalid bits or protections in the memory management system

Page 96: Chapter 9, Virtual Memory Overheads, Part 1 Sections 9.1-9.5

96

• If the child process tries to write to a shared page, then a page fault is triggered

• A new page is allocated to the child• The new page has the image of the shared (parent’s)

page copied into it• Then the child modifies its copy, not the original• A new page allocation only has to be made when the

child performs a write• This technique is used in Linux, Solaris, and Windows XP

Page 97: Chapter 9, Virtual Memory Overheads, Part 1 Sections 9.1-9.5

97

• Another system call in Unix and Linux is vfork()• If this is called, the child shares its parent’s address

space• But the parent is immediately suspended• For all practical purposes, this is the end of the

parent, and its memory is given to the child• The child doesn’t incur any paging costs• The memory remains loaded and the identity of the

owner is simply changed from the parent to the child

Page 98: Chapter 9, Virtual Memory Overheads, Part 1 Sections 9.1-9.5

98

Speed of Memory Allocation

• Copy-on-write generates a page request—but it doesn’t trigger the need to read a new page from secondary storage

• A program may also need more heap or stack space during the course of a run which doesn’t require access to secondary storage

• It is important for requests like these for dynamic memory allocation to be met quickly

Page 99: Chapter 9, Virtual Memory Overheads, Part 1 Sections 9.1-9.5

99

The Free Frame Pool

• In order to support quick memory allocation not dependent on reading from secondary storage, a free frame pool is maintained

• In other words, memory is never completely allocated to user processes

• The system always maintains a reserve of free frames• These frames are allocated using zero-fill-on-demand• This means that any previous contents are zeroed out

before a frame is allocated to a process

Page 100: Chapter 9, Virtual Memory Overheads, Part 1 Sections 9.1-9.5

100

9.4 Page Replacement

• The underlying scenario of virtual memory with demand paging is this:

• Most programs, most of the time, don’t have to be fully loaded in order to run

• However, if a program isn’t fully loaded, as time passes it may trigger a page fault

Page 101: Chapter 9, Virtual Memory Overheads, Part 1 Sections 9.1-9.5

101

Memory Allocation with Multi-Programming

• A simple illustration might go like this:• Suppose the system has 40 frames• Suppose you have 8 processes, which if loaded

completely would take 10 pages each• Suppose that the 8 processes can actually run

effectively with only 5 pages allocated apiece• Memory is fully allocated and multi-

programming is maximized

Page 102: Chapter 9, Virtual Memory Overheads, Part 1 Sections 9.1-9.5

102

O/S Demands on Memory

• Keep in mind that it’s not just user processes that make demands on memory

• The O/S requires memory to maintain the free frame pool

• It also requires memory for the creation/allocation of I/O buffers to processes, and so on

• In some systems, a certain amount of memory may be strictly reserved for system use

• In other systems, the system itself has to complete for the free memory pool with user processes

Page 103: Chapter 9, Virtual Memory Overheads, Part 1 Sections 9.1-9.5

103

Process Demands on Memory

• It is true that no one process can literally use all of its pages simultaneously

• However, a program can enter a stage where it accesses all or most of its pages repeatedly in rapid succession

• In order to be efficient (without incurring a paging cost for each page access) all or most of the pages would have to be loaded

Page 104: Chapter 9, Virtual Memory Overheads, Part 1 Sections 9.1-9.5

104

Over-Allocation of Memory

• Suppose, under the thumbnail scenario given, that one of the 8 processes needs 10 pages instead of 5

• The demand for memory exceeds the supply• A system may empty the free frame pool in

order to try and meet the demand• When no free memory is left, this is known as

over-allocation of memory

Page 105: Chapter 9, Virtual Memory Overheads, Part 1 Sections 9.1-9.5

105

• Formally, you know that memory is over-allocated when this happens:

• 1. A running process triggers a page fault• 2. The system determines that the address is

valid, the page is not in memory, and there is no free frame to load the page into

Page 106: Chapter 9, Virtual Memory Overheads, Part 1 Sections 9.1-9.5

106

Dealing with Over-Allocation of Memory

• There are two preliminary, simple-minded, and undesirable solutions to this problem:

• 1. Terminate the process which triggered the fault (very primitive and not good)

• 2. Swap out another process, freeing all of its frames (somewhat less primitive, but also not good)

• The point is that there has to be a better way

Page 107: Chapter 9, Virtual Memory Overheads, Part 1 Sections 9.1-9.5

107

Basic Page Replacement

• If no frame is free, find one that’s allocated but not currently being used

• Notice that this is a simplification. • Under concurrency no other page is literally

currently being used• Ultimately it will be necessary to define “not

currently being used” • At any rate, “free” the unused page by writing its

contents to swap space

Page 108: Chapter 9, Virtual Memory Overheads, Part 1 Sections 9.1-9.5

108

• Then update the frame table to show that the frame no longer belongs to the one process

• Update the page table to show that the page of the process that originally had the frame allocated to it is no longer in memory

• Read in the page belonging to the new process• Update the tables to reflect that fact

Page 109: Chapter 9, Virtual Memory Overheads, Part 1 Sections 9.1-9.5

109

Swap Space’s Role

• Notice that swap space now has a very important role

• In general, a currently unused page may have been written into by the process that has been using it

• The changes have to be saved• That means that the page has to be written

out

Page 110: Chapter 9, Virtual Memory Overheads, Part 1 Sections 9.1-9.5

110

• You don’t write currently unused pages to file space• There are two reasons for this:• 1. It’s not efficient to write to file space• It’s more efficient to write to an image in swap

space• 2. By definition, the process the page belongs to is

still running• Therefore, the assumption is that that page could

have to be swapped in again in the future

Page 111: Chapter 9, Virtual Memory Overheads, Part 1 Sections 9.1-9.5

111

• Swap space becomes the intermediary between the file space and main memory

• Note that in future discussions we’ll consider the possibility of keeping track which pages have been written to

• If a page hasn’t been written to, you can still rely on the original image that was in swap space when the page was loaded

Page 112: Chapter 9, Virtual Memory Overheads, Part 1 Sections 9.1-9.5

112

Managing Storage Hierarchies

• You may recall the storage hierarchy given in chapter 1 that went from registers to cache to main memory to disk

• The main parameters that distinguished the levels in the hierarchy were speed, size, and cost

Page 113: Chapter 9, Virtual Memory Overheads, Part 1 Sections 9.1-9.5

113

• TLB’s provided an illustration of mediating between something small and fast and something large and slow

• There had to be an algorithm for entering a subset of page table entries in the TLB

• A reliable strategy was based on frequency of reference

• When a victim was needed, the least recently used table entry was chosen

Page 114: Chapter 9, Virtual Memory Overheads, Part 1 Sections 9.1-9.5

114

• The same ideas are used over and over again, potentially at every level of a storage hierarchy

• With virtual memory paging, the main memory allocated to a process is a subset of the total address space of the process

• When memory is over-allocated, victim pages have to be chosen in the same way that victim TLB entries have to be chosen on a TLB miss

Page 115: Chapter 9, Virtual Memory Overheads, Part 1 Sections 9.1-9.5

115

• The same general idea is recapitulated at other levels

• Cache mediates between the registers on the CPU and main memory

• Swap space mediates between main memory and the file system

Page 116: Chapter 9, Virtual Memory Overheads, Part 1 Sections 9.1-9.5

116

Summary of Page Fault Service with Page Replacement

• Returning to the topic at hand, this is a summary of page replacement:

• 0. A process requests a page address. – A fault occurs because the address is valid but the

page hasn’t been loaded into memory• 1. Find the location of the requested page on

the disk

Page 117: Chapter 9, Virtual Memory Overheads, Part 1 Sections 9.1-9.5

117

• 2. Find a free frame– A. If there’s a free frame in the frame pool, use it– B. If there is no free frame, use a page

replacement algorithm to choose a victim frame– C. Write the victim page to disk and update the

page and frame tables accordingly

Page 118: Chapter 9, Virtual Memory Overheads, Part 1 Sections 9.1-9.5

118

• 3. Read the desired page into the newly freed frame– Update the page and frame tables

• 4. Restart the user process

Page 119: Chapter 9, Virtual Memory Overheads, Part 1 Sections 9.1-9.5

119

Page Fault Cost

• Note that if no frame is free, the page replacement algorithm costs two page transfers from secondary storage

• One to write out the victim• Another to read in the requested page• If the victim hasn’t been changed, writing it

out is an avoidable expense

Page 120: Chapter 9, Virtual Memory Overheads, Part 1 Sections 9.1-9.5

120

Dirty Bits

• In the page table, entries may include valid/invalid bits or protection bits

• It is also possible to maintain a dirty, or modify bit

• If a page is written to, the dirty bit is set• If it hasn’t been written to, the dirty bit remains

unset• If a page is chosen as a victim, it only needs to be

written out if its dirty bit has been set

Page 121: Chapter 9, Virtual Memory Overheads, Part 1 Sections 9.1-9.5

121

• Nothing is simple• Depending on how the system is coded, it may

still be necessary to write out clean pages• Swap space may be managed in such a way that

“unneeded” pages in swap space are overwritten• If so, the only copy of a given page may be in

main memory, and if that page has been purged from swap space, it will have to be written out to swap space

Page 122: Chapter 9, Virtual Memory Overheads, Part 1 Sections 9.1-9.5

122

• In the context of the previous general discussion of the management of storage hierarchies, note the following:

• An analogous problem exists in the management of cache

• If cache has been written into, then the contents of cache have to be written to memory in order to save them

Page 123: Chapter 9, Virtual Memory Overheads, Part 1 Sections 9.1-9.5

123

Paging Algorithm Implementation

• Everything so far has been discussed in general terms

• Specifically, in order to implement demand paging you need two algorithms:

• 1. A frame allocation algorithm: – Briefly stated, how many frames does a given

process get?– This will be dealt with in the next section

Page 124: Chapter 9, Virtual Memory Overheads, Part 1 Sections 9.1-9.5

124

• 2. A page replacement algorithm: – Briefly stated, how do you choose a victim when

memory is over-allocated?– This is the topic of the remainder of this section

• The overall goal is to pick or devise algorithms that will lead to the lowest possible page fault rate

• We saw earlier that a high page fault rate kills virtual memory access performance

Page 125: Chapter 9, Virtual Memory Overheads, Part 1 Sections 9.1-9.5

125

Leading to Thumbnail Analyses with Page Reference Strings

• It is possible to do simple, thumbnail analyses of page replacement algorithms

• These are similar in scope to the thumbnail analyses of scheduling algorithms done using Gantt charts

• Analyses of paging use memory reference strings and representations of memory

• A memory reference string is a sequence of memory addresses that a program refers to

Page 126: Chapter 9, Virtual Memory Overheads, Part 1 Sections 9.1-9.5

126

• In practice, examples of memory reference strings can be obtained by tracing the behavior of a program in a system

• Examples can also be devised by running a random number generator

• Paging algorithms can be compared by checking how many page faults each one generates on one or more memory reference strings

Page 127: Chapter 9, Virtual Memory Overheads, Part 1 Sections 9.1-9.5

127

Specifying Memory Reference Strings

• Suppose that an address space is of size 10,000

• In other words, valid addresses, base 10, range from 0000 to 9999

• Here is an example of a reference string:• 0100, 0432, 0101, 0612, 0102, 0103, …

Page 128: Chapter 9, Virtual Memory Overheads, Part 1 Sections 9.1-9.5

128

• The string can be simplified as a result of these observations:

• For the purposes of analyzing paging, you’re only interested in which page an attempted memory access goes to, not which address within a page

• For any address within the page, the whole page will have to be brought in in any case

• Therefore, you can simplify the reference string to its page components only

Page 129: Chapter 9, Virtual Memory Overheads, Part 1 Sections 9.1-9.5

129

• Suppose the page size is 100• Then the first two digits of the reference string

values represent page id’s• For the purposes of analyzing paging behavior,

the example string given earlier can be simplified to 1, 4, 1, 6, 1, 1, …

Page 130: Chapter 9, Virtual Memory Overheads, Part 1 Sections 9.1-9.5

130

• 1, 4, 1, 6, 1, 1, …• Consecutive accesses to the same page can’t

cause a fault• That means that in the future it will be

possible to simplify the reference string further by not including consecutive accesses to the same page

Page 131: Chapter 9, Virtual Memory Overheads, Part 1 Sections 9.1-9.5

131

A Simple, Initial Example

• Suppose you had 3 frames available for the reference string:

• 1, 4, 1, 6, 1, 1• This string would cause exactly 3 page faults• There would be one page fault each for reading

in pages 1, 4, and 6 the first time• The reference string as given requests no other

pages, and 1, 4, and 6 would stay memory resident

Page 132: Chapter 9, Virtual Memory Overheads, Part 1 Sections 9.1-9.5

132

Extending the Example

• Here is an extension to an extreme case• Suppose you had only 1 frame available for the

reference string• 1, 4, 1, 6, 1, 1• This string would cause 5 page faults• There would be one page fault each for reading in 1,

4, 1, 6, and 1• The last page access, to page 1, would not cause a

fault since 1 would be memory resident at that point

Page 133: Chapter 9, Virtual Memory Overheads, Part 1 Sections 9.1-9.5

133

• Not surprisingly, the more frames are available, the fewer the number of page faults generated

• The behavior is generally negative exponential• A graph illustrating this is given on the next

overhead

Page 134: Chapter 9, Virtual Memory Overheads, Part 1 Sections 9.1-9.5

134

Page 135: Chapter 9, Virtual Memory Overheads, Part 1 Sections 9.1-9.5

135

A FIFO Page Replacement Algorithm

• FIFO in this context describes the method for choosing a victim page when memory is over-allocated

• The idea is that the victim is always the oldest page

Page 136: Chapter 9, Virtual Memory Overheads, Part 1 Sections 9.1-9.5

136

• Note, this is not least recently used page replacement

• It’s least recently read in page replacement• There’s no need to timestamp pages as long as

you maintain a data structure like a queue that keeps records of pages ordered by age

Page 137: Chapter 9, Virtual Memory Overheads, Part 1 Sections 9.1-9.5

137

Thumbnail Examples

• The thumbnail examples, taken from the book, will be based on a reference string of length 20 and 3 available frames

• The frames are drawn for each case where there’s a fault.

• There’s no need to redraw them when there’s no fault

Page 138: Chapter 9, Virtual Memory Overheads, Part 1 Sections 9.1-9.5

138

• The book’s representation of the frames does not show which one is oldest.

• While following the examples you have to keep track of which one is the oldest

Page 139: Chapter 9, Virtual Memory Overheads, Part 1 Sections 9.1-9.5

139

A Simple FIFO Example

• The example is visually presented on the overhead following the next one

• The reference string goes across the top• The frame contents are shown in the

rectangles below the string

Page 140: Chapter 9, Virtual Memory Overheads, Part 1 Sections 9.1-9.5

140

• It’s not hard to keep track of the page that was least recently read in

• FIFO page replacement just cycles through the three frames, top to bottom

• The page least recently read in is the one after the last one you just updated (replaced)

• In the example shown on the next overhead, for a reference string of length 20 and 3 memory frames, there are 15 page replacements

Page 141: Chapter 9, Virtual Memory Overheads, Part 1 Sections 9.1-9.5

141

Page 142: Chapter 9, Virtual Memory Overheads, Part 1 Sections 9.1-9.5

142

General Characteristics of FIFO Page Replacement

• FIFO page replacement is easy to understand and program

• Its performance isn’t particularly good• It goes against locality of reference• A page that is frequently used will often end

up being the one least recently read in• Under FIFO it will become a victim and have to

be read in again right away

Page 143: Chapter 9, Virtual Memory Overheads, Part 1 Sections 9.1-9.5

143

Belady’s Anomaly

• Some page replacement algorithms, like FIFO, have a surprising and not very pleasant characteristic, known as Belady’s anomaly

• You would think that increasing the available memory would decrease the number of page faults

Page 144: Chapter 9, Virtual Memory Overheads, Part 1 Sections 9.1-9.5

144

• Under FIFO, depending on the reference string, increasing the number of available frames may increase the number of page faults

• In other words, the general, negative exponential curve shown earlier is not smoothly descending under FIFO page replacement

Page 145: Chapter 9, Virtual Memory Overheads, Part 1 Sections 9.1-9.5

145

• On the next overhead the same reference string is processed using FIFO with 3 and 5 available memory frames

• This reference string exhibits Belady’s anomaly.

• With 3 frames there are 9 page faults• With 5 frames there are 10 faults

Page 146: Chapter 9, Virtual Memory Overheads, Part 1 Sections 9.1-9.5

146

Page 147: Chapter 9, Virtual Memory Overheads, Part 1 Sections 9.1-9.5

147

Optimal Page Replacement

• An optimal page replacement algorithm would be one with the lowest page fault rate

• This means an algorithm that chooses for replacement the page that will not be used for the longest period of time

• Assuming the number of frames is fixed over time, such an algorithm would not suffer from Belady’s anomaly

Page 148: Chapter 9, Virtual Memory Overheads, Part 1 Sections 9.1-9.5

148

A Thumbnail Example of Optimal Page Replacement

• When the time comes to do a replacement, choose the victim by looking ahead (to the right) in the memory reference string

• Choose as a victim that page, among those in the memory, which will be used again farthest in the future

• The first example memory reference string is repeated on the next overhead with 3 frames and optimal replacement

Page 149: Chapter 9, Virtual Memory Overheads, Part 1 Sections 9.1-9.5

149

Page 150: Chapter 9, Virtual Memory Overheads, Part 1 Sections 9.1-9.5

150

• The same reference string and number of frames with FIFO replacement had 15 page faults

• Optimal page replacement has 9 faults• In both cases the first 3 reads are unavoidable• Then marginally, FIFO had 12 faults and optimal

had 6• In this sense, for this example, optimal is twice as

good as FIFO

Page 151: Chapter 9, Virtual Memory Overheads, Part 1 Sections 9.1-9.5

151

Optimal Page Replacement is Not Possible

• Notice the similarity of optimal page replacement with SJF scheduling

• To implement the optimum, you would need knowledge of future events

• Since this isn’t possible, it is necessary to approximate optimal page replacement with some other algorithm

Page 152: Chapter 9, Virtual Memory Overheads, Part 1 Sections 9.1-9.5

152

LRU Page Replacement: Approximating Optimal Replacement

• LRU page replacement is effectively a rule of thumb for predicting future behavior

• The idea is that something that hasn’t been used lately isn’t likely to be used again soon

• Conversely, something that has been used lately is likely to be used again soon

• This is the same reasoning that is used with caching

Page 153: Chapter 9, Virtual Memory Overheads, Part 1 Sections 9.1-9.5

153

• The author makes the following observation:• LRU page replacement is the optimal page

replacement strategy looking backward in time rather than forward

• For those interested in logical puzzles, consider a memory reference string S and the same string reversed, SR

Page 154: Chapter 9, Virtual Memory Overheads, Part 1 Sections 9.1-9.5

154

• SR is like “looking at S backwards” and the following relationships hold:

• The page fault rate for optimal page replacement on S and SR are the same

• The page fault rate for LRU page replacement on S and SR are the same

Page 155: Chapter 9, Virtual Memory Overheads, Part 1 Sections 9.1-9.5

155

• LRU page replacement applied to the memory reference string of the previous examples is illustrated on the next overhead

Page 156: Chapter 9, Virtual Memory Overheads, Part 1 Sections 9.1-9.5

156

Page 157: Chapter 9, Virtual Memory Overheads, Part 1 Sections 9.1-9.5

157

• LRU gives 12 faults total, which is midway between the 9 of optimal and the 15 of FIFO page replacement

• It is possible to identify places where LRU is an imperfect predictor of future behavior

• At the spot marked with a star, 2 has to be read in again when it was the value most recently replaced

• When read in, it would be better if it replaced 4 instead of 3, because 4 is never accessed again

• However, the algorithm can’t foresee this, and 4 was the most recently used page, so it isn’t chosen as the victim

Page 158: Chapter 9, Virtual Memory Overheads, Part 1 Sections 9.1-9.5

158

Implementing LRU

• For the time being, the assumption is that each process has a certain allocation of frames

• When a victim is selected, it is selected from the set of frames belonging to that process

• In other words, victim selection is local, not global• The second challenge of virtual memory remains

to be solved• How to determine the allocation to give to each

process

Page 159: Chapter 9, Virtual Memory Overheads, Part 1 Sections 9.1-9.5

159

Aspects of LRU Page Replacement Implementation

• LRU provides a pretty good page replacement strategy, but implementation is more difficult than looking at a diagram with 3 frames

• The book gives two general approaches to solving the problem

• 1. Use counters (timestamps) to record which page was least recently used

• 2. Use a data structure (like a stack) to organize pages in least recently used order

Page 160: Chapter 9, Virtual Memory Overheads, Part 1 Sections 9.1-9.5

160

Counter/Time Stamp Based Implementation of LRU Page Replacement

• A. The CPU has a clock• B. Every page table entry will now have a time

stamp• C. Finding a victim means searching the page

table by time stamp• D. The time stamp is updated for every access

to the page.

Page 161: Chapter 9, Virtual Memory Overheads, Part 1 Sections 9.1-9.5

161

• E. The time stamp entries will have to be saved in the page table– Every time a page is accessed, even on a TLB

hit, it will be necessary to write to the page table in memory

– Saving time stamp information in this way is necessary because the TLB is flushed (either immediately or gradually) when a new process is scheduled

Page 162: Chapter 9, Virtual Memory Overheads, Part 1 Sections 9.1-9.5

162

• F. The implementation also has to take into account the possibility of time stamp rollover or clock overflow– This is because the time stamp is likely to be a

minutes/seconds subset of the overall system time

– In fact, a counter showing relative recentness of use among pages would be just as useful as an actual time stamp

Page 163: Chapter 9, Virtual Memory Overheads, Part 1 Sections 9.1-9.5

163

Using a Stack to Implement LRU Page Replacement

• The basic idea is to use a modified stack to maintain a record of the pages in memory in access order

• In a classic stack, elements can only be removed from or added to the top

• For the purposes of LRU paging, it is desirable to be able to access any page in memory, regardless of where it is in the stack

Page 164: Chapter 9, Virtual Memory Overheads, Part 1 Sections 9.1-9.5

164

• When a page is accessed, It can then be placed at the top of the stack, indicating that it was the most recently used

• Under this scheme, the least recently used page migrates to the bottom of the stack

• For LRU page replacement, the victim can always be selected from the bottom of the stack

Page 165: Chapter 9, Virtual Memory Overheads, Part 1 Sections 9.1-9.5

165

• This modified stack can be implemented as a form of doubly-linked list

• An update would involve at most 6 pointer/reference operations

• It would take up to 4 operations to remove a node from the middle of the stack

• It would take 2 operations to attach it to the top of the stack

Page 166: Chapter 9, Virtual Memory Overheads, Part 1 Sections 9.1-9.5

166

• 6 pointer operations is not excessively expensive in terms of implementation complexity

• This would seem to be a reasonable approach if implementing in software or microcode

• The reality is that any software solution would be too slow

• Hardware support for page replacement algorithms will be taken up shortly

Page 167: Chapter 9, Virtual Memory Overheads, Part 1 Sections 9.1-9.5

167

Concluding Remarks on Optimal and LRU Page Replacement

• Neither optimal nor LRU page replacement suffer from Belady’s anomaly

• They are both examples of what are called stack algorithms, which are free of this problem

Page 168: Chapter 9, Virtual Memory Overheads, Part 1 Sections 9.1-9.5

168

• This is the defining characteristic of such algorithms:

• At any point in the execution of a given reference string:

• The pages that would be in memory for an allocation of n frames is always a subset of what would be in memory for an allocation of n + 1 frames

Page 169: Chapter 9, Virtual Memory Overheads, Part 1 Sections 9.1-9.5

169

• That this holds for LRU is relatively clear• The n most recently used pages will always be a

subset of the n + 1 most recently used pages• And as shown, it’s easy to maintain this logical

ordering using a stack• Time will not be taken to prove that FIFO, for

example, doesn’t work this way• Belady’s anomaly was already illustrated using

FIFO

Page 170: Chapter 9, Virtual Memory Overheads, Part 1 Sections 9.1-9.5

170

Hardware Support for Page Replacement

• Consider the software solutions again• A counter/timestamp based implementation would

require a memory access to update the page table every time a page was accessed

• If a stack were used, it would also be memory resident• Rearranging the nodes on the stack would require a

memory access• Adding one overhead memory access for each user

memory access is expensive

Page 171: Chapter 9, Virtual Memory Overheads, Part 1 Sections 9.1-9.5

171

• Also, if completely software driven, an implementation would include interrupt handling costs

• The author asserts that overall this would slow memory access by a factor of 10 (not 10%, 1000%)

• This means that in order to be practical, whatever page replacement approach is chosen, it will have to have hardware support

Page 172: Chapter 9, Virtual Memory Overheads, Part 1 Sections 9.1-9.5

172

LRU Approximation Page Replacement

• Most systems don’t provide hardware support for full LRU page replacement, either counter or stack based

• It is possible to get most of the benefits of LRU page replacement by providing hardware support for LRU approximation

Page 173: Chapter 9, Virtual Memory Overheads, Part 1 Sections 9.1-9.5

173

A Reference Bit Algorithm

• This begins the discussion of a counter/clock/timestamp based approach to LRU approximation

• Consider this simple approach:• 1. Let every page table entry have one additional

bit, a reference bit• 2. The reference bit is set by hardware (without

a separate interrupt) every time the page is referenced (either read or written)

Page 174: Chapter 9, Virtual Memory Overheads, Part 1 Sections 9.1-9.5

174

• 3. In theory, this would allow a choice between used an unused

• 4. In reality, under demand paging, all pages would be used

• 5. On the other hand, under pre-loading, you could in theory choose an unused page as a victim

• 6. Still, it’s clear that one reference bit alone doesn’t get you everything you might want

Page 175: Chapter 9, Virtual Memory Overheads, Part 1 Sections 9.1-9.5

175

The Additional Reference Bits Algorithm

• Adding more bits to the page table entry makes it possible to enhance the algorithm to the point of usefulness

• 1. Let a current reference bit value be maintained in system software

• 2. That bit tells whether during the current time interval the page has been accessed (The question of an interval will be addressed shortly)

• 3. Let the page table keep 8 reference bits for each entry

Page 176: Chapter 9, Virtual Memory Overheads, Part 1 Sections 9.1-9.5

176

• 4. At regular clock intervals (say every 100 ms.) an interrupt triggers writing the reference bit into the high order place of the 8 bit entry, shifting everything to the right by 1 bit

• 5. The result is that instead of a simple record of used/not used, you have used/not used for each of the last 8 time intervals

Page 177: Chapter 9, Virtual Memory Overheads, Part 1 Sections 9.1-9.5

177

• 6. If you treat the 8 bits as a number, the higher the number, the more recently used the page

• 7. You choose as a victim the page with the smallest number

Page 178: Chapter 9, Virtual Memory Overheads, Part 1 Sections 9.1-9.5

178

• 8. The scheme actually gives up to 28 groups of pages by frequency of usage

• You choose the victim from the group with the least frequent usage

• In case there is >1 page in the least frequent group, you can swap out all of the pages in that group or pick one victim using FIFO, for example

Page 179: Chapter 9, Virtual Memory Overheads, Part 1 Sections 9.1-9.5

179

The Second Chance Algorithm

• This begins the discussion of a stack/data structure based approach to LRU approximation

• The second chance algorithm is also known as the clock algorithm

• However, the second chance/clock algorithm shouldn’t be confused with counter algorithms which depend on the system clock

Page 180: Chapter 9, Virtual Memory Overheads, Part 1 Sections 9.1-9.5

180

• The clock algorithm is not based on a stack• It is called the clock algorithm because it is

supported by a circularly linked list• This list can be visualized as a clock

Page 181: Chapter 9, Virtual Memory Overheads, Part 1 Sections 9.1-9.5

181

• 1. The clock algorithm is based on a single reference bit where 1 signifies that the page has been accessed

• 2. As a process acquires pages, let them be recorded in a circularly linked list

• 3. When the list is full (i.e., the process has as many frames as it will be allowed) and a fault occurs, you need to find a victim

Page 182: Chapter 9, Virtual Memory Overheads, Part 1 Sections 9.1-9.5

182

• 4. Begin with the oldest page (i.e., the one after the one most recently replaced) and traverse the circular list

• 5. If you encounter a page with a 1 bit, don’t select it. • 6. Give it a second chance, but set its reference bit to 0

so that it won’t get a third chance• 7. Select as a victim the first page you encounter with a

reference bit of 0• 8. The next time a fault occurs, pick up searching where

you left off before (this is just a different way of saying the same thing as in point 4)

Page 183: Chapter 9, Virtual Memory Overheads, Part 1 Sections 9.1-9.5

183

• If all of the pages have been referenced since the last fault, all of the reference bits will be set to 1

• This means the algorithm will go all the way around the list before selecting a victim

• The victim will end up being the first page that was touched

• This is not a complete waste though—because all of the 1 bits will have been cleared

Page 184: Chapter 9, Virtual Memory Overheads, Part 1 Sections 9.1-9.5

184

• If all of the pages are consistently referenced before the next fault, the clock algorithm degenerates into FIFO page replacement

• Incidentally, this signifies that at least those pages that are sitting memory are in use

• It may indicate that the process doesn’t have a large enough allocation

• The broader topic of a program’s behavior and the size of its allocation will be discussed later

Page 185: Chapter 9, Virtual Memory Overheads, Part 1 Sections 9.1-9.5

185

• A diagram illustrating how the clock algorithm works will be given shortly

• The scenario is that there are 8 pages/frames allocated, and you’re looking for a victim among them

• This means it’s an “8 hour” clock, a circularly linked list with 8 nodes

Page 186: Chapter 9, Virtual Memory Overheads, Part 1 Sections 9.1-9.5

186

• The clock has a single hand• Really, it’s more of a spinner• The hand, or spinner, is lettered A, B, C, …• This just indicates the sequence of nodes you

look at around the clock as the algorithm progresses

Page 187: Chapter 9, Virtual Memory Overheads, Part 1 Sections 9.1-9.5

187

• You start the search by advancing the hand to the node following the one where the search ended during the last round

• The explanations of the steps in the algorithm are given in numbered boxes sitting next to the node in question

• Follow the diagrams and explanations from there

Page 188: Chapter 9, Virtual Memory Overheads, Part 1 Sections 9.1-9.5

188

Page 189: Chapter 9, Virtual Memory Overheads, Part 1 Sections 9.1-9.5

189

The Enhanced Second Chance Algorithm

• This algorithm makes use of the reference bit and an additional bit, the modify bit

• It prefers as victims those pages that don’t have to be written back because this saves time

• The two bits together make four classes of pages

Page 190: Chapter 9, Virtual Memory Overheads, Part 1 Sections 9.1-9.5

190

• These are the four classes, given in descending order of preference for replacement:

• 1. (0, 0), not recently used, not modified• 2. (0, 1), not recently used, modified• 3. (1, 0), recently used, not modified• 4. (1, 1,), recently used, modified

Page 191: Chapter 9, Virtual Memory Overheads, Part 1 Sections 9.1-9.5

191

• The victim will be the first page encountered in the lowest non-empty class

• Identifying the victim may require at least one full trip around the linked list

• Since the algorithm changes bits from 1 to 0, it may require more than a single trip in order to identify a member of the lowest class represented in the data structure

Page 192: Chapter 9, Virtual Memory Overheads, Part 1 Sections 9.1-9.5

192

• The trade-off is more trips around the data structure in exchange for fewer page writes

• A previous edition of the book said that this algorithm was implemented at some point in the MacIntosh

Page 193: Chapter 9, Virtual Memory Overheads, Part 1 Sections 9.1-9.5

193

Counting Based Page Replacement

• Do not confuse this with the counter/clock/timestamp ideas given above

• Two algorithms will be given as examples• They are alternative approaches to finding a

suitable victim• It turns out that they are not easy to

implement

Page 194: Chapter 9, Virtual Memory Overheads, Part 1 Sections 9.1-9.5

194

• They also don’t approximate the performance of optimal page replacement very well

• They are infrequently used• I guess they’re given here in the same spirit as

the “worst fit” memory allocation scheme—food for thought

Page 195: Chapter 9, Virtual Memory Overheads, Part 1 Sections 9.1-9.5

195

• 1. The victim is LFU = least frequently used. – The theory is that active pages should have a large number

of accesses per unit time– This at least seems plausible

• 2. The victim is MFU = most frequently used. – The theory is that a high count of uses should be a sign that

the process is ready to move on to another address– MFU seems loony– It would depend on magic knowledge of how high the count

would go on a certain page before the process was ready to move on

Page 196: Chapter 9, Virtual Memory Overheads, Part 1 Sections 9.1-9.5

196

The Page Buffering Algorithm

• This is an enhancement that may be added to any page replacement algorithm that might be chosen

• In general, this refers to the management of writing out the contents of victims when making frames available

• There are three basic approaches

Page 197: Chapter 9, Virtual Memory Overheads, Part 1 Sections 9.1-9.5

197

• 1. Always keep a free frame pool– The currently running process gets a free frame

from the pool immediately– This saves time for the currently running process

when a victim is chosen for it– The contents of the victim can be written out to

memory when cycles are available and then added into the free frame pool

Page 198: Chapter 9, Virtual Memory Overheads, Part 1 Sections 9.1-9.5

198

• 2. Maintain a system data structure providing access to modified pages– Whenever cycles are available, write out copies of

modified pages– That means that if a certain page is chosen as a

victim, the currently running process that is allocated that frame won’t have to wait for a write

– Note that this does in advance what the previous approach accomplishes after the fact

Page 199: Chapter 9, Virtual Memory Overheads, Part 1 Sections 9.1-9.5

199

• 3. Keep a free frame pool, but don’t erase the records for them– The idea is that a page that has been moved to the

free frame pool still has a second chance– Whenever a page fault occurs, always check first

to see whether the desired page is in the free frame pool

– This was implemented in the VAX/VMS system

Page 200: Chapter 9, Virtual Memory Overheads, Part 1 Sections 9.1-9.5

200

Applications and Page Replacement

• Most application programs are completely reliant on the O/S for paging

• The application programs may have no special paging needs that can’t be met by the O/S

• The O/S may not support applications that attempt to manage their own paging

• Even if applications have special needs and the O/S would support separate management, the application programmers don’t want to load down the application with that functionality

Page 201: Chapter 9, Virtual Memory Overheads, Part 1 Sections 9.1-9.5

201

• On the other hand, some applications, like database management systems, have their own internal logic for data access

• The application is better suited to determining an advantageous paging algorithm than the underlying operating system

Page 202: Chapter 9, Virtual Memory Overheads, Part 1 Sections 9.1-9.5

202

• Some systems will allow an application to use a disk partition without O/S support

• This is just an array of addresses in secondary storage known as a raw disk

• In some systems, applications like database management systems are integrated with the O/S

• The AS/400 is an example

Page 203: Chapter 9, Virtual Memory Overheads, Part 1 Sections 9.1-9.5

203

9.5 Allocation of Frames

• Recall the two problems identified at the beginning of the previous section:

• Page replacement and frame allocation• We have dealt with page replacement. • Now it’s time for frame allocation• The simplest possible case would be a system

that allowed one user process at a time

Page 204: Chapter 9, Virtual Memory Overheads, Part 1 Sections 9.1-9.5

204

• Under this scenario you could implement demand paging

• The process could acquire memory until it was entirely consumed.

• All of the frames would be allocated to that process

• If a fault then occurred, some sort of page replacement could be used

Page 205: Chapter 9, Virtual Memory Overheads, Part 1 Sections 9.1-9.5

205

Multiprogramming and the Minimum Number of Frames

• As soon as you contemplate multiprogramming, you can have >1 process

• In that situation, it becomes undesirable for a single process to acquire all of memory

• If the memory allocated to a process is limited, what is the minimum allowable?

• There is both a practical minimum and an absolute minimum

Page 206: Chapter 9, Virtual Memory Overheads, Part 1 Sections 9.1-9.5

206

• If a process has too little memory, its performance will suffer.

• This is the practical minimum• There is also an absolute theoretical minimum

needed in order for paging to work• This depends on the nature of the machine

instruction set and the fetch, decode, execute cycle

Page 207: Chapter 9, Virtual Memory Overheads, Part 1 Sections 9.1-9.5

207

• Recall the execution cycle– Fetch– Decode– Fetch operands– Do the operation– Write the results

Page 208: Chapter 9, Virtual Memory Overheads, Part 1 Sections 9.1-9.5

208

• Consider this scenario:• 1. A process is only allocated one frame• 2. Fetch instruction from one page• 3. Decode• 4. Try to fetch operands and discover they’re on

another page• 5. Page fault to get operands• 6. Restart halted instruction• 7. Cycle endlessly

Page 209: Chapter 9, Virtual Memory Overheads, Part 1 Sections 9.1-9.5

209

• The previous scenario would require a minimum of two pages

• For an instruction set with an instruction that had multiple operands or had operands that could span an address space of >1 page, more than two pages could be required

• Let n be the maximum number of page references possible in a single fetch, decode, execute cycle

• n is the minimum number of frames that can be allocated to a process

Page 210: Chapter 9, Virtual Memory Overheads, Part 1 Sections 9.1-9.5

210

• System support for indirection can affect the minimum number of frames

• Indirection refers to the idea that an operand can be a special value which has this meaning:

• Do not use the value found—let the value serve as an address, and use the value found at that address

Page 211: Chapter 9, Virtual Memory Overheads, Part 1 Sections 9.1-9.5

211

• If the value found at the address is another of the indirection type, multiple levels of indirection result

• The instruction takes this form: • act on(the value found at(the value found at(…))

Page 212: Chapter 9, Virtual Memory Overheads, Part 1 Sections 9.1-9.5

212

• Every system architecture has to set a fixed limit on the number of levels of indirection that are allowed

• If n levels of indirection are allowed, then a process needs at least n + 1 frames, 1 for the instruction, n for the levels of indirection

• If a program did use all levels, unknown until run time, without n + 1 frames it would page cyclically, as described earlier

Page 213: Chapter 9, Virtual Memory Overheads, Part 1 Sections 9.1-9.5

213

Allocation Algorithms

• Between the minimum necessary and the maximum allowed (possibly all), there is a wide range of number of frames that could be allocated to a process

• There is also a choice of how to decide how many to allow

• Some suggestions follow

Page 214: Chapter 9, Virtual Memory Overheads, Part 1 Sections 9.1-9.5

214

• 1. Equal allocation• This is the simplest approach• Divide the number of available frames by the

number of processes• Limit the number of processes so that none

fall below the minimum necessary allocation

Page 215: Chapter 9, Virtual Memory Overheads, Part 1 Sections 9.1-9.5

215

• 2. Proportional allocation• Divide the frame pool in proportion to the

sizes of the processes (where size is measured in the number of frames in the swap image)

• Again, limit the number of processes so that they all have more than the minimum allocation

Page 216: Chapter 9, Virtual Memory Overheads, Part 1 Sections 9.1-9.5

216

• 3. Proportional allocation based on priority• Devise a formula that gives higher priority jobs

more frames• The formula may include job size as an

additional factor• ***Note that in reality, frame allocation will

go beyond these three simplistic suggestions

Page 217: Chapter 9, Virtual Memory Overheads, Part 1 Sections 9.1-9.5

217

Global Versus Local Allocation

• The question is, when a page fault occurs, do you choose the victim among the pages belonging to the faulting process, or do you choose a victim among all pages?

• In other words, is a process allowed to “steal” pages from other processes—meaning that the frame allocations to processes can vary over time

Page 218: Chapter 9, Virtual Memory Overheads, Part 1 Sections 9.1-9.5

218

• Global replacement opens up the possibility of better use of memory and better throughput

• On the other hand, local replacement makes individual processes more autonomous

• With a fixed frame allocation, their paging behavior alone determines their performance

Page 219: Chapter 9, Virtual Memory Overheads, Part 1 Sections 9.1-9.5

219

• If processes could have their pages stolen by other processes, their performance would change due to an outside factor

• This is just a preview of this topic. • More will come later

Page 220: Chapter 9, Virtual Memory Overheads, Part 1 Sections 9.1-9.5

220

The End


Recommended