Date post: | 01-Jan-2016 |
Category: |
Documents |
Upload: | brandon-freeman |
View: | 214 times |
Download: | 0 times |
July 30, 2001 Systems Architecture II 1
Systems Architecture II (CS 282-001)
Lecture 8: Exploiting Memory Hierarchy: Virtual Memory*
Jeremy R. Johnson
Monday July 30, 2001
*This lecture was derived from material in the text (Chap. 7). All figures from Computer Organization and Design: The Hardware/Software Approach, Second Edition, by David Patterson and John Hennessy, are copyrighted material (COPYRIGHT 1998 MORGAN KAUFMANN PUBLISHERS, INC. ALL RIGHTS RESERVED).
July 30, 2001 Systems Architecture II 2
Introduction• Objective: To use main memory as a cache for accessing
secondary storage (typically magnetic disk)
• Additional Motivation:– To allow multiple programs to efficiently and safely share main memory– To provide the programmer with the illusion of an unbounded memory
• Topics– Virtual memory– Mapping virtual addresses to physical addresses– Hardware support for address translation (TLB)– Page faults– Implementing protection with virtual memory– Common framework for memory hierarchies
July 30, 2001 Systems Architecture II 3
Virtual Memory
• In virtual memory, blocks of memory (called pages) are mapped from a set of addresses (called virtual addresses) to another set (called physical addresses)
• A physical page resides in memory, while a virtual page may reside in disk
• Sharing is accomplished by pointing two virtual addresses to the same physical address
Physical addresses
Disk addresses
Virtual addresses
Address translation
July 30, 2001 Systems Architecture II 4
Mapping from a Virtual to Physical Address
• In virtual memory, an address is broken into a virtual page number and a page offset
• The virtual page number is mapped to a physical page number
• The number of addressable virtual pages may not be the same as the number of physical pages
3 2 1 011 10 9 815 14 13 1231 30 29 28 27
Page offsetVirtual page number
Virtual address
3 2 1 011 10 9 815 14 13 1229 28 27
Page offsetPhysical page number
Physical address
Translation
July 30, 2001 Systems Architecture II 5
Design Decisions
• Many design decisions for virtual memory are motivated by the high cost of a miss (called a page fault)
– The time to process a page fault will take millions of cycles!
1 Pages should be large enough to amortize large access time (utilize principle of locality)
– 4 KB - 64 KB page size
2 Reducing the rate of page faults is crucial– Use a fully associative scheme
3 Page faults can be handled in software and hence clever algorithms can be used to choose replacement pages thus reducing the number of misses
4 Using write-through to manage writes in virtual memory will not work (too costly). Write-back strategy is used instead
July 30, 2001 Systems Architecture II 6
Locating a Physical Page
• Use a fully associative scheme to reduce page faults and to allow more flexible replacement policies
• Since search is to costly, table lookup is used to find the physical location of a page.
• Index table is called a page table • The page table is indexed by the virtual page number• Each program has its own page table.• The size of the page table is determined by the number of
bits in the virtual address• Since page tables are large and there can be many of them,
the entire table is not kept in memory (use dynamic tables, hash functions, multiple levels, and virtual memory itself)
July 30, 2001 Systems Architecture II 7
Page Table
Page offsetVirtual page number
Virtual address
Page offsetPhysical page number
Physical address
Physical page numberValid
If 0 then page is notpresent in memory
Page table register
Page table
20 12
18
31 30 29 28 27 15 14 13 12 11 10 9 8 3 2 1 0
29 28 27 15 14 13 12 11 10 9 8 3 2 1 0
July 30, 2001 Systems Architecture II 8
Page Faults
• If the valid bit for a virtual page is off, a page fault occurs• Control is transferred to the OS through an exception• The OS must get the page from disk (disk locations can be
stored in the page table or an auxiliary data structure)• If there is no room for the new page, a page currently residing
in memory must be replaced• In order to minimize page faults the least recently accessed
page (or some approximation to it) will be replaced• Implementing LRU is too costly (must update data structure
for every memory access). Approximate using a reference bit which is set whenever the page is accessed, and periodically cleared by the OS and recorded to determine which pages were used during a particular time period
July 30, 2001 Systems Architecture II 9
Write Strategy and Page Replacement
• The difference in access times to cache and main memory is tens of cycles and a write-through strategy can be used (with the aid of a write buffer to hide latency of the write)
• Since writes to disk take millions of cycles this approach is impractical for virtual memory
• Instead, use a write-back policy– perform individual writes to the page in memory and copy them back
to disk when the page is replaced– Copying an entire page is more efficient than the sum of individual
writes
• A write-back operation though more efficient is still costly– only write back when the contents of the page have changed– determined by the setting of the dirty bit
July 30, 2001 Systems Architecture II 10
Making Address Translation Fast
• With virtual memory, accessing memory requires two memory references (one to determine physical address and one to access the contents of the desired location)
• The key to improving access performance relies on the principle of locality
– it is likely that once a page is accessed it will be accessed again in the near future
– only need to perform address translation once, saving the result in a buffer (a special cache) called the translation-lookaside buffer (TLB)
• Since the TLB may replace access to the page table, reference and dirty bit may be required. Write-back is used since miss rate should be very small
• A TLB miss may not imply a page fault
July 30, 2001 Systems Architecture II 11
Translation-Lookaside Buffer– TLB size 32-4096 entries– Block size 1 - 2 page table entries– Hit time: 0.5 - 1 cycles– Miss penalty: 10-30 cycles– Miss rate: 0.01% - 1%
Valid
1
1
1
1
0
1
1
0
1
1
0
1
Page table
Physical pageaddressValid
TLB
1
1
1
1
0
1
TagVirtual page
number
Physical pageor disk address
Physical memory
Disk storage
July 30, 2001 Systems Architecture II 12
An Example
• DECStation 3100– 32 bit address– 4 KB pages– 20 bit virtual page
number– TLB contains 64 entries– TLB fully associative– Exception on TLB miss– Avg. TLB miss 16 cycles– Special instructions
update TLB– Replacement of TLB
entry determined by hardware
Valid Tag Data
Page offset
Page offset
Virtual page number
Virtual address
Physical page numberValid
1220
20
16 14
Cache index
32
Cache
DataCache hit
2
Byteoffset
Dirty Tag
TLB hit
Physical page number
Physical address tag
TLB
Physical address
31 30 29 15 14 13 12 11 10 9 8 3 2 1 0
July 30, 2001 Systems Architecture II 13
Processing a Read/Write in the DECStation 3100 TLB and Cache
Yes
Deliver datato the CPU
Write?
Try to read datafrom cache
Write data into cache,update the tag, and put
the data and the addressinto the write buffer
Cache hit?Cache miss stall
TLB hit?
TLB access
Virtual address
TLB missexception
No
YesNo
YesNo
Write accessbit on?
YesNo
Write protectionexception
Physical address
July 30, 2001 Systems Architecture II 14
Interaction with OS• Maintain memory hierarchy (data can not be in cache unless in memory)
– flush cache of entries from page replaced to disk– update page tables and TLB so that attempt to access data on the replaced page will
generate a page fault
• Protection (prevent one programs from writing to another programs portion of memory)
– Each program has its own virtual address space - organize page tables to map this to distinct sets of physical pages
– Must make sure only the OS can modify page tables (done by putting page tables in address space of OS)
– Two modes of execution (user and supervisor). Special instructions to modify TLB, page table register, user/supervisor mode bit
– on context switch need to flush TLB entries (or use PID to distinguish)
• Sharing (allow programs to share memory (e.g. editor code)– Have OS point virtual page to shared physical page– Use write access bit to restrict sharing to read
July 30, 2001 Systems Architecture II 15
Handling Page Faults• Handling page faults requires using the exception
mechanism to interrupt the active process, transferring control to the OS, and later resuming execution of the interrupted process
– Exception must be asserted before instruction completes (so that the state remains as it was before the instruction), otherwise can not properly restart instruction (making instructions restartable is difficult)
– Instruction address placed in EPC and cause of exception in cause register
– Virtual address determined from EPC or instruction in EPC (depending on whether it was an instruction or data access that caused the fault)
– Steps:• Save entire state of process (includes all registers)• Look up page table entry and find location of referenced page on disk• Choose replacement page (if dirty need to write back)• Start a read to bring referenced page from disk to memory
– Since the last step takes millions of cycles usually start another process while waiting for read to complete
July 30, 2001 Systems Architecture II 16
Summary
• Virtual memory (cache between main memory and secondary storage)
• Supports large address space and allows sharing• Misses (page faults) very costly
– large blocks (pages)– fully associative using page table– LRU (or approximation) policy used for replacement strategy
• Write back strategy used• TLB used to reduce cost of extra address translation• Virtual to physical mapping provides mechanism for
protection and simplifies memory allocation
July 30, 2001 Systems Architecture II 17
Common Framework for Memory Hierarchies
• Question 1: Where can a block be placed?– One place (direct mapped), a few places (set associative), or any place
(fully associative)
• Question 2: How is a block found?– There are four methods: indexing, limited search, full search, or table
lookup
• Question 3: Which block should be replaced on a cache miss?
– Typically least recently used or random block
• Question 4: What happens on writes?– Write-through or write-back
July 30, 2001 Systems Architecture II 18
Model for Cache Misses
• Compulsory misses (cold-start)– These are cache misses caused by the first access to a block that has
never been in the cache
• Capacity misses– These are cache misses caused when the cache cannot contain all the
blocks needed during execution of a program.
• Conflict misses (collision)– These are cache misses that occur in a set associative or direct
mapped cache when multiple blocks compete for the same set. These misses are eliminated with a fully associative cache.