+ All Categories
Home > Documents > 1 COMP 206: Computer Architecture and Implementation Montek Singh Mon., Nov. 17, 2003 Topic: Virtual...

1 COMP 206: Computer Architecture and Implementation Montek Singh Mon., Nov. 17, 2003 Topic: Virtual...

Date post: 19-Dec-2015
Category:
View: 219 times
Download: 3 times
Share this document with a friend
Popular Tags:
19
1 COMP 206: COMP 206: Computer Architecture Computer Architecture and Implementation and Implementation Montek Singh Montek Singh Mon., Nov. 17, 2003 Mon., Nov. 17, 2003 Topic: Topic: Virtual Memory Virtual Memory
Transcript
Page 1: 1 COMP 206: Computer Architecture and Implementation Montek Singh Mon., Nov. 17, 2003 Topic: Virtual Memory.

1

COMP 206:COMP 206:Computer Architecture and Computer Architecture and

ImplementationImplementation

Montek SinghMontek Singh

Mon., Nov. 17, 2003Mon., Nov. 17, 2003

Topic: Topic: Virtual MemoryVirtual Memory

Page 2: 1 COMP 206: Computer Architecture and Implementation Montek Singh Mon., Nov. 17, 2003 Topic: Virtual Memory.

2

OutlineOutline IntroductionIntroduction Address TranslationAddress Translation VM OrganizationVM Organization ExamplesExamples

Reading: HP3 Section 5.10 Reading: HP3 Section 5.10 For background: Refer to PH (Comp. Org.)For background: Refer to PH (Comp. Org.)

Page 3: 1 COMP 206: Computer Architecture and Implementation Montek Singh Mon., Nov. 17, 2003 Topic: Virtual Memory.

3

CharacteristicsCharacteristics

Cache-MM MM-diskAccess time ratio ("speed gap") 1:5 - 1:15 1:10000 - 1:1000000

Hit time 1-2 cycles 40-100 cyclesHit ratio 0.90-0.99 0.99999-0.9999999

Miss (page fault) ratio 0.01-0.10 0.00000001-0.000001Miss penalty 10-100 cycles 1M-6M cycles

CPU during block transfer blocking/non-blocking task switchingBlock (page) size 16-128 bytes 4Kbytes - 64KbytesImplemented in hardware hardware + software

Mapping Direct or set-associative Page table ("fully associative")Replacement algorithm Not crucial Very important (LRU)

Write policy Many choices Write backDirect access to slow memory Yes No

Page 4: 1 COMP 206: Computer Architecture and Implementation Montek Singh Mon., Nov. 17, 2003 Topic: Virtual Memory.

4

AddressingAddressing Always a congruence mappingAlways a congruence mapping Assume Assume

4GB VM composed of 24GB VM composed of 22020 4KB pages 4KB pages 64MB DRAM main memory composed of 16384 page frames (of 64MB DRAM main memory composed of 16384 page frames (of

same size)same size) Only those pages (of the 2Only those pages (of the 22020) that are not empty actually exist) that are not empty actually exist

Each is either in main memory or on diskEach is either in main memory or on disk Can be located with two mappings (implemented with tables)Can be located with two mappings (implemented with tables)

Virtual address = (virtual page number, page offset)VA = (VPN, offset)32 bits = (20 bits + 12 bits)

Physical address = (real page number, page offset)PA = (RPN, offset)26 bits = (14 bits + 12 bits)

Page 5: 1 COMP 206: Computer Architecture and Implementation Montek Singh Mon., Nov. 17, 2003 Topic: Virtual Memory.

5

Address TranslationAddress Translation

RPN = fRPN = fMM(VPN)(VPN) In reality, VPN is mapped to a page table entry (PTE)In reality, VPN is mapped to a page table entry (PTE)

which contains RPN …which contains RPN … … … as well as miscellaneous control information (e.g., valid bit, as well as miscellaneous control information (e.g., valid bit,

dirty bit, replacement information, access control)dirty bit, replacement information, access control)

VA PA(VPN, offset within page) (RPN, offset within page)

VA disk address

Page 6: 1 COMP 206: Computer Architecture and Implementation Montek Singh Mon., Nov. 17, 2003 Topic: Virtual Memory.

6

Single-Level, Direct Page Table in Single-Level, Direct Page Table in MMMM Fully associative mapping:Fully associative mapping:

when VM page is brought in from disk to MM, it may when VM page is brought in from disk to MM, it may go into any of the real page framesgo into any of the real page frames

Simplest addressing scheme: one-level, direct Simplest addressing scheme: one-level, direct page tablepage table (page table base address + VPN) = PTE or page fault(page table base address + VPN) = PTE or page fault Assume that PTE size is 4 bytesAssume that PTE size is 4 bytes Then whole table requires 4Then whole table requires 4222020 = 4MB of main = 4MB of main

memorymemory

Disadvantage: 4MB of main memory must be Disadvantage: 4MB of main memory must be reserved for page tables, even when the VM reserved for page tables, even when the VM space is almost emptyspace is almost empty

Page 7: 1 COMP 206: Computer Architecture and Implementation Montek Singh Mon., Nov. 17, 2003 Topic: Virtual Memory.

7

Single-Level Direct Page Table in Single-Level Direct Page Table in VMVM To avoid tying down 4MB of physical memoryTo avoid tying down 4MB of physical memory

Put page tables in VMPut page tables in VM Bring into MM only those that are actually neededBring into MM only those that are actually needed ““Paging the page tables”Paging the page tables”

Needs only 1K PTEs in main memory, rather Needs only 1K PTEs in main memory, rather than 4MBthan 4MB

Slows down access to VM pages by possibly Slows down access to VM pages by possibly needing disk accesses for the PTEsneeding disk accesses for the PTEs

Page 8: 1 COMP 206: Computer Architecture and Implementation Montek Singh Mon., Nov. 17, 2003 Topic: Virtual Memory.

8

Multi-Level Direct Page Table in Multi-Level Direct Page Table in MMMM Another solution to storage problemAnother solution to storage problem Break 20-bit VPN into two 10-bit partsBreak 20-bit VPN into two 10-bit parts

VPN = (VPN1, VPN2)VPN = (VPN1, VPN2)

This turns original one-level page table into a This turns original one-level page table into a tree structuretree structure (1st level base address + VPN1) = 2nd level base (1st level base address + VPN1) = 2nd level base

addressaddress (2nd level base address + VPN2) = PTE or page fault(2nd level base address + VPN2) = PTE or page fault

Storage situation much improvedStorage situation much improved Always need root node (1K 4-byte entries = 1 VM page)Always need root node (1K 4-byte entries = 1 VM page) Ned only a few of the second level nodesNed only a few of the second level nodes

Allocated on demandAllocated on demandCan be anywhere in main memoryCan be anywhere in main memory

Access time to PTE has doubledAccess time to PTE has doubled

Page 9: 1 COMP 206: Computer Architecture and Implementation Montek Singh Mon., Nov. 17, 2003 Topic: Virtual Memory.

9

Inverted Page TablesInverted Page Tables Virtual address spaces may be vastly larger Virtual address spaces may be vastly larger

(and more sparsely populated) than real (and more sparsely populated) than real address spacesaddress spaces less-than-full utilization of tree nodes in multi-level less-than-full utilization of tree nodes in multi-level

direct page table becomes more significantdirect page table becomes more significant Ideal (i.e., smallest possible) page table would Ideal (i.e., smallest possible) page table would

have one entry for every VM page actually in have one entry for every VM page actually in main memorymain memory Need 4Need 416K = 64KB of main memory to store this ideal 16K = 64KB of main memory to store this ideal

page tablepage table Storage overhead = 0.1%Storage overhead = 0.1%

Inverted page tableInverted page table implementations are implementations are approximations to this ideal page tableapproximations to this ideal page table Associative inverted page table in special hardware Associative inverted page table in special hardware

(ATLAS)(ATLAS) Hashed inverted page table in MM (IBM, HP PA-RISC)Hashed inverted page table in MM (IBM, HP PA-RISC)

Page 10: 1 COMP 206: Computer Architecture and Implementation Montek Singh Mon., Nov. 17, 2003 Topic: Virtual Memory.

10

Translation Lookaside Buffer (TLB)Translation Lookaside Buffer (TLB) To avoid two or more MM accesses for each VM To avoid two or more MM accesses for each VM

access, use a small cache to store (VPN, PTE) access, use a small cache to store (VPN, PTE) pairspairs PTE contains RPN, from which RA can be constructedPTE contains RPN, from which RA can be constructed

This cache is the TLB, and it exploits localityThis cache is the TLB, and it exploits locality DEC Alpha (32 entries, fully associative)DEC Alpha (32 entries, fully associative) Amdahl V/8 (512 entries, 2-way set-associative)Amdahl V/8 (512 entries, 2-way set-associative)

Processor issues VAProcessor issues VA TLB hitTLB hit

Send RA to main memorySend RA to main memory TLB missTLB miss

Make two or more MM accesses to page tables to retrieve Make two or more MM accesses to page tables to retrieve RARA

Send RA to MMSend RA to MM– (Any of these may cause page fault)(Any of these may cause page fault)

Page 11: 1 COMP 206: Computer Architecture and Implementation Montek Singh Mon., Nov. 17, 2003 Topic: Virtual Memory.

11

TLB MissesTLB Misses Causes for TLB missCauses for TLB miss

VM page is not in main memoryVM page is not in main memory VM page is in main memory, but TLB entry has not yet VM page is in main memory, but TLB entry has not yet

been entered into TLBbeen entered into TLB VM page is in main memory, but TLB entry has been VM page is in main memory, but TLB entry has been

removed for some reason (removed as LRU, removed for some reason (removed as LRU, invalidated because page table was updated, etc.)invalidated because page table was updated, etc.)

Miss rates are remarkably low (~0.1%)Miss rates are remarkably low (~0.1%) Miss rate depends on size of TLB and on VM page size Miss rate depends on size of TLB and on VM page size

(coverage)(coverage)

Miss penalty varies from a single cache access Miss penalty varies from a single cache access to several page faultsto several page faults

Page 12: 1 COMP 206: Computer Architecture and Implementation Montek Singh Mon., Nov. 17, 2003 Topic: Virtual Memory.

12

Dirty Bits and TLB: Two SolutionsDirty Bits and TLB: Two Solutions TLB is TLB is read-onlyread-only cache cache Dirty bit is contained Dirty bit is contained

only in page table in MMonly in page table in MM TLB contains only a TLB contains only a

write-access bitwrite-access bit Initially set to zero Initially set to zero

(denying writing of page)(denying writing of page) On first attempt to write On first attempt to write

VM pageVM page An exception is causedAn exception is caused Sets the dirty bit in page Sets the dirty bit in page

table in MMtable in MM Resets the write access Resets the write access

bit to 1 in TLBbit to 1 in TLB

TLB is a TLB is a read-writeread-write cachecache

Dirty bit present in both Dirty bit present in both TLB and page table in TLB and page table in MMMM

On first write to VM On first write to VM pagepage Only dirty bit in TLB is Only dirty bit in TLB is

setset Dirty bit in page table is Dirty bit in page table is

brought up-to-datebrought up-to-date when TLB entry is when TLB entry is

evictedevicted when VM page and PTE when VM page and PTE

are evictedare evicted

Page 13: 1 COMP 206: Computer Architecture and Implementation Montek Singh Mon., Nov. 17, 2003 Topic: Virtual Memory.

13

Virtual Memory Access TimeVirtual Memory Access Time Assume existence of TLB, physical cache, MM, Assume existence of TLB, physical cache, MM,

diskdisk Processor issues VAProcessor issues VA

TLB hitTLB hitSend RA to cacheSend RA to cache

TLB missTLB missException: Access page tables, update TLB, retryException: Access page tables, update TLB, retry

Memory reference may involve accesses toMemory reference may involve accesses to TLBTLB Page table in MMPage table in MM CacheCache Page in MMPage in MM

Each of these can be a hit or a missEach of these can be a hit or a miss 16 possible combinations16 possible combinations

Page 14: 1 COMP 206: Computer Architecture and Implementation Montek Singh Mon., Nov. 17, 2003 Topic: Virtual Memory.

14

Virtual Memory Access Time (2)Virtual Memory Access Time (2) Constraints among these accessesConstraints among these accesses

Hit in TLB Hit in TLB hit in page table in MM hit in page table in MM Hit in cache Hit in cache hit in page in MM hit in page in MM Hit in page in MM Hit in page in MM hit in page table in MM hit in page table in MM

These constraints eliminate eleven These constraints eliminate eleven combinationscombinationsCase TLB MM PTE Cache MM data Comment

Cache hit Hit (Hit) Hit (Hit) MM not checkedCache miss Hit (Hit) Miss Hit Cache updatedTLB miss Miss Hit Hit Hit TLB updated, TLB access repeatedTLB+cache miss Miss Hit Miss Hit TLB+cache updatedPage fault Miss Miss Miss Miss Cache miss follows servicing of page fault

Page 15: 1 COMP 206: Computer Architecture and Implementation Montek Singh Mon., Nov. 17, 2003 Topic: Virtual Memory.

15

Virtual Memory Access Time (3)Virtual Memory Access Time (3) Number of MM accesses depends on page Number of MM accesses depends on page

table organizationtable organization MIPS R2000/R4000 accomplishes table walking with MIPS R2000/R4000 accomplishes table walking with

CPU instructions (eight instructions per page table CPU instructions (eight instructions per page table level)level)

Several CISC machines implement this in microcode, Several CISC machines implement this in microcode, with MC88200 having dedicated hardware for thiswith MC88200 having dedicated hardware for this

RS/6000 implements this completely in hardwareRS/6000 implements this completely in hardware

TLB miss penalty dominated by having to go to TLB miss penalty dominated by having to go to main memorymain memory Page tables may not be in cachePage tables may not be in cache Further increase in miss penalty if page table Further increase in miss penalty if page table

organization is complexorganization is complex TLB misses can have very damaging effect on TLB misses can have very damaging effect on

physical cachesphysical caches

Page 16: 1 COMP 206: Computer Architecture and Implementation Montek Singh Mon., Nov. 17, 2003 Topic: Virtual Memory.

16

Page SizePage Size ChoicesChoices

Fixed at design time (most early VM systems)Fixed at design time (most early VM systems) Statically configurableStatically configurable

At any moment, only pages of same size exist in systemAt any moment, only pages of same size exist in systemMC68030 allowed page sizes between 256B and 32KB this MC68030 allowed page sizes between 256B and 32KB this

wayway Dynamically configurableDynamically configurable

Pages of different sizes coexist in systemPages of different sizes coexist in systemAlpha 21164, UltraSPARC: 8KB, 64KB, 512KB, 4MBAlpha 21164, UltraSPARC: 8KB, 64KB, 512KB, 4MBMIPS R10000, PA-8000: 4KB, 16Kb, 64KB, 256 KB, 1 MB, 4 MIPS R10000, PA-8000: 4KB, 16Kb, 64KB, 256 KB, 1 MB, 4

MB, 16 MBMB, 16 MBAll pages are alignedAll pages are aligned

Dynamic configuration is a sophisticated way to Dynamic configuration is a sophisticated way to decrease TLB missdecrease TLB miss Increasing # TLB entries increases processor cycle timeIncreasing # TLB entries increases processor cycle time Increasing size of VM page increases internal memory Increasing size of VM page increases internal memory

fragmentationfragmentationNeeds fully associative TLBsNeeds fully associative TLBs

Page 17: 1 COMP 206: Computer Architecture and Implementation Montek Singh Mon., Nov. 17, 2003 Topic: Virtual Memory.

17

Segmentation and PagingSegmentation and Paging Paged segments: Segments are made up of pagesPaged segments: Segments are made up of pages Paging system has flat, linear address spacePaging system has flat, linear address space

32-bit VA = (10-bit VPN1, 10-bit VPN2, 12-bit offset)32-bit VA = (10-bit VPN1, 10-bit VPN2, 12-bit offset) If, for given VPN1, we reach max value of VPN2 and add 1, If, for given VPN1, we reach max value of VPN2 and add 1,

we reach next page at address (VPN+1, 0)we reach next page at address (VPN+1, 0) Segmented version has two-dimensional address spaceSegmented version has two-dimensional address space

32-bit VA = (10-bit segment #, 10-bit page number, 12-bit 32-bit VA = (10-bit segment #, 10-bit page number, 12-bit offset)offset)

If, for given segment #, we reach max page number and If, for given segment #, we reach max page number and add 1, we get an undefined valueadd 1, we get an undefined value

Segments are not contiguousSegments are not contiguous Segments do not need to have the same sizeSegments do not need to have the same size

Size can even vary dynamicallySize can even vary dynamically Implemented by storing upper bound for each segment Implemented by storing upper bound for each segment

and checking every reference against itand checking every reference against it

Page 18: 1 COMP 206: Computer Architecture and Implementation Montek Singh Mon., Nov. 17, 2003 Topic: Virtual Memory.

18

Example 1: Alpha 21264 TLBExample 1: Alpha 21264 TLB Figure 5.36Figure 5.36

Page 19: 1 COMP 206: Computer Architecture and Implementation Montek Singh Mon., Nov. 17, 2003 Topic: Virtual Memory.

19

Example 2: Hypothetical Virtual Example 2: Hypothetical Virtual MemMem Figure 5.37Figure 5.37


Recommended