+ All Categories
Home > Documents > 1 Above: The Burrough B5000 computer. The first commercial machine with virtual memory. Right: First...

1 Above: The Burrough B5000 computer. The first commercial machine with virtual memory. Right: First...

Date post: 21-Dec-2015
Category:
View: 215 times
Download: 1 times
Share this document with a friend
Popular Tags:
21
1 Above: The Burrough B5000 computer. The first commercial machine with virtual memory. Right: First experimental virtual memory. The Manchester Atlas computer, which had virtual memory backed on a magnetic drum.
Transcript
Page 1: 1 Above: The Burrough B5000 computer. The first commercial machine with virtual memory. Right: First experimental virtual memory. The Manchester Atlas.

1

Above: The Burrough B5000 computer. The first commercial machine with virtual memory.

Right: First experimental virtual memory. The Manchester Atlas computer, which had virtual memory backed on a magnetic drum.

Page 2: 1 Above: The Burrough B5000 computer. The first commercial machine with virtual memory. Right: First experimental virtual memory. The Manchester Atlas.

2

COMP 740:COMP 740:Computer Architecture and Computer Architecture and ImplementationImplementation

Montek SinghMontek Singh

Thu, Apr 23, 2009Thu, Apr 23, 2009

Topic: Topic: Virtual MemoryVirtual Memory

Page 3: 1 Above: The Burrough B5000 computer. The first commercial machine with virtual memory. Right: First experimental virtual memory. The Manchester Atlas.

33

Virtual Memory (App. C)Virtual Memory (App. C)Several purposes:Several purposes: Main: Allowing software to address more than Main: Allowing software to address more than

physical memoryphysical memory

Other benefits:Other benefits: Provides for protection; facilitates multi-processingProvides for protection; facilitates multi-processing Enables relocationEnables relocation Enables programs to begin before loading fully (some Enables programs to begin before loading fully (some

implementations)implementations) Programmers used to use Programmers used to use overlaysoverlays and manually and manually

control loading/unloadingcontrol loading/unloading

Page 4: 1 Above: The Burrough B5000 computer. The first commercial machine with virtual memory. Right: First experimental virtual memory. The Manchester Atlas.

4

CharacteristicsCharacteristics

Cache-MM MM-diskAccess time ratio ("speed gap") 1:5 - 1:15 1:10000 - 1:1000000

Hit time 1-2 cycles 40-100 cyclesHit ratio 0.90-0.99 0.99999-0.9999999

Miss (page fault) ratio 0.01-0.10 0.00000001-0.000001Miss penalty 10-100 cycles 1M-6M cycles

CPU during block transfer blocking/non-blocking task switchingBlock (page) size 16-128 bytes 4Kbytes - 64KbytesImplemented in hardware hardware + software

Mapping Direct or set-associative Page table ("fully associative")Replacement algorithm Not crucial Very important (LRU)

Write policy Many choices Write backDirect access to slow memory Yes No

Page 5: 1 Above: The Burrough B5000 computer. The first commercial machine with virtual memory. Right: First experimental virtual memory. The Manchester Atlas.

5

Segmentation and PagingSegmentation and Paging Paging system has flat, linear address spacePaging system has flat, linear address space

32-bit VA = (10-bit VPN1, 10-bit VPN2, 12-bit offset)32-bit VA = (10-bit VPN1, 10-bit VPN2, 12-bit offset) If, for given VPN1, we reach max value of VPN2 and add 1, If, for given VPN1, we reach max value of VPN2 and add 1,

we reach next page at address (VPN+1, 0)we reach next page at address (VPN+1, 0) Segmented version has two-dimensional address Segmented version has two-dimensional address

spacespace 32-bit VA = (10-bit segment #, 10-bit page number, 12-bit 32-bit VA = (10-bit segment #, 10-bit page number, 12-bit

offset)offset) If, for given segment #, we reach max page number and add 1, If, for given segment #, we reach max page number and add 1,

we get an undefined valuewe get an undefined value Segments are not contiguousSegments are not contiguous Segments do not need to have the same sizeSegments do not need to have the same size

Size can even vary dynamicallySize can even vary dynamically Implemented by storing upper bound for each segment and Implemented by storing upper bound for each segment and

checking every reference against itchecking every reference against it

Pure segmentation not used todayPure segmentation not used today However, variable page sizes have been used to get However, variable page sizes have been used to get

some of the locality advantages of segmentationsome of the locality advantages of segmentation

Page 6: 1 Above: The Burrough B5000 computer. The first commercial machine with virtual memory. Right: First experimental virtual memory. The Manchester Atlas.

6

AddressingAddressing Always a “congruence mapping”Always a “congruence mapping” Assume Assume

4GB VM composed of 24GB VM composed of 22020 4KB pages 4KB pages 64MB DRAM main memory composed of 16384 page frames (of 64MB DRAM main memory composed of 16384 page frames (of

same size)same size) Only those pages (of the 2Only those pages (of the 22020) that are not empty actually exist) that are not empty actually exist

Each is either in main memory or on diskEach is either in main memory or on disk Can be located with two mappings (implemented with tables)Can be located with two mappings (implemented with tables)

Virtual address = (virtual page number, page offset)VA = (VPN, offset)32 bits = (20 bits + 12 bits)

Physical address = (real page number, page offset)PA = (RPN, offset)26 bits = (14 bits + 12 bits)

Page 7: 1 Above: The Burrough B5000 computer. The first commercial machine with virtual memory. Right: First experimental virtual memory. The Manchester Atlas.

7

Address TranslationAddress Translation

RPN = fRPN = fMM(VPN)(VPN) In reality, VPN is mapped to a page table entry (PTE)In reality, VPN is mapped to a page table entry (PTE)

which contains RPN …which contains RPN … … … as well as miscellaneous control information (e.g., valid bit, as well as miscellaneous control information (e.g., valid bit,

dirty bit, replacement information, access control)dirty bit, replacement information, access control)

VA PA(VPN, offset within page) (RPN, offset within page)

VA disk address

Page 8: 1 Above: The Burrough B5000 computer. The first commercial machine with virtual memory. Right: First experimental virtual memory. The Manchester Atlas.

8

Single-Level, Direct Page Table in Single-Level, Direct Page Table in MMMM Fully associative mapping:Fully associative mapping:

when VM page is brought in from disk to MM, it may when VM page is brought in from disk to MM, it may go into any of the real page framesgo into any of the real page frames

Simplest addressing scheme: one-level, direct Simplest addressing scheme: one-level, direct page tablepage table (page table base address + VPN) = PTE or page fault(page table base address + VPN) = PTE or page fault Assume that PTE size is 4 bytesAssume that PTE size is 4 bytes Then whole table requires 4Then whole table requires 4222020 = 4MB of main = 4MB of main

memorymemory

Disadvantage: 4MB of main memory must be Disadvantage: 4MB of main memory must be reserved for page tables, even when the VM reserved for page tables, even when the VM space is almost emptyspace is almost empty

Page 9: 1 Above: The Burrough B5000 computer. The first commercial machine with virtual memory. Right: First experimental virtual memory. The Manchester Atlas.

9

Single-Level Direct Page Table in Single-Level Direct Page Table in VMVM To avoid tying down 4MB of physical memoryTo avoid tying down 4MB of physical memory

Put page tables in VMPut page tables in VM Bring into MM only those that are actually neededBring into MM only those that are actually needed ““Paging the page tables”Paging the page tables”

Needs only 1K PTEs in main memory, rather Needs only 1K PTEs in main memory, rather than 4MBthan 4MB

Slows down access to VM pages by possibly Slows down access to VM pages by possibly needing disk accesses for the PTEsneeding disk accesses for the PTEs

Page 10: 1 Above: The Burrough B5000 computer. The first commercial machine with virtual memory. Right: First experimental virtual memory. The Manchester Atlas.

10

Multi-Level Direct Page Table in Multi-Level Direct Page Table in MMMM Another solution to storage problemAnother solution to storage problem Break 20-bit VPN into two 10-bit partsBreak 20-bit VPN into two 10-bit parts

VPN = (VPN1, VPN2)VPN = (VPN1, VPN2)

This turns original one-level page table into a This turns original one-level page table into a tree structuretree structure (1st level base address + VPN1) = 2nd level base (1st level base address + VPN1) = 2nd level base

addressaddress (2nd level base address + VPN2) = PTE or page fault(2nd level base address + VPN2) = PTE or page fault

Storage situation much improvedStorage situation much improved Always need root node (1K 4-byte entries = 1 VM page)Always need root node (1K 4-byte entries = 1 VM page) Ned only a few of the second level nodesNed only a few of the second level nodes

Allocated on demandAllocated on demandCan be anywhere in main memoryCan be anywhere in main memory

Negative: Access time to PTE has doubledNegative: Access time to PTE has doubled

Page 11: 1 Above: The Burrough B5000 computer. The first commercial machine with virtual memory. Right: First experimental virtual memory. The Manchester Atlas.

11

Inverted Page TablesInverted Page Tables Virtual address spaces may be vastly larger Virtual address spaces may be vastly larger

(and more sparsely populated) than real (and more sparsely populated) than real address spacesaddress spaces less-than-full utilization of tree nodes in multi-level less-than-full utilization of tree nodes in multi-level

direct page table becomes more significantdirect page table becomes more significant Ideal (i.e., smallest possible) page table would Ideal (i.e., smallest possible) page table would

have one entry for every VM page actually in have one entry for every VM page actually in main memorymain memory Need 4Need 416K = 64KB of main memory to store this ideal 16K = 64KB of main memory to store this ideal

page tablepage table Storage overhead = 0.1%Storage overhead = 0.1%

Inverted page tableInverted page table implementations are implementations are approximations to this ideal page tableapproximations to this ideal page table Associative inverted page table in special hardware Associative inverted page table in special hardware

(ATLAS)(ATLAS) Hashed inverted page table in MM (IBM, HP PA-RISC)Hashed inverted page table in MM (IBM, HP PA-RISC)

Page 12: 1 Above: The Burrough B5000 computer. The first commercial machine with virtual memory. Right: First experimental virtual memory. The Manchester Atlas.

12

Translation Lookaside Buffer (TLB)Translation Lookaside Buffer (TLB) To avoid two or more MM accesses for each VM To avoid two or more MM accesses for each VM

access, use a small cache to store (VPN, PTE) access, use a small cache to store (VPN, PTE) pairspairs PTE contains RPN, from which RA can be constructedPTE contains RPN, from which RA can be constructed

This cache is the TLB, and it exploits localityThis cache is the TLB, and it exploits locality DEC Alpha (32 entries, fully associative)DEC Alpha (32 entries, fully associative) Amdahl V/8 (512 entries, 2-way set-associative)Amdahl V/8 (512 entries, 2-way set-associative)

Processor issues VAProcessor issues VA TLB hitTLB hit

Send RA to main memorySend RA to main memory TLB missTLB miss

Make two or more MM accesses to page tables to retrieve Make two or more MM accesses to page tables to retrieve RARA

Send RA to MMSend RA to MM– (Any of these may cause page fault)(Any of these may cause page fault)

Page 13: 1 Above: The Burrough B5000 computer. The first commercial machine with virtual memory. Right: First experimental virtual memory. The Manchester Atlas.

13

TLB MissesTLB Misses Causes for TLB missCauses for TLB miss

VM page is not in main memoryVM page is not in main memory VM page is in main memory, but TLB entry has not yet VM page is in main memory, but TLB entry has not yet

been entered into TLBbeen entered into TLB VM page is in main memory, but TLB entry has been VM page is in main memory, but TLB entry has been

removed for some reason (removed as LRU, removed for some reason (removed as LRU, invalidated because page table was updated, etc.)invalidated because page table was updated, etc.)

Miss rates are remarkably low (~0.1%)Miss rates are remarkably low (~0.1%) Miss rate depends on size of TLB and on VM page size Miss rate depends on size of TLB and on VM page size

(coverage)(coverage)

Miss penalty varies from a single cache access Miss penalty varies from a single cache access to several page faultsto several page faults

Page 14: 1 Above: The Burrough B5000 computer. The first commercial machine with virtual memory. Right: First experimental virtual memory. The Manchester Atlas.

14

Dirty Bits and TLB: Two SolutionsDirty Bits and TLB: Two Solutions TLB is TLB is read-onlyread-only cache cache Dirty bit is contained Dirty bit is contained

only in page table in MMonly in page table in MM TLB contains only a TLB contains only a

write-access bitwrite-access bit Initially set to zero Initially set to zero

(denying writing of page)(denying writing of page) On first attempt to write On first attempt to write

VM pageVM page An exception is causedAn exception is caused Sets the dirty bit in page Sets the dirty bit in page

table in MMtable in MM Resets the write access Resets the write access

bit to 1 in TLBbit to 1 in TLB

TLB is a TLB is a read-writeread-write cachecache

Dirty bit present in both Dirty bit present in both TLB and page table in TLB and page table in MMMM

On first write to VM On first write to VM pagepage Only dirty bit in TLB is Only dirty bit in TLB is

setset Dirty bit in page table is Dirty bit in page table is

brought up-to-datebrought up-to-date when TLB entry is when TLB entry is

evictedevicted when VM page and PTE when VM page and PTE

are evictedare evicted

Page 15: 1 Above: The Burrough B5000 computer. The first commercial machine with virtual memory. Right: First experimental virtual memory. The Manchester Atlas.

15

Virtual Memory Access TimeVirtual Memory Access Time Assume existence of TLB, physical cache, MM, Assume existence of TLB, physical cache, MM,

diskdisk Processor issues VAProcessor issues VA

TLB hitTLB hitSend RA to cacheSend RA to cache

TLB missTLB missException: Access page tables, update TLB, retryException: Access page tables, update TLB, retry

Memory reference may involve accesses toMemory reference may involve accesses to TLBTLB Page table in MMPage table in MM CacheCache Page in MMPage in MM

Each of these can be a hit or a missEach of these can be a hit or a miss 16 possible combinations16 possible combinations

Page 16: 1 Above: The Burrough B5000 computer. The first commercial machine with virtual memory. Right: First experimental virtual memory. The Manchester Atlas.

16

Virtual Memory Access Time (2)Virtual Memory Access Time (2) Constraints among these accessesConstraints among these accesses

Hit in TLB Hit in TLB hit in page table in MM hit in page table in MM Hit in cache Hit in cache hit in page in MM hit in page in MM Hit in page in MM Hit in page in MM hit in page table in MM hit in page table in MM

These constraints eliminate eleven These constraints eliminate eleven combinationscombinationsCase TLB MM PTE Cache MM data Comment

Cache hit Hit (Hit) Hit (Hit) MM not checkedCache miss Hit (Hit) Miss Hit Cache updatedTLB miss Miss Hit Hit Hit TLB updated, TLB access repeatedTLB+cache miss Miss Hit Miss Hit TLB+cache updatedPage fault Miss Miss Miss Miss Cache miss follows servicing of page fault

Page 17: 1 Above: The Burrough B5000 computer. The first commercial machine with virtual memory. Right: First experimental virtual memory. The Manchester Atlas.

17

Virtual Memory Access Time (3)Virtual Memory Access Time (3) Number of MM accesses depends on page Number of MM accesses depends on page

table organizationtable organization MIPS R2000/R4000 accomplishes table walking with MIPS R2000/R4000 accomplishes table walking with

CPU instructions (eight instructions per page table CPU instructions (eight instructions per page table level)level)

Several CISC machines implement this in microcode, Several CISC machines implement this in microcode, with MC88200 having dedicated hardware for thiswith MC88200 having dedicated hardware for this

RS/6000 implements this completely in hardwareRS/6000 implements this completely in hardware

TLB miss penalty dominated by having to go to TLB miss penalty dominated by having to go to main memorymain memory Page tables may not be in cachePage tables may not be in cache Further increase in miss penalty if page table Further increase in miss penalty if page table

organization is complexorganization is complex

Page 18: 1 Above: The Burrough B5000 computer. The first commercial machine with virtual memory. Right: First experimental virtual memory. The Manchester Atlas.

18

Page SizePage Size ChoicesChoices

Fixed at design time (most early VM systems)Fixed at design time (most early VM systems) Statically configurableStatically configurable

At any moment, only pages of same size exist in systemAt any moment, only pages of same size exist in systemMC68030 allowed page sizes between 256B and 32KB this MC68030 allowed page sizes between 256B and 32KB this

wayway Dynamically configurableDynamically configurable

Pages of different sizes coexist in systemPages of different sizes coexist in systemAlpha 21164, UltraSPARC: 8KB, 64KB, 512KB, 4MBAlpha 21164, UltraSPARC: 8KB, 64KB, 512KB, 4MBMIPS R10000, PA-8000: 4KB, 16Kb, 64KB, 256 KB, 1 MB, 4 MIPS R10000, PA-8000: 4KB, 16Kb, 64KB, 256 KB, 1 MB, 4

MB, 16 MBMB, 16 MBAll pages are alignedAll pages are aligned

Dynamic configuration is a sophisticated way to Dynamic configuration is a sophisticated way to decrease TLB missdecrease TLB miss Increasing # TLB entries increases processor cycle timeIncreasing # TLB entries increases processor cycle time Increasing size of VM page increases internal memory Increasing size of VM page increases internal memory

fragmentationfragmentationNeeds fully associative TLBsNeeds fully associative TLBs

Page 19: 1 Above: The Burrough B5000 computer. The first commercial machine with virtual memory. Right: First experimental virtual memory. The Manchester Atlas.

1919

Example 1: Alpha 21264 TLBExample 1: Alpha 21264 TLB

1. VPN is extracted2. Protections checked3. One of 40 entries muxed (or

miss registered)4. Physical page address

combined with offset to generate real address

Page 20: 1 Above: The Burrough B5000 computer. The first commercial machine with virtual memory. Right: First experimental virtual memory. The Manchester Atlas.

2020

Example 2: Hypothetical Virtual Example 2: Hypothetical Virtual MemMem

Figure C.24Figure C.24 8K page size8K page size Cache block 64 Cache block 64

bytesbytes L1 8K bytesL1 8K bytes L2 4MBL2 4MB Caches are direct Caches are direct

mappedmapped L1 virtually L1 virtually

indexed and indexed and physically taggedphysically tagged

Maybe 2 sets of Maybe 2 sets of TLB and L1 (I & D)TLB and L1 (I & D)

Page 21: 1 Above: The Burrough B5000 computer. The first commercial machine with virtual memory. Right: First experimental virtual memory. The Manchester Atlas.

21

THE END!THE END! Thank you for your participation in this class!Thank you for your participation in this class!

Final Exam Apr 29, 8-11am, SN155Final Exam Apr 29, 8-11am, SN155 Open book, open notesOpen book, open notes

I will post office hours for Friday and Monday I will post office hours for Friday and Monday on the class websiteon the class website

Graded project, homework in your mailbox Graded project, homework in your mailbox tomorrowtomorrow


Recommended