+ All Categories
Home > Documents > Virtual Memory CENG331: Introduction to Computer Systems 13 rd Lecture

Virtual Memory CENG331: Introduction to Computer Systems 13 rd Lecture

Date post: 07-Feb-2016
Category:
Upload: azura
View: 50 times
Download: 0 times
Share this document with a friend
Description:
Virtual Memory CENG331: Introduction to Computer Systems 13 rd Lecture. Instructor: Erol Sahin. Acknowledgement: Most of the slides are adapted from the ones prepared by R.E. Bryant, D.R. O’Hallaron of Carnegie-Mellon Univ. Today. Virtual memory (VM) Overview and motivation - PowerPoint PPT Presentation
Popular Tags:
80
Instructor: Erol Sahin Virtual Memory CENG331: Introduction to Computer Systems 13 rd Lecture Acknowledgement: Most of the slides are adapted from the ones prepared by R.E. Bryant, D.R. O’Hallaron of Carnegie-Mellon Univ.
Transcript
Page 1: Virtual Memory CENG331: Introduction to Computer Systems 13 rd  Lecture

Instructor:

Erol Sahin

Virtual MemoryCENG331: Introduction to Computer Systems13rd Lecture

Acknowledgement: Most of the slides are adapted from the ones prepared by R.E. Bryant, D.R. O’Hallaron of Carnegie-Mellon Univ.

Page 2: Virtual Memory CENG331: Introduction to Computer Systems 13 rd  Lecture

– 2 –

Today

Virtual memory (VM) Overview and motivation VM as tool for caching VM as tool for memory management VM as tool for memory protection Address translation

Page 3: Virtual Memory CENG331: Introduction to Computer Systems 13 rd  Lecture

– 3 –

Programs refer to virtual memory addresses movl (%ecx),%eax Conceptually very large array of bytes Each byte has its own address Actually implemented with hierarchy of different

memory types System provides address space private to particular

“process”

Allocation: Compiler and run-time system Where different program objects should be stored All allocation within single virtual address space

But why virtual memory?

Why not physical memory?

Virtual Memory

00∙∙∙∙∙∙0

FF∙∙∙∙∙∙F

Page 4: Virtual Memory CENG331: Introduction to Computer Systems 13 rd  Lecture

– 4 –

Problem 1: How Does Everything Fit?

64-bit addresses:16 Exabyte

Physical main memory:Few Gigabytes

?

And there are many processes ….

Page 5: Virtual Memory CENG331: Introduction to Computer Systems 13 rd  Lecture

– 5 –

Problem 2: Memory Management

Physical main memory

What goes

where?

stackheap

.text

.data…

Process 1Process 2Process 3

…Process n

x

Page 6: Virtual Memory CENG331: Introduction to Computer Systems 13 rd  Lecture

– 6 –

Problem 3: How To Protect

Physical main memory

Process i

Process j

Problem 4: How To Share?Physical main memory

Process i

Process j

Page 7: Virtual Memory CENG331: Introduction to Computer Systems 13 rd  Lecture

– 7 –

Solution: Level Of Indirection

Each process gets its own private memory space

Solves the previous problems

Physical memory

Virtual memory

Virtual memory

Process 1

Process n

mapping

Page 8: Virtual Memory CENG331: Introduction to Computer Systems 13 rd  Lecture

– 8 –

Address Spaces

Linear address space: Ordered set of contiguous non-negative integer addresses:

{0, 1, 2, 3 … }

Virtual address space: Set of N = 2n virtual addresses{0, 1, 2, 3, …, N-1}

Physical address space: Set of M = 2m physical addresses{0, 1, 2, 3, …, M-1}

Clean distinction between data (bytes) and their attributes (addresses)

Each object can now have multiple addresses

Every byte in main memory: one physical address, one (or more) virtual addresses

Page 9: Virtual Memory CENG331: Introduction to Computer Systems 13 rd  Lecture

– 9 –

A System Using Physical Addressing

Used in “simple” systems like embedded microcontrollers in devices like cars, elevators, and digital picture frames

0:1:

M-1:

Main memory

CPU

2:3:4:5:6:7:

Physical address(PA)

Data word

8: ...

Page 10: Virtual Memory CENG331: Introduction to Computer Systems 13 rd  Lecture

– 10 –

A System Using Virtual Addressing

Used in all modern desktops, laptops, workstations

One of the great ideas in computer science

MMU checks the cache

0:1:

M-1:

Main memory

MMU

2:3:4:5:6:7:

Physical address(PA)

Data word

8: ...

CPU

Virtual address(VA)

CPU Chip

Page 11: Virtual Memory CENG331: Introduction to Computer Systems 13 rd  Lecture

– 11 –

Why Virtual Memory (VM)?

Efficient use of limited main memory (RAM) Use RAM as a cache for the parts of a virtual address space

some non-cached parts stored on disk some (unallocated) non-cached parts stored nowhere

Keep only active areas of virtual address space in memory transfer data back and forth as needed

Simplifies memory management for programmers Each process gets the same full, private linear address space

Isolates address spaces One process can’t interfere with another’s memory

because they operate in different address spaces User process cannot access privileged information

different sections of address spaces have different permissions

Page 12: Virtual Memory CENG331: Introduction to Computer Systems 13 rd  Lecture

– 12 –

Today

Virtual memory (VM) Overview and motivation VM as tool for caching VM as tool for memory management VM as tool for memory protection Address translation

Page 13: Virtual Memory CENG331: Introduction to Computer Systems 13 rd  Lecture

– 13 –

VM as a Tool for CachingVirtual memory: array of N = 2n contiguous bytes

think of the array (allocated part) as being stored on disk

Physical main memory (DRAM) = cache for allocated virtual memory

Blocks are called pages; size = 2p

PP 2m-p-1

Physical memory

Empty

Empty

Uncached

VP 0VP 1

VP 2n-p-1

Virtual memory

UnallocatedCached

UncachedUnallocated

CachedUncached

PP 0PP 1

EmptyCached

0

2n-12m-1

0

Virtual pages (VP's) stored on disk

Physical pages (PP's) cached in DRAM

Disk

Page 14: Virtual Memory CENG331: Introduction to Computer Systems 13 rd  Lecture

– 14 –

Memory Hierarchy: Core 2 Duo

Disk

Main Memory

L2 unified cache

L1 I-cache

L1 D-cacheCPU Reg

2 B/cycle8 B/cycle16 B/cycle 1 B/30 cyclesThroughput:Latency: 100 cycles14 cycles3 cycles millions

~4 MB

32 KB

~4 GB ~500 GB

Not drawn to scale

L1/L2 cache: 64 B blocks

Miss penalty (latency): 30x

Miss penalty (latency): 10,000x

Page 15: Virtual Memory CENG331: Introduction to Computer Systems 13 rd  Lecture

– 15 –

DRAM Cache OrganizationDRAM cache organization driven by the enormous miss penalty

DRAM is about 10x slower than SRAM Disk is about 10,000x slower than DRAM

For first byte, faster for next byte

Consequences Large page (block) size: typically 4-8 KB, sometimes 4 MB Fully associative

Any VP can be placed in any PP Requires a “large” mapping function – different from CPU caches

Highly sophisticated, expensive replacement algorithms Too complicated and open-ended to be implemented in hardware

Write-back rather than write-through

Page 16: Virtual Memory CENG331: Introduction to Computer Systems 13 rd  Lecture

– 16 –

Address Translation: Page TablesA page table is an array of page table entries (PTEs) that maps

virtual pages to physical pages. Here: 8 VPs Per-process kernel data structure in DRAM

null

null

Memory residentpage table

(DRAM)

Physical memory(DRAM)

VP 7VP 4

Virtual memory(disk)

Valid01

010

10

1

Physical pagenumber or

disk addressPTE 0

PTE 7

PP 0VP 2VP 1

PP 3

VP 1

VP 2

VP 4

VP 6

VP 7

VP 3

Page 17: Virtual Memory CENG331: Introduction to Computer Systems 13 rd  Lecture

– 17 –

Address Translation With a Page Table

Virtual page number (VPN) Virtual page offset (VPO)

Physical page number (PPN) Physical page offset (PPO)

Virtual address

Physical address

Valid Physical page number (PPN)

Page table base register

(PTBR)

Page table Page table address for process

Valid bit = 0:page not in memory

(page fault)

Page 18: Virtual Memory CENG331: Introduction to Computer Systems 13 rd  Lecture

– 18 –

Page HitPage hit: reference to VM word that is in physical memory

null

null

Memory residentpage table

(DRAM)

Physical memory(DRAM)

VP 7VP 4

Virtual memory(disk)

Valid01

010

10

1

Physical pagenumber or

disk addressPTE 0

PTE 7

PP 0VP 2VP 1

PP 3

VP 1

VP 2

VP 4

VP 6

VP 7

VP 3

Virtual address

Page 19: Virtual Memory CENG331: Introduction to Computer Systems 13 rd  Lecture

– 19 –

Page MissPage miss: reference to VM word that is not in physical memory

null

null

Memory residentpage table

(DRAM)

Physical memory(DRAM)

VP 7VP 4

Virtual memory(disk)

Valid01

010

10

1

Physical pagenumber or

disk addressPTE 0

PTE 7

PP 0VP 2VP 1

PP 3

VP 1

VP 2

VP 4

VP 6

VP 7

VP 3

Virtual address

Page 20: Virtual Memory CENG331: Introduction to Computer Systems 13 rd  Lecture

– 20 –

Handling Page FaultPage miss causes page fault (an exception)

null

null

Memory residentpage table

(DRAM)

Physical memory(DRAM)

VP 7VP 4

Virtual memory(disk)

Valid01

010

10

1

Physical pagenumber or

disk addressPTE 0

PTE 7

PP 0VP 2VP 1

PP 3

VP 1

VP 2

VP 4

VP 6

VP 7

VP 3

Virtual address

Page 21: Virtual Memory CENG331: Introduction to Computer Systems 13 rd  Lecture

– 21 –

Handling Page FaultPage miss causes page fault (an exception)

Page fault handler selects a victim to be evicted (here VP 4)

null

null

Memory residentpage table

(DRAM)

Physical memory(DRAM)

VP 7VP 4

Virtual memory(disk)

Valid01

010

10

1

Physical pagenumber or

disk addressPTE 0

PTE 7

PP 0VP 2VP 1

PP 3

VP 1

VP 2

VP 4

VP 6

VP 7

VP 3

Virtual address

Page 22: Virtual Memory CENG331: Introduction to Computer Systems 13 rd  Lecture

– 22 –

Handling Page FaultPage miss causes page fault (an exception)

Page fault handler selects a victim to be evicted (here VP 4)

null

null

Memory residentpage table

(DRAM)

Physical memory(DRAM)

VP 7VP 3

Virtual memory(disk)

Valid01

100

10

1

Physical pagenumber or

disk addressPTE 0

PTE 7

PP 0VP 2VP 1

PP 3

VP 1

VP 2

VP 4

VP 6

VP 7

VP 3

Virtual address

Page 23: Virtual Memory CENG331: Introduction to Computer Systems 13 rd  Lecture

– 23 –

Handling Page FaultPage miss causes page fault (an exception)

Page fault handler selects a victim to be evicted (here VP 4)

Offending instruction is restarted: page hit!

null

null

Memory residentpage table

(DRAM)

Physical memory(DRAM)

VP 7VP 3

Virtual memory(disk)

Valid01

100

10

1

Physical pagenumber or

disk addressPTE 0

PTE 7

PP 0VP 2VP 1

PP 3

VP 1

VP 2

VP 4

VP 6

VP 7

VP 3

Virtual address

Page 24: Virtual Memory CENG331: Introduction to Computer Systems 13 rd  Lecture

– 24 –

Why does it work? Locality

Virtual memory works because of locality

At any point in time, programs tend to access a set of active virtual pages called the working set Programs with better temporal locality will have smaller working sets

If (working set size < main memory size) Good performance for one process after compulsory misses

If ( SUM(working set sizes) > main memory size ) Thrashing: Performance meltdown where pages are swapped (copied)

in and out continuously

Page 25: Virtual Memory CENG331: Introduction to Computer Systems 13 rd  Lecture

– 25 –

Today

Virtual memory (VM) Overview and motivation VM as tool for caching VM as tool for memory management VM as tool for memory protection Address translation

Page 26: Virtual Memory CENG331: Introduction to Computer Systems 13 rd  Lecture

– 26 –

VM as a Tool for Memory ManagementKey idea: each process has its own virtual address space

It can view memory as a simple linear array Mapping function scatters addresses through physical memory

Well chosen mappings simplify memory allocation and management

Virtual Address

Space for Process 1:

Physical Address

Space (DRAM)

0

N-1(e.g., read-only

library code)

Virtual Address

Space for Process 2:

VP 1VP 2

...

0

N-1

VP 1VP 2

...

PP 2

PP 6

PP 8

...

0

M-1

Address translation

Page 27: Virtual Memory CENG331: Introduction to Computer Systems 13 rd  Lecture

– 27 –

VM as a Tool for Memory ManagementMemory allocation

Each virtual page can be mapped to any physical page A virtual page can be stored in different physical pages at different times

Sharing code and data among processes Map virtual pages to the same physical page (here: PP 6)

Virtual Address

Space for Process 1:

Physical Address

Space (DRAM)

0

N-1(e.g., read-only

library code)

Virtual Address

Space for Process 2:

VP 1VP 2

...

0

N-1

VP 1VP 2

...

PP 2

PP 6

PP 8

...

0

M-1

Address translation

Page 28: Virtual Memory CENG331: Introduction to Computer Systems 13 rd  Lecture

– 28 –

Simplifying Linking and LoadingKernel virtual memory

Memory-mapped region forshared libraries

Run-time heap(created by malloc)

User stack(created at runtime)

Unused0

%esp (stack

pointer)

Memoryinvisible touser code

brk

0xc0000000

0x08048000

0x40000000

Read/write segment(.data, .bss)

Read-only segment(.init, .text, .rodata)

Loaded from the

executable file

Linking Each program has similar virtual

address space Code, stack, and shared libraries

always start at the same address

Loading execve() allocates virtual

pages for .text and .data sections = creates PTEs marked as invalid

The .text and .data sections are copied, page by page, on demand by the virtual memory system

Page 29: Virtual Memory CENG331: Introduction to Computer Systems 13 rd  Lecture

– 29 –

Today

Virtual memory (VM) Overview and motivation VM as tool for caching VM as tool for memory management VM as tool for memory protection Address translation

Page 30: Virtual Memory CENG331: Introduction to Computer Systems 13 rd  Lecture

– 30 –

VM as a Tool for Memory ProtectionExtend PTEs with permission bitsPage fault handler checks these before remapping

If violated, send process SIGSEGV (segmentation fault)

Process i: AddressREAD WRITE

PP 6Yes NoPP 4Yes YesPP 2Yes

VP 0:VP 1:VP 2:

•••

Process j:

Yes

SUP

NoNoYes

AddressREAD WRITE

PP 9Yes NoPP 6Yes Yes

PP 11Yes Yes

SUP

NoYesNo

VP 0:VP 1:VP 2:

Physical Address Space

PP 2

PP 4

PP 6

PP 8PP 9

PP 11

Page 31: Virtual Memory CENG331: Introduction to Computer Systems 13 rd  Lecture

– 31 –

Today

Virtual memory (VM) Overview and motivation VM as tool for caching VM as tool for memory management VM as tool for memory protection Address translation

Page 32: Virtual Memory CENG331: Introduction to Computer Systems 13 rd  Lecture

– 32 –

Address Translation: Page Hit

1) Processor sends virtual address to MMU

2-3) MMU fetches PTE from page table in memory

4) MMU sends physical address to cache/memory

5) Cache/memory sends data word to processor

MMU Cache/MemoryPA

Data

CPUVA

CPU ChipPTEA

PTE1

2

3

4

5

Page 33: Virtual Memory CENG331: Introduction to Computer Systems 13 rd  Lecture

– 33 –

Address Translation: Page Fault

1) Processor sends virtual address to MMU 2-3) MMU fetches PTE from page table in memory4) Valid bit is zero, so MMU triggers page fault exception5) Handler identifies victim (and, if dirty, pages it out to disk)6) Handler pages in new page and updates PTE in memory7) Handler returns to original process, restarting faulting instruction

MMU Cache/Memory

CPU VA

CPU ChipPTEA

PTE1

2

3

4

5

Disk

Page fault handler

Victim page

New page

Exception

6

7

Page 34: Virtual Memory CENG331: Introduction to Computer Systems 13 rd  Lecture

– 34 –

Speeding up Translation with a TLB

Page table entries (PTEs) are cached in L1 like any other memory word PTEs may be evicted by other data references PTE hit still requires a 1-cycle delay

Solution: Translation Lookaside Buffer (TLB) Small hardware cache in MMU Maps virtual page numbers to physical page numbers Contains complete page table entries for small number of pages

Page 35: Virtual Memory CENG331: Introduction to Computer Systems 13 rd  Lecture

– 35 –

TLB Hit

MMU Cache/Memory

PA

Data

CPUVA

CPU Chip

PTE

1

2

4

5

A TLB hit eliminates a memory access

TLB

VPN 3

Page 36: Virtual Memory CENG331: Introduction to Computer Systems 13 rd  Lecture

– 36 –

TLB Miss

MMU Cache/MemoryPA

Data

CPUVA

CPU Chip

PTE

1

2

5

6

TLB

VPN

4

PTEA3

A TLB miss incurs an add’l memory access (the PTE)Fortunately, TLB misses are rare

Page 37: Virtual Memory CENG331: Introduction to Computer Systems 13 rd  Lecture

– 37 –

Simple Memory System ExampleAddressing

14-bit virtual addresses 12-bit physical address Page size = 64 bytes

13 12 11 10 9 8 7 6 5 4 3 2 1 0

11 10 9 8 7 6 5 4 3 2 1 0

VPO

PPOPPN

VPN

Virtual Page Number Virtual Page Offset

Physical Page Number Physical Page Offset

Page 38: Virtual Memory CENG331: Introduction to Computer Systems 13 rd  Lecture

– 38 –

Simple Memory System Page TableOnly show first 16 entries (out of 256)

10D0F1110E12D0D0–0C0–0B1090A1170911308

ValidPPNVPN

0–070–06116050–0410203133020–0112800

ValidPPNVPN

Page 39: Virtual Memory CENG331: Introduction to Computer Systems 13 rd  Lecture

– 39 –

Simple Memory System TLB16 entries

4-way associative

13 12 11 10 9 8 7 6 5 4 3 2 1 0

VPOVPN

TLBITLBT

0–021340A10D030–073

0–030–060–080–022

0–0A0–040–0212D031

102070–0010D090–030

ValidPPNTagValidPPNTagValidPPNTagValidPPNTagSet

Page 40: Virtual Memory CENG331: Introduction to Computer Systems 13 rd  Lecture

– 40 –

Simple Memory System Cache16 lines, 4-byte block size

Physically addressed

Direct mapped

11 10 9 8 7 6 5 4 3 2 1 0

PPOPPN

COCICT

03DFC2111167

––––0316

1DF0723610D5

098F6D431324

––––0363

0804020011B2

––––0151

112311991190

B3B2B1B0ValidTagIdx

––––014F

D31B7783113E

15349604116D

––––012C

––––00BB

3BDA159312DA

––––02D9

8951003A1248

B3B2B1B0ValidTagIdx

Page 41: Virtual Memory CENG331: Introduction to Computer Systems 13 rd  Lecture

– 41 –

Address Translation Example #1

Virtual Address: 0x03D4

VPN ___ TLBI ___ TLBT ____ TLB Hit? __ Page Fault? __ PPN: ____

Physical Address

CO ___ CI___ CT ____ Hit? __ Byte: ____

13 12 11 10 9 8 7 6 5 4 3 2 1 0

VPOVPN

TLBITLBT

11 10 9 8 7 6 5 4 3 2 1 0

PPOPPN

COCICT

00101011110000

0x0F 3 0x03 Y N 0x0D

0001010 11010

0 0x5 0x0D Y 0x36

Page 42: Virtual Memory CENG331: Introduction to Computer Systems 13 rd  Lecture

– 42 –

Address Translation Example #2

Virtual Address: 0x0B8F

VPN ___ TLBI ___ TLBT ____ TLB Hit? __ Page Fault? __ PPN: ____

Physical Address

CO ___ CI___ CT ____ Hit? __ Byte: ____

13 12 11 10 9 8 7 6 5 4 3 2 1 0

VPOVPN

TLBITLBT

11 10 9 8 7 6 5 4 3 2 1 0

PPOPPN

COCICT

11110001110100

0x2E 2 0x0B N Y TBD

Page 43: Virtual Memory CENG331: Introduction to Computer Systems 13 rd  Lecture

– 43 –

Address Translation Example #3

Virtual Address: 0x0020

VPN ___ TLBI ___ TLBT ____ TLB Hit? __ Page Fault? __ PPN: ____

Physical Address

CO___ CI___ CT ____ Hit? __ Byte: ____

13 12 11 10 9 8 7 6 5 4 3 2 1 0

VPOVPN

TLBITLBT

11 10 9 8 7 6 5 4 3 2 1 0

PPOPPN

COCICT

00000100000000

0x00 0 0x00 N N 0x28

0000000 00111

0 0x8 0x28 N Mem

Page 44: Virtual Memory CENG331: Introduction to Computer Systems 13 rd  Lecture

– 44 –

Summary

Programmer’s view of virtual memory Each process has its own private linear address space Cannot be corrupted by other processes

System view of virtual memory Uses memory efficiently by caching virtual memory pages

Efficient only because of locality Simplifies memory management and programming Simplifies protection by providing a convenient interpositioning point

to check permissions

Page 45: Virtual Memory CENG331: Introduction to Computer Systems 13 rd  Lecture

– 45 –

Today

Virtual memory (VM) Overview and motivation VM as tool for caching VM as tool for memory management VM as tool for memory protection Address translation Allocation, multi-level page tables

Linux VM system

Page 46: Virtual Memory CENG331: Introduction to Computer Systems 13 rd  Lecture

– 46 –

Allocating Virtual PagesExample: Allocating VP5

null

null

Memory residentpage table

(DRAM)

Physical memory(DRAM)

VP 7VP 3

Virtual memory(disk)

Valid01

100

10

1

Physical pagenumber or

disk addressPTE 0

PTE 7

PP 0VP 2VP 1

PP 3

VP 1

VP 2

VP 4

VP 6

VP 7

VP 3

Page 47: Virtual Memory CENG331: Introduction to Computer Systems 13 rd  Lecture

– 47 –

Allocating Virtual PagesExample: Allocating VP 5 Kernel allocates VP 5 on disk and points PTE 5 to it

null

Memory residentpage table

(DRAM)

Physical memory(DRAM)

VP 7VP 3

Virtual memory(disk)

Valid01

100

10

1

Physical pagenumber or

disk addressPTE 0

PTE 7

PP 0VP 2VP 1

PP 3

VP 1

VP 2

VP 4

VP 6

VP 7

VP 3

VP 5

Page 48: Virtual Memory CENG331: Introduction to Computer Systems 13 rd  Lecture

– 48 –

Multi-Level Page TablesGiven:

4KB (212) page size 48-bit address space 4-byte PTE

Problem: Would need a 256 GB page table!

248 * 2-12 * 22 = 238 bytes

Common solution Multi-level page tables Example: 2-level page table Level 1 table: each PTE points to a page table Level 2 table: each PTE points to a page

(paged in and out like other data)

Level 1 table stays in memory Level 2 tables paged in and out

Level 1Table

...

Level 2Tables

...

Page 49: Virtual Memory CENG331: Introduction to Computer Systems 13 rd  Lecture

– 49 –

A Two-Level Page Table HierarchyLevel 1

page table

...

Level 2page tables

VP 0

...

VP 1023

VP 1024

...

VP 2047

Gap

0

PTE 0

...

PTE 1023

PTE 0

...

PTE 1023

1023 nullPTEs

PTE 1023 1023 unallocated

pagesVP 9215

Virtualmemory

(1K - 9)null PTEs

PTE 0

PTE 1

PTE 2 (null)

PTE 3 (null)

PTE 4 (null)

PTE 5 (null)

PTE 6 (null)

PTE 7 (null)

PTE 8

2K allocated VM pagesfor code and data

6K unallocated VM pages

1023 unallocated pages

1 allocated VM pagefor the stack

Page 50: Virtual Memory CENG331: Introduction to Computer Systems 13 rd  Lecture

– 50 –

Translating with a k-level Page Table

VPN 10p-1n-1

VPOVPN 2 ... VPN k

PPN

0p-1m-1

PPOPPN

Virtual Address

Physical Address

... ...Level 1

page tableLevel 2

page tableLevel k

page table

Page 51: Virtual Memory CENG331: Introduction to Computer Systems 13 rd  Lecture

– 51 –

Disk

Servicing a Page Fault

(1) Processor signals disk controller Read block of length P starting at

disk address X and store starting at memory address Y

(2) Read occurs Direct Memory Access (DMA) Under control of I/O controller

(3) Controller signals completion Interrupts processor OS resumes suspended process

Disk

Memory-I/O busMemory-I/O bus

ProcessorProcessor

CacheCache

MemoryMemoryI/O

controllerI/O

controller

Reg

(2) DMA Transfer

(1) Initiate Block Read

(3) Read Done

Page 52: Virtual Memory CENG331: Introduction to Computer Systems 13 rd  Lecture

– 52 –

Today

Virtual memory (VM) Multi-level page tables

Linux VM system

Case study: VM system on P6

Performance optimization for VM system

Page 53: Virtual Memory CENG331: Introduction to Computer Systems 13 rd  Lecture

– 53 –

vm_next

vm_next

Linux Organizes VM as Collection of “Areas”

task_structmm_struct

pgdmm

mmap

vm_area_struct

vm_end

vm_protvm_start

vm_end

vm_protvm_start

vm_end

vm_prot

vm_next

vm_start

process virtual memory

text

data

shared libraries

0

0x08048000

0x0804a020

0x40000000

pgd: Page directory address

vm_prot: Read/write permissions for

this area

vm_flags Shared with other processes

or private to this process

vm_flags

vm_flags

vm_flags

Page 54: Virtual Memory CENG331: Introduction to Computer Systems 13 rd  Lecture

– 54 –

Linux Page Fault Handling Is the VA legal?

= Is it in an area defined by a vm_area_struct?

If not (#1), then signal segmentation violation

Is the operation legal? i.e., Can the process

read/write this area? If not (#2), then signal

protection violation

Otherwise Valid address (#3):

handle fault

write

read

read1

2

3

vm_next

vm_next

vm_area_struct

vm_end

vm_protvm_start

vm_end

vm_protvm_start

vm_end

vm_prot

vm_next

vm_start

process virtual memory

text

data

shared librariesvm_flags

vm_flags

vm_flags

Page 55: Virtual Memory CENG331: Introduction to Computer Systems 13 rd  Lecture

– 55 –

Memory System SummaryL1/L2 Memory Cache

Purely a speed-up technique Behavior invisible to application programmer and (mostly) OS Implemented totally in hardware

Virtual Memory Supports many OS-related functions

Process creation, task switching, protection Software

Allocates/shares physical memory among processes Maintains high-level tables tracking memory type, source, sharing Handles exceptions, fills in hardware-defined mapping tables

Hardware Translates virtual addresses via mapping tables, enforcing permissions Accelerates mapping via translation cache (TLB)

Page 56: Virtual Memory CENG331: Introduction to Computer Systems 13 rd  Lecture

– 56 –

Further ReadingIntel TLBs:

Application Note: “TLBs, Paging-Structure Caches, and Their Invalidation”, April 2007

Page 57: Virtual Memory CENG331: Introduction to Computer Systems 13 rd  Lecture

– 57 –

Today

Virtual memory (VM) Multi-level page tables

Linux VM system

Case study: VM system on P6

Performance optimization for VM system

Page 58: Virtual Memory CENG331: Introduction to Computer Systems 13 rd  Lecture

– 58 –

Intel P6Internal designation for successor to Pentium

Which had internal designation P5

Fundamentally different from Pentium Out-of-order, superscalar operation

Resulting processors Pentium Pro (1996) Pentium II (1997)

L2 cache on same chip Pentium III (1999)

Page 59: Virtual Memory CENG331: Introduction to Computer Systems 13 rd  Lecture

– 59 –

P6 Memory System

bus interface unit

DRAM

external system bus (e.g. PCI)

instruction fetch unit

L1i-cache

L2cache

cache bus

L1d-cache

instTLB

dataTLB

processor package

32 bit address space

4 KB page size

L1, L2, and TLBs• 4-way set associative

Inst TLB• 32 entries• 8 sets

Data TLB• 64 entries• 16 sets

L1 i-cache and d-cache• 16 KB• 32 B line size• 128 sets

L2 cache• unified• 128 KB–2 MB

Page 60: Virtual Memory CENG331: Introduction to Computer Systems 13 rd  Lecture

– 60 –

Review of Abbreviations

Components of the virtual address (VA) TLBI: TLB index TLBT: TLB tag VPO: virtual page offset VPN: virtual page number

Components of the physical address (PA) PPO: physical page offset (same as VPO) PPN: physical page number CO: byte offset within cache line CI: cache index CT: cache tag

Page 61: Virtual Memory CENG331: Introduction to Computer Systems 13 rd  Lecture

– 61 –

Overview of P6 Address TranslationCPU

VPN VPO20 12

TLBT TLBI416

virtual address (VA)

...

TLB (16 sets, 4 entries/set)VPN1 VPN2

1010

PDE PTE

PDBR

PPN PPO20 12

Page tables

TLBmiss

TLBhit

physicaladdress (PA)

result32

...

CT CO20 5

CI7

L2 and DRAM

L1 (128 sets, 4 lines/set)

L1hit

L1miss

Page 62: Virtual Memory CENG331: Introduction to Computer Systems 13 rd  Lecture

– 62 –

P6 2-level Page Table StructurePage directory

1024 4-byte page directory entries (PDEs) that point to page tables

One page directory per process Page directory must be in memory

when its process is running Always pointed to by PDBR Large page support:

Make PD the page table Fixes page size to 4KB (why?)

Page tables: 1024 4-byte page table entries (PTEs)

that point to pages Size: exactly one page Page tables can be paged in and out

page directory

...

Up to 1024 page tables

1024PTEs

1024PTEs

1024PTEs

...

1024PDEs

Page 63: Virtual Memory CENG331: Introduction to Computer Systems 13 rd  Lecture

– 63 –

P6 Page Directory Entry (PDE)

Page table physical base address Avail G PS A CD WT U/S R/W P=1

Page table physical base address: 20 most significant bits of physical page table address (forces page tables to be 4KB aligned)

Avail: These bits available for system programmers

G: global page (don’t evict from TLB on task switch)

PS: page size 4K (0) or 4M (1)

A: accessed (set by MMU on reads and writes, cleared by software)

CD: cache disabled (1) or enabled (0)

WT: write-through or write-back cache policy for this page table

U/S: user or supervisor mode access

R/W: read-only or read-write access

P: page table is present in memory (1) or not (0)

31 12 11 9 8 7 6 5 4 3 2 1 0

Available for OS (page table location in secondary storage) P=0

31 01

Page 64: Virtual Memory CENG331: Introduction to Computer Systems 13 rd  Lecture

– 64 –

P6 Page Table Entry (PTE)

Page physical base address Avail G 0 D A CD WT U/S R/W P=1

Page base address: 20 most significant bits of physical page address (forces pages to be 4 KB aligned)

Avail: available for system programmers

G: global page (don’t evict from TLB on task switch)

D: dirty (set by MMU on writes)

A: accessed (set by MMU on reads and writes)

CD: cache disabled or enabled

WT: write-through or write-back cache policy for this page

U/S: user/supervisor

R/W: read/write

P: page is present in physical memory (1) or not (0)

31 12 11 9 8 7 6 5 4 3 2 1 0

Available for OS (page location in secondary storage) P=0

31 01

Page 65: Virtual Memory CENG331: Introduction to Computer Systems 13 rd  Lecture

– 65 –

Representation of VM Address Space

Simplified Example 16 page virtual address space

Flags P: Is entry in physical memory? M: Has this part of VA space been mapped?

Page Directory

PT 3

P=1, M=1P=1, M=1P=0, M=0P=0, M=1

••••

P=1, M=1P=0, M=0P=1, M=1P=0, M=1

••••

P=1, M=1P=0, M=0P=1, M=1P=0, M=1

••••

P=0, M=1P=0, M=1P=0, M=0P=0, M=0

••••

PT 2

PT 0

Page 0

Page 1

Page 2

Page 3

Page 4

Page 5

Page 6

Page 7

Page 8

Page 9

Page 10

Page 11

Page 12

Page 13

Page 14

Page 15

Mem Addr

Disk Addr

In Mem

On Disk

Unmapped

Page 66: Virtual Memory CENG331: Introduction to Computer Systems 13 rd  Lecture

– 66 –

P6 TLB TranslationCPU

VPN VPO20 12

TLBT TLBI416

virtual address (VA)

...

TLB (16 sets, 4 entries/set)VPN1 VPN2

1010

PDE PTE

PDBR

PPN PPO20 12

Page tables

TLBmiss

TLBhit

physicaladdress (PA)

result32

...

CT CO20 5

CI7

L2 and DRAM

L1 (128 sets, 4 lines/set)

L1hit

L1miss

Page 67: Virtual Memory CENG331: Introduction to Computer Systems 13 rd  Lecture

– 67 –

P6 TLBTLB entry (not all documented, so this is speculative):

V: indicates a valid (1) or invalid (0) TLB entry TLBTag: disambiguates entries cached in the same set PPN: translation of the address indicated by index & tag G: page is “global” according to PDE, PTE S: page is “supervisor-only” according to PDE, PTE W: page is writable according to PDE, PTE D: PTE has already been marked “dirty” (once is enough)

Structure of the data TLB: 16 sets, 4 entries/set

PPN TLBTag W

11620

S

1

G

1

V

1

D

1

entry entry entry entryentry entry entry entry

entry entry entry entry

...set 0set 1

set 15

Page 68: Virtual Memory CENG331: Introduction to Computer Systems 13 rd  Lecture

– 68 –

Translating with the P6 TLB

1. Partition VPN into TLBT and TLBI.

2. Is the PTE for VPN cached in set TLBI?

3. Yes: Check permissions, build physical address

4. No: Read PTE (and PDE if not cached) from memory and build physical address

CPU

VPN VPO20 12

TLBT TLBI416

virtual address

PDE PTE...

TLBmiss

TLBhit

page table translation

PPN PPO20 12

physical address

12

3

4

partialTLB hit

Page 69: Virtual Memory CENG331: Introduction to Computer Systems 13 rd  Lecture

– 69 –

P6 TLB TranslationCPU

VPN VPO20 12

TLBT TLBI416

virtual address (VA)

...

TLB (16 sets, 4 entries/set)VPN1 VPN2

1010

PDE PTE

PDBR

PPN PPO20 12

Page tables

TLBmiss

TLBhit

physicaladdress (PA)

result32

...

CT CO20 5

CI7

L2 and DRAM

L1 (128 sets, 4 lines/set)

L1hit

L1miss

Page 70: Virtual Memory CENG331: Introduction to Computer Systems 13 rd  Lecture

– 70 –

Translating with the P6 Page Tables(case 1/1)

Case 1/1: page table and page present

MMU Action: MMU builds

physical address and fetches data word

OS action None

VPN

VPN1 VPN2

PDE

PDBR

PPN PPO20 12

20VPO12

p=1 PTE p=1

Data page

data

Page directory

Page table

Mem

Disk

Page 71: Virtual Memory CENG331: Introduction to Computer Systems 13 rd  Lecture

– 71 –

Translating with the P6 Page Tables(case 1/0)

VPN

VPN1 VPN2

PDE

PDBR

20VPO12

p=1 PTE p=0

Page directory

Page table

Mem

Disk

Data page

data

Case 1/0: page table present, page missing

MMU Action: Page fault exception Handler receives the

following args: %eip that caused fault VA that caused fault Fault caused by non-

present page or page-level protection violation

– Read/write– User/supervisor

Page 72: Virtual Memory CENG331: Introduction to Computer Systems 13 rd  Lecture

– 72 –

Translating with the P6 Page Tables(case 1/0, cont.)

VPN

VPN1 VPN2

PDE

PDBR

PPN PPO20 12

20VPO12

p=1 PTE p=1

Data page

data

Page directory

Page table

Mem

Disk

OS Action: Check for a legal virtual

address. Read PTE through PDE. Find free physical page

(swapping out current page if necessary)

Read virtual page from disk into physical page

Adjust PTE to point to physical page, set p=1

Restart faulting instruction by returning from exception handler

Page 73: Virtual Memory CENG331: Introduction to Computer Systems 13 rd  Lecture

– 73 –

Translating with the P6 Page Tables(case 0/1)

VPN

VPN1 VPN2

PDE

PDBR

20VPO12

p=0

PTE p=1

Page directory

Page table

Mem

Disk

Data page

data

Case 0/1: page table missing, page present

Introduces consistency issue Potentially every page-

out requires update of disk page table

Linux disallows this If a page table is

swapped out, then swap out its data pages too

Page 74: Virtual Memory CENG331: Introduction to Computer Systems 13 rd  Lecture

– 74 –

Translating with the P6 Page Tables(case 0/0)

VPN

VPN1 VPN2

PDE

PDBR

20VPO12

p=0

PTE p=0

Page directory

Page table

Mem

Disk

Data page

data

Case 0/0: page table and page missing

MMU Action: Page fault

Page 75: Virtual Memory CENG331: Introduction to Computer Systems 13 rd  Lecture

– 75 –

Translating with the P6 Page Tables(case 0/0, cont.)

VPN

VPN1 VPN2

PDE

PDBR

20VPO12

p=1 PTE p=0

Page directory

Page table

Mem

Disk

Data page

data

OS action: Swap in page table Restart faulting

instruction by returning from handler

Like case 0/1 from here on. Two disk reads

Page 76: Virtual Memory CENG331: Introduction to Computer Systems 13 rd  Lecture

– 76 –

P6 L1 Cache AccessCPU

VPN VPO20 12

TLBT TLBI416

virtual address (VA)

...

TLB (16 sets, 4 entries/set)VPN1 VPN2

1010

PDE PTE

PDBR

PPN PPO20 12

Page tables

TLBmiss

TLBhit

physicaladdress (PA)

result32

...

CT CO20 5

CI7

L2 and DRAM

L1 (128 sets, 4 lines/set)

L1hit

L1miss

Page 77: Virtual Memory CENG331: Introduction to Computer Systems 13 rd  Lecture

– 77 –

L1 Cache Access

Partition physical address: CO, CI, and CT

Use CT to determine if line containing word at address PA is cached in set CI

No: check L2

Yes: extract word at byte offset CO and return to processor

physicaladdress (PA)

data32

...

CT CO20 5

CI7

L2 and DRAM

L1 (128 sets, 4 lines/set)

L1hit

L1miss

Page 78: Virtual Memory CENG331: Introduction to Computer Systems 13 rd  Lecture

– 78 –

Speeding Up L1 Access

Observation Bits that determine CI identical in virtual and physical address Can index into cache while address translation taking place Generally we hit in TLB, so PPN bits (CT bits) available next “Virtually indexed, physically tagged” Cache carefully sized to make this possible

Physical address (PA)

CT CO20 5

CI7

Virtual address (VA) VPN VPO

20 12

PPOPPN

AddressTranslation

NoChange CI

Tag Check

Page 79: Virtual Memory CENG331: Introduction to Computer Systems 13 rd  Lecture

– 79 –

x86-64 PagingOrigin

AMD’s way of extending x86 to 64-bit instruction set Intel has followed with “EM64T”

Requirements 48-bit virtual address

256 terabytes (TB) Not yet ready for full 64 bits

» Nobody can buy that much DRAM yet» Mapping tables would be huge» Multi-level array map may not be the right data structure

52-bit physical address = 40 bits for PPN Requires 64-bit table entries

Keep traditional x86 4KB page size, and same size for page tables (4096 bytes per PT) / (8 bytes per PTE) = only 512 entries per page

Page 80: Virtual Memory CENG331: Introduction to Computer Systems 13 rd  Lecture

– 80 –

x86-64 Paging

PM4LE

BR

Page MapTable

VPN19

VPO12

Virtual address

PPN PPO

40 12

Physical address

VPN2 VPN3 VPN49 9 9

PDPE

Page DirectoryPointerTable

PDE

PageDirectory

Table

PTE

PageTable


Recommended