+ All Categories
Home > Documents > Virtual Memory - University of California, San Diego

Virtual Memory - University of California, San Diego

Date post: 10-Dec-2021
Category:
Upload: others
View: 1 times
Download: 0 times
Share this document with a friend
45
Virtual Memory 1
Transcript

Virtual Memory

1

Learning to Play Well With Others

0x00000

0x10000 (64KB)

Stack

Heap

(Physical) Memory

Learning to Play Well With Others

0x00000

0x10000 (64KB)

Stack

Heap

(Physical) Memorymalloc(0x20000)

Learning to Play Well With Others

0x00000

0x10000 (64KB)

Stack

Heap

(Physical) Memorymalloc(0x20000)

Learning to Play Well With Others

Stack

Heap

(Physical) Memory

0x00000

0x10000 (64KB)

Learning to Play Well With Others

Stack

Heap

(Physical) Memory

0x00000

0x10000 (64KB)

Learning to Play Well With Others

Stack

Heap

(Physical) Memory

Stack

Heap

0x00000

0x10000 (64KB)

Learning to Play Well With Others

Stack

Heap

(Physical) Memory

Stack

Heap

0x00000

0x10000 (64KB)

Learning to Play Well With Others

Stack

Heap

Virtual Memory

0x00000

0x10000 (64KB)

Stack

Heap

Virtual Memory

0x00000

0x10000 (64KB)

Learning to Play Well With Others

Stack

Heap

Virtual Memory

0x00000

0x10000 (64KB)

Physical Memory

0x00000

0x10000 (64KB)

Stack

Heap

Virtual Memory

0x00000

0x10000 (64KB)

Learning to Play Well With Others

Stack

Heap

Virtual Memory

0x00000

0x10000 (64KB)

Physical Memory

0x00000

0x10000 (64KB)

Stack

Heap

Virtual Memory

0x00000

0x10000 (64KB)

Learning to Play Well With Others

Stack

Heap

Virtual Memory

0x00000

0x400000 (4MB)

Physical Memory

0x00000

0x10000 (64KB)

Stack

Heap

Virtual Memory

0x00000

0xF000000 (240MB)

Disk(GBs)

Mapping

• Virtual-to-physical mapping• Virtual --> “virtual address space”• physical --> “physical address space”

• We will break both address spaces up into “pages”• Typically 4KB in size, although sometimes large

• Use a “page table” to map between virtual pages and physical pages.

• The processor generates “virtual” addresses• They are translated via “address translation” into

physical addresses.

6

Implementing Virtual Memory

Physical Address SpaceVirtual Address Space

0

232 - 1 230 – 1 (or whatever)

0

Stack

We needto keep track ofthis mapping…

Heap

The Mapping Process

8

Virtual Page Number Page Offset (log(page size))

Virtual address (32 bits)

Physical address (32 bits)

Page Offset (log(page size))

Virtual-to-physical map

Physical Page Number

Two Problems With VM

• How do we store the map compactly?• How do we translation quickly?

9

How Big is the map?

• 32 bit address space:• 4GB of virtual addresses• 1MPages• Each entry is 4 bytes (a 32 bit physical address)• 4MB of map

• 64 bit address space• 16 exabytes of virtual address space• 4PetaPages • Entry is 8 bytes• 64PB of map

10

Shrinking the map

• Only store the entries that matter (i.e.,. enough for your physical address space)

• 64GB on a 64bit machine• 16M pages, 128MB of map

• This is still pretty big.• Representing the map is now hard

• The OS allocates stuff all over the place.• For security, convenience, or caching optimizations

• How do you represent this “sparse” map.

11

Hierarchical Page Tables

• Break the virtual page number into several pieces• If each piece has N bits, build an 2N-ary tree• Only store the part of the tree that contain valid

pages• To do translation, walk down the tree using the

pieces to select with child to visit.

12

Hierarchical Page Table

Level 1 Page Table

Level 2Page Tables

Data Pages

Parts of the map that exist

Root of the CurrentPage Table

p1

offset

p2

Virtual Address

(ProcessorRegister)

Parts that don’t

p1 p2 offset01112212231

10-bitL1 index

10-bit L2 index

Adapted from Arvind and Krste’s MIT Course 6.823 Fall 05

Making Translation Fast

• Address translation has to happen for every memory access

• This potentially puts it squarely on the critical for memory operation (which are already slow)

14

“Solution 1”: Use the Page Table

• We could walk the page table on every memory access

• Result: every load or store requires an additional 3-4 loads to walk the page table.

• Unacceptable performance hit.

15

Solution 2: TLBs• We have a large pile of data (i.e., the page table) and we want

to access it very quickly (i.e., in one clock cycle)• So, build a cache for the page mapping, but call it a “translation

lookaside buffer” or “TLB”

16

TLBs

• TLBs are small (maybe 128 entries), highly-associative (often fully-associative) caches for page table entries.

• This raises the possibility of a TLB miss, which can be expensive• To make them cheaper, there are “hardware page table

walkers” -- specialized state machines that can load page table entries into the TLB without OS intervention

• This means that the page table format is now part of the big-A architecture.

• Typically, the OS can disable the walker and implement its own format.

17

Solution 3: Defer translating Accesses

• If we translate before we go to the cache, we have a “physically cache”, since cache works on physical addresses. • Critical path = TLB access time + Cache access time

• Alternately, we could translate after the cache• Translation is only required on a miss.• This is a “virtual cache”

•18

CPU PhysicalCache

TLB PrimaryMemory

VAPA

CPU

VA

VirtualCache

PATLB

PrimaryMemory

The Danger Of Virtual Caches (1)

• Process A is running. It issues a memory request to address 0x10000• It is a miss, and is brought into the virtual cache

• A context switch occurs• Process B starts running. It issues a request to

0x10000• Will B get the right data?

19

The Danger Of Virtual Caches (1)

• Process A is running. It issues a memory request to address 0x10000• It is a miss, and is brought into the virtual cache

• A context switch occurs• Process B starts running. It issues a request to

0x10000• Will B get the right data?

19

No! We must flush virtual caches on a context switch.

The Danger Of Virtual Caches (2)

• There is no rule that says that each virtual address maps to a different physical address.

• When this occurs, it is called “aliasing”• Example: An alias exists in the cache

• Store B to 0x1000

• Now, a load from 0x2000 will return the wrong value20

A

A

0x1000

0x2000

Address Data

Cache

0x1000 0xfff0000

0x2000 0xfff0000

Page Table

B

A

0x1000

0x2000

Address Data

Cache

0x1000 0xfff0000

0x2000 0xfff0000

Page Table

The Danger Of Virtual Caches (2)

• There is no rule that says that each virtual address maps to a different physical address.

• Copy on write:

• The initial copy is free, and the OS will catch attempts to write to the copy, and do the actual copy lazily.

• There are also system calls that let you do this arbitrarily.21

Virtual address space

char * A

My Big

Datamemcpy(A, B, 100000)

Physical address space

My Big

Data

memcpy(A, B, 100000)

char * B;

My Empty

Buffer

Virtual address space

char * A

My Big

Data

Physical address space

My Big

Data

char * B;

Un-

writeable

copy

By Big

Empty

Buffer

The Danger Of Virtual Caches (2)

• There is no rule that says that each virtual address maps to a different physical address.

• Copy on write:

• The initial copy is free, and the OS will catch attempts to write to the copy, and do the actual copy lazily.

• There are also system calls that let you do this arbitrarily.21

Virtual address space

char * A

My Big

Datamemcpy(A, B, 100000)

Physical address space

My Big

Data

memcpy(A, B, 100000)

char * B;

My Empty

Buffer

Virtual address space

char * A

My Big

Data

Physical address space

My Big

Data

char * B;

Un-

writeable

copy

By Big

Empty

Buffer

Two virtual addressespointing the same physical address

Avoiding Aliases

• If the system has virtual caches, the operating system must prevent alias from occurring.

• This means that any addresses that may alias must map to the same cache index.• If VA1 and VA2 are aliases,• VA1 mod (cache size) == VA2 mod (cache size)

22

Solution (4): Virtually indexed physically tagged

Index L is available without consulting the TLB⇒ cache and TLB accesses can begin simultaneously

Critical path = max(cache time, TLB time)!!!Tag comparison is made after both accesses are completed

Work if Cache Size ≤ Page Size ( C ≤ P) because then all the cache inputs do not need to be translated

VPN L = C-b b

TLB Direct-map Cache Size 2C = 2L+b

PPN Page Offset

=hit?

DataPhysical Tag

Tag

VA

PA

“VirtualIndex”

P

Adapted from Arvind and Krste’s MIT Course 6.823 Fall 05

key idea: page offset bits are not translated and thus can be presented to the cache immediately

Avoiding Aliasing in Large Caches

• The restrictions on cache size might be too painful.

• In this case, we need another mechanism to avoid aliasing.

24

Anti-Aliasing Using Inclusive L2: MIPS R10000-style

VPN a Page Offset b

TLB

PPN Page Offset b

Tag

VA

PA

Virtual Index L1 VA cache

= hit?

PPNa Data

PPNa Data

VA1

VA2

Direct-Mapped PA L2

PA a1 Data

PPN

into L2 tag

Suppose VA1 and VA2 both map to PA and VA1 is already in L1, L2 (VA1 ≠ VA2)

After VA2 is resolved to PA, a collision will be detected in L2 because the a1 bits don’t match.

VA1 will be purged from L1 and L2, and VA2 will be loaded ⇒ no aliasing !

Once again, ensure the invariant that only one copy of physical address is in virtually-addressed L1 cache at any one time. The physically-addressed L2, which includes contents of L1, contains the missing virtual address bits that identify the location of the item in the L1.

(could be associative too, justneed to check more entries)

Adapted from Arvind and Krste’s MIT Course 6.823 Fall 05

In the Real World

• L1 caches are virtually indexed, physically tagged.• Lower levels are pure physical

• Once you go physical, it is not possible (or desirable) to go back.

26

Base of Table

Power PC: Hashed Page Table

hashOffset +

PA of Slot

PrimaryMemory

VPN PPN

Page TableVPN d 80-bit VA

VPN

Each hash table slot has 8 PTE's <VPN,PPN> that are searched sequentially

If the first hash slot fails, an alternate hash function is used to look in another slot (“rehashing”)

All these steps are done in hardware! Hashed Table is typically 2 to 3 times larger than the

number of physical pages The full backup Page Table is a software data structure

Adapted from Arvind and Krste’s MIT Course 6.823 Fall 05

Page table with pages on disk

Level 1 Page Table

Level 2Page Tables

Data Pages

page in primary memory page on disk

Root of the CurrentPage Table

p1

offset

p2

Virtual Address

(ProcessorRegister)

PTE of a nonexistent page

p1 p2 offset01112212231

10-bitL1 index

10-bit L2 index

Adapted from Arvind and Krste’s MIT Course 6.823 Fall 05

The TLB With Disk

• TLB entries always point to memory, not disks29

The Value of Paging

• Disk are really really slow.• Paging is not very useful for expanding the active

memory capacity of a system• It’s good for “coarse grain context switching” between

apps• And for dealing with memory leaks ;-)

• As a result, fast systems don’t page.

30

The Future of Paging

• Non-volatile, solid-state memories significantly alter the trade-offs for paging.• NAND-based SSDs can be between 10-100x faster than

disk

• Is paging viable now? In what circumstances?

31

Other uses for VM• VM provides us a mechanism for adding “meta

data” to different regions of memory.• The primary piece of meta data is the location of the

data in physical ram.• But we can support other bits of information as well

•32

Other uses for VM• VM provides us a mechanism for adding “meta

data” to different regions of memory.• The primary piece of meta data is the location of the

data in physical ram.• But we can support other bits of information as well

• Backing memory to disk• next slide

• Protection• Pages can be readable, writable, or executable• Pages can be cachable or un-cachable• Pages can be write-through or write back.

• Other tricks• Arrays bounds checking• Copy on write, etc.

33

Heterogeneous Memory

34

Fast/non-dense/volatile memory

Slower/denser/non-volatile memory

Physical Address Space

Really slow, really dense (Disk/SSD)

Buffer cache

VA space 1 VA space 2

• VM is responsible for combining different types of data into a single, coherent interface.

Universal Memory• Fast (as DRAM), dense (as flash or better), non-

volatile (as disk), reliable (as DRAM and disk), addressable (like DRAM)

• UM has wide-ranging systems implications• Disks as fast as dram

• Change the file system• OS overhead will now dominate accesses

• Potentially blurred line between volatile and non-volatile state• Disk looks like DRAM. Shouldn’t we access it like DRAM (i.e.,

using loads and stores)?• Need permanent naming, protection (FS are good at this)• Allocation in FS is very expensive. • What are the semantics of a non-volatile data structure?• Probably need language support.

35

Non-volatile Data Structures

• If non-volatile storage is byte addressable, we can apply everything we know about data structures, etc. to non-volatile data• No more serialization• Radically redesigned databases (Tables are built for disks)• Reuse the tried-and-true programming models we use

for volatile data.

• New problems• How do you deal with power failure in this model?

• Transactions

• Do you have to trust user code?• No. Use the FS to manage the namespace, the TLB to enforce

protection. The common case is access. The other stuff can be slower.

36


Recommended