virtualization Issues and Challenges in MemoryExample: 4-level page tables (48-bit virtual address)...

Issues and Challenges in Memory virtualization

CS698E

Virtualization: Resource multiplexing with isolation

Hypervisor

VM1

Applications

Operating System

Virtualized system

VM2

Applications

Operating System

CPU Memory I/O

➔ Definition 1 “Not physically existing as such but made by software to appear to do so.”

➔ Core objectives◆ Equivalence◆ Isolation◆ Resource control◆ Efficiency

➔ Today’s lecture: Memory virtualization

1. Oxford dictionary : https://en.oxforddictionaries.com/definition/virtual

Hang on! Isn’t memory already virtualized?

Process-1 Process-2 Process-N

Virtual Memory

0 V

Virtual Memory

0 V

Virtual Memory

0 V

Physical Memory

Virtual to physical

➔ Paging is a well known V2P translation scheme

➔ Who builds the page table: s/w or CPU?

➔ Who walks the page table: s/w or CPU?

➔ What happens during context switch?

Example: 4-level page tables (48-bit virtual address)

mm->pgd CR3 →

9 bits 9 bits 9 bits 9 bits 12 bits

pgd_t

pgd_offset

pud_t

pud_offset

pmd_t

pmd_offset pte_offset

pte_t Physical frame (4K)

Doubly virtualized memory! VM1

Virtual

P1

Virtual

P2

VM2

Virtual

P1

Virtual

P2

Guest Physical Address (GPA) Guest Physical Address (GPA)

Machine Physical Address (MPA)

CPUCR3?

➔ Two levels of translation◆ V to P◆ P to M

➔ Two types of context switch◆ Intra-VM◆ Inter-VM

➔ Two sources of Page fault

Shadow paging: Basic designVM1

1

P1

P2

1

CPUCR3

2

3

2 3 4 5 6 7 8

4

1 → 3, 2 → 43 → 7, 4 → X

1 2

3 4

1 → 8, 2 → 13 → 3, 4 → X

1 2 3 4 5 6 7

8 9 A B C D E

1 → 3, 2 → 4, 3 → 5, 4 → X, 5→ 7, 6→ 8, 7→ 9, 8→ A, 9→ B

9

G to M

VM1: P11 → 5, 2 → X3→ 9, 4 → X

VM1: P2

1 → A, 2 → 33→ 5, 4 → X


1

P1

P2

1

CPUCR3

2

3

2 3 4 5 6 7 8

4

1 → 3, 2 → 43 → 7, 4 → X

1 2

3 4

1 → 8, 2 → 13 → 3, 4 → X

1 2 3 4 5 6 7

8 9 A B C D E

1 → 3, 2 → 4, 3 → 5, 4 → X, 5→ 7, 6→ 8, 7→ 9, 8→ A, 9→ B

9

G to M

VM1: P1

1 → 5, 2 → X3→ 9, 4 → X

VM1: P2

1 → A, 2 → 33→ 5, 4 → X

Context switch from P1 to P2


1

P1

P2

1

CPUCR3

2

3

2 3 4 5 6 7 8

4

1 → 3, 2 → 43 → 7, 4 → X

1 2

3 4

1 → 8, 2 → 13 → 3, 4 → X

1 2 3 4 5 6 7

8 9 A B C D E

1 → 3, 2 → 4, 3 → 5, 4 → X, 5→ 7, 6→ 8, 7→ 9, 8→ A, 9→ B

9

G to M

VM1: P1

1 → 5, 2 → X3→ 9, 4 → X

VM1: P2

1 → A, 2 → 33→ 5, 4 → X

Shadow paging: Page fault handlingVM1

1

P1

P2

1

CPUCR3

2

3

2 3 4 5 6 7 8

4

1 → 3, 2 → 43 → 7, 4 → X

1 2

3 4

1 → 8, 2 → 13 → 3, 4 → X

1 2 3 4 5 6 7

8 9 A B C D E

1 → 3, 2 → 4, 3 → 5, 4 → X, 5→ 7, 6→ 8, 7→ 9, 8→ A, 9→ B

9

G to M

VM1: P1

1 → 5, 2 → X3→ 9, 4 → X

VM1: P2

1 → A, 2 → 33→ 5, 4 → X

P1 access virtual address 2 → Page fault → Handled @ hypervisor


1

P1

P2

1

CPUCR3

2

3

2 3 4 5 6 7 8

4

1 → 3, 2 → 43 → 7, 4 → X

1 2

3 4

1 → 8, 2 → 13 → 3, 4 → X

1 2 3 4 5 6 7

8 9 A B C D E

1 → 3, 2 → 4, 3 → 5, 4 → B, 5→ 7, 6→ 8, 7→ 9, 8→ A, 9→ B

9

G to M

VM1: P1

1 → 5, 2 → B3→ 9, 4 → X

VM1: P2

1 → A, 2 → 33→ 5, 4 → X


1

P1

P2

1

CPUCR3

2

3

2 3 4 5 6 7 8

4

1 → 3, 2 → 43 → 7, 4 → X

1 2

3 4

1 → 8, 2 → 13 → 3, 4 → X

1 2 3 4 5 6 7

8 9 A B C D E

1 → 3, 2 → 4, 3 → 5, 4 → X, 5→ 7, 6→ 8, 7→ 9, 8→ A, 9→ B

9

G to M

VM1: P1

1 → 5, 2 → X3→ 9, 4 → X

VM1: P2

1 → A, 2 → 33→ 5, 4 → X

P1 access VA 4 → Page fault →Who handles Page fault?


1

P1

P2

1

CPUCR3

2

3

2 3 4 5 6 7 8

4

1 → 3, 2 → 43 → 7, 4 → 9

1 2

3 4

1 → 8, 2 → 13 → 3, 4 → X

1 2 3 4 5 6 7

8 9 A B C D E

1 → 3, 2 → 4, 3 → 5, 4 → X, 5→ 7, 6→ 8, 7→ 9, 8→ A, 9→ B

9

G to M

VM1: P1

1 → 5, 2 → X3→ 9, 4 → X

VM1: P2

1 → A, 2 → 33→ 5, 4 → X

How will the shadow page table be in sync?


1

P1

P2

1

CPUCR3

2

3

2 3 4 5 6 7 8

4

1 → 3, 2 → 43 → 7, 4 → 9

1 2

3 4

1 → 8, 2 → 13 → 3, 4 → X

1 2 3 4 5 6 7

8 9 A B C D E

1 → 3, 2 → 4, 3 → 5, 4 → X, 5→ 7, 6→ 8, 7→ 9, 8→ A, 9→ B

9

G to M

VM1: P1

1 → 5, 2 → X3→ 9, 4 → B

VM1: P2

1 → A, 2 → 33→ 5, 4 → X

Shadow paging: Good, Bad and Ugly!➔ Assume a case when

◆ All virtual addresses are mapped◆ No updates to page table mappings◆ Shadow paging performance = ?◆ TLB effectiveness?

➔ How many shadow pages to be maintained?◆ N VMs with M active processes each

➔ Assume a case when◆ A lot of page faults (memory alloc and dealloc)◆ Context switch

H/W assisted paging: Basic designVM1

1

P1

P2

1

CPU

gCR3

2

3

2 3 4 5 6 7 8

4

1 → 3, 2 → 43 → 7, 4 → X

1 2

3 4

1 → 8, 2 → 13 → 3, 4 → X

1 2 3 4 5 6 7

8 9 A B C D E

1 → 3, 2 → 4, 3 → 5, 4 → X, 5→ 7, 6→ 8, 7→ 9, 8→ A, 9→ B

9 G to M

CR3

Intel Extended Page Tables (EPT)1 and AMD Nested Page Tables (NPT)2

1. https://www.intel.com/content/dam/www/public/us/en/documents/manuals/64-ia-32-architectures-software-developer-vol-3c-part-3-manual.pdf

2. http://developer.amd.com/wordpress/media/2012/10/NPT-WP-1%201-final-TM.pdf

https://www.intel.com/content/dam/www/public/us/en/documents/manuals/64-ia-32-architectures-software-developer-vol-3c-part-3-manual.pdf

https://www.intel.com/content/dam/www/public/us/en/documents/manuals/64-ia-32-architectures-software-developer-vol-3c-part-3-manual.pdf

http://developer.amd.com/wordpress/media/2012/10/NPT-WP-1%201-final-TM.pdf

H/W assisted paging: page table managementVM1

1

P1

P2

1

CPU

gCR3

2

3

2 3 4 5 6 7 8

4

1 → 3, 2 → 43 → 7, 4 → X

1 2

3 4

1 → 8, 2 → 13 → 3, 4 → X

1 2 3 4 5 6 7

8 9 A B C D E

1 → 3, 2 → 4, 3 → 5, 4 → X, 5→ 7, 6→ 8, 7→ 9, 8→ A, 9→ B

9 G to M

CR3

➔ Intra-VM context switch does not require hypervisor involvement➔ What about page fault handling?

H/W assisted paging: Page fault handlingVM1

1

P1

P2

1

CPU

gCR3

2

3

2 3 4 5 6 7 8

4

1 → 3, 2 → 43 → 7, 4 → X

1 2

3 4

1 → 8, 2 → 13 → 3, 4 → X

1 2 3 4 5 6 7

8 9 A B C D E

1 → 3, 2 → 4, 3 → 5, 4 → X, 5→ 7, 6→ 8, 7→ 9, 8→ A, 9→ B

9 G to M

CR3

P1 access virtual address 2, how the page fault handled?


1

P1

P2

1

CPU

gCR3

2

3

2 3 4 5 6 7 8

4

1 → 3, 2 → 43 → 7, 4 → X

1 2

3 4

1 → 8, 2 → 13 → 3, 4 → X

1 2 3 4 5 6 7

8 9 A B C D E

1 → 3, 2 → 4, 3 → 5, 4 → C, 5→ 7, 6→ 8, 7→ 9, 8→ A, 9→ B

9 G to M

CR3

Page Fault

Page fault handled at the hypervisor in a guest OS transparent manner


1

P1

P2

1

CPU

gCR3

2

3

2 3 4 5 6 7 8

4

1 → 3, 2 → 43 → 7, 4 → X

1 2

3 4

1 → 8, 2 → 13 → 3, 4 → X

1 2 3 4 5 6 7

8 9 A B C D E

1 → 3, 2 → 4, 3 → 5, 4 → X, 5→ 7, 6→ 8, 7→ 9, 8→ A, 9→ B

9 G to M

CR3

P1 access virtual address 4, how the page fault handled?


1

P1

P2

1

CPU

gCR3

2

3

2 3 4 5 6 7 8

4

1 → 3, 2 → 43 → 7, 4→ 9

1 2

3 4

1 → 8, 2 → 13 → 3, 4 → X

1 2 3 4 5 6 7

8 9 A B C D E

1 → 3, 2 → 4, 3 → 5, 4 → X, 5→ 7, 6→ 8, 7→ 9, 8→ A, 9→ B

9 G to M

CR3

Page Fault

Page fault handled by guest OS → no VMExit and hypervisor involvement

Nested paging: Good, Bad and Ugly!➔ Good

◆ No VMExit, cumbersome page table sync◆ P to M mapping fixed for all processes

➔ Bad◆ What about TLB efficiency?◆ One TLB entry in native system == ___ TLB entries in NPT/EPT

➔ Ugly◆ Costly memory translation in case of TLB miss◆ How costly?

Nested page walkVM1

1

P1

1

CPU

gCR3

2

2 3 4 5 6 7 8

1 2 3 4 5 6 7

8 9 A B C D E

9 G to MCR3

1’-> 2 1

11’’ → 3

2

1’ → 2 2’ → 2 3’ → 2

1

2

1

1→ 3

1’’ → 3 2’’ → 4 3’’ → 5

2-level page table, e.g., 32-bit (10 + 10 + 12)

#of memory accesses for translation?

Nested page walkVM1

1

P1

1

CPU

gCR3

2

2 3 4 5 6 7 8

1 2 3 4 5 6 7

8 9 A B C D E

9 G to MCR3

1’-> 2 1

11’’ → 3

2

1’ → 2 2’ → 2 3’ → 2

1

2

1

1→ 3

1’’ → 3 2’’ → 4 3’’ → 5

2-level page table, e.g., 32-bit (10 + 10 + 12)

Memory accesses for translation

1 1 2 3

2 1 2 4

1’

1’’

3 3 1 2

Lost in translation? Here is a summary.Memory Virtualization

Shadow Paging EPT/NPT

➔ Near native performance, after page table is in place and not updated

➔ Messy interactions across the two layers for correctness and isolation → overheads

➔ Everything is great when there are no TLB miss

➔ Nested walk becomes a bottleneck◆ Page structure/walk caches◆ L1/L2/LLC caches

Motivation for a hybrid scheme 1

1. Selective Hardware/Software Memory Virtualization, Wang et.al. VEE 2011, Figures used are from the paper

➔ Neither of the techniques is a clear winner.➔ What is the solution? How to combine?

Dynamic Switching Paging (DSP)1

DSP

Shadow Paging EPT/NPT

1. Selective Hardware/Software Memory Virtualization, Wang et.al. VEE 2011,

➔ Dynamic switching◆ How?◆ When?

DSP: Switching challenges

➔ Where is the switch?➔ Maintain relevant states when paging mode inactive

◆ In SP mode → Maintain EPT structures◆ In EPT mode → Maintain SP structures

➔ Page table pages: RO ← → RW◆ Unanswered in paper

➔ TLB flush or no TLB flush?◆ Unanswered in paper◆ Carry forward TLB → Any issues?

DSP: Switching strategy

➔ Players in the game: TLB misses and VMExits◆ Both monitored continuously◆ In EPT-mode

● How to monitor VMExits?● VMExits ~ _________

➔ Cost (VMExit) vs. Cost (TLB misses)➔ DSP uses thresholds, why?

Date post:	22-Apr-2020
Category:	Documents
Upload:	others
View:	7 times
Download:	0 times

virtualization Issues and Challenges in MemoryExample: 4-level page tables (48-bit virtual address)...

Documents