+ All Categories
Home > Documents > CS5460: Operating Systems Lecture: Virtualization 2cs5460/slides/virt-lecture2.pdf · Memory...

CS5460: Operating Systems Lecture: Virtualization 2cs5460/slides/virt-lecture2.pdf · Memory...

Date post: 22-Sep-2020
Category:
Upload: others
View: 6 times
Download: 0 times
Share this document with a friend
61
CS5460: Operating Systems Lecture: Virtualization 2 Anton Burtsev March, 2013
Transcript
Page 1: CS5460: Operating Systems Lecture: Virtualization 2cs5460/slides/virt-lecture2.pdf · Memory virtualization: brute force. Hypervisor TLB Hardware Guest PD CR3 PT Helper structures

CS5460: Operating Systems

Lecture: Virtualization 2

Anton BurtsevMarch, 2013

Page 2: CS5460: Operating Systems Lecture: Virtualization 2cs5460/slides/virt-lecture2.pdf · Memory virtualization: brute force. Hypervisor TLB Hardware Guest PD CR3 PT Helper structures

Paravirtualization:Xen

Page 3: CS5460: Operating Systems Lecture: Virtualization 2cs5460/slides/virt-lecture2.pdf · Memory virtualization: brute force. Hypervisor TLB Hardware Guest PD CR3 PT Helper structures

● Complete illusion of physical hardware● Trap _all_ sensitive

instructions● Example: page table

update

Full virtualization

Virtualized OS

Hypervisor

PTE update (mov)

Page 4: CS5460: Operating Systems Lecture: Virtualization 2cs5460/slides/virt-lecture2.pdf · Memory virtualization: brute force. Hypervisor TLB Hardware Guest PD CR3 PT Helper structures

● Complete illusion of physical hardware● Trap _all_ sensitive

instructions● Example: page table

update

Full virtualization

Virtualized OS

Hypervisor

PTE update (mov)

Trap

Page 5: CS5460: Operating Systems Lecture: Virtualization 2cs5460/slides/virt-lecture2.pdf · Memory virtualization: brute force. Hypervisor TLB Hardware Guest PD CR3 PT Helper structures

● Complete illusion of physical hardware● Trap _all_ sensitive

instructions● Example: page table

update

Full virtualization

Virtualized OS

Hypervisor

PTE update (mov)

if (safe) { update_pte(); emulate_mov(); }

Next instruction

Trap

● Traps are slow● Binary translation is

faster, for some events● Not for PTE updates,

why?

Page 6: CS5460: Operating Systems Lecture: Virtualization 2cs5460/slides/virt-lecture2.pdf · Memory virtualization: brute force. Hypervisor TLB Hardware Guest PD CR3 PT Helper structures

Performance problems

Virtualized OS

Hypervisor

PTE update (mov)

if (safe) { update_pte(); emulate_mov(); }

Next instruction

Trap

● Traps are slow● Binary translation is faster

● For some events● Not for PTE updates, why?

Page 7: CS5460: Operating Systems Lecture: Virtualization 2cs5460/slides/virt-lecture2.pdf · Memory virtualization: brute force. Hypervisor TLB Hardware Guest PD CR3 PT Helper structures

Paravirtualization

● No illusion of hardware● Instead: paravirtualized interface

● Explicit hypervisor calls to update sensitive state– Page tables, interrupt flag

● But Guest OS needs porting● Applications run natively in Ring 3

Page 8: CS5460: Operating Systems Lecture: Virtualization 2cs5460/slides/virt-lecture2.pdf · Memory virtualization: brute force. Hypervisor TLB Hardware Guest PD CR3 PT Helper structures

ParavirtualizationParavirtualized OS

Hypervisor

PTE update

Batch updatesupdate 1update 2

Invoke hypervisor

if (safe) update

Page 9: CS5460: Operating Systems Lecture: Virtualization 2cs5460/slides/virt-lecture2.pdf · Memory virtualization: brute force. Hypervisor TLB Hardware Guest PD CR3 PT Helper structures

Xen

Page 10: CS5460: Operating Systems Lecture: Virtualization 2cs5460/slides/virt-lecture2.pdf · Memory virtualization: brute force. Hypervisor TLB Hardware Guest PD CR3 PT Helper structures

Segmentation and paging

Page 11: CS5460: Operating Systems Lecture: Virtualization 2cs5460/slides/virt-lecture2.pdf · Memory virtualization: brute force. Hypervisor TLB Hardware Guest PD CR3 PT Helper structures

Hypervisor protection

Page 12: CS5460: Operating Systems Lecture: Virtualization 2cs5460/slides/virt-lecture2.pdf · Memory virtualization: brute force. Hypervisor TLB Hardware Guest PD CR3 PT Helper structures

Hardware support for virtualization:KVM

Page 13: CS5460: Operating Systems Lecture: Virtualization 2cs5460/slides/virt-lecture2.pdf · Memory virtualization: brute force. Hypervisor TLB Hardware Guest PD CR3 PT Helper structures

Basic idea

Host instruction stream

Guest instruction stream

VM Entry VM Exit

Host State

Guest State

VMCS

Page 14: CS5460: Operating Systems Lecture: Virtualization 2cs5460/slides/virt-lecture2.pdf · Memory virtualization: brute force. Hypervisor TLB Hardware Guest PD CR3 PT Helper structures

New mode of operation:VMX root

● VMX root operation● 4 privilege levels

● VMX non-root operation● 4 privilege levels as well, but unable to invoke

VMX root instructions● Guest runs until it performs exception causing it

to exit● Rich set of exit events● Guest state and exit reason are stored in VMCS

Page 15: CS5460: Operating Systems Lecture: Virtualization 2cs5460/slides/virt-lecture2.pdf · Memory virtualization: brute force. Hypervisor TLB Hardware Guest PD CR3 PT Helper structures

Virtual machine control structure (VMCS)

● Guest State● Loaded on entries● Saved on exits

● Host State● Saved on entries● Loaded on exits

● Control fields● Execution control, exits control, entries control

Page 16: CS5460: Operating Systems Lecture: Virtualization 2cs5460/slides/virt-lecture2.pdf · Memory virtualization: brute force. Hypervisor TLB Hardware Guest PD CR3 PT Helper structures

Guest state

● Register state● Non-register state

● Activity state: – active– inactive (HLT, Shutdown, wait for Startup IPI

interprocessor interrupt))● Interruptibility state

Page 17: CS5460: Operating Systems Lecture: Virtualization 2cs5460/slides/virt-lecture2.pdf · Memory virtualization: brute force. Hypervisor TLB Hardware Guest PD CR3 PT Helper structures

Host state

● Only register state● ALU registers,

● also:● Base page table address (CR3)● Segment selectors● Global descriptors table ● Interrupt descriptors table

Page 18: CS5460: Operating Systems Lecture: Virtualization 2cs5460/slides/virt-lecture2.pdf · Memory virtualization: brute force. Hypervisor TLB Hardware Guest PD CR3 PT Helper structures

VM-execution controls(asynchronous events control)

Reserved

Bit 31 Bit 0

External interrupts (maskable or IRQs) cause exits(yes/no)If not, then they delivered through guestIDT

NMI cause exits (yes/no)If not, then they are delivered normally through guest IDT (descriptor 2)

Page 19: CS5460: Operating Systems Lecture: Virtualization 2cs5460/slides/virt-lecture2.pdf · Memory virtualization: brute force. Hypervisor TLB Hardware Guest PD CR3 PT Helper structures

VM-execution controls(synchronous events control, not all reasons are shown)

Reserved

Bit 31 Bit 0PAUSE

MONITOR

Act

ivat

e I/

O b

itm

aps

Unc

ondi

tion

al I

/O

HLT

INVLPG

Page 20: CS5460: Operating Systems Lecture: Virtualization 2cs5460/slides/virt-lecture2.pdf · Memory virtualization: brute force. Hypervisor TLB Hardware Guest PD CR3 PT Helper structures

Exception bitmap(one for each of 32 IA-32 exceptions)

Bit 31 Bit 0

● IA-32 defines 32 exception vectors (interrupts 0-31)

● Each of them is configured to cause or not VM-exit

14 – page fault

Page 21: CS5460: Operating Systems Lecture: Virtualization 2cs5460/slides/virt-lecture2.pdf · Memory virtualization: brute force. Hypervisor TLB Hardware Guest PD CR3 PT Helper structures

I/O Bitmaps

● Two addresses on 4KB memory areas (A and B)

A B

Safe I/O addresses (not causing exits)

Page 22: CS5460: Operating Systems Lecture: Virtualization 2cs5460/slides/virt-lecture2.pdf · Memory virtualization: brute force. Hypervisor TLB Hardware Guest PD CR3 PT Helper structures

Exit information

● Information describing conditions of VM-exit is saved in VMCS● It's different for different types of event

Page 23: CS5460: Operating Systems Lecture: Virtualization 2cs5460/slides/virt-lecture2.pdf · Memory virtualization: brute force. Hypervisor TLB Hardware Guest PD CR3 PT Helper structures

KVM

Page 24: CS5460: Operating Systems Lecture: Virtualization 2cs5460/slides/virt-lecture2.pdf · Memory virtualization: brute force. Hypervisor TLB Hardware Guest PD CR3 PT Helper structures

Memory virtualization: brute force.

Hypervisor

HardwareTLB

Guest

PD

CR3

PT

Helper structures describe actual guest VM layout Maintained for each guest. On VM-Exit hypervisor adjusts guest page accordingly.

Write / read protectedpage table area. Every access results in VM-Exit and passes control to hypervisor

CPU stores pointer onguest page table directory

Page 25: CS5460: Operating Systems Lecture: Virtualization 2cs5460/slides/virt-lecture2.pdf · Memory virtualization: brute force. Hypervisor TLB Hardware Guest PD CR3 PT Helper structures

Memory virtualization: shadow page tables

HardwareTLB

Guest

PD

CR3

PT

Active page table hierarchy VMM maintains it for each VM that it supports

Guest page table hierarchy It's writable, but can be inconsistent with active page table hierarchy stored by the hypervisor

PD PT

CPU stores pointer on active page table hierarchy. On Intel CPUs TLB is always refilled from active page table directory

Page 26: CS5460: Operating Systems Lecture: Virtualization 2cs5460/slides/virt-lecture2.pdf · Memory virtualization: brute force. Hypervisor TLB Hardware Guest PD CR3 PT Helper structures

Nested page tables

hPT gPT Host Physical

gPT

VMM Host Virtual

Guest Physical

Guest VirtualgCR3

hCR3

0

0

0

PT

CR3 used by VMM

Translation can be cached in TLB

paged by CR3

paged by hCR3

paged by gCR3

Page 27: CS5460: Operating Systems Lecture: Virtualization 2cs5460/slides/virt-lecture2.pdf · Memory virtualization: brute force. Hypervisor TLB Hardware Guest PD CR3 PT Helper structures

Page table lookup ● 4-level page table

Page 28: CS5460: Operating Systems Lecture: Virtualization 2cs5460/slides/virt-lecture2.pdf · Memory virtualization: brute force. Hypervisor TLB Hardware Guest PD CR3 PT Helper structures

Nested page table lookup

Page 29: CS5460: Operating Systems Lecture: Virtualization 2cs5460/slides/virt-lecture2.pdf · Memory virtualization: brute force. Hypervisor TLB Hardware Guest PD CR3 PT Helper structures

Efficient I/O

Page 30: CS5460: Operating Systems Lecture: Virtualization 2cs5460/slides/virt-lecture2.pdf · Memory virtualization: brute force. Hypervisor TLB Hardware Guest PD CR3 PT Helper structures

Where is the bottleneck● What is the bottleneck in case of

virtualization?● CPU?

– CPU bound workloads execute natively on the real CPU

– Sometimes JIT compilation (binary translation makes them even faster [Dynamo]

● Everything what is inside VM is fast!● What is the most frequent operation

disturbing execution of VM? ● Device I/O!

● Disk, Network, Graphics

Page 31: CS5460: Operating Systems Lecture: Virtualization 2cs5460/slides/virt-lecture2.pdf · Memory virtualization: brute force. Hypervisor TLB Hardware Guest PD CR3 PT Helper structures

Virtual devices in Xen

31

Page 32: CS5460: Operating Systems Lecture: Virtualization 2cs5460/slides/virt-lecture2.pdf · Memory virtualization: brute force. Hypervisor TLB Hardware Guest PD CR3 PT Helper structures

Virtual devices in Xen

32

Page 33: CS5460: Operating Systems Lecture: Virtualization 2cs5460/slides/virt-lecture2.pdf · Memory virtualization: brute force. Hypervisor TLB Hardware Guest PD CR3 PT Helper structures

Virtual devices in Xen

33

Page 34: CS5460: Operating Systems Lecture: Virtualization 2cs5460/slides/virt-lecture2.pdf · Memory virtualization: brute force. Hypervisor TLB Hardware Guest PD CR3 PT Helper structures

Virtual devices in Xen

34

Page 35: CS5460: Operating Systems Lecture: Virtualization 2cs5460/slides/virt-lecture2.pdf · Memory virtualization: brute force. Hypervisor TLB Hardware Guest PD CR3 PT Helper structures

Virtual devices in Xen

35

Page 36: CS5460: Operating Systems Lecture: Virtualization 2cs5460/slides/virt-lecture2.pdf · Memory virtualization: brute force. Hypervisor TLB Hardware Guest PD CR3 PT Helper structures

How to make the I/O fast?● Take into account specifics of the device-

driver communication● Bulk

– Large packets (512B – 4K)● Session oriented

– Connection is established once (during boot)– No short IPCs, like function calls– Costs of establishing an IPC channel are irrelevant

● Throughput oriented– Devices have high delays anyway

● Asynchronous– Again, no function calls, devices are already

asynchronous

Page 37: CS5460: Operating Systems Lecture: Virtualization 2cs5460/slides/virt-lecture2.pdf · Memory virtualization: brute force. Hypervisor TLB Hardware Guest PD CR3 PT Helper structures

Shared rings and events

Page 38: CS5460: Operating Systems Lecture: Virtualization 2cs5460/slides/virt-lecture2.pdf · Memory virtualization: brute force. Hypervisor TLB Hardware Guest PD CR3 PT Helper structures

Shared rings

Page 39: CS5460: Operating Systems Lecture: Virtualization 2cs5460/slides/virt-lecture2.pdf · Memory virtualization: brute force. Hypervisor TLB Hardware Guest PD CR3 PT Helper structures

Shared rings

Page 40: CS5460: Operating Systems Lecture: Virtualization 2cs5460/slides/virt-lecture2.pdf · Memory virtualization: brute force. Hypervisor TLB Hardware Guest PD CR3 PT Helper structures

Shared rings

Page 41: CS5460: Operating Systems Lecture: Virtualization 2cs5460/slides/virt-lecture2.pdf · Memory virtualization: brute force. Hypervisor TLB Hardware Guest PD CR3 PT Helper structures

Shared rings

Page 42: CS5460: Operating Systems Lecture: Virtualization 2cs5460/slides/virt-lecture2.pdf · Memory virtualization: brute force. Hypervisor TLB Hardware Guest PD CR3 PT Helper structures

Where is a performance bottleneck here?

Page 43: CS5460: Operating Systems Lecture: Virtualization 2cs5460/slides/virt-lecture2.pdf · Memory virtualization: brute force. Hypervisor TLB Hardware Guest PD CR3 PT Helper structures

Eliminate cache thrashing

Page 44: CS5460: Operating Systems Lecture: Virtualization 2cs5460/slides/virt-lecture2.pdf · Memory virtualization: brute force. Hypervisor TLB Hardware Guest PD CR3 PT Helper structures

GPUs● Sending frames from the framebuffer

● No hardware acceleration● Too slow

● OpenGL/DirectX level virtualization● Send high-level OpenGL commands over rings● OpenGL operations will be executed on the real

GPU

Page 45: CS5460: Operating Systems Lecture: Virtualization 2cs5460/slides/virt-lecture2.pdf · Memory virtualization: brute force. Hypervisor TLB Hardware Guest PD CR3 PT Helper structures

Devices supporting virtualization

Page 46: CS5460: Operating Systems Lecture: Virtualization 2cs5460/slides/virt-lecture2.pdf · Memory virtualization: brute force. Hypervisor TLB Hardware Guest PD CR3 PT Helper structures

Some VM tricks:suspend/resume, checkpoints

migration

Page 47: CS5460: Operating Systems Lecture: Virtualization 2cs5460/slides/virt-lecture2.pdf · Memory virtualization: brute force. Hypervisor TLB Hardware Guest PD CR3 PT Helper structures

Suspend

Page 48: CS5460: Operating Systems Lecture: Virtualization 2cs5460/slides/virt-lecture2.pdf · Memory virtualization: brute force. Hypervisor TLB Hardware Guest PD CR3 PT Helper structures

Resume

Page 49: CS5460: Operating Systems Lecture: Virtualization 2cs5460/slides/virt-lecture2.pdf · Memory virtualization: brute force. Hypervisor TLB Hardware Guest PD CR3 PT Helper structures

Checkpoints● Checkpoints are almost suspend/resume● Except that a copy of the entire VM’s state

has to be saved● Memory

– OK, it’s relatively small 128MB-4GB● Disk

– Problem: disks are huge 100GB-1TB

● How to save storage efficiently?

● How to make it efficient?

Page 50: CS5460: Operating Systems Lecture: Virtualization 2cs5460/slides/virt-lecture2.pdf · Memory virtualization: brute force. Hypervisor TLB Hardware Guest PD CR3 PT Helper structures

Branching storage

Page 51: CS5460: Operating Systems Lecture: Virtualization 2cs5460/slides/virt-lecture2.pdf · Memory virtualization: brute force. Hypervisor TLB Hardware Guest PD CR3 PT Helper structures

Branching storage: snapshot

Page 52: CS5460: Operating Systems Lecture: Virtualization 2cs5460/slides/virt-lecture2.pdf · Memory virtualization: brute force. Hypervisor TLB Hardware Guest PD CR3 PT Helper structures

Branching storage: writes

Page 53: CS5460: Operating Systems Lecture: Virtualization 2cs5460/slides/virt-lecture2.pdf · Memory virtualization: brute force. Hypervisor TLB Hardware Guest PD CR3 PT Helper structures

Branching storage: snapshot

Page 54: CS5460: Operating Systems Lecture: Virtualization 2cs5460/slides/virt-lecture2.pdf · Memory virtualization: brute force. Hypervisor TLB Hardware Guest PD CR3 PT Helper structures

Migration● Migration is essentially a live checkpoint

between machines● The goal: minimal downtime

● How to make the checkpoint faster?

Page 55: CS5460: Operating Systems Lecture: Virtualization 2cs5460/slides/virt-lecture2.pdf · Memory virtualization: brute force. Hypervisor TLB Hardware Guest PD CR3 PT Helper structures

Migration: memory

Page 56: CS5460: Operating Systems Lecture: Virtualization 2cs5460/slides/virt-lecture2.pdf · Memory virtualization: brute force. Hypervisor TLB Hardware Guest PD CR3 PT Helper structures

Migration: memory

Page 57: CS5460: Operating Systems Lecture: Virtualization 2cs5460/slides/virt-lecture2.pdf · Memory virtualization: brute force. Hypervisor TLB Hardware Guest PD CR3 PT Helper structures

Migration: memory

Page 58: CS5460: Operating Systems Lecture: Virtualization 2cs5460/slides/virt-lecture2.pdf · Memory virtualization: brute force. Hypervisor TLB Hardware Guest PD CR3 PT Helper structures

Migration: memory

Page 59: CS5460: Operating Systems Lecture: Virtualization 2cs5460/slides/virt-lecture2.pdf · Memory virtualization: brute force. Hypervisor TLB Hardware Guest PD CR3 PT Helper structures

Migration: storage

Page 60: CS5460: Operating Systems Lecture: Virtualization 2cs5460/slides/virt-lecture2.pdf · Memory virtualization: brute force. Hypervisor TLB Hardware Guest PD CR3 PT Helper structures

Migration

Page 61: CS5460: Operating Systems Lecture: Virtualization 2cs5460/slides/virt-lecture2.pdf · Memory virtualization: brute force. Hypervisor TLB Hardware Guest PD CR3 PT Helper structures

References

● Intel® 64 and IA-32 Architectures Software Developer's Manual. Volume 3C: System Programming Guide, Part 3

● Ravi Bhargava, Benjamin Serebrin, Francesco Spadini, and Srilatha Manne. Accelerating two-dimensional page walks for virtualized systems. In ASPLOS'08.


Recommended