+ All Categories
Home > Documents > CS5460: Operating Systems Lecture: Virtualization 2cs5460/slides/virt-lecture2.pdf · N I T O R A c...

CS5460: Operating Systems Lecture: Virtualization 2cs5460/slides/virt-lecture2.pdf · N I T O R A c...

Date post: 15-Aug-2020
Category:
Upload: others
View: 1 times
Download: 0 times
Share this document with a friend
61
CS5460: Operating Systems Lecture: Virtualization 2 Anton Burtsev March, 2013
Transcript
Page 1: CS5460: Operating Systems Lecture: Virtualization 2cs5460/slides/virt-lecture2.pdf · N I T O R A c t i v a t e I / O b i t m a p s U n c o n d i t i o n a l I / O H L T I N V L P

CS5460: Operating Systems

Lecture: Virtualization 2

Anton BurtsevMarch, 2013

Page 2: CS5460: Operating Systems Lecture: Virtualization 2cs5460/slides/virt-lecture2.pdf · N I T O R A c t i v a t e I / O b i t m a p s U n c o n d i t i o n a l I / O H L T I N V L P

Paravirtualization:Xen

Page 3: CS5460: Operating Systems Lecture: Virtualization 2cs5460/slides/virt-lecture2.pdf · N I T O R A c t i v a t e I / O b i t m a p s U n c o n d i t i o n a l I / O H L T I N V L P

● Complete illusion of physical hardware● Trap _all_ sensitive

instructions● Example: page table

update

Full virtualization

Virtualized OS

Hypervisor

PTE update (mov)

Page 4: CS5460: Operating Systems Lecture: Virtualization 2cs5460/slides/virt-lecture2.pdf · N I T O R A c t i v a t e I / O b i t m a p s U n c o n d i t i o n a l I / O H L T I N V L P

● Complete illusion of physical hardware● Trap _all_ sensitive

instructions● Example: page table

update

Full virtualization

Virtualized OS

Hypervisor

PTE update (mov)

Trap

Page 5: CS5460: Operating Systems Lecture: Virtualization 2cs5460/slides/virt-lecture2.pdf · N I T O R A c t i v a t e I / O b i t m a p s U n c o n d i t i o n a l I / O H L T I N V L P

● Complete illusion of physical hardware● Trap _all_ sensitive

instructions● Example: page table

update

Full virtualization

Virtualized OS

Hypervisor

PTE update (mov)

if (safe) { update_pte(); emulate_mov(); }

Next instruction

Trap

● Traps are slow● Binary translation is

faster, for some events● Not for PTE updates,

why?

Page 6: CS5460: Operating Systems Lecture: Virtualization 2cs5460/slides/virt-lecture2.pdf · N I T O R A c t i v a t e I / O b i t m a p s U n c o n d i t i o n a l I / O H L T I N V L P

Performance problems

Virtualized OS

Hypervisor

PTE update (mov)

if (safe) { update_pte(); emulate_mov(); }

Next instruction

Trap

● Traps are slow● Binary translation is faster

● For some events● Not for PTE updates, why?

Page 7: CS5460: Operating Systems Lecture: Virtualization 2cs5460/slides/virt-lecture2.pdf · N I T O R A c t i v a t e I / O b i t m a p s U n c o n d i t i o n a l I / O H L T I N V L P

Paravirtualization

● No illusion of hardware● Instead: paravirtualized interface

● Explicit hypervisor calls to update sensitive state– Page tables, interrupt flag

● But Guest OS needs porting● Applications run natively in Ring 3

Page 8: CS5460: Operating Systems Lecture: Virtualization 2cs5460/slides/virt-lecture2.pdf · N I T O R A c t i v a t e I / O b i t m a p s U n c o n d i t i o n a l I / O H L T I N V L P

ParavirtualizationParavirtualized OS

Hypervisor

PTE update

Batch updatesupdate 1update 2

Invoke hypervisor

if (safe) update

Page 9: CS5460: Operating Systems Lecture: Virtualization 2cs5460/slides/virt-lecture2.pdf · N I T O R A c t i v a t e I / O b i t m a p s U n c o n d i t i o n a l I / O H L T I N V L P

Xen

Page 10: CS5460: Operating Systems Lecture: Virtualization 2cs5460/slides/virt-lecture2.pdf · N I T O R A c t i v a t e I / O b i t m a p s U n c o n d i t i o n a l I / O H L T I N V L P

Segmentation and paging

Page 11: CS5460: Operating Systems Lecture: Virtualization 2cs5460/slides/virt-lecture2.pdf · N I T O R A c t i v a t e I / O b i t m a p s U n c o n d i t i o n a l I / O H L T I N V L P

Hypervisor protection

Page 12: CS5460: Operating Systems Lecture: Virtualization 2cs5460/slides/virt-lecture2.pdf · N I T O R A c t i v a t e I / O b i t m a p s U n c o n d i t i o n a l I / O H L T I N V L P

Hardware support for virtualization:KVM

Page 13: CS5460: Operating Systems Lecture: Virtualization 2cs5460/slides/virt-lecture2.pdf · N I T O R A c t i v a t e I / O b i t m a p s U n c o n d i t i o n a l I / O H L T I N V L P

Basic idea

Host instruction stream

Guest instruction stream

VM Entry VM Exit

Host State

Guest State

VMCS

Page 14: CS5460: Operating Systems Lecture: Virtualization 2cs5460/slides/virt-lecture2.pdf · N I T O R A c t i v a t e I / O b i t m a p s U n c o n d i t i o n a l I / O H L T I N V L P

New mode of operation:VMX root

● VMX root operation● 4 privilege levels

● VMX non-root operation● 4 privilege levels as well, but unable to invoke

VMX root instructions● Guest runs until it performs exception causing it

to exit● Rich set of exit events● Guest state and exit reason are stored in VMCS

Page 15: CS5460: Operating Systems Lecture: Virtualization 2cs5460/slides/virt-lecture2.pdf · N I T O R A c t i v a t e I / O b i t m a p s U n c o n d i t i o n a l I / O H L T I N V L P

Virtual machine control structure (VMCS)

● Guest State● Loaded on entries● Saved on exits

● Host State● Saved on entries● Loaded on exits

● Control fields● Execution control, exits control, entries control

Page 16: CS5460: Operating Systems Lecture: Virtualization 2cs5460/slides/virt-lecture2.pdf · N I T O R A c t i v a t e I / O b i t m a p s U n c o n d i t i o n a l I / O H L T I N V L P

Guest state

● Register state● Non-register state

● Activity state: – active– inactive (HLT, Shutdown, wait for Startup IPI

interprocessor interrupt))● Interruptibility state

Page 17: CS5460: Operating Systems Lecture: Virtualization 2cs5460/slides/virt-lecture2.pdf · N I T O R A c t i v a t e I / O b i t m a p s U n c o n d i t i o n a l I / O H L T I N V L P

Host state

● Only register state● ALU registers,

● also:● Base page table address (CR3)● Segment selectors● Global descriptors table ● Interrupt descriptors table

Page 18: CS5460: Operating Systems Lecture: Virtualization 2cs5460/slides/virt-lecture2.pdf · N I T O R A c t i v a t e I / O b i t m a p s U n c o n d i t i o n a l I / O H L T I N V L P

VM-execution controls(asynchronous events control)

Reserved

Bit 31 Bit 0

External interrupts (maskable or IRQs) cause exits(yes/no)If not, then they delivered through guestIDT

NMI cause exits (yes/no)If not, then they are delivered normally through guest IDT (descriptor 2)

Page 19: CS5460: Operating Systems Lecture: Virtualization 2cs5460/slides/virt-lecture2.pdf · N I T O R A c t i v a t e I / O b i t m a p s U n c o n d i t i o n a l I / O H L T I N V L P

VM-execution controls(synchronous events control, not all reasons are shown)

Reserved

Bit 31 Bit 0PAUSE

MONITOR

Act

ivat

e I/

O b

itm

aps

Unc

ondi

tion

al I

/O

HLT

INVLPG

Page 20: CS5460: Operating Systems Lecture: Virtualization 2cs5460/slides/virt-lecture2.pdf · N I T O R A c t i v a t e I / O b i t m a p s U n c o n d i t i o n a l I / O H L T I N V L P

Exception bitmap(one for each of 32 IA-32 exceptions)

Bit 31 Bit 0

● IA-32 defines 32 exception vectors (interrupts 0-31)

● Each of them is configured to cause or not VM-exit

14 – page fault

Page 21: CS5460: Operating Systems Lecture: Virtualization 2cs5460/slides/virt-lecture2.pdf · N I T O R A c t i v a t e I / O b i t m a p s U n c o n d i t i o n a l I / O H L T I N V L P

I/O Bitmaps

● Two addresses on 4KB memory areas (A and B)

A B

Safe I/O addresses (not causing exits)

Page 22: CS5460: Operating Systems Lecture: Virtualization 2cs5460/slides/virt-lecture2.pdf · N I T O R A c t i v a t e I / O b i t m a p s U n c o n d i t i o n a l I / O H L T I N V L P

Exit information

● Information describing conditions of VM-exit is saved in VMCS● It's different for different types of event

Page 23: CS5460: Operating Systems Lecture: Virtualization 2cs5460/slides/virt-lecture2.pdf · N I T O R A c t i v a t e I / O b i t m a p s U n c o n d i t i o n a l I / O H L T I N V L P

KVM

Page 24: CS5460: Operating Systems Lecture: Virtualization 2cs5460/slides/virt-lecture2.pdf · N I T O R A c t i v a t e I / O b i t m a p s U n c o n d i t i o n a l I / O H L T I N V L P

Memory virtualization: brute force.

Hypervisor

HardwareTLB

Guest

PD

CR3

PT

Helper structures describe actual guest VM layout Maintained for each guest. On VM-Exit hypervisor adjusts guest page accordingly.

Write / read protectedpage table area. Every access results in VM-Exit and passes control to hypervisor

CPU stores pointer onguest page table directory

Page 25: CS5460: Operating Systems Lecture: Virtualization 2cs5460/slides/virt-lecture2.pdf · N I T O R A c t i v a t e I / O b i t m a p s U n c o n d i t i o n a l I / O H L T I N V L P

Memory virtualization: shadow page tables

HardwareTLB

Guest

PD

CR3

PT

Active page table hierarchy VMM maintains it for each VM that it supports

Guest page table hierarchy It's writable, but can be inconsistent with active page table hierarchy stored by the hypervisor

PD PT

CPU stores pointer on active page table hierarchy. On Intel CPUs TLB is always refilled from active page table directory

Page 26: CS5460: Operating Systems Lecture: Virtualization 2cs5460/slides/virt-lecture2.pdf · N I T O R A c t i v a t e I / O b i t m a p s U n c o n d i t i o n a l I / O H L T I N V L P

Nested page tables

hPT gPT Host Physical

gPT

VMM Host Virtual

Guest Physical

Guest VirtualgCR3

hCR3

0

0

0

PT

CR3 used by VMM

Translation can be cached in TLB

paged by CR3

paged by hCR3

paged by gCR3

Page 27: CS5460: Operating Systems Lecture: Virtualization 2cs5460/slides/virt-lecture2.pdf · N I T O R A c t i v a t e I / O b i t m a p s U n c o n d i t i o n a l I / O H L T I N V L P

Page table lookup ● 4-level page table

Page 28: CS5460: Operating Systems Lecture: Virtualization 2cs5460/slides/virt-lecture2.pdf · N I T O R A c t i v a t e I / O b i t m a p s U n c o n d i t i o n a l I / O H L T I N V L P

Nested page table lookup

Page 29: CS5460: Operating Systems Lecture: Virtualization 2cs5460/slides/virt-lecture2.pdf · N I T O R A c t i v a t e I / O b i t m a p s U n c o n d i t i o n a l I / O H L T I N V L P

Efficient I/O

Page 30: CS5460: Operating Systems Lecture: Virtualization 2cs5460/slides/virt-lecture2.pdf · N I T O R A c t i v a t e I / O b i t m a p s U n c o n d i t i o n a l I / O H L T I N V L P

Where is the bottleneck● What is the bottleneck in case of

virtualization?● CPU?

– CPU bound workloads execute natively on the real CPU

– Sometimes JIT compilation (binary translation makes them even faster [Dynamo]

● Everything what is inside VM is fast!● What is the most frequent operation

disturbing execution of VM? ● Device I/O!

● Disk, Network, Graphics

Page 31: CS5460: Operating Systems Lecture: Virtualization 2cs5460/slides/virt-lecture2.pdf · N I T O R A c t i v a t e I / O b i t m a p s U n c o n d i t i o n a l I / O H L T I N V L P

Virtual devices in Xen

31

Page 32: CS5460: Operating Systems Lecture: Virtualization 2cs5460/slides/virt-lecture2.pdf · N I T O R A c t i v a t e I / O b i t m a p s U n c o n d i t i o n a l I / O H L T I N V L P

Virtual devices in Xen

32

Page 33: CS5460: Operating Systems Lecture: Virtualization 2cs5460/slides/virt-lecture2.pdf · N I T O R A c t i v a t e I / O b i t m a p s U n c o n d i t i o n a l I / O H L T I N V L P

Virtual devices in Xen

33

Page 34: CS5460: Operating Systems Lecture: Virtualization 2cs5460/slides/virt-lecture2.pdf · N I T O R A c t i v a t e I / O b i t m a p s U n c o n d i t i o n a l I / O H L T I N V L P

Virtual devices in Xen

34

Page 35: CS5460: Operating Systems Lecture: Virtualization 2cs5460/slides/virt-lecture2.pdf · N I T O R A c t i v a t e I / O b i t m a p s U n c o n d i t i o n a l I / O H L T I N V L P

Virtual devices in Xen

35

Page 36: CS5460: Operating Systems Lecture: Virtualization 2cs5460/slides/virt-lecture2.pdf · N I T O R A c t i v a t e I / O b i t m a p s U n c o n d i t i o n a l I / O H L T I N V L P

How to make the I/O fast?● Take into account specifics of the device-

driver communication● Bulk

– Large packets (512B – 4K)● Session oriented

– Connection is established once (during boot)– No short IPCs, like function calls– Costs of establishing an IPC channel are irrelevant

● Throughput oriented– Devices have high delays anyway

● Asynchronous– Again, no function calls, devices are already

asynchronous

Page 37: CS5460: Operating Systems Lecture: Virtualization 2cs5460/slides/virt-lecture2.pdf · N I T O R A c t i v a t e I / O b i t m a p s U n c o n d i t i o n a l I / O H L T I N V L P

Shared rings and events

Page 38: CS5460: Operating Systems Lecture: Virtualization 2cs5460/slides/virt-lecture2.pdf · N I T O R A c t i v a t e I / O b i t m a p s U n c o n d i t i o n a l I / O H L T I N V L P

Shared rings

Page 39: CS5460: Operating Systems Lecture: Virtualization 2cs5460/slides/virt-lecture2.pdf · N I T O R A c t i v a t e I / O b i t m a p s U n c o n d i t i o n a l I / O H L T I N V L P

Shared rings

Page 40: CS5460: Operating Systems Lecture: Virtualization 2cs5460/slides/virt-lecture2.pdf · N I T O R A c t i v a t e I / O b i t m a p s U n c o n d i t i o n a l I / O H L T I N V L P

Shared rings

Page 41: CS5460: Operating Systems Lecture: Virtualization 2cs5460/slides/virt-lecture2.pdf · N I T O R A c t i v a t e I / O b i t m a p s U n c o n d i t i o n a l I / O H L T I N V L P

Shared rings

Page 42: CS5460: Operating Systems Lecture: Virtualization 2cs5460/slides/virt-lecture2.pdf · N I T O R A c t i v a t e I / O b i t m a p s U n c o n d i t i o n a l I / O H L T I N V L P

Where is a performance bottleneck here?

Page 43: CS5460: Operating Systems Lecture: Virtualization 2cs5460/slides/virt-lecture2.pdf · N I T O R A c t i v a t e I / O b i t m a p s U n c o n d i t i o n a l I / O H L T I N V L P

Eliminate cache thrashing

Page 44: CS5460: Operating Systems Lecture: Virtualization 2cs5460/slides/virt-lecture2.pdf · N I T O R A c t i v a t e I / O b i t m a p s U n c o n d i t i o n a l I / O H L T I N V L P

GPUs● Sending frames from the framebuffer

● No hardware acceleration● Too slow

● OpenGL/DirectX level virtualization● Send high-level OpenGL commands over rings● OpenGL operations will be executed on the real

GPU

Page 45: CS5460: Operating Systems Lecture: Virtualization 2cs5460/slides/virt-lecture2.pdf · N I T O R A c t i v a t e I / O b i t m a p s U n c o n d i t i o n a l I / O H L T I N V L P

Devices supporting virtualization

Page 46: CS5460: Operating Systems Lecture: Virtualization 2cs5460/slides/virt-lecture2.pdf · N I T O R A c t i v a t e I / O b i t m a p s U n c o n d i t i o n a l I / O H L T I N V L P

Some VM tricks:suspend/resume, checkpoints

migration

Page 47: CS5460: Operating Systems Lecture: Virtualization 2cs5460/slides/virt-lecture2.pdf · N I T O R A c t i v a t e I / O b i t m a p s U n c o n d i t i o n a l I / O H L T I N V L P

Suspend

Page 48: CS5460: Operating Systems Lecture: Virtualization 2cs5460/slides/virt-lecture2.pdf · N I T O R A c t i v a t e I / O b i t m a p s U n c o n d i t i o n a l I / O H L T I N V L P

Resume

Page 49: CS5460: Operating Systems Lecture: Virtualization 2cs5460/slides/virt-lecture2.pdf · N I T O R A c t i v a t e I / O b i t m a p s U n c o n d i t i o n a l I / O H L T I N V L P

Checkpoints● Checkpoints are almost suspend/resume● Except that a copy of the entire VM’s state

has to be saved● Memory

– OK, it’s relatively small 128MB-4GB● Disk

– Problem: disks are huge 100GB-1TB

● How to save storage efficiently?

● How to make it efficient?

Page 50: CS5460: Operating Systems Lecture: Virtualization 2cs5460/slides/virt-lecture2.pdf · N I T O R A c t i v a t e I / O b i t m a p s U n c o n d i t i o n a l I / O H L T I N V L P

Branching storage

Page 51: CS5460: Operating Systems Lecture: Virtualization 2cs5460/slides/virt-lecture2.pdf · N I T O R A c t i v a t e I / O b i t m a p s U n c o n d i t i o n a l I / O H L T I N V L P

Branching storage: snapshot

Page 52: CS5460: Operating Systems Lecture: Virtualization 2cs5460/slides/virt-lecture2.pdf · N I T O R A c t i v a t e I / O b i t m a p s U n c o n d i t i o n a l I / O H L T I N V L P

Branching storage: writes

Page 53: CS5460: Operating Systems Lecture: Virtualization 2cs5460/slides/virt-lecture2.pdf · N I T O R A c t i v a t e I / O b i t m a p s U n c o n d i t i o n a l I / O H L T I N V L P

Branching storage: snapshot

Page 54: CS5460: Operating Systems Lecture: Virtualization 2cs5460/slides/virt-lecture2.pdf · N I T O R A c t i v a t e I / O b i t m a p s U n c o n d i t i o n a l I / O H L T I N V L P

Migration● Migration is essentially a live checkpoint

between machines● The goal: minimal downtime

● How to make the checkpoint faster?

Page 55: CS5460: Operating Systems Lecture: Virtualization 2cs5460/slides/virt-lecture2.pdf · N I T O R A c t i v a t e I / O b i t m a p s U n c o n d i t i o n a l I / O H L T I N V L P

Migration: memory

Page 56: CS5460: Operating Systems Lecture: Virtualization 2cs5460/slides/virt-lecture2.pdf · N I T O R A c t i v a t e I / O b i t m a p s U n c o n d i t i o n a l I / O H L T I N V L P

Migration: memory

Page 57: CS5460: Operating Systems Lecture: Virtualization 2cs5460/slides/virt-lecture2.pdf · N I T O R A c t i v a t e I / O b i t m a p s U n c o n d i t i o n a l I / O H L T I N V L P

Migration: memory

Page 58: CS5460: Operating Systems Lecture: Virtualization 2cs5460/slides/virt-lecture2.pdf · N I T O R A c t i v a t e I / O b i t m a p s U n c o n d i t i o n a l I / O H L T I N V L P

Migration: memory

Page 59: CS5460: Operating Systems Lecture: Virtualization 2cs5460/slides/virt-lecture2.pdf · N I T O R A c t i v a t e I / O b i t m a p s U n c o n d i t i o n a l I / O H L T I N V L P

Migration: storage

Page 60: CS5460: Operating Systems Lecture: Virtualization 2cs5460/slides/virt-lecture2.pdf · N I T O R A c t i v a t e I / O b i t m a p s U n c o n d i t i o n a l I / O H L T I N V L P

Migration

Page 61: CS5460: Operating Systems Lecture: Virtualization 2cs5460/slides/virt-lecture2.pdf · N I T O R A c t i v a t e I / O b i t m a p s U n c o n d i t i o n a l I / O H L T I N V L P

References

● Intel® 64 and IA-32 Architectures Software Developer's Manual. Volume 3C: System Programming Guide, Part 3

● Ravi Bhargava, Benjamin Serebrin, Francesco Spadini, and Srilatha Manne. Accelerating two-dimensional page walks for virtualized systems. In ASPLOS'08.


Recommended