+ All Categories
Home > Documents > Bare-Metal Performance for x86 I/O Virtualization · HiPEAC Autumn Computing Systems Week in...

Bare-Metal Performance for x86 I/O Virtualization · HiPEAC Autumn Computing Systems Week in...

Date post: 14-Oct-2020
Category:
Upload: others
View: 0 times
Download: 0 times
Share this document with a friend
23
Bare-Metal Performance for x86 I/O Virtualization Muli Ben-Yehuda Technion & IBM Research HiPEAC Autumn Computing Systems Week in Barcelona Muli Ben-Yehuda (Technion & IBM Research) Bare-Metal Perf. for I/O Virtualization HiPEAC CSW Nov, 2011 1 / 23
Transcript
Page 1: Bare-Metal Performance for x86 I/O Virtualization · HiPEAC Autumn Computing Systems Week in Barcelona Muli Ben-Yehuda (Technion & IBM Research) Bare-Metal Perf. for I/O Virtualization

Bare-Metal Performance for x86 I/O Virtualization

Muli Ben-Yehuda

Technion & IBM Research

HiPEAC Autumn Computing Systems Week in Barcelona

Muli Ben-Yehuda (Technion & IBM Research) Bare-Metal Perf. for I/O Virtualization HiPEAC CSW Nov, 2011 1 / 23

Page 2: Bare-Metal Performance for x86 I/O Virtualization · HiPEAC Autumn Computing Systems Week in Barcelona Muli Ben-Yehuda (Technion & IBM Research) Bare-Metal Perf. for I/O Virtualization

Background: x86 machine virtualization

Running multiple different unmodified operating systemsEach in an isolated virtual machineSimultaneouslyOn the x86 architectureMany uses: live migration, record & replay, testing, security, . . .Foundation of IaaS cloud computingUsed nearly everywhere

Muli Ben-Yehuda (Technion & IBM Research) Bare-Metal Perf. for I/O Virtualization HiPEAC CSW Nov, 2011 2 / 23

Page 3: Bare-Metal Performance for x86 I/O Virtualization · HiPEAC Autumn Computing Systems Week in Barcelona Muli Ben-Yehuda (Technion & IBM Research) Bare-Metal Perf. for I/O Virtualization

The problem is performance

Machine virtualization can reduce performance by orders ofmagnitude[Adams06,Santos08,Ram09,Ben-Yehuda10,Amit11,. . . ]Overhead limits use of virtualization in many scenariosWe would like to make it possible to use virtualization everywhereWhere does the overhead come from?

Muli Ben-Yehuda (Technion & IBM Research) Bare-Metal Perf. for I/O Virtualization HiPEAC CSW Nov, 2011 3 / 23

Page 4: Bare-Metal Performance for x86 I/O Virtualization · HiPEAC Autumn Computing Systems Week in Barcelona Muli Ben-Yehuda (Technion & IBM Research) Bare-Metal Perf. for I/O Virtualization

The origin of overhead

Popek and Goldberg’s virtualization model [Popek74]: Trap andemulatePrivileged instructions trap to the hypervisorHypervisor emulates their behaviorTraps cause an exitI/O intensive workloads cause many exits

Muli Ben-Yehuda (Technion & IBM Research) Bare-Metal Perf. for I/O Virtualization HiPEAC CSW Nov, 2011 4 / 23

Page 5: Bare-Metal Performance for x86 I/O Virtualization · HiPEAC Autumn Computing Systems Week in Barcelona Muli Ben-Yehuda (Technion & IBM Research) Bare-Metal Perf. for I/O Virtualization

I/O virtualization via device emulation

GUEST

HOST

1

2

34

deviceemulation

driverdevice

driverdevice

Emulation is usually the default [Sugerman01]Works for unmodified guests out of the boxVery low performance, due to many exits on the I/O path

Muli Ben-Yehuda (Technion & IBM Research) Bare-Metal Perf. for I/O Virtualization HiPEAC CSW Nov, 2011 5 / 23

Page 6: Bare-Metal Performance for x86 I/O Virtualization · HiPEAC Autumn Computing Systems Week in Barcelona Muli Ben-Yehuda (Technion & IBM Research) Bare-Metal Perf. for I/O Virtualization

I/O virtualization via paravirtualized devices

GUEST

HOST

driver

1

23

back−end

virtualdriver

front−end

virtualdevicedriver

Hypervisor aware drivers and “devices” [Barham03,Russell08]Requires new guest driversRequires hypervisor involvement on the I/O path

Muli Ben-Yehuda (Technion & IBM Research) Bare-Metal Perf. for I/O Virtualization HiPEAC CSW Nov, 2011 6 / 23

Page 7: Bare-Metal Performance for x86 I/O Virtualization · HiPEAC Autumn Computing Systems Week in Barcelona Muli Ben-Yehuda (Technion & IBM Research) Bare-Metal Perf. for I/O Virtualization

I/O virtualization via device assignment

GUEST

HOST

devicedriver

Bypass the hypervisor on I/O path [Levasseur04,Ben-Yehuda06]SR-IOV devices provide sharing in hardwareBetter performance than paravirtual—but far from native

Muli Ben-Yehuda (Technion & IBM Research) Bare-Metal Perf. for I/O Virtualization HiPEAC CSW Nov, 2011 7 / 23

Page 8: Bare-Metal Performance for x86 I/O Virtualization · HiPEAC Autumn Computing Systems Week in Barcelona Muli Ben-Yehuda (Technion & IBM Research) Bare-Metal Perf. for I/O Virtualization

Comparing I/O virtualization methods

IOV method throughput (Mb/s) CPU utilizationbare-metal 950 20%

device assignment 950 25%paravirtual 950 50%emulation 250 100%

netperf TCP_STREAM sender on 1Gb/s Ethernet (16K msgs)Device assignment best performing optionDevice assignment still 25% worse than bare metal. Why?

“The Turtles Project: Design and Implementation of Nested Virtualization”,Ben-Yehuda, Day, Dubitzky, Factor, Hare’El, Gordon, Liguori, Wasserman andYassour, OSDI ’10

Muli Ben-Yehuda (Technion & IBM Research) Bare-Metal Perf. for I/O Virtualization HiPEAC CSW Nov, 2011 8 / 23

Page 9: Bare-Metal Performance for x86 I/O Virtualization · HiPEAC Autumn Computing Systems Week in Barcelona Muli Ben-Yehuda (Technion & IBM Research) Bare-Metal Perf. for I/O Virtualization

What does it mean, to do I/O?

Programmed I/O (in/outinstructions)Memory-mapped I/O (loadsand stores)Direct memory access (DMA)Interrupts

Muli Ben-Yehuda (Technion & IBM Research) Bare-Metal Perf. for I/O Virtualization HiPEAC CSW Nov, 2011 9 / 23

Page 10: Bare-Metal Performance for x86 I/O Virtualization · HiPEAC Autumn Computing Systems Week in Barcelona Muli Ben-Yehuda (Technion & IBM Research) Bare-Metal Perf. for I/O Virtualization

Direct memory access (DMA)

All modern devices access memory directlyOn bare-metal:

A trusted driver gives its device an addressDevice reads or writes that address

Protection problem: guest drivers are not trustedTranslation problem: guest memory 6= host memoryDirect access: the guest bypasses the hostWhat to do?

Muli Ben-Yehuda (Technion & IBM Research) Bare-Metal Perf. for I/O Virtualization HiPEAC CSW Nov, 2011 10 / 23

Page 11: Bare-Metal Performance for x86 I/O Virtualization · HiPEAC Autumn Computing Systems Week in Barcelona Muli Ben-Yehuda (Technion & IBM Research) Bare-Metal Perf. for I/O Virtualization

IOMMU

Muli Ben-Yehuda (Technion & IBM Research) Bare-Metal Perf. for I/O Virtualization HiPEAC CSW Nov, 2011 11 / 23

Page 12: Bare-Metal Performance for x86 I/O Virtualization · HiPEAC Autumn Computing Systems Week in Barcelona Muli Ben-Yehuda (Technion & IBM Research) Bare-Metal Perf. for I/O Virtualization

The IOMMU mapping memory/performance tradeoff

When does the host map and unmap translation entries?Direct mapping up-front on virtual machine creation: all memory ispinned, no intra-guest protectionDuring run-time: high cost in performanceWe want: direct mapping performance, intra-guest protection,minimal pinning

Muli Ben-Yehuda (Technion & IBM Research) Bare-Metal Perf. for I/O Virtualization HiPEAC CSW Nov, 2011 12 / 23

Page 13: Bare-Metal Performance for x86 I/O Virtualization · HiPEAC Autumn Computing Systems Week in Barcelona Muli Ben-Yehuda (Technion & IBM Research) Bare-Metal Perf. for I/O Virtualization

vIOMMU: efficient IOMMU emulation

Emulate an IOMMU so that weknow when to map and unmapUse a sidecore [Kumar07] forefficient emulation: avoid costlyexits by running emulation onanother core in parallelOptimistic teardown: relaxprotection to increaseperformance by cachingtranslation entriesvIOMMU provides highperformance with intra-guestprotection and minimal pinning

IOMMU

I/O Device

Memory

I/O DeviceDriver

IOMMUMapping

Layer

GuestDomain

EmulationDomain(Sidecore)

SystemDomain

IOMMUEmulation

(2) UpdateMappings Emul.

PTE

PhysicalPTE

(6) UpdateMappings

I/OBuffer

(9) IOVAAccess

(7) IOTLB Invalidations

Emul.IOMMURegs.

(4) Poll

(3) IOTLB Invd.

(1)Map / Unmap

I/O Buffer

(11)PhysicalAccess

(8) Transactionto IOVA

(10)Translate

(5) Read

“vIOMMU: Efficient IOMMU Emulation”, Amit, Ben-Yehuda, Schuster, Tsafrir,USENIX ATC ’11

Muli Ben-Yehuda (Technion & IBM Research) Bare-Metal Perf. for I/O Virtualization HiPEAC CSW Nov, 2011 13 / 23

Page 14: Bare-Metal Performance for x86 I/O Virtualization · HiPEAC Autumn Computing Systems Week in Barcelona Muli Ben-Yehuda (Technion & IBM Research) Bare-Metal Perf. for I/O Virtualization

Problem solved?

netperf TCP_STREAMsender on 10Gb/s Ethernetwith 256 byte messagesUsing device assignment withdirect mapping in the IOMMUOnly achieves 60% ofbare-metal performanceSame results for memcachedand apache

Where does the rest go?

Muli Ben-Yehuda (Technion & IBM Research) Bare-Metal Perf. for I/O Virtualization HiPEAC CSW Nov, 2011 14 / 23

Page 15: Bare-Metal Performance for x86 I/O Virtualization · HiPEAC Autumn Computing Systems Week in Barcelona Muli Ben-Yehuda (Technion & IBM Research) Bare-Metal Perf. for I/O Virtualization

Recap: doing I/O

Programmed I/O (in/out instructions)Memory-mapped I/O (loads and stores)Direct memory access (DMA)Interrupts: approximately 49,000 interrupts per second with Linux

Muli Ben-Yehuda (Technion & IBM Research) Bare-Metal Perf. for I/O Virtualization HiPEAC CSW Nov, 2011 15 / 23

Page 16: Bare-Metal Performance for x86 I/O Virtualization · HiPEAC Autumn Computing Systems Week in Barcelona Muli Ben-Yehuda (Technion & IBM Research) Bare-Metal Perf. for I/O Virtualization

ELI: ExitLess Interrupts

bare-metal

Baseline

guest

hypervisor

(time)

ELI delivery

guest

hypervisor

ELIdelivery & completion

guest

hypervisor

PhysicalInterrupt

Interrupt Completion

(a)

(b)

(c)

Interrupt Injection

Interrupt Completion

(d)

ELI: direct interrupts for unmodified, untrusted guests

“ELI: Bare-Metal Performance for I/O Virtualization”, Gordon, Amit, Hare’El,Ben-Yehuda, Landau, Schuster, Tsafrir, ASPLOS ’12

Muli Ben-Yehuda (Technion & IBM Research) Bare-Metal Perf. for I/O Virtualization HiPEAC CSW Nov, 2011 16 / 23

Page 17: Bare-Metal Performance for x86 I/O Virtualization · HiPEAC Autumn Computing Systems Week in Barcelona Muli Ben-Yehuda (Technion & IBM Research) Bare-Metal Perf. for I/O Virtualization

ELI: delivery

ShadowIDT

Hypervisor

ShadowIDT

InterruptHandler

AssignedInterrupt

PhysicalInterrupt

Non-assignedInterrupt(#NP/#GP exit)

ELIDelivery

GuestIDT

VM

IDT Entry

IDT Entry

IDT Entry

P=0

P=1

P=0

Handler

#NP

#NP

IDT Entry#GP

IDTRLimit

All interrupts are delivered directly to the guestHost and other guests’ interrupts are bounced back to the host. . . without the guest being aware of it

Muli Ben-Yehuda (Technion & IBM Research) Bare-Metal Perf. for I/O Virtualization HiPEAC CSW Nov, 2011 17 / 23

Page 18: Bare-Metal Performance for x86 I/O Virtualization · HiPEAC Autumn Computing Systems Week in Barcelona Muli Ben-Yehuda (Technion & IBM Research) Bare-Metal Perf. for I/O Virtualization

ELI: signaling completion

Guests signal interrupt completions by writing to the LocalAdvance Programmable Interrupt Controller (LAPIC)End-of-Interrupt (EOI) registerOld LAPIC: hypervisor traps load/stores to LAPIC pagex2APIC: hypervisor can trap specific registers

Signaling completion without trapping requires x2APICELI gives the guest direct access only to the EOI register

Muli Ben-Yehuda (Technion & IBM Research) Bare-Metal Perf. for I/O Virtualization HiPEAC CSW Nov, 2011 18 / 23

Page 19: Bare-Metal Performance for x86 I/O Virtualization · HiPEAC Autumn Computing Systems Week in Barcelona Muli Ben-Yehuda (Technion & IBM Research) Bare-Metal Perf. for I/O Virtualization

ELI: threat model

Threats: malicious guests might try to:keep interrupts disabledsignal invalid completionsconsume other guests or host interrupts

Muli Ben-Yehuda (Technion & IBM Research) Bare-Metal Perf. for I/O Virtualization HiPEAC CSW Nov, 2011 19 / 23

Page 20: Bare-Metal Performance for x86 I/O Virtualization · HiPEAC Autumn Computing Systems Week in Barcelona Muli Ben-Yehuda (Technion & IBM Research) Bare-Metal Perf. for I/O Virtualization

ELI: protection

VMX preemption timer to force exits instead of timer interruptsIgnore spurious EOIsProtect critical interrupts by:

Delivering them to a non-ELI core if availableRedirecting them as NMIs→unconditional exitUse IDTR limit to force #GP exits on critical interrupts

Muli Ben-Yehuda (Technion & IBM Research) Bare-Metal Perf. for I/O Virtualization HiPEAC CSW Nov, 2011 20 / 23

Page 21: Bare-Metal Performance for x86 I/O Virtualization · HiPEAC Autumn Computing Systems Week in Barcelona Muli Ben-Yehuda (Technion & IBM Research) Bare-Metal Perf. for I/O Virtualization

Bare-metal Performance for I/O Virtualization

Throughput is scaled so 100% means bare-metal throughputAll workloads reach 97–100% of bare metal with ELI!CPU is saturated; host uses huge pages to back guest memoryFull experimental details and analysis in ASPLOS paper

Muli Ben-Yehuda (Technion & IBM Research) Bare-Metal Perf. for I/O Virtualization HiPEAC CSW Nov, 2011 21 / 23

Page 22: Bare-Metal Performance for x86 I/O Virtualization · HiPEAC Autumn Computing Systems Week in Barcelona Muli Ben-Yehuda (Technion & IBM Research) Bare-Metal Perf. for I/O Virtualization

Conclusion

IOMMUs take the host out of the DMA pathELI takes the host out of the interrupt pathAchievement unlocked: bare-metal performance for x86 VMs

Muli Ben-Yehuda (Technion & IBM Research) Bare-Metal Perf. for I/O Virtualization HiPEAC CSW Nov, 2011 22 / 23

Page 23: Bare-Metal Performance for x86 I/O Virtualization · HiPEAC Autumn Computing Systems Week in Barcelona Muli Ben-Yehuda (Technion & IBM Research) Bare-Metal Perf. for I/O Virtualization

Thank you! Questions?

Muli Ben-Yehuda (Technion & IBM Research) Bare-Metal Perf. for I/O Virtualization HiPEAC CSW Nov, 2011 23 / 23


Recommended