Date post: | 10-Feb-2017 |
Category: |
Technology |
Author: | hajime-tazaki |
View: | 382 times |
Download: | 17 times |
1
Linux Kernel Library: ReusingMonolithic Kernel
Hajime TazakiIIJ Innovation Institute
2016/07
AIST seminar vol.2
2 . 1
LKL in a nutshellLinux kernel library
a library of LinuxOctavian Purdila (Intel)'s work (since 2007?)Proposed on LKML (Nov. 2015)
2809 LoC (as of Apr. 2016)https://lwn.net/Articles/662953/
Purdila et al., LKL: The Linux kernel library, RoEduNet
2010.
2 . 2
LKL (cont'd)hardware-independent architecture (arch/lkl)provide an interface underlying environment
outsource dependenciesclock, memory allocation, schedulerrunning on Windows, Linux, FreeBSD
simplify I/O operation of devicesvirtio host implementationcould use the driver (of virtio) in Linux
Purdila et al., LKL: The Linux kernel library,RoEduNet 2010.
2 . 3
Benefitless ossication of new features
operating system personalityuserspace library has less deployment cost
Well-matured code base(e.g.) Linux kernel running in userspacesmall kernel, a bunch of librarybut in a dierent shape
Any problem in computer science can be solved withanother level of indirection.
(Wheeler and/or Lampson)
img src: https://www.ickr.com/photos/thomasclaveirole/305073153
2 . 4
2 . 5
What is reusing monolithic kernel ?Anykernel: originally in NetBSD rump kernel
We dene an anykernel to be an organization ofkernel code which allows the kernel's unmodied
drivers to be run in various congurations such asapplication libraries and microkernel style servers,and also as part of a monolithic kernel. -- Kantee
2012.
Using (unmodied) high-quality code base of monolithic kernelon dierent environment in dierent shapeby gluing additional stus
2 . 6
2 . 7
(a bitof)
Historyrump: 2007 (NetBSD)LKL: 2007 (Linux)DCE/LibOS: 2008 (Linux/FreeBSD)LibOS/LKL revival: 2015
LibOS merged to LKL
http://news.mynavi.jp/news/2015/03/25/285/https://news.ycombinator.com/item?id=9259292
http://www.phoronix.com/scan.php?page=news_item&px=Linux-Library-LibOShttp://lwn.net/Articles/639333/
2 . 8
2 . 9
LKL v.s. LibOS
LKL
LibOS
LKL v.s. LibOS (cont'd)LoC:
arch/lkl (LKL) < arch/lib (LibOS)di: the amount of stub code
commonsno modication to the original Linux codedescription of kernel context (by POSIX thread)outsourced resources (clock, memory, scheduler)CPU independent architecture
disLibOS: implemented with higher API (timer, irq, kthread) by pthreadLKL: implement IRQ, kthread, timer with pthread in lower layer
2 . 103 . 1
Implementation
2 . 10
3 . 2
Internals
1. Host backend (host_ops)2. CPU independent arch. (arch/lkl)3. Application interface
1. host backendenvironment dependent part
unify an interface across dierent platforms(rump-hypercall like)
device interface with Virtioblock device disk imagenetworking TAP, raw socket, DPDK, VDE
3 . 33 . 4
2. CPU independent architecturearchitecture (arch/lkl)
transparent architecture bind (as CPU arch)
require no modication to the other
2800 LoCthread information (struct thread_info)irq, timer, syscall handleraccess to underlying layer by host_ops
3 . 3
3 . 5
3. Application interface
1. use exposed API (LKL syscall)2. use host libc (LD_PRELOAD)3. extend (alternative) libc
3 . 6
API 1: use exposed API (LKLsyscall)
call entry points of LKL kernellkl_sys_open(), lkl_sys_socket()
almost same as ordinal syscallsreturn value, errno notication are dierent
can use LKL syscall and host syscall simultaneously
read ext4 le by lkl_sys_read() => write into host (Windows) by write()
3 . 7
API 2: hijack host standard librarydynamically replace symbols of host syscalls (of libc)
LD_PRELOADsocket() => lkl_sys_socket()
can use host binary (executable) as-islimitation of replaceable symbolsneeds syscall translation on non-linux host
3 . 8
API 3: extend (alternative) libconly call LKL syscall with our own libcalso introduce as a virtual CPU architecturea program can link this instead of host libc
can't access to (underlying) host resource directly via this lkl syscall
as a patch for musl libc
3 . 9
Usecase (applications)Use Case 1: instant kernel bypassUse Case 2: programs reusing kernel code in userspaceUse Case 3: unikernel
3 . 10
Use Case 1: instant kernel bypasssyscall redirection by LD_PRELOADcan use both LKL and host syscalls
new feature without touching host kernel
LD_PRELOAD=liblklsupertcp++.sofirefox
3 . 11
Use Case 2: programs reusingkernel code in userspace
use kernel code without portingmount a lesystem w/o root privilege
can use both LKL and host syscalls
e.g., access to disk image of ext4 format on Windows1. open disk image (CreateFile())2. Mount (lkl_sys_mount())3. read a le in the disk image (lkl_sys_read())4. write a le to windows side (WriteFile())
3 . 12
Use Case 3: Unikernelsingle-application contained LKL
python + LKL, nginx + LKLonly LKL syscalls available
musl libc extensionrump hypcall (frankenlibc)
running on non-OS environment(on Xen Mini-OS via rumprun)
Work in progress
- http://www.linux.com/news/enterprise/cloud-computing/751156-are-cloud-operating-
systems-the-next-big-thing-
3 . 13
demos with linux kernel library
Unikernel on Linux (ping6 commandembedded kernel library)
Unikernel on qemu-arm (helloworld)
4 . 1
Kernel bypass/userspacenetworking
4 . 2
Network StackWhy in kernel space ?
the cost of packet was expensive at the era ('70s)now much cheaper
Getting fat (matured) after decades
code path is longer (and slower)hard to add new featuresfaced unknown issues
img src: http://www.makelinux.net/kernel_map/
4 . 3
Alternate network stackslwip (2002~)Arrakis [OSDI '14]IX [OSDI '14]MegaPipe [OSDI '12]mTCP [NSDI '14]SandStorm [SIGCOMM '14]uTCP [CCR '14]rumpkernel [ATC '09]FastSocket [ASPLOS '16]SolarFlare (2007~?)StackMap [ATC '16]libuinet (2013~)SeaStar (2014~)Snabb Switch (2012~)
4 . 4
MotivationsSocket API sucks
StackMap, MegaPipe, uTCP, SandStorm, IXNew API: no benet with existing applications
Network stack in kernel space sucksFastSocket, mTCP, lwip (SolarFlare?)
Compatibility is (also) importantrumpkernel, libuinet, Arrakis, IX, SolarFlare
Existing programming model sucksSeaStar
4 . 5
Techniquesbatching (syscall/NIC access)
Arrakis, IX, MegaPipe, mTCP, SandStorm, uTCPUtilize feature-rich kernel stack
rumpkernel, fastsocket, StackMapPorting to userspace stack
libuinet, SandStormKernel bypass (userspace network stack)
mTCP, SandStorm, uTCP, rumpkernel, libuinet, lwip, SeaStarbypass technique itself
netmap, PF_RING, raw socket, Intel DPDKConnection locality (multi-core scalability)
SeaStar, MegaPipe, mTCP, fastsocket, .....
4 . 6
ImplementationFull scratch
lwip (Arrakis, IX, SolarFlare?), mTCP, uTCP, SeaStarPorting based
libuinet, SandStormNew API
MegaPipe, StackMapAnykernel
rumpkernel, (LKL)
4 . 7
What's still missing ?some solves problems by specialization
avoiding generality taxperformance w/ specialization v.s. more features w/ generalizatione.g., less TCP stack features, new API breaks existing applicationssupport.
specialized v.s. generalizedgeneralization often involves indirectionindirection usually introduces complexity (Wheeler/Lampson)
performant and generalized ?
5 . 1
Performance study
5 . 2
ConditionsThinkStation P310 x2
CPU: Intel Core i7-6700 CPU @ 3.40GHz (8 cores)Memory: 32GBNIC: X540-T2
Linux 4.4.6-301 (x86_64) on Fedora 23Linux bridge (X540 + tap/raw socket)no DPDK... can't with hijack, etc
netperf (git ~v2.7.0)netserver (native)netperf (varied)
5 . 3
Conditions (cont'd)combinations
netperf (sendmmsg) + host stack (native)+ hijack library, native thread (hijack)+ frankenlibc/lkl, green thread (lkl-musl)netperf (sendmmsg) + lkl extension + frankenlibc (lkl-musl (skb prealloc))
pinned a processorusing taskset command
disable all ooad features (tso/gso/gro, rx/tx cksum)
TCP_RR (netperf)
5 . 4
UDP_STREAM (netperf)
5 . 5
UDP_STREAM (pps, netperf)
5 . 6
TCP_STREAM (netperf)
5 . 7
5 . 8
(ref.) LibOS results (as of Feb.2015)
1024 bytes UDP, own-crafted tool
throughput:
5 . 9
Observations (of benchmark)Native thread vs Green thread
better TCP_RR w/ native thread (pthread)better TCP_STREAM/UDP_STREAM w/ green thread???
avoiding dynamic allocation contributes a lotpenalized over MTU-sized payload on host stack (?)
6 . 1
SummaryMorphing monolithic kernel into an AnykernelVarious use cases
Userspace network stack (kernel bypass)Unikernel
Performance study in progress
https://github.com/lkl/linux
6 . 2
ReferenceLinux Kernel Library
Purdila et al., LKL: The Linux kernel library, RoEduNet 2010.
Rumpkernel (dissertation)Kantee, Flexible Operating System Internals: The Design andImplementation of the Anykernel and Rump Kernels, Ph.D Thesis,2012
Linux LibOS in generalTazaki et al. Direct Code Execution: Revisiting Library OSArchitecture for Reproducible Network Experiments, CoNEXT 2013
(LibOS in general)
https://github.com/lkl/linux
http://libos-nuse.github.io/https://lwn.net/Articles/637658/
7 . 1
Backups
7 . 4
Recent Updates
7 . 5
Updates (diff to lkl)(musl) libc integrationrump hypercall interface
via frankenlibc tools (for POSIX environment)via rumprun framework (for baremetall/xen/kvm environment)
more applicationsnetperf (signal handling, etc)nginxghc (Haskell runtime)
performance study
7 . 6
libc integrationstandard lib for LKL
all syscall direct to LKLapplication can use LKL transparently no special modications or hijack needed
based on musl libcintroduce new (sub) architecture lkl
rump hypercall interfacereplacement of LKL host_ops
or yet-another new host environment (rump)
has two thread primitives
pthread-based (as LKL does)ucontext-based (more ecient on non-MP)
can reduce
the eort of host_ops maintainancecomplexity of tall abstraction turtle
7 . 77 . 8
rump hypcall (cont'd)integration of
libc (musl for LKL, netbsd libc for rumpkernel)rump hypcall (on linux, freebsd, netbsd, qemu-arm, spike)host (platform) support code
frankenlibchas two namespaced libc(s)hyper call implementation can use libc
providesa libc.across-build toolchains (rumprun-cc, etc)
7 . 7
7 . 9
Usagebuild
%./configureCC=rumprunccmake
execution (with rexec launcher)
%rexec./nginxdisknginx.imgtap:tap0cnginx.conf
rexec executable [disk image le] [NIC] -- [executable specic options]
7 . 10
Codeshttps://github.com/libos-nuse/lkl-linuxhttps://github.com/libos-nuse/muslhttps://github.com/libos-nuse/frankenlibchttps://github.com/libos-nuse/rumprunhttps://github.com/libos-nuse/nginxhttps://github.com/libos-nuse/ghc