Date post: | 18-Jan-2018 |
Category: |
Documents |
Upload: | clementine-phelps |
View: | 216 times |
Download: | 0 times |
Resource Management in Software Programmable Router OS
David Yau System Software and Architecture LabDepartment of Computer SciencesPurdue Universityhttp://ssal.cs.purdue.eduhttp://www.cs.purdue.edu/people/yau
MotivationsMore sophisticated network contents
application-specific transportMore demanding network usersValue-added services
intelligent congestion control/packet repair security (copyright, intrusion detection) packet combining without losing information ...
Existing Networks
client
router: simpleforwarding
ISP server
Value-added Services Networks
client
router: processing +forwarding
Code server
Encryption
Intelligentcongestion control
ISP
ChallengesHeterogeneous users
needs, priorities, purchased sharesUntrusted programs
greedy, buggy, malicious, …Diverse resources
space-shared, time-sharedDiverse resource bindings
multi-processes, multi-threads, multiplexed threads
The CROSS ApproachVirtualized router resources
virtual machines installed by service providers
Resource allocation objects as first-class citizens in kernel
Flexible/scalable packet classification resource binding, per-flow processing
Efficiency, modularity, configurability
Resource VirtualizationHierarchical scheduling
virtual machines with different APIs user allocations on demand
Target resource types CPU time network bandwidth memory pool capacity (virtual memory) disk bandwidth
Resource AbstractionKernel resource allocation objects to
account for resource useIndependent/orthogonal objects
relative to resource consumersFlexible bindings to resource consumers
shared binding dynamic binding (with run-time information) configurable parameters
Resource Allocations
CPUscheduler
Diskscheduler
Networkscheduler
Memoryscheduler
Resourceallocation
bindSharedbinding
request
selectLogical multiplexing function
Resource Allocation APICreate/delete
named by object system-wide keyBind/unbind
affect calling thread/process key to fine-grained resource management
Control change scheduling parameters, owner, …
User-level access through pseudo-device
Packet ForwardingThree possibilities
active program dispatchtrusted (kernel thread), untrusted (user
process) Per-flow processing
subscribed by dispatched router programssecurity processing, application-level routing
Cut-through fast pathminimal delay
Packet forwarding decision
Based on packet header informationPacket classification
scalable to many dimensions scalable to many classification rules flexible
support multiple and least-cost matches
Cross Forwarding Paths
Resourceallocationmanager
Functiondispatcher
Cut-through
subscribe
dispatch
Active packet
send
Per-flowprocessing
Outputnetworkqueues
Inputqueues
Resource Binding DecisionActive packet starts router programProgram must run with resource
allocation which allocation? retrieved as part of packet classification request to create new allocation request to use existing allocation with
given key
System IntegrationLeverage against Solaris gateway OS
support for existing application immediate access to software development
platformImplication
need to work with existing Solaris abstractions
threads/processes, stream buffers, page frames, buffer header structures, ...
CPU SchedulerHierarchical partitioning using fair
service curves [MMCN 2001]Decoupled delay and rate allocation
good for low delay and low rate applicationsSolution to priority inversion
lock contention and client/server interactionPerformance
rate/delay guarantees, proportional sharing, minimized unfariness
Service Curve
Time since thread wakes up
CPUtimepromised Linear curve =
rate0.3
Convex curve
0.2
0.5
Concave curve
0.5
0.2
C
DD’
CPU Sharing Hierarchy
CPU
VMfoo
VMbar
VMdoe
Userallocation
B
Userallocation
A
bind
thread
process
Memory SchedulerGuaranteed share per allocation
minimum number of page frames that allocation can map simultaneously
Guaranteed-share scanner algorithm consider pages for replacement in
decreasing over-allocation order second chance to referenced pages
allow reserved but unused pages to be utilized
Memory Allocation
Disk SchedulerProvide proportional sharing
conflict with efficiency goal to minimize seek time overhead
notion of eligibility to balance between the two goals, using tolerance parameter
Integrated with file systems problem: applications do not access disk
directly!
File System Disk AccessResource allocations bound to resource
consumers (I.e., threads/processes)Threads/processes execute
read/write/mmap system calls disk accesses avoided unless file system
page faultsPage faults occur in interrupt context!Solution: Association Map on vnode/offset
Association Map
Page fault
Associationmap Disk server
Resource principal
Upper-halffile calls:read/write/mmap
Vnode/offsetto allocationmapping
Lookupallocation
Vnode/offset
USER
KERNEL
System ImplementationExtension to Solaris 2.5.1Deployed on UltraSPARC/Pentium
network Ethernet, Fast Ethernet, Myrinet
Modular subsystems with well-defined interfaces
Simple command interfaces to launch legacy applications
Basic CostsResource Allocation control
create delete bind/unbind
Function dispatch thread: about 145 microseconds, low variance process: 0.77 to 1.1 ms, application-dependent
Resource Allocation Costs (microseconds)Operation kernel user
Bind 4.8 9.0
Unbind 2.4 6.6
Create +delete
15.4 19.6
Packet ClassificationFive dimension
exact, prefix, range, wildcardDatabase size up to 256 K rulesAverage lookup cost of 7.8
microseconds 1.1 Gb/s for 1000 byte packets
Add/delete 10.8/14.9 microseconds 67,000 updates per second
Packet Classifier Performance
CPU/network SchedulingNetwork respond application
driven by received packets do some CPU computation, send some
network data outTotal delay budget of 3.5 seconds
CPU one second, network 2.5 seconds CPU two seconds, network 1.5 seconds
Allow both rate and delay compositions
Rate CompositionUdpburst CPUrate
Greedy CPUrate
Achievedbandwidth(Mb/s)
5% 95% 3.610% 90% 7.815% 85% 9.820% 80% 9.8
Delay Composition (microseconds)Run Mean
CPUdelay
S. d.CPUdelay
Meannetdelay
S. d.netdelay
Meantotaldelay
S. d.totaldelay
1 1.06 0.007 2.31 0.136 3.39 0.144
2 2.00 0.096 1.48 0.136 3.49 0.172
Disk SchedulingProgram: search through an input file
sequentially for some patternTwo groups of 10 processes each
group one: reading 65,588 kbytes, with allocation of rate 10
group two: reading 55,789 kbytes, with allocation of rate 20
Equal CPU allocations, disk placement not controlled
Proportional Disk Sharing
Memory SchedulingFootprint application
repeatedly touch a set of n distinct pages
Result summary [JSAC 2001] isolation properties utilization of over-reserved pages reclaim of reserved pages
Related WorkRouter Plugins (Washington U)
extensibility, quick resource binding through gates
Extensible Router (Princeton) kernel built from scratch, path abstractionabstraction
Bowman (Georgia Tech/U Kentucky)Bowman (Georgia Tech/U Kentucky) Posix user-level implementation for Posix user-level implementation for
portabilityportability
Related Work (cont’d)Active node with ANTS (MIT)
externally certified program capsulesFlexible end-system scheduling
Resource Containers (Rice) Software Performance Units (Stanford) Reservation Domains (AT&T)
ConclusionsResource management important for
software-programmable routersPresented system prototype as solution
step packet classification router program dispatch unified and orthogonal resource abstraction schedulers for major resource types
PointersSystem Software and Architecture Lab
http://ssal.cs.purdue.edu [JSAC 2001] Resource Management in Software-
Programmable Router OS [MMCN 2001] Performance Evaluation of CPU
Isolation Mechanisms in a Multimedia OS KernelMyself
[email protected] www.cs.purdue.edu/people/yau