Date post: | 22-Dec-2015 |
Category: |
Documents |
View: | 217 times |
Download: | 2 times |
Why Events Are A Bad Idea (for high-concurrency servers)
Rob von Behren, Jeremy Condit and Eric Brewer
Computer Science Division, University of California at Berkeley {jrvb, jcondit, brewer}@cs.berkeley.edu http://capriccio.cs.berkeley.edu/
Final Version, Proceedings of HotOS IX
Lihue, Kauai, Hawaii, May 2003
Presented by: Ryan Ledbetter
CS533 - Concepts of Operating SystemsPortland State University,
Professor Jonathan Walpole
January 26, 2009
Events are better, according to other
literature• Inexpensive synchronization
(cooperative multitasking)• Lower overhead for state (short
stacks)• Better scheduling and locality
(application-level information)• More flexible control flow (not just
call/return)
Threads are better, according to von Behren et
al.• Threads are more natural• Historically threads had some
drawbacks, but not now• With small improvements to
compilers and runtime systems older drawbacks are eliminated
• It is easier to implement compiler optimization to a threaded program
Threads Vs. Events
• Old debate• Lauer and Needham discussed in1978• Process-based and message-passing
systems are duals• Conclusion:
Both systems if implemented correctly should yield similar performance
Which one to use is the one that is a better fit for the task at hand
What they missed
• Cooperative scheduling is used by most events for synchronization
• Use of shared memory/global data structures, which Lauer and Needham said was not typical
• SEDA is the only event system that matches Lauer and Needham
“Problems” with Threads
• Performance: Many attempts to use threads for high concurrency have not performed well
• Control Flow: Threads have restrictive control flow
• Synchronization: Thread synchronization mechanisms are too heavyweight
• State Management: Thread stacks are an ineffective way to manage live state
• Scheduling: The virtual processor model provided by threads forces the runtime system to be too generic and prevents it from making optimal scheduling decisions.
Performance
• Historically accurate, due to poor implementations: did not account for both high
concurrency and blocking operations Overhead of O(n) operations (where n is
the number of threads) High context switch overhead
• None of the above is a property of the threaded model
Performance, the Fix
• Modifications to the GNU Pth user-level threading package
• Removed most of the O(n) operations• Repeated SEDA’s threaded server
benchmark• Result: threaded server scales nicely
to 100,000 threads (matching the event-based server)
Control Flow
• Threads push the programmer to think too linearly
• This may cause the use of more efficient control flow patterns to be overlooked
Control Flow, reply
• Flash, Ninja, SEDA, and TinyOS use one of three control flow patterns: Call/Return Parallel Calls Pipelines
• These can be naturally expressed using threads
• Complex patters are rare
Control Flow, reply cont.
• Even-bases system often obfuscate the control flow of the system
• Programmer must keep track of call and return state (which may be in different parts of the code)
• Programmers usually have to use “stack ripping” to save state
• Race conditions can arise and can be hard to find.
Control Flow, Threads are better
• Threads allow for more natural encapsulation of the state
• Calls and returns are grouped • Debugging with current tools is
easier as the call stack contains the live state
Synchronization
• Events can get “free” synchronization because of cooperative multitasking
• No need for the runtime system to worry about mutexes, wait queues, etc.
Synchronization, reply
• The Benefit is from cooperative multitasking not events themselves
• Threads can have the same “free” synchronization
• NOTE: cooperative multitasking only works with uniprocessors
State Management
• Threads historically face the decision of wasting virtual address space or risking a stack overflow
• Events usually have short stacks that fully unwind after each event
• Event systems usually minimize state before a blocking point (as state is managed by the programmers)
State Management, reply
• Dynamic stack growth would solve the problem of over-allocating or overflowing the stack
• Provide automatic state management using the call stack, and reduces state at blocking calls
State Management, Exceptions and State
Lifetime• State cleanup is easier with threads
(stack allocated)• Event-based state is usually heap
allocated and may be difficult to know when to free the memory: Memory leaks Access deallocated memory
• Garbage collection (like Java) is inappropriate for high-performance
Scheduling
• Events have more control over scheduling because it can be done at the application level
• Applications can choose the best schedule (shortest time, priority, etc.)
• Can group runs of the same type of event together.
Scheduling, reply
• Lauer and Needham’s duality says we should be able to do the same scheduling with threads
Why Not Fix Events
• Create tools/languages that force: Call\reply matching Live state management Shared state management
• Would basically be the same a threads• Some tools\techniques have been
created and syntax is very similar to threads
• Improved events == threads
Compiler Support for Threads
• With minor modification the compiler can improve safety, programmer productivity and performance
• Mainly compilers can address three key areas: Dynamic Stack Growth Live State Management Synchronization
Compiler, Dynamic Stack Growth
• A compiler could determine an upper bound on the stack space a function call will need
• Thus it could determine when growth may be needed
• Recursion and function pointers are obstacles, but can be dealt with further analysis
Live State Management
• The issue is that state is not minimized before a subroutine call is made
• Compilers could look ahead at the code and see what temporary variables can safely be popped off or if the entire frame could be popped off
• Warn users if large amounts of state data is waiting for a blocking call
Synchronization
• Compilers could better analyze the code for race conditions and raise a warning if one is likely
• Support atomic sections: nesC, a language for networked sensors
supports atomic sections Atomic section cannot yield or block
The Test
• Compared two models with two servers: Event-based model
Haboob (SEDA) Threaded model
Knot, tested two version• Knot-C favored current connections• Knot-A favored accepting new connections
• Both used the standard poll system call
Results
Stats
• Knot-C 700 Mbits/s maximum bandwidth 1024 clients at peak
• Haboob 500 Mbits/s maximum bandwidth 512 clients at peak
Conclusion
• Event-based Haboob must context switch between event handlers 30,000 context switches per second at
peak (6 times that of Knot)• Small modules == lots of queuing• Temporary object creation, thus
garbage collection• Order of events determined at runtime
Reducing compiler optimizations