Why Events Are A Bad Idea (for high-concurrency servers) Rob von Behren, Jeremy Condit and Eric...

Why Events Are A Bad Idea (for high-concurrency servers)

Rob von Behren, Jeremy Condit and Eric Brewer

Computer Science Division, University of California at Berkeley {jrvb, jcondit, brewer}@cs.berkeley.edu http://capriccio.cs.berkeley.edu/

Final Version, Proceedings of HotOS IX

Lihue, Kauai, Hawaii, May 2003

Presented by: Ryan Ledbetter

CS533 - Concepts of Operating SystemsPortland State University,

Professor Jonathan Walpole

January 26, 2009

Events are better, according to other

literature• Inexpensive synchronization

(cooperative multitasking)• Lower overhead for state (short

stacks)• Better scheduling and locality

(application-level information)• More flexible control flow (not just

call/return)

Threads are better, according to von Behren et

al.• Threads are more natural• Historically threads had some

drawbacks, but not now• With small improvements to

compilers and runtime systems older drawbacks are eliminated

• It is easier to implement compiler optimization to a threaded program

Threads Vs. Events

• Old debate• Lauer and Needham discussed in1978• Process-based and message-passing

systems are duals• Conclusion:

Both systems if implemented correctly should yield similar performance

Which one to use is the one that is a better fit for the task at hand

What they missed

• Cooperative scheduling is used by most events for synchronization

• Use of shared memory/global data structures, which Lauer and Needham said was not typical

• SEDA is the only event system that matches Lauer and Needham

“Problems” with Threads

• Performance: Many attempts to use threads for high concurrency have not performed well

• Control Flow: Threads have restrictive control flow

• Synchronization: Thread synchronization mechanisms are too heavyweight

• State Management: Thread stacks are an ineffective way to manage live state

• Scheduling: The virtual processor model provided by threads forces the runtime system to be too generic and prevents it from making optimal scheduling decisions.

Performance

• Historically accurate, due to poor implementations: did not account for both high

concurrency and blocking operations Overhead of O(n) operations (where n is

the number of threads) High context switch overhead

• None of the above is a property of the threaded model

Performance, the Fix

• Modifications to the GNU Pth user-level threading package

• Removed most of the O(n) operations• Repeated SEDA’s threaded server

benchmark• Result: threaded server scales nicely

to 100,000 threads (matching the event-based server)

Control Flow

• Threads push the programmer to think too linearly

• This may cause the use of more efficient control flow patterns to be overlooked

Control Flow, reply

• Flash, Ninja, SEDA, and TinyOS use one of three control flow patterns: Call/Return Parallel Calls Pipelines

• These can be naturally expressed using threads

• Complex patters are rare

Control Flow, reply cont.

• Even-bases system often obfuscate the control flow of the system

• Programmer must keep track of call and return state (which may be in different parts of the code)

• Programmers usually have to use “stack ripping” to save state

• Race conditions can arise and can be hard to find.

Control Flow, Threads are better

• Threads allow for more natural encapsulation of the state

• Calls and returns are grouped • Debugging with current tools is

easier as the call stack contains the live state

Synchronization

• Events can get “free” synchronization because of cooperative multitasking

• No need for the runtime system to worry about mutexes, wait queues, etc.

Synchronization, reply

• The Benefit is from cooperative multitasking not events themselves

• Threads can have the same “free” synchronization

• NOTE: cooperative multitasking only works with uniprocessors

State Management

• Threads historically face the decision of wasting virtual address space or risking a stack overflow

• Events usually have short stacks that fully unwind after each event

• Event systems usually minimize state before a blocking point (as state is managed by the programmers)

State Management, reply

• Dynamic stack growth would solve the problem of over-allocating or overflowing the stack

• Provide automatic state management using the call stack, and reduces state at blocking calls

State Management, Exceptions and State

Lifetime• State cleanup is easier with threads

(stack allocated)• Event-based state is usually heap

allocated and may be difficult to know when to free the memory: Memory leaks Access deallocated memory

• Garbage collection (like Java) is inappropriate for high-performance

Scheduling

• Events have more control over scheduling because it can be done at the application level

• Applications can choose the best schedule (shortest time, priority, etc.)

• Can group runs of the same type of event together.

Scheduling, reply

• Lauer and Needham’s duality says we should be able to do the same scheduling with threads

Why Not Fix Events

• Create tools/languages that force: Call\reply matching Live state management Shared state management

• Would basically be the same a threads• Some tools\techniques have been

created and syntax is very similar to threads

• Improved events == threads

Compiler Support for Threads

• With minor modification the compiler can improve safety, programmer productivity and performance

• Mainly compilers can address three key areas: Dynamic Stack Growth Live State Management Synchronization

Compiler, Dynamic Stack Growth

• A compiler could determine an upper bound on the stack space a function call will need

• Thus it could determine when growth may be needed

• Recursion and function pointers are obstacles, but can be dealt with further analysis

Live State Management

• The issue is that state is not minimized before a subroutine call is made

• Compilers could look ahead at the code and see what temporary variables can safely be popped off or if the entire frame could be popped off

• Warn users if large amounts of state data is waiting for a blocking call

Synchronization

• Compilers could better analyze the code for race conditions and raise a warning if one is likely

• Support atomic sections: nesC, a language for networked sensors

supports atomic sections Atomic section cannot yield or block

The Test

• Compared two models with two servers: Event-based model

Haboob (SEDA) Threaded model

Knot, tested two version• Knot-C favored current connections• Knot-A favored accepting new connections

• Both used the standard poll system call

Results

Stats

• Knot-C 700 Mbits/s maximum bandwidth 1024 clients at peak

• Haboob 500 Mbits/s maximum bandwidth 512 clients at peak

Conclusion

• Event-based Haboob must context switch between event handlers 30,000 context switches per second at

peak (6 times that of Knot)• Small modules == lots of queuing• Temporary object creation, thus

garbage collection• Order of events determined at runtime

Reducing compiler optimizations

Date post:	22-Dec-2015
Category:	Documents
View:	217 times
Download:	2 times

Why Events Are A Bad Idea (for high-concurrency servers) Rob von Behren, Jeremy Condit and Eric...

Documents