Date post: | 18-Dec-2014 |
Category: |
Technology |
Upload: | mahendra-m |
View: | 1,666 times |
Download: | 2 times |
Kqueue : Generic Event Notification
Mahendra [email protected]://www.infosys.com
This work is licensed under a Creative Commons Licensehttp://creativecommons.org/licenses/by-sa/2.5/
Agenda
Traditional ways of multiplexing I/O Methods and issues in handling asynchronous events. Enter Kqueue The Kqueue architecture. Kqueue possibilities.
Traditional File/Socket handling
Traditionally a single file can be handled as below/* No error checking here */
while ( i = read( fd, ... ) ) {
do_something( with_this_data );
}
The above case works fine for one file descriptor What about the case where we have two or more such
descriptors ( for sockets ) and data can appear on any one of the socket at any given point of time ?
– Basically, we need a mechanism for event driven applications.
– This is a case for multiplexing I/O ( or events ) !!
Traditional I/O multiplexing
Use select() and/or poll() select() or poll() pass a list of file descriptors to the kernel
and wait for updates to happen. On receiving an update these calls have the list of file descriptors that got updated.
File descriptors passed as a bitmap – with each bit being set or unset to represent a file descriptor.
Select() and poll() can watch for read/write/exception events on the list of file descriptors.
On return, the applications have to parse the entire bitmap to see which file descriptors have to be handled.
Traditional I/O multiplexing ( contd.. )
fd_set fds;
FD_ZERO( &fds );
FD_SET( 5, &fds );
n = select( 1, &fds, NULL, NULL, NULL );
j = 0;
for ( i = 0; (i < MAX) && (j < n); i++ ) {
if ( FD_ISSET( i ) ) {
read_something_from_socket( i );
j++;
}
}
Issues with select()/poll() Problems of scalability
– Entire descriptor set has to be passed to each invocation of the system call ( specially with poll() - which uses an array )
– Massive copies from user space to kernel space and vice-versa
– Not all descriptors may have activity all the time
– On return, apps had to parse the entire list to check for updated descriptors. ( duplicated effort in kernel and app ) - O(N) activity
– Results in inefficient memory usage within the kernel
– In case of sleep, the list has to be parsed three times.
sleep()/poll() can handle only file descriptors Coding was clunky for select()
– Descriptor set is a bitmap of fixed size ( default 255 )
Other forms of interesting events
Asynchronous signal notifications– Required in libraries that may want to be notified of signals
Asynchronous timer expiry Asynchronous Read/Write ( aio_read(), aio_write() ) VFS changes Process state Changes Thread state changes Device driver notifications Anything else – that will require some asynchronous event
notification – and the design allowing it.
Available solutions
Linux 2.4 : SIGIO Sun Solaris : /dev/poll Linux 2.4 : /dev/epoll
– Use ioctl() to manipulate the above.
Even Microsoft Windows had something to offer. Kqueue – for BSD boxes.
– We shall be talking about that now !!
Kqueue - Goals
A generic event notification framework– File descriptors (read/write/exceptions), Signals,
Asynchronous I/O ( not in OSFR ), Vnodes monitoring, process monitoring, Timer events.
A single system call to handle all this. Capability to add new functionality. Efficient use of memory
– Memory should be allocated as per need.– Should be able to register/receive interested number of
events.– Events should be combined ( eg: data arriving over a socket )
Should be good replacements for standard calls. Should be possible to extend this functionality easily
Kqueue APIs
int32_t kqueue( void );
– Creates a kernel queue. It is identical to a file descriptor. It can be deleted using the close() system call.
int32_t kevent( kq, changes, nc, events, ne, timeout );
– To register events in the kernel queue– To receive events that occurred between consecutive calls.– Can simulate select(), poll() - Using different values of timeout– No need to store the event descriptors locally in the
application. EV_SET( &event, ident, filter, flags,
fflags, data, udata)
– Used to prepare an event for registering in the kernel queue.
Kqueue sample code
kq = kqueue();
struct kevent kev[10];
// Prepare an event
EV_SET( &kev[0], fd, EVFILT_READ, EV_ADD, 0, 0, 0);
// Register an event
kevent( kq, &kev, 10, NULL, 0, timeout );
// Receive events
n = kevent( kq, NULL, 0, &kev, 10, timeout );
for ( i = 0; i < n; i++ ) {
// Do something
}
Kqueue filter types
READ : Returns when data is available for read from sockets, vnodes, fifos, pipes
– ident = descriptor– Data = amount of data to be read– Flags = can be EOF etc.
WRITE : Returns when it is possible to write to a descriptor ( ident ).
– Data = amount of data that can be written VNODE : Returns when a file descriptor changes
– fflags = delete, write, extend, attrib, link, rename, revoke
Kqueue filter types ( contd... )
PROC : Monitors a process– Ident = pid of the process to be monitored.– Fflags = Exit, fork, exec, track, trackerr
SIGNAL : Returns when a signal is delivered to a process.– Ident = signal number– Data = no of times the signal was delivered.– Co-exists with signal() and sigaction() - and has a lower
precedence.– Is delivered even if SIG_IGN is set for the signal
TIMER : Establishes a timer– ident = timer id, Data = timeout in milliseconds, or no of times– Periodic by default unless ONESHOT is specified
Kqueue Flags
ADD : To add an event to the queue ENABLE : To enable a disabled event DISABLE : To temporarily disable an event ( not deleted ) DELETE : Remove an event from the kernel queue ONESHOT : Cause the event to happen only once. CLEAR : Clear the state of the filter after it is received EOF : End – of – File ERROR : Specific errors.
Kqueue – Good things
As you would have seen – It is extremely scalable in handling large file descriptors
– Eliminates most of the deficiencies of select()/poll()
– Currently, efforts are underway to migrate some popular daemons ( Apache ) to use Kqueue.
It supports a wide range of events – not just file descriptors. Is easily extensible. New kqueue filters can be added very easily inside the BSD
kernels. Opens up a lot of interesting possibilities.
Issues with Kqueue
Kqueue calls are not part of POSIX specifications.– Most of the Unix systems do not implement it.
– Breaks portability across Unices
Third party code may still use select(), poll() etc. We may have to migrate this or allow these to co-exist
Relatively new in the play field – Not time-tested.
References
Kqueue: A generic and scalable event notification facility - Jonathan Lemon
http://people.freebsd.org/~jlemon/papers/kqueue.pdf
Man pages for kqueue, knote, kfilter_register Read the source, Luke !!
Finally ...
Questions ?? Thanks to
– Organizers for giving me a chance to speak at GNUnify 2006
– NetBSD and Linux developers who helped me during my work
– To Infosys for sponsoring my visit to GNUnify 2006
Special thanks to YOU for listening...
You can contact me at :