+ All Categories
Home > Documents > OS09 K. Gopinath, IIScdrona.csa.iisc.ernet.in/~gopi/os09/extra-sep-os09.pdf · algorithm getblk...

OS09 K. Gopinath, IIScdrona.csa.iisc.ernet.in/~gopi/os09/extra-sep-os09.pdf · algorithm getblk...

Date post: 10-Feb-2018
Category:
Upload: phamkhanh
View: 215 times
Download: 2 times
Share this document with a friend
62
Indian Inst of Science 1 OS09 K. Gopinath, IISc Slides are from Tanenbaum, provided as part of his book Updated Tanenbaum slides from by Darrell Long/Ethan Miller at UCSC Mine own (K. Gopinath) Some papers/books too numerous to list Hence PL. DO NOT CIRCULATE
Transcript
Page 1: OS09 K. Gopinath, IIScdrona.csa.iisc.ernet.in/~gopi/os09/extra-sep-os09.pdf · algorithm getblk input: file sys #, block #; output: locked buffer that can now be used for block while

Indian Inst of Science 1

OS09K. Gopinath, IISc

Slides are from Tanenbaum, provided as part of his book Updated Tanenbaum slides from

by Darrell Long/Ethan Miller at UCSC Mine own (K. Gopinath) Some papers/books too numerous to list Hence PL. DO NOT CIRCULATE

Page 2: OS09 K. Gopinath, IIScdrona.csa.iisc.ernet.in/~gopi/os09/extra-sep-os09.pdf · algorithm getblk input: file sys #, block #; output: locked buffer that can now be used for block while

Sleep and wakeup: Producer­Consumer problem#include "prototypes.h"

#define N 100

int count = 0;

void producer(void) {

int item;

while (TRUE) {

produce_item(&item);

if (count == N) sleep();

enter_item(item);

count = count + 1;

if (count == 1) wakeup(consumer);

}

}

void consumer(void) {

int item;

while (TRUE) {

if (count == 0) sleep();

remove_item(&item);

count = count - 1;

if (count == N - 1) wakeup(producer);

consume_item(item);

}

}

Race problem if just before sleep, consumer swapped out and producer signals

Page 3: OS09 K. Gopinath, IIScdrona.csa.iisc.ernet.in/~gopi/os09/extra-sep-os09.pdf · algorithm getblk input: file sys #, block #; output: locked buffer that can now be used for block while

Interrupts & Kernel code● Interrupts (or process scheduling) can occur anytime

● Interrupt handler can also call brelse just like kcode● However, interrupt handler should not block● Otherwise, the process on whose (kernel) stack the interrupt 

handler runs blocks● Can expose data structures in an inconsistent state

● List manipulation requires multiple steps● Interrupt can expose intermediate state

● Interrupt handler can manipulate linked lists that kernel code could also be manipulating● Need to raise “processor execution level” to mask interrupts (or 

scheduling)● Check & sleep (or test & set) should be atomic

Page 4: OS09 K. Gopinath, IIScdrona.csa.iisc.ernet.in/~gopi/os09/extra-sep-os09.pdf · algorithm getblk input: file sys #, block #; output: locked buffer that can now be used for block while

Buffer Allocation Algs● getblk: given a filesystem number and disk block number, get the 

buffer for it locked

● brelse: given a locked buffer, wakeup waiting procs and unlock it

● bread: read a given disk block into a buffer

● breada: bread + asynch. read ahead

● bwrite: write a given buffer to a disk block

● Buffer properties

● No file block in 2 different buffers● Can be in free list or hash list: Search free list if any buffer 

needed; hash list if a particular buffer needed● Buf alloc safe: allocs during syscall & frees at end

– Disk drive hw problem: cannot interrupt CPU: buf lost!– But no starvation guarantees: 

Page 5: OS09 K. Gopinath, IIScdrona.csa.iisc.ernet.in/~gopi/os09/extra-sep-os09.pdf · algorithm getblk input: file sys #, block #; output: locked buffer that can now be used for block while

Data Structures● inode: owner, file type (REG, DIR, FIFO, CHR, 

BLK, ...), access perms/times, #links, disk addrs for blocks in file, file size

● incore inode: addl fields: locked?, process­waiting?, dirty?, mount point?; reference count (# of opens), ptrs to other incore inodes(free and hash q)

● superblock: size of FS/inode list, #free blocks/inodes, dirty?,list/bitmap of free blocks/inodes, index of next free block/inode, locks for lists/bitmap

Page 6: OS09 K. Gopinath, IIScdrona.csa.iisc.ernet.in/~gopi/os09/extra-sep-os09.pdf · algorithm getblk input: file sys #, block #; output: locked buffer that can now be used for block while

algorithm getblkinput: file sys #, block #; output: locked buffer that can now be used for block

while (buffer not found) { 1

if (block in hash queue) { 2

if (buffer busy) { 3

sleep (event buffer becomes free); 4

continue; 5

} 6

mark buffer busy; 7

remove buffer from free list; 8

return buffer; 9

} else { 10

if (there are no buffers on free list){11

sleep (event any buffer becomes free);

continue;

}

remove buffer from free list;

if (buffer marked for delayed write) {

asynchronous write buffer to disk;

continue;

}

mark buffer busy;

remove buffer from old hash queue;

put buffer onto new hash queue;

return buffer;

}

}

bread(filesystem f, block n) {

getblk(f,n);

if (buffer data valid) return buffer;

initiate disk read;

sleep(event disk read complete);

return (buffer); }

Page 7: OS09 K. Gopinath, IIScdrona.csa.iisc.ernet.in/~gopi/os09/extra-sep-os09.pdf · algorithm getblk input: file sys #, block #; output: locked buffer that can now be used for block while

Race Conditions   P1 P2 P3block b not

on hash Q

no free bufs

sleep

block b not on hash Q

no free bufs

sleep

free a buf

wakeup

use freed buf

wakeup

Try from beg!

P1 P2 P3alloc buf to block b

lock buf

init I/O

sleep until

I/O done

buf locked

sleep

wait for any buf

get buf b & reassign to b'

wakeup; try again!

Page 8: OS09 K. Gopinath, IIScdrona.csa.iisc.ernet.in/~gopi/os09/extra-sep-os09.pdf · algorithm getblk input: file sys #, block #; output: locked buffer that can now be used for block while

algorithm brelseinput : locked buffer

output : none

{wake up all procs : event, waiting for any buffer to become free

wake up all procs : event, waiting for this buffer to become free

raise processor execution level to block interrupts;

if (buffer contents are valid and buffer not old)

enqueue buffer at end of free list

else enqueue buffer at the beginning of the free list

lower processor execution level to allow interrupts;

unlock(buffer);}

bwrite(buf b) {

initiate disk write;

if (I/O synchronous) {sleep(event I/O complete); brelse(b);}

else (if b marked for delayed write) mark buffer to put at head of list

}

Page 9: OS09 K. Gopinath, IIScdrona.csa.iisc.ernet.in/~gopi/os09/extra-sep-os09.pdf · algorithm getblk input: file sys #, block #; output: locked buffer that can now be used for block while

Problems● 1st prob

● P1 finds buffer is busy (line 3) & “starts” to sleep ● P1 gets preempted as P2 finishes use of buffer (thru interrupt)

and releases it● P1 now begins sleep even though buffer free

● 2nd prob

● P1 finds no free buffers (line 11)● P1 gets preempted as P2 finishes use of buffer (thru interrupt)

and releases it● P1 now begins sleep even though buffer free

● 3rd prob

● Both getblk (line 8) and brelse manipulate free list

Atomicity violated in each case: interleaved execution of getblk with interrupt handler or with brelse

Need to block interrupts; Also: have to watch out for hardware error!

Page 10: OS09 K. Gopinath, IIScdrona.csa.iisc.ernet.in/~gopi/os09/extra-sep-os09.pdf · algorithm getblk input: file sys #, block #; output: locked buffer that can now be used for block while

Other file ops● iget: get a locked inode (doing bread if necessary) given

inode number

● iput: release an inode; if ref count 0, writes dirty inode

● bmap: given inode and byte offset, returns disk block num and offset

● namei: given a path, get the locked inode

● ialloc: assign a new disk inode for a newly created file

● ifree: free an inode (link count 0)

● alloc: allocate a free disk block and return buffer using getblk

● free: free a disk block

Page 11: OS09 K. Gopinath, IIScdrona.csa.iisc.ernet.in/~gopi/os09/extra-sep-os09.pdf · algorithm getblk input: file sys #, block #; output: locked buffer that can now be used for block while

initsem(semaphore *sem, int val) {

*sem = val

}

void P(semaphore *sem) {

*sem -= 1;

while (*sem <0) sleep

}

void V(semaphore *sem) {

*sem += 1;

if (*sem<=0) wakeup thread blocked on sem

}

boolean_t CP(semaphore *sem) {

if (*sem>0) {*sem -= 1; return(TRUE) } else return(FALSE)

}

● Mutex thru Semaphore semaphore sem;

initsem(&sem, 1);

P(&sem);

use resource

V(&sem);

● Event­wait semaphore event;

initsem(&event, 0);

P(&event);

event processing

V(&event);

● Countable Resources

semaphore counter;

initsem(&counter, count);

P(&counter); use resource; V(&counter)

Page 12: OS09 K. Gopinath, IIScdrona.csa.iisc.ernet.in/~gopi/os09/extra-sep-os09.pdf · algorithm getblk input: file sys #, block #; output: locked buffer that can now be used for block while

Have we solved the problem?● P() and V() must be executed atomically● In uniprocessor system may disable interrupts● In multi­processor system, use hardware 

synchronization primitives● TS, FAA, etc…

● Involves some limited amount of busy waiting

Page 13: OS09 K. Gopinath, IIScdrona.csa.iisc.ernet.in/~gopi/os09/extra-sep-os09.pdf · algorithm getblk input: file sys #, block #; output: locked buffer that can now be used for block while

Simulation of a monitor with semaphorestypedef int semaphore;

semaphore mutex = 1;

void enter_monitor(void) {

down(mutex);

}

void leave_normally(void) {

up(mutex);

}

void leave_with_signal(semaphore c) {

/* signal on c & exit monitor */

up(c);

}

void wait(semaphore c) {

up(mutex);

down(c);

}

Page 14: OS09 K. Gopinath, IIScdrona.csa.iisc.ernet.in/~gopi/os09/extra-sep-os09.pdf · algorithm getblk input: file sys #, block #; output: locked buffer that can now be used for block while

Java Monitors● void wait(); Enter a monitor's wait set until notified by 

another thread● void wait(long timeout);  Enter a monitor's wait set until 

notified by another thread or timeout milliseconds elapses● void wait(long timeout, int nanos);  Enter a monitor's wait 

set until notified by another thread or timeout milliseconds plus nanos nanoseconds elapses

● void notify();  Wake up one thread waiting in the monitor's wait set. (If no threads are waiting, do nothing.)

● void notifyAll();  Wake up all threads waiting in the monitor's wait set. (If no threads are waiting, do nothing.)

Page 15: OS09 K. Gopinath, IIScdrona.csa.iisc.ernet.in/~gopi/os09/extra-sep-os09.pdf · algorithm getblk input: file sys #, block #; output: locked buffer that can now be used for block while

Java (contd)● Each Java monitor has a single nameless anonymous 

condition variable on which a thread can wait() or signal one waiting thread with notify() or signal all waiting threads with notifyAll().

● This nameless condition variable corresponds to a lock on the object that must be obtained whenever a thread calls a synchronized method in the object. ● Only inside a synchronized method may wait(), notify(), 

and notifyAll() be called.● Methods that are static can also be synchronized. There is a 

lock associated with the class that must be obtained when a static synchronized method is called. 

Page 16: OS09 K. Gopinath, IIScdrona.csa.iisc.ernet.in/~gopi/os09/extra-sep-os09.pdf · algorithm getblk input: file sys #, block #; output: locked buffer that can now be used for block while

Problems with Semaphores● Too complex?

● Needs low­level atomic op to construct, blocking & unblocking involve context switches, manipulates scheduler and sleep Qs

● Good for resources held for long times, not for short ● Good as V only wakes up if someone can run● But this  can result in convoys

● Low priority process P1 that has locked an imp lock (L) preempted by P2 which then waits for L

– Imp lock: Often log lock in txnal systems– P3 also needs L, P4 also, ... all wait

● P1 scheduled again (FIFO) & unlocks L● P2 gets lock (P1 preempted), P2 uses lock, then P3, ...● For next upd, P2 goes back to Q again, then P3, P4,...● Lock­unlock: 100's of insts; lock­wait­dispatch­unlock: 1000's 

Page 17: OS09 K. Gopinath, IIScdrona.csa.iisc.ernet.in/~gopi/os09/extra-sep-os09.pdf · algorithm getblk input: file sys #, block #; output: locked buffer that can now be used for block while

Semaphore Problem?● 1: T2 (P2) in cs (using a sem): blocks T3; T4 on run Q● 2: T2 exits cs but active; T3 now gets sem but inactive ● 3: T1 (P1) now wants to enter cs but blocked by T3

● T1 blocked even if no one in cs! FIFO property!● T4 scheduled on P1

– T1 & T3 cannot run unless T2 or T4 giveup– Processor 1: T1 ­> T1 ­> T4– Processor 2: T2 ­> T2 ­> T2

● Problem in step 2:● Have to make sure that T3 does not get sem but on ready 

Q. T1 will then get sem & no context sw.● Need different semantics: eg: condition variables

Page 18: OS09 K. Gopinath, IIScdrona.csa.iisc.ernet.in/~gopi/os09/extra-sep-os09.pdf · algorithm getblk input: file sys #, block #; output: locked buffer that can now be used for block while

Message Passing: Mailboxes, Ports, CSPSend/Receive; Blocking/non­blocking

typedef int message[MSIZE];

void producer(void){

int item;

message m;

while (TRUE) {

produce_item(&item);

receive(consumer, &m);

bulid_message(&m, item);

send(consumer, &m);

}

}

void consumer(void){

int item, i; message m;

for (i = 0; i < N; i++) send(producer, &m);

while (TRUE) {

receive(producer, &m);

extract_item(&m, & item);

send(producer, &m);

consumer_item(item);

}

}

Page 19: OS09 K. Gopinath, IIScdrona.csa.iisc.ernet.in/~gopi/os09/extra-sep-os09.pdf · algorithm getblk input: file sys #, block #; output: locked buffer that can now be used for block while

Fork & fork1 in MT processes● Process with exactly 1 LWP=> same semantics as “old Unix” 

process● copy all LWPs on fork? Solaris9 but not Posix

● one LWP blocked in parent: what about in child? Restart? Concurrent syscalls? EINTR or wait(disk)?

● one LWP has open netw cnxn: if closed, unexpected user msg to remote node

● one LWP changing a shared data structure: corruption thru the new copy of LWP? How to make a “consistent” copy?

● copy only calling LWP? Fork1: Solaris10; good for exec'ing● some user thrs not on LWPs that were in parent● child process should not try to acq locks held by LWPs not 

in child (deadlock!) but user code cannot know! these locks may be held by ulib POSIX

Page 20: OS09 K. Gopinath, IIScdrona.csa.iisc.ernet.in/~gopi/os09/extra-sep-os09.pdf · algorithm getblk input: file sys #, block #; output: locked buffer that can now be used for block while

fork1fork1(): only calling LWP created in child

   registration of fork_handlers (_atfork)

       prepare: prior to fork in the ctxt of calling LWP. LIFO

       parent: after fork. FIFO

       child: after fork in context of 1 thr in child. FIFO

       LIFO/FIFO order to enable preserving of locking order

          int pthread_atfork(void (*prepare)  (void),                                      void (*parent) (void), void (*child) (void));

   handles orphaned mutexes

      prepare fork handlers lock all mutexes (by calling thr)

      parent/child fork handlers unlock mutexes

   indep libs & appl progs can protect themselves

      lib provides fork handlers

Page 21: OS09 K. Gopinath, IIScdrona.csa.iisc.ernet.in/~gopi/os09/extra-sep-os09.pdf · algorithm getblk input: file sys #, block #; output: locked buffer that can now be used for block while

Fork and threads Thr A Thr B Thr B_ch

locks mutex

modifies shared

data fork

copy of locked mutex and

inconsistent data struct.

/* cannot drop mutex as

data inconsistent nor can

it take mutex: deadlock*/

/* memory leaks also! */

Page 22: OS09 K. Gopinath, IIScdrona.csa.iisc.ernet.in/~gopi/os09/extra-sep-os09.pdf · algorithm getblk input: file sys #, block #; output: locked buffer that can now be used for block while

Solutions?● programs that use fork() call an exec function soon

afterwards in child process, thus resetting all states● In the meantime, only a short list of async-signal-safe

library routines are promised to be available

● But not good wrt multi-threaded (MT) libraries. ● Applications may not be aware that a MT library is in use,

and feel free to call any number of library routines between the fork() and exec calls. They may be extant 1-threaded programs that cannot be expected to obey new restrictions imposed by the threads library.

● A MT library needs a way to protect its internal state during fork() in case it is re-entered later in the child process. eg. MT I/O libraries, which are invoked between the fork() and exec calls to effect I/O redirection.

Page 23: OS09 K. Gopinath, IIScdrona.csa.iisc.ernet.in/~gopi/os09/extra-sep-os09.pdf · algorithm getblk input: file sys #, block #; output: locked buffer that can now be used for block while

Fork handling● Lock global mutexes

● Other threads locked out of the critical regions of code protected by these mutexes

● Can take snapshot: copy of valid, stable data

● Reset synchr objects in the child process● ensures they are properly cleansed of any artifacts from

the threading subsystem of parent process● eg. a mutex may inherit a wait queue of threads waiting for

the lock; this wait queue makes no sense in child. Initialize mutex to remedy (deletes unnecessary data structures in child). Otherwise memory leaks!

● But how to correct or otherwise deal with the inconsistent state in the child?

Page 24: OS09 K. Gopinath, IIScdrona.csa.iisc.ernet.in/~gopi/os09/extra-sep-os09.pdf · algorithm getblk input: file sys #, block #; output: locked buffer that can now be used for block while

With pthread_atfork: no orphaned locks!

prepare: lock(mutex)

parent: unlock(mutex)

child: unlock(mutex)

Thr A Thr B ThrB_ch locks mutex

modifies shared data

attempt fork but blocked as

prepare (lock mutex) blocked

drops mutex (shared data now consistent!)

prepare succeeds (locks mutex)

fork completes

unlocks mutex (parent) unlock mutex(child)

Page 25: OS09 K. Gopinath, IIScdrona.csa.iisc.ernet.in/~gopi/os09/extra-sep-os09.pdf · algorithm getblk input: file sys #, block #; output: locked buffer that can now be used for block while

Solutions0: pthread_atfork: provides MT libraries with a means

to protect themselves from innocent appls that call fork(), and provides MT appls with a std mech for protecting themselves from fork() calls in a lib routine or the appl itself. But COMPLEX!!! Avoid problems by

1: If posssible, fork before creating any threads

2: Instead of fork, create a new thread. If forking to exec a binary, can attempt to convert binary to a shared lib that can be linked to.

3: Try a surrogate parent method. Fork at init time; the child will be a "surrogate" parent that will remain 1-threaded. When exec is needed, child is informed and it does a fork/exec

Page 26: OS09 K. Gopinath, IIScdrona.csa.iisc.ernet.in/~gopi/os09/extra-sep-os09.pdf · algorithm getblk input: file sys #, block #; output: locked buffer that can now be used for block while

Posix Model of Concurrency● Creation● pthread_create(tp, attrp, fptr, argp)             ● pthread_attr_xxx(): manipulate attr of a thread

– Init/destroy; set/get detachstate, inheritsched, schedparam, schedpolicy, scope, stackaddr, stacksize 

● Exit● pthread_exit(retvalp)                          ● pthread_join(t, **v): wait for another thread termination  ● pthread_detach(t): storage for thread can be reclaimed 

when thread terminates (no zombie)● Thread Specific Data (indexed by key)

● pthread_key_create(keyp, fpdestructor)/_delete()           ● pthread_setspecific()/_getspecific() mapping betw key and thread   

Page 27: OS09 K. Gopinath, IIScdrona.csa.iisc.ernet.in/~gopi/os09/extra-sep-os09.pdf · algorithm getblk input: file sys #, block #; output: locked buffer that can now be used for block while

● Signal: pthread_sigmask(how, newmask, saveprev): change signal  mask  for  calling thread 

● pthread_kill(t, sig)                   sigwait: suspend thr till sig   ● ID: pthread_self()                         

● pthread_equal(t1, t2)                      ● pthread_once(once?, fptr):  ensure some init at most once 

● Scheduling● pthread_setschedparam()/_getschedparam()   

●  Cancellation (cancellation pts: _join, _cond_wait, _cond_timedwait, sem_wait, sigwait, _testcancel)

● pthread_cancel(t) by others /pthread_testcancel(void) by self

● pthread_setcancelstate()/type()● pthread_cleanup_pop()/_push(): if a thread exits or cancelled 

(with locked mutexes?), cleanup handlers executed; LIFO order                      

Page 28: OS09 K. Gopinath, IIScdrona.csa.iisc.ernet.in/~gopi/os09/extra-sep-os09.pdf · algorithm getblk input: file sys #, block #; output: locked buffer that can now be used for block while

● Mutex● pthread_mutex_init()/_destroy()● pthread_mutexattr_xxx()

– Init/destroy; set/get pshared, protocol, prioceiling● pthread_mutex_setprioceiling()/_getprioceiling()● pthread_mutex_lock()/_trylock()/_unlock()

● Condition Variable● pthread_cond_init()/_destroy()● pthread_condattr_xxx()

– Init/destroy; set/get pshared● pthread_cond_wait()/_timedwait()● pthread_cond_signal()● pthread_cond_broadcast()

Page 29: OS09 K. Gopinath, IIScdrona.csa.iisc.ernet.in/~gopi/os09/extra-sep-os09.pdf · algorithm getblk input: file sys #, block #; output: locked buffer that can now be used for block while

Condition variablesint x,y;

pthread_mutex_t mut =PTHREAD_MUTEX_INITIALIZER;

pthread_cond_t cond = PTHREAD_COND_INITIALIZER;

// (waiter) Wait until x is greater than y

pthread_mutex_lock(&mut);

while (x <= y) pthread_cond_wait(&cond, &mut);

/* operate on x and y */

pthread_mutex_unlock(&mut);

// (signaller) Signal if modifications on x and y st x>y

pthread_mutex_lock(&mut);

/* modify x and y */

if (x > y) pthread_cond_broadcast(&cond);

pthread_mutex_unlock(&mut);

Page 30: OS09 K. Gopinath, IIScdrona.csa.iisc.ernet.in/~gopi/os09/extra-sep-os09.pdf · algorithm getblk input: file sys #, block #; output: locked buffer that can now be used for block while

// (waiter) if timeout also

struct timeval now;

struct timespec timeout;

int retcode;

pthread_mutex_lock(&mut);

gettimeofday(&now);

timeout.tv_sec = now.tv_sec + 5;

timeout.tv_nsec = now.tv_usec * 1000;

retcode = 0;

while (x <= y && retcode != ETIMEDOUT)

retcode = pthread_cond_timedwait(&cond, &mut, &timeout);

if (retcode == ETIMEDOUT) {/* timeout occurred */}

else { /* operate on x and y */}

pthread_mutex_unlock(&mut);

Page 31: OS09 K. Gopinath, IIScdrona.csa.iisc.ernet.in/~gopi/os09/extra-sep-os09.pdf · algorithm getblk input: file sys #, block #; output: locked buffer that can now be used for block while

● Semaphore● sem_init()/_destroy()                             ● sem_open()/_close()                              ● sem_wait()/_trywait()                            ● sem_post()                              ● sem_getvalue()                          ● sem_unlink()  

● fork() Clean Up Handling● pthread_atfork()   

● Async safe? Some pthread calls not safe to call from sig handlers● A user thr lib may have taken a lock  to ensure, say, that 

only one user changing Qs. If pthread_mutex_lock, etc, may deadlock          

Page 32: OS09 K. Gopinath, IIScdrona.csa.iisc.ernet.in/~gopi/os09/extra-sep-os09.pdf · algorithm getblk input: file sys #, block #; output: locked buffer that can now be used for block while

Signals:● oldest ipc method used by UNIX systems to signal asynchronous 

events. ONLY 1BIT INFO!● can be generated by a keyboard interrupt or an error condition 

or by other processes in the system (if they have the correct privileges)● kernel & superuser can send a signal to any process ● a process can also send a signal to other processes with same 

uid/gid● Processes can handle signals themselves or allow kernel to 

handle● If kernel handles the signal, default action for the signal: eg, SIGFPE 

causes core dump and causes the process to exit● SIGSTOP (causes a process to halt its execution) and SIGKILL handled  only by 

kernel

● List of signals on an Linux/Intel machine: SIGHUP  SIGINT SIGQUIT  SIGILL SIGTRAP SIGIOT SIGBUS SIGFPE SIGKILL SIGUSR1   SIGSEGV SIGUSR2  SIGPIPE  SIGALRM  SIGTERM SIGCHLD  SIGCONT  SIGSTOP  SIGTSTP  SIGTTIN  SIGTTOU SIGURG SIGXCPU SIGXFSZ  SIGVTALRM SIGPROF SIGWINCH SIGIO SIGPWR 

Page 33: OS09 K. Gopinath, IIScdrona.csa.iisc.ernet.in/~gopi/os09/extra-sep-os09.pdf · algorithm getblk input: file sys #, block #; output: locked buffer that can now be used for block while

Signals (cont’d)

● void (*signal(int signo, void (*func) (int))) (int) =● typedef void Sigfunc(int); Sigfunc *signal(int, Sigfunc *)

● Signal is a func that returns a ptr to a func that ret void (prev sigh)● Or, sighandler_t signal(int signum, sighandler_t handler);

● Linux implements signals using information stored in  in task_struct of process:● struct sigpending pending: currently pending signals

● blocked: mask of blocked signals● struct signal_struct *sig has array of sigactions that holds info 

about how the process handles each signal

● Signals generated by setting appropriate bit in signal field of pending. If not blocked,  scheduler will run handler in the next system scheduling.

● Every time a process exits from a system call, the signal and blocked fields are checked, and if there is any unblocked signal, the handler is called.

Page 34: OS09 K. Gopinath, IIScdrona.csa.iisc.ernet.in/~gopi/os09/extra-sep-os09.pdf · algorithm getblk input: file sys #, block #; output: locked buffer that can now be used for block while

#include <signal.h>

static void sig_usr(int); /* one handler for both signals */

int main(void) {

if (signal(SIGUSR1, sig_usr) == SIG_ERR)

err_sys("can't catch SIGUSR1");

if (signal(SIGUSR2, sig_usr) == SIG_ERR)

err_sys("can't catch SIGUSR2");

for ( ; ; ) pause(); }

static void sig_usr(int signo) { /* argument is signal number */

if (signo == SIGUSR1) printf("received SIGUSR1\n");

else if (signo == SIGUSR2)

printf("received SIGUSR2\n");

else err_dump("received signal %d\n", signo);

return;}

Page 35: OS09 K. Gopinath, IIScdrona.csa.iisc.ernet.in/~gopi/os09/extra-sep-os09.pdf · algorithm getblk input: file sys #, block #; output: locked buffer that can now be used for block while

#include <signal.h>  No Qing for non­real time signals!

main() { int childPid, i; void SigIntHandler();

sigblock(sigmask(SIGINT)); signal(SIGINT, SigIntHandler);

childPid = fork(); if (childPid > 0) { /* parent */ for (i=0; i < 10 ; i++) kill(childPid, SIGINT); printf("Parent has issued %d signals to the child\n", i); } else { /* child */ sleep(2); /* sleep for 2 secs so that signals overwritten */ while (1) sigpause(0); }}void SigIntHandler(int signo) { printf("Child : received a signal\n");}

Page 36: OS09 K. Gopinath, IIScdrona.csa.iisc.ernet.in/~gopi/os09/extra-sep-os09.pdf · algorithm getblk input: file sys #, block #; output: locked buffer that can now be used for block while

signal: V7, SVR2/3/4 (handler uninstalled, no blocking of signals, no

autostart of interrupted system calls)

sigset, sighold, sigrelse, sigignore, sigpause: SVR3/4 (no autostart)

signal, sigvec, sigblock, sigsetmask(unblock a signal), sigpause: 4.x BSD (autostart 4.2; default 4.3/4.4)

sigaction, sigprocmask, sigpending, sigsuspend: autostart unspecified (POSIX.1), optional(SVR4, 4.3/4.4BSD, Linux)

sigprocmask: change the list of currently blocked signals

sigpending: allows examination of pending signals (ones

which have been raised while blocked)

sigsuspend: replaces with given signal mask & suspends process until a signal

int sigaction(int signo, const struct sigaction *act, struct sigaction *oact)

struct sigaction {

void (*sa_handler)();

sigset_t sa_mask; /* addl signals to block */

int sa_flags; /* restart?, alt stack?, waitchild?, uninstall handler? ...*/ }

Page 37: OS09 K. Gopinath, IIScdrona.csa.iisc.ernet.in/~gopi/os09/extra-sep-os09.pdf · algorithm getblk input: file sys #, block #; output: locked buffer that can now be used for block while

Unreliable signalsold V7 code: race with a new signal for process before signal reinstalled

int sig_int();

...

signal(SIGINT, sig_int);

...

sig_int() {

/* another signal can come here! can cause default action */

signal(SIGINT, sig_int);

...

}

Page 38: OS09 K. Gopinath, IIScdrona.csa.iisc.ernet.in/~gopi/os09/extra-sep-os09.pdf · algorithm getblk input: file sys #, block #; output: locked buffer that can now be used for block while

Another race

int sig_int_flag;

main() { int sig_int(); ... signal(SIGINT, sig_int); ... while (sig_int_flag==0) /* signal can come here! */ pause(); ...}

sig_int() { signal(SIGINT, sig_int); sig_int_flag=1}

Page 39: OS09 K. Gopinath, IIScdrona.csa.iisc.ernet.in/~gopi/os09/extra-sep-os09.pdf · algorithm getblk input: file sys #, block #; output: locked buffer that can now be used for block while

int sighold(int sig); int sigrelse(int sig) SysVsighold(SIGQUIT); sighold(SIGINT)

c.s.

sigrelse(SIGINT); sigrelse(SIGQUIT)

int sig_int_flag;

main() {

int sig_int();

...

signal(SIGINT, sig_int);

...

sighold(SIGINT);

while (sig_int_flag==0) sigpause(SIGINT); //atomically release signal

/* wait for a signal to occur */ // and pause

...

Page 40: OS09 K. Gopinath, IIScdrona.csa.iisc.ernet.in/~gopi/os09/extra-sep-os09.pdf · algorithm getblk input: file sys #, block #; output: locked buffer that can now be used for block while

Restarting of interrupted system calls by signals        4.3BSDCan only call reentrant functions within signal handlers

int oldmask; /* SIGQUIT: quit key + core image; SIGINT: interrupt key ^C */oldmask= sigblock (sigmask(SIGQUIT)|sigmask(SIGINT)); /* block SIGQUIT/INT */c.s.sigsetmask(oldmask) /* reset to old mask */

int sig_int_flag;main() { int sig_int();...signal(SIGINT, sig_int);...sigblock(sigmask(SIGINT)); /* sigblock returns mask before */while (sig_int_flag==0) sigpause(0); /*wait for signal to occur *//* sigpause(0) <> sigsetmask + pause as signal can in betw *//* process signal... */...}

Page 41: OS09 K. Gopinath, IIScdrona.csa.iisc.ernet.in/~gopi/os09/extra-sep-os09.pdf · algorithm getblk input: file sys #, block #; output: locked buffer that can now be used for block while

Executing Signal Handlers in Linux● On signal (either from kernel or another process), ker checks some 

conditions (disp, etc) before calling do_signal

● do_signal in kernel while (user) signal handler in user mode

● After signal handler run, kernel code executed further

● However, ker stack no longer contains hw context of interrupted program as ker stack emptied on user mode

● Also, sig handlers can reenter kernel (syscalls, etc.)● Solution: copy hw context saved in ker stack to user stack of curr 

process 

● When sig handler terminates, sigreturn syscall automatically invoked to copy hw context back to kernel stack & restore the user stack

● Sigframe struct pushed on stack has some code for calling sigreturn: stack has to be executable!!!

Page 42: OS09 K. Gopinath, IIScdrona.csa.iisc.ernet.in/~gopi/os09/extra-sep-os09.pdf · algorithm getblk input: file sys #, block #; output: locked buffer that can now be used for block while

pselect● int select(int nfds, fd_set *readfds, fd_set *writefds, fd_set

*exceptfds, struct timeval *utimeout);

● int pselect(int nfds, fd_set *readfds, fd_set *writefds, fd_set *exceptfds, const struct timespec *ntimeout, sigset_t *sigmask);

● pselect() used to wait for a signal as well as data from a fd ● Programs that receive signals as events normally use the

signal handler only to raise a global flag. ● The global flag indicates that the event must be processed in

the main loop of the program. ● A signal will cause the select()/pselect() to return with errno set

to EINTR. This behavior is essential so that signals can be processed in the main loop of the program, otherwise select() would block indefinitely.

Page 43: OS09 K. Gopinath, IIScdrona.csa.iisc.ernet.in/~gopi/os09/extra-sep-os09.pdf · algorithm getblk input: file sys #, block #; output: locked buffer that can now be used for block while

Race condition

● Somewhere in the main loop, a conditional checks the global flag.

● What if a signal arrives after the conditional, but before the select() call?

● select() would block indefinitely, even though an event is actually pending.

Page 44: OS09 K. Gopinath, IIScdrona.csa.iisc.ernet.in/~gopi/os09/extra-sep-os09.pdf · algorithm getblk input: file sys #, block #; output: locked buffer that can now be used for block while

pselect example

● udp broadcast

Page 45: OS09 K. Gopinath, IIScdrona.csa.iisc.ernet.in/~gopi/os09/extra-sep-os09.pdf · algorithm getblk input: file sys #, block #; output: locked buffer that can now be used for block while

Linux Concurrency Model● Within appl: clones (incl threads & processes of other systems)● Inside kernel: 

● Kernel threads: do not have USER context● deferrable and interruptible ker funcs:

– Softirq: reentrant: multiple softirqs of the same type can be run concurrently on several CPUs. 

● No dyn alloc! Have to be statically defined at compile time. – Tasklet: multiple tasklets of the same type cannot run 

concurrently on several CPUs. ● Dyn alloc OK! Can be allocated and initialized at run time (loadable 

modules). Impl thru softirqs

– Bottom Half: multiple bottom halves cannot be run concurrently on several CPUs. No dyn alloc!

● Impl thru tasklets

● Across HW: IPI

Page 46: OS09 K. Gopinath, IIScdrona.csa.iisc.ernet.in/~gopi/os09/extra-sep-os09.pdf · algorithm getblk input: file sys #, block #; output: locked buffer that can now be used for block while

Spinlocks & Semaphores● Shared data betw different parts of code in kernel 

● most common: access to data structures shared between user process context and interrupt context

● In uniprocessor system: mutual excl by setting and clearing interrupts + flags

● SMP: three types of spinlocks: vanilla (basic), read­write, big­reader● Read­write spinlocks when many readers and few writers

– Eg: access to the list of registered filesystems. ● Big­reader spinlocks a form of read­write spinlocks optimized for 

very light read access, with penalty for writes

– limited number of big­reader spinlocks users. – used in networking part of the kernel. 

● semaphores: Two types of semaphores: basic and read­write semaphores. Different from IPC's● Mutex or counting up()& down(); interruptible/ non

Page 47: OS09 K. Gopinath, IIScdrona.csa.iisc.ernet.in/~gopi/os09/extra-sep-os09.pdf · algorithm getblk input: file sys #, block #; output: locked buffer that can now be used for block while

Spinlocks: (cont’d)● A good example of using spinlocks: accessing a data strucuture 

shared betw a user context and an interrupt handlerspinlock_t my_lock = SPIN_LOCK_UNLOCKED;

my_ioctl() { // _ioctl: definitely process context!

spin_lock_irq(&my_lock); // and known that interrupts enabled!

/* critical section */ // hence, _irq to disable iinterrupts

spin_unlock_irq(&my_lock);

}

my_irq_handler() { // _irq_handler: definitely system (or intr

spin_lock(&lock); // context)& hence known that intr disabled!

/* critical section */ // can use simpler lock

spin_unlock(&lock);

}

 spin_lock: if interrupts disabled or no race with interrupt context

 spin_lock_irq: if interrupts enabled and has to be disabled  spin_lock_irqsave: if interrupt state not known● Basic premise of a spin lock: one thread busy­waits on a resource on one 

processor while another used on another (only true for MP). But code has to work for 1 or more processors. If all threads on 1 processor, if a thread tries to spin lock that is already held by another thread, deadlock.

● Never give up CPU when holding a spinlock! 

Page 48: OS09 K. Gopinath, IIScdrona.csa.iisc.ernet.in/~gopi/os09/extra-sep-os09.pdf · algorithm getblk input: file sys #, block #; output: locked buffer that can now be used for block while

Linux 2.4 buffer cachestruct buffer_head * getblk(kdev_t dev, int block, int size) {

for (;;) {

struct buffer_head * bh;

bh = get_hash_table(dev, block, size);

if (bh)

return bh;

if (!grow_buffers(dev, block, size))

free_more_memory();

}

}

Page 49: OS09 K. Gopinath, IIScdrona.csa.iisc.ernet.in/~gopi/os09/extra-sep-os09.pdf · algorithm getblk input: file sys #, block #; output: locked buffer that can now be used for block while

struct buffer_head * get_hash_table(kdev_t dev, int block, int size){

struct buffer_head *bh, **p = &hash(dev, block);

read_lock(&hash_table_lock);

for (;;) { static inline void get_bh(struct buffer_head * bh) {

bh = *p; atomic_inc(&(bh)->b_count);

if (!bh) break; }

p = &bh->b_next;

if (bh->b_blocknr != block) continue;

if (bh->b_size != size) continue;

if (bh->b_dev != dev) continue;

get_bh(bh); #define hash(dev,block) break; } hash_table[(_hashfn(HASHDEV(dev),block) & bh_hash_mask)]

read_unlock(&hash_table_lock); bh_hash_mask = (nr_hash - 1)

return bh;

} lru_list_lock > hash_table_lock > unused_list_lock: hier

#define HASHDEV(dev) ((unsigned int ) (dev))

Page 50: OS09 K. Gopinath, IIScdrona.csa.iisc.ernet.in/~gopi/os09/extra-sep-os09.pdf · algorithm getblk input: file sys #, block #; output: locked buffer that can now be used for block while

Linux downstatic inline void down(struct semaphore * sem) {

__asm__ __volatile__( "# atomic down operation\n\t"

LOCK "decl %0\n\t" /* --sem->count */

"js 2f\n"

"1:\n"

LOCK_SECTION_START("")

"2:\tcall __down_failed\n\t"

"jmp 1b\n"

LOCK_SECTION_END

:"=m" (sem->count)

:"c" (sem)

:"memory");

}

Page 51: OS09 K. Gopinath, IIScdrona.csa.iisc.ernet.in/~gopi/os09/extra-sep-os09.pdf · algorithm getblk input: file sys #, block #; output: locked buffer that can now be used for block while

asm( ".text\n" ".align 4\n"

".globl __down_failed\n"

"__down_failed:\n\t"

#if defined(CONFIG_FRAME_POINTER)

"pushl %ebp\n\t" "movl %esp,%ebp\n\t"

#endif

"pushl %eax\n\t"

"pushl %edx\n\t"

"pushl %ecx\n\t"

"call __down\n\t"

"popl %ecx\n\t"

"popl %edx\n\t"

"popl %eax\n\t"

#if defined(CONFIG_FRAME_POINTER)

"movl %ebp,%esp\n\t" "popl %ebp\n\t"

#endif

"ret" );

Page 52: OS09 K. Gopinath, IIScdrona.csa.iisc.ernet.in/~gopi/os09/extra-sep-os09.pdf · algorithm getblk input: file sys #, block #; output: locked buffer that can now be used for block while

void __down(struct semaphore * sem) { struct task_struct *tsk = current; DECLARE_WAITQUEUE(wait, tsk); tsk->state = TASK_UNINTERRUPTIBLE; add_wait_queue_exclusive(&sem->wait, &wait);

spin_lock_irq(&semaphore_lock); sem->sleepers++; for (;;) { int sleepers = sem->sleepers;

/*Add "everybody else" into it. They aren't * playing, because we own the spinlock. */ if (!atomic_add_negative(sleepers - 1, &sem->count)) { sem->sleepers = 0; break; } sem->sleepers = 1; /* us - see -1 above */ spin_unlock_irq(&semaphore_lock); schedule(); tsk->state = TASK_UNINTERRUPTIBLE; spin_lock_irq(&semaphore_lock); } spin_unlock_irq(&semaphore_lock); remove_wait_queue(&sem->wait, &wait); tsk->state = TASK_RUNNING; wake_up(&sem->wait);}

Page 53: OS09 K. Gopinath, IIScdrona.csa.iisc.ernet.in/~gopi/os09/extra-sep-os09.pdf · algorithm getblk input: file sys #, block #; output: locked buffer that can now be used for block while

Analysis● Semaphore open: count=1, sleepers=0: down makes

count 0; __down not executed● Semaphore closed & no sleeping processes:

count=0, sleepers=0=> count -1 & sleepers 1● Each iteration checks if count negative

– Negative: schedule() & check again– Otherwise: sleepers=0; wakeup another (but Q empty)

● Semaphore closed & other sleeping processes: count, sleepers (-1,1) => (-2, 1)● Sleepers temporarily 2, count becomes -1 again

– Checks if count still negative as holding process may V● If negative: schedule ● Not negative:

Page 54: OS09 K. Gopinath, IIScdrona.csa.iisc.ernet.in/~gopi/os09/extra-sep-os09.pdf · algorithm getblk input: file sys #, block #; output: locked buffer that can now be used for block while

#define DECLARE_WAITQUEUE(name, tsk) wait_queue_t name = __WAITQUEUE_INITIALIZER(name, tsk)

#define __WAITQUEUE_INITIALIZER(name, tsk) { task: tsk, task_list: { NULL, NULL }, __WAITQUEUE_DEBUG_INIT(name)}static spinlock_t semaphore_lock = SPIN_LOCK_UNLOCKED

struct semaphore { atomic_t count; int sleepers; wait_queue_head_t wait;#if WAITQUEUE_DEBUG long __magic;#endif};#define spin_lock_irq(lock) do { local_irq_disable(); spin_lock(lock); } while (0)#define local_irq_disable() __cli()

struct __wait_queue { unsigned int flags;#define WQ_FLAG_EXCLUSIVE 0x01 struct task_struct * task; struct list_head task_list;#if WAITQUEUE_DEBUG long __magic; long __waker;#endif };typedef struct __wait_queue wait_queue_t;

Page 55: OS09 K. Gopinath, IIScdrona.csa.iisc.ernet.in/~gopi/os09/extra-sep-os09.pdf · algorithm getblk input: file sys #, block #; output: locked buffer that can now be used for block while

static inline void up(struct semaphore * sem) {

__asm__ __volatile__( "# atomic up operation\n\t"

LOCK "incl %0\n\t" /* ++sem->count */

"jle 2f\n"

"1:\n"

LOCK_SECTION_START("")

"2:\tcall __up_wakeup\n\t"

"jmp 1b\n"

LOCK_SECTION_END

".subsection 0\n"

:"=m" (sem->count)

:"c" (sem)

:"memory");

}

Page 56: OS09 K. Gopinath, IIScdrona.csa.iisc.ernet.in/~gopi/os09/extra-sep-os09.pdf · algorithm getblk input: file sys #, block #; output: locked buffer that can now be used for block while

asm(

".text\n" ".align 4\n"

".globl __up_wakeup\n"

"__up_wakeup:\n\t"

"pushl %eax\n\t"

"pushl %edx\n\t"

"pushl %ecx\n\t"

"call __up\n\t"

"popl %ecx\n\t"

"popl %edx\n\t"

"popl %eax\n\t"

"ret");

#define wake_up(x) __wake_up((x),TASK_UNINTERRUPTIBLE |TASK_INTERRUPTIBLE, 1)

void __up(struct semaphore *sem) {

wake_up(&sem->wait);

}

Page 57: OS09 K. Gopinath, IIScdrona.csa.iisc.ernet.in/~gopi/os09/extra-sep-os09.pdf · algorithm getblk input: file sys #, block #; output: locked buffer that can now be used for block while

void __wake_up(wait_queue_head_t *q, unsigned int mode, int nr_exclusive{

unsigned long flags;

if (unlikely(!q)) return;

spin_lock_irqsave(&q->lock, flags);

__wake_up_common(q, mode, nr_exclusive, 0);

spin_unlock_irqrestore(&q->lock, flags);

}

static inline void __wake_up_common(wait_queue_head_t *q, unsigned int mode, int nr_exclusive, int sync) {

struct list_head *tmp; unsigned int state; wait_queue_t *curr; task_t *p;

list_for_each(tmp, &q->task_list) {

curr = list_entry(tmp, wait_queue_t, task_list);

p = curr->task;

state = p->state;

if ((state & mode) && try_to_wake_up(p, sync) && ((curr->flags & WQ_FLAG_EXCLUSIVE) && !--nr_exclusive)) break;

}

}

Page 58: OS09 K. Gopinath, IIScdrona.csa.iisc.ernet.in/~gopi/os09/extra-sep-os09.pdf · algorithm getblk input: file sys #, block #; output: locked buffer that can now be used for block while

Process Tree● Init reads /etc/inittab● Opens tty

● Fd 0,1,2 set to dev● Login printed● Read user name● Initial env set (­p: add to existing env; envp: TERM, etc,)● uid, gid=0● execle(“/bin/login”, “login”, “­p”, username, (char*)0, envp)● Getpwname (get password file entry); getpass ­ get a password; use 

crypt/md5 to validate pwd● Fail: login calls exit(1);noticed by init; respawn action● Success: chdir; chown for terminal device; setgid; initgroups; initenv 

(HOME, SHELL, USER, PATH, ...)● Setuid; then  execl(“/bin/sh”, “­sh”, 0)   (2nd arg: login shell)

# Run gettys in standard runlevels1:2345:respawn:/sbin/mingetty tty12:2345:respawn:/sbin/mingetty tty23:2345:respawn:/sbin/mingetty tty34:2345:respawn:/sbin/mingetty tty45:2345:respawn:/sbin/mingetty tty56:2345:respawn:/sbin/mingetty tty6

Page 59: OS09 K. Gopinath, IIScdrona.csa.iisc.ernet.in/~gopi/os09/extra-sep-os09.pdf · algorithm getblk input: file sys #, block #; output: locked buffer that can now be used for block while

init

init

getty

login

fork

exec

exec

forks one per tty init

login shell

term dev driver

thru getty/login

fds 0,1,2

userRS-232 cnxn

Page 60: OS09 K. Gopinath, IIScdrona.csa.iisc.ernet.in/~gopi/os09/extra-sep-os09.pdf · algorithm getblk input: file sys #, block #; output: locked buffer that can now be used for block while

init

inetd

inetd

telnetd

fork

exec

fork/exec of /bin/sh that executes /etc/rc script init

login shell

term dev driver

thru inetd, telnetd, login

fds 0,1,2

usernetw cnxn

telnet req

Page 61: OS09 K. Gopinath, IIScdrona.csa.iisc.ernet.in/~gopi/os09/extra-sep-os09.pdf · algorithm getblk input: file sys #, block #; output: locked buffer that can now be used for block while

Network Logins● Terminal device driver thru, say, RS232

● Shell (fd 0,1,2): user level● Kernel level:

– Line terminal disc (echo chars, assemble chars to lines, bs, C­u, gen SIGINT/SIGQUIT, C­S, C­Q, newline (CR+LF),...)

– terminal device driver 

● Network login: similar to terminal login● init, inetd, telnetd/sshd, login ● Pseudo­terminal device driver

● pseudo­terminal is a special IPC that acts like a terminal● data written to master side received by the slave side as if it was the result of a 

user typing at an ordinary terminal & viceversa

● Netw cnxn thru telnetd/sshd server& telnet/ssh client

,

Page 62: OS09 K. Gopinath, IIScdrona.csa.iisc.ernet.in/~gopi/os09/extra-sep-os09.pdf · algorithm getblk input: file sys #, block #; output: locked buffer that can now be used for block while

rlogind

TCP/IP

netw dev driver pty master

login shell

term disc

pty slave

fork

exec, exec

stdout/err stdin

KERNEL


Recommended