Date post: | 02-Jan-2016 |
Category: |
Documents |
Upload: | camron-bishop |
View: | 221 times |
Download: | 2 times |
Processes, Threads, Synchronization
CS 519: Operating System Theory
Computer Science, Rutgers University
Fall 2011
2Computer Science, Rutgers CS 519: Operating System Theory
Process
Process = system abstraction for the set of resources required for executing a program
= a running instance of a program
= memory image + registers (+ I/O state)
The stack + registers form the execution context
3Computer Science, Rutgers CS 519: Operating System Theory
Process Image
Each variable must be assigned a storage class
Global (static) variables
Allocated in the global region at compile-time
Local variables and parameters
Allocated dynamically on the stack
Dynamically created objects
Allocated from the heap
Code
Globals
Stack
Heap
Memory
4Computer Science, Rutgers CS 519: Operating System Theory
What About The OS Image?
Recall that one of the function of an OS is to provide a virtual machine interface that makes programming the machine easier
So, a process memory image must also contain the OS
OS
Code
Globals
Stack
Heap
MemoryCode
Globals
Stack
Heap
OS data space is used to store thingslike file descriptors for files beingaccessed by the process, status of I/Odevices, etc.
5Computer Science, Rutgers CS 519: Operating System Theory
What Happens When There Are More Than One Running Process?
OS
Code
Globals
Stack
Heap
P0
P1
P2
6Computer Science, Rutgers CS 519: Operating System Theory
Process Control Block
Each process has per-process state maintained by the OS
Identification: process, parent process, user, group, etc.
Execution contexts: threads
Address space: virtual memory
I/O state: file handles (file system), communication endpoints (network), etc.
Accounting information
For each process, this state is maintained in a process control block (PCB)
This is just data in the OS data space
7Computer Science, Rutgers CS 519: Operating System Theory
Process Creation
How to create a process? System call.
In UNIX, a process can create another process using the fork() system call
int pid = fork(); /* this is in C */
The creating process is called the parent and the new process is called the child
The child process is created as a copy of the parent process (process image and process control structure) except for the identification and scheduling state
Parent and child processes run in two different address spaces
By default, there is no memory sharing
Process creation is expensive because of this copying
The exec() call is provided for the newly created process to run a different program than that of the parent
8Computer Science, Rutgers CS 519: Operating System Theory
System Call In Monolithic OS
kernel mode
user mode
id = fork()
PC PSW
code for fork system call
trap
interrupt vector for trap instruction
iret
in-kernel file system(monolithic OS)
9Computer Science, Rutgers CS 519: Operating System Theory
Process Creation
fork()
fork() code
exec()
PCBs
fork()
10Computer Science, Rutgers CS 519: Operating System Theory
Example of Process Creation Using Fork
The UNIX shell is command-line interpreter whose basic purpose is for user to run applications on a UNIX system
cmd arg1 arg2 ... argn
11Computer Science, Rutgers CS 519: Operating System Theory
Process Termination
One process can wait for another process to finish using the wait() system call
Can wait for a child to finish as shown in the example
Can also wait for an arbitrary process if it knows its PID
Can kill another process using the kill() system callWhat all happens when kill() is invoked?
What if the victim process does not want to die?
12Computer Science, Rutgers CS 519: Operating System Theory
Process Swapping
May want to swap out entire processThrashing if too many processes competing for resources
To swap out a processSuspend its execution
Copy all of its information to backing store (except for PCB)
To swap a process back inCopy needed information back into memory, e.g. page table, thread control blocks
Restore state to blocked or readyMust check whether event(s) has (have) already
occurred
13Computer Science, Rutgers CS 519: Operating System Theory
Process State Diagram
ready(in memory)
suspended(swapped out)
swap out swap in
14Computer Science, Rutgers CS 519: Operating System Theory
Signals
OS may need to “upcall” into user processes
SignalsUNIX mechanism to upcall when an event of interest occurs
Potentially interesting events are predefined: e.g., segmentation violation, message arrival, kill, etc.
When interested in “handling” a particular event (signal), a process indicates its interest to the OS and gives the OS a procedure that should be invoked in the upcall.
15Computer Science, Rutgers CS 519: Operating System Theory
Signals (Cont’d)
When an event of interest occurs, the kernel handles the event first, then modifies the process‘ stack to look as if the process’ code made a procedure call to the signal handler.
When the user process is scheduled next, it executes the handler first
From the handler, the user process returns to where it was when the event occurred
A
B
A
B
Handler
16Computer Science, Rutgers CS 519: Operating System Theory
Inter-Process Communication
Most operating systems provide several abstractions for inter-process communication: message passing, shared memory, etc
Communication requires synchronization between processes (i.e. data must be produced before it is consumed)
Synchronization can be implicit (message passing) or explicit (shared memory)
Explicit synchronization can be provided by the OS (semaphores, monitors, etc) or can be achieved exclusively in user-mode (if processes share memory)
17
Message Passing Implementation
two copy operations in a conventional implementation
x=1send(process2, &X)
receive(process1,&Y)print Y
process 1
process 2X
Y
kernel buffers
1stcopy
2ndcopy
kernel
18
Shared Memory Implementation
no copying but synchronization is necessary
X=1
print Y
process 1
process 2X
Y
kernel
physical memory
sharedregion
19Computer Science, Rutgers CS 519: Operating System Theory
Inter-Process Communication
More on shared memory and message passing later
Synchronization after we talk about threads
20Computer Science, Rutgers CS 519: Operating System Theory
A Tree of Processes On A Typical UNIX System
21Computer Science, Rutgers CS 519: Operating System Theory
Process: Summary
System abstraction – the set of resources required for executing a program (an instantiation of a program)
Execution context
Address space
File handles, communication endpoints, etc.
Historically, all of the above “lumped” into a single abstraction
More recently, split into several abstractionsThreads, address space, protection domain, etc.
OS process management:Supports creation of processes and interprocess communication (IPC)
Allocates resources to processes according to specific policies
Interleaves the execution of multiple processes to increase system utilization
22Computer Science, Rutgers CS 519: Operating System Theory
Threads
Thread of execution: stack + registers (including PC)Informally: where an execution stream is currently at in the program and the method invocation chain that brought the execution stream to the current place
Example: A called B, which called C, which called B, which called C
The PC should be pointing somewhere inside C at this point
The stack should contain 5 activation records: A/B/C/B/C
Process model discussed thus far implies a single thread
23Computer Science, Rutgers CS 519: Operating System Theory
Multi-Threading
Why limit ourselves to a single thread?
Think of a web server that must service a large stream of requests
If only have one thread, can only process one request at a time
What to do when reading a file from disk?
Multi-threading model
Each process can have multiple threads
Each thread has a private stack
Registers are also private
All threads of a process share the code, the global data and heap
24Computer Science, Rutgers CS 519: Operating System Theory
Process Address Space Revisited
OS
Code
Globals
Stack
Heap
OS
Code
GlobalsStack
Heap
Stack
(a) Single-threaded address space (b) Multi-threaded address space
25Computer Science, Rutgers CS 519: Operating System Theory
Multi-Threading (cont)
ImplementationEach thread is described by a thread-control block (TCB)
A TCB typically containsThread ID
Space for saving registers
Pointer to thread-specific data not on stack
ObservationAlthough the model is that each thread has a private stack, threads actually share the process address space
There’s no memory protection!
Threads could potentially write into each other’s stack
26
Posix Thread (Pthread) API
thread creation and termination
pthread_create(&tid,NULL,start_fn,arg);
pthread_exit(status)’ thread join
pthread_join(tid, &status); mutual exclusion
pthread_mutex_lock(&lock);
pthread_mutex_unlock(&lock); condition variable
pthread_cond_wait(&c,&lock);
pthread_cond_signal(&c);
27Computer Science, Rutgers CS 519: Operating System Theory
Thread Creation
thread_create()
thread_create() code
PCBs
TCBs
stacks
new_thread_starts_here
PC
SP
28Computer Science, Rutgers CS 519: Operating System Theory
Context Switching
Suppose a process has multiple threads, a uniprocessor machine only has 1 CPU, so what to do?
In fact, even if we only had one thread per process, we would have to do something about running multiple processes …
We multiplex the multiple threads on the single CPU
At any instance in time, only one thread is running
At some point in time, the OS may decide to stop the currently running thread and allow another thread to run
This switching from one running thread to another is called context switching
30Computer Science, Rutgers CS 519: Operating System Theory
Context Switching (cont)
How to do a context switch?
Save state of currently executing threadCopy all “live” registers to the thread control block
Restore state of thread to run nextCopy values of live registers from thread control block to registers
When does context switching take place?
31Computer Science, Rutgers CS 519: Operating System Theory
Context Switching (cont)
When does context switching occur?When the OS decides that a thread has run long enough and that another thread should be given the CPU
Remember how the OS gets control of the CPU back when it is executing user code?
When a thread performs an I/O operation and needs to block to wait for the completion of this operation
To wait for some other thread
Thread synchronization
32Computer Science, Rutgers CS 519: Operating System Theory
How Is the Switching Code Invoked?
user thread executing clock interrupt PC modified by hardware to “vector” to interrupt handler user thread state is saved for later resume clock interrupt handler is invoked disable interrupt checking check whether current thread has run “long enough” if yes, post asynchronous software trap (AST) enable interrupt checking exit interrupt handler enter “return-to-user” code check whether AST was posted if not, restore user thread state and return to executing user thread; if AST was posted, call context switch code
Why need AST?
33Computer Science, Rutgers CS 519: Operating System Theory
How Is the Switching Code Invoked? (cont)
user thread executing system call to perform I/O user thread state is saved for later resume OS code to perform system call is invoked I/O operation started (by invoking I/O driver) set thread status to waiting move thread’s TCB from run queue to wait queue associated with specific device call context switching code
34Computer Science, Rutgers CS 519: Operating System Theory
Context Switching
At entry to CS, the return address is either in a register or on the stack (in the current activation record)
CS saves this return address to the TCB instead of the current PC
To thread, it looks like CS just took a while to return!
If the context switch was initiated from an interrupt, the thread never knows that it has been context switched out and back in unless it looking at the “wall” clock
35Computer Science, Rutgers CS 519: Operating System Theory
Context Switching (cont)
Even that is not quite the whole story
When a thread is switched out, what happens to it?
How do we find it to switch it back in?
This is what the TCB is for. System typically has
A run queue that points to the TCBs of threads ready to run
A blocked queue per device to hold the TCBs of threads blocked waiting for an I/O operation on that device to complete
When a thread is switched out at a timer interrupt, it is still ready to run so its TCB stays on the run queue
When a thread is switched out because it is blocking on an I/O operation, its TCB is moved to the blocked queue of the device
36Computer Science, Rutgers CS 519: Operating System Theory
Ready Queue And Various I/O Device Queues
37Computer Science, Rutgers CS 519: Operating System Theory
Switching Between Threads of Different Processes
What if switching to a thread of a different process?
Caches, TLB, page table, etc.?Caches
Physical addresses: no problem
Virtual addresses: cache must either have process tag or must flush cache on context switch
TLBEach entry must have process tag or must flush
TLB on context switch
Page tableTypically have page table pointer (register) that
must be reloaded on context switch
38Computer Science, Rutgers CS 519: Operating System Theory
Threads & Signals
What happens if kernel wants to signal a process when all of its threads are blocked?
When there are multiple threads, which thread should the kernel deliver the signal to?
OS writes into process control block that a signal should be delivered
Next time any thread from this process is allowed to run, the signal is delivered to that thread as part of the context switch
What happens if kernel needs to deliver multiple signals?
39Computer Science, Rutgers CS 519: Operating System Theory
Thread Implementation
Kernel-level threads (lightweight processes)Kernel sees multiple execution contexts
Thread management done by the kernel
User-level threadsImplemented as a thread library, which contains the code for thread creation, termination, scheduling and switching
Kernel sees one execution context and is unaware of thread activity
Can be preemptive or not
40
User-Level Thread Implementation
code
process
kernel
thread 1pc
sp
pc
sp
thread 2
thread stacks
data
41Computer Science, Rutgers CS 519: Operating System Theory
User-Level vs. Kernel-Level Threads
Advantages of user-level threadsPerformance: low-cost thread operations (do not require crossing protection domains)
Flexibility: scheduling can be application specific
Portability: user-level thread library easy to port
Disadvantages of user-level threadsIf a user-level thread is blocked in the kernel, the entire process (all threads of that process) are blocked
Cannot take advantage of multiprocessing (the kernel assigns one process to only one processor)
42Computer Science, Rutgers CS 519: Operating System Theory
User-Level vs. Kernel-Level Threads
process
processor
user-levelthreads
threadscheduling
processscheduling
kernel-levelthreads
threadscheduling
kernel
user
processor
threads
threads
processscheduling
43Computer Science, Rutgers CS 519: Operating System Theory
User-Level vs. Kernel-Level Threads
No reason why we should not have both
Most systems now support kernel threads
User-level threads are available as linkable libraries
kernel-levelthreads
processor
user-levelthreads
threadscheduling
threadscheduling
kernel
user
processscheduling
44Computer Science, Rutgers CS 519: Operating System Theory
Kernel Support for User-Level Threads
Even kernel threads are not quite the right abstraction for supporting user-level threads
Mismatch between where the scheduling information is available (user) and where scheduling on real processors is performed (kernel)
When the kernel thread is blocked, the corresponding physical processor is lost to all user-level threads although there may be some ready to run.
45Computer Science, Rutgers CS 519: Operating System Theory
Why Kernel Threads Are Not The Right Abstraction
physical processor
kernel thread kernel
user
user-level threads
user-level scheduling
kernel-level schedulingblocked
46Computer Science, Rutgers CS 519: Operating System Theory
Scheduler Activations: Kernel Support for User-Level Threads
Each process contains a user-level thread system (ULTS) that controls the scheduling of the allocated processors
Kernel allocates processors to processes as scheduler activations (SAs). An SA is similar to a kernel thread, but it also transfers control from the kernel to the ULTS on a kernel event as described below
Kernel notifies a process whenever the number of allocated processors changes or when an SA is blocked due to the user-level thread running on it (e.g., for I/O or on a page fault)
The process notifies the kernel when it needs more or fewer SAs (processors)
Ex.: (1) Kernel notifies ULTS that user-level thread blocked by creating an SA and upcalling the process; (2) ULTS removes the state from the old SA, tells the kernel that it can be reused, and decides which user-level thread to run on the new SA
47Computer Science, Rutgers CS 519: Operating System Theory
User-Level Threads On Top ofScheduler Activations
physical processor
scheduler activation
kernel
user
user-level threads
user-level scheduling
kernel-level schedulingblocked active
blocked active
Source: T. Anderson et al. “Scheduler Activations: Effective Kernel Support for the User-Level Management of Parallelism”. ACM TOCS, 1992.
48Computer Science, Rutgers CS 519: Operating System Theory
Threads vs. Processes
Why multiple threads?
Can’t we use multiple processes to do whatever it is that we do with multiple threads?
Of course, we need to be able to share memory (and other resources) between multiple processes …
But this sharing is already supported by threads
Operations on threads (creation, termination, scheduling, etc..) are cheaper than the corresponding operations on processes
This is because thread operations do not involve manipulations of other resources associated with processes (I/O descriptors, address space, etc)
Inter-thread communication is supported through shared memory without kernel intervention
Why not? Have multiple other resources, why not threads?
49Computer Science, Rutgers CS 519: Operating System Theory
Thread/Process Operation Latencies
Operation User-level Thread
(s)
Kernel Threads
(s)
Processes (s)
Null fork 34 948 11,300
Signal-wait
37 441 1,840
VAX uniprocessor running UNIX-like OS, 1992.
2.8-GHz Pentium 4 uniprocessor running Linux, 2004.
Operation Kernel Threads (s)
Processes (s)
Null fork 45 108
51Computer Science, Rutgers CS 519: Operating System Theory
Synchronization
ProblemThreads must share data
Data consistency must be maintained
52
The Critical Section Problem
When a process executes code that manipulates shared data (or resource), we say that the process is in its critical section (for that shared data)
The execution of critical sections must be mutually exclusive: at any time: only one process is allowed to execute in its critical section (even with multiple CPUs)
Each process must request the permission to enter a critical section
53Computer Science, Rutgers CS 519: Operating System Theory
Terminologies
Critical section: a section of code which reads or writes shared data
Race condition: potential for interleaved execution of a critical section by multiple threads
Results are non-deterministic
Mutual exclusion: synchronization mechanism to avoid race conditions by ensuring exclusive execution of critical sections
Deadlock: permanent blocking of threads
Starvation: execution but no progress
54
The Critical Section Problem
• The section of code implementing the request to enter a CS is called the entry section
• The critical section might be followed by an exit section
• The remaining code is the remainder section
• The critical section problem: to design a protocol that, if executed by concurrent processes, ensures that their action will not depend on the order in which their execution is interleaved (possibly on many processors)
55
Framework for Critical Section Solution Analysis
•Each thread executes at nonzero speed, but no assumption on the relative speed of n threads
•Structure of a concurrent thread:
•No assumption about order of interleaved execution
•Basically, a ME solution must specify the entry and exit sections
repeat entry section critical section exit section remainder sectionforever
56Computer Science, Rutgers CS 519: Operating System Theory
Requirements for Mutual Exclusion
• No assumptions on hardware: speed, # of processors
• Mutual exclusion is maintained – that is, only one thread at a time can be executing inside a CS
• Execution of CS takes a finite time
• A thread/process not in CS cannot prevent other threads/processes to enter the CS
• Entering CS cannot de delayed indefinitely: no deadlock or starvation
57
What about thread failures?
If all three criteria (ME, progress, bounded waiting) are satisfied, then a valid solution will provide robustness against failure of a thread in its remainder section (RS)
Because failure in RS is just like having an infinitely long RS
However, no valid solution can provide robustness against a thread failing in its critical section (CS)
A thread Ti that fails in its CS does not signal that fact to other threads: for them Ti is still in its CS
58Computer Science, Rutgers CS 519: Operating System Theory
Synchronization Primitives
Most common primitivesLocks (mutual exclusion)
Condition variables
Semaphores
Monitors
NeedSemaphores, or
Locks and condition variables, or
Monitors
59
Solutions for Mutual Exclusion
•software reservation: a thread must register its intent to enter CS and then, wait until no other thread has registered a similar intention before proceeding
•spin-locks using memory-interlocked instructions: require special hardware to ensure that a given location can be read, modified and written without interruption (i.e. TST: test&set instruction)
•OS-based mechanisms for ME: semaphores, monitors, message passing, lock files
60Computer Science, Rutgers CS 519: Operating System Theory
Locks
Mutual exclusion want to be the only thread modifying a set of data items
Can look at it as exclusive access to data items or to a piece of code
Have three components:
Acquire, Release, Waiting
61Computer Science, Rutgers CS 519: Operating System Theory
Example
public class BankAccount{ Lock aLock = new Lock; int balance = 0;
...
public void deposit(int amount) {
aLock.acquire();balance = balance + amount;aLock.release();
}
public void withdrawal(int amount){
aLock.acquire();balance = balance - amount;aLock.release();
}}
62Computer Science, Rutgers CS 519: Operating System Theory
Implementing Locks Inside OS Kernel
From Nachos (with some simplifications)public class Lock { private KThread lockHolder = null; private ThreadQueue waitQueue = ThreadedKernel.scheduler.newThreadQueue(true);
public void acquire() {KThread thread = KThread.currentThread(); // Get thread object (TCB)if (lockHolder != null) { // Gotta wait waitQueue.waitForAccess(thread); // Put thread on wait queue KThread.sleep(); // Context switch}else { lockHolder = thread; // Got the lock}
}
63Computer Science, Rutgers CS 519: Operating System Theory
Implementing Locks Inside OS Kernel (cont)
This implementation is not quite right … what’s missing?
public void release() {if ((lockHolder = waitQueue.nextThread()) != null) lockHolder.ready(); // Wake up a waiting thread
}
64Computer Science, Rutgers CS 519: Operating System Theory
Implementing Locks Inside OS Kernel (cont)
public void release() {boolean intStatus = Machine.interrupt().disable();
if ((lockHolder = waitQueue.nextThread()) != null) lockHolder.ready();
Machine.interrupt().restore(intStatus);}
Unfortunately, disabling interrupts only works for uniprocessors.
65Computer Science, Rutgers CS 519: Operating System Theory
Implementing Locks At User-Level
Why?Expensive to enter the kernel
Parallel programs on multiprocessor systems
What’s the problem?Can’t disable interrupt …
Many software algorithms for mutual exclusionSee any OS book
Disadvantages: very difficult to get correct
So what do we do?
66Computer Science, Rutgers CS 519: Operating System Theory
Implementing Locks At User-Level
Simple with a “little bit” of help from the hardware
Atomic read-modify-write instructionsTest-and-set
Atomically read a variable and, if the value of the variable is currently 0, set it to 1
Fetch-and-increment
Compare-and-swap
67
Hardware Solutions: Interrupt Disabling
•On a uniprocessor, mutual exclusion is preserved but efficiency of execution is degraded
• while in CS, execution cannot be interleaved with other processes in RS
•On a multiprocessor, mutual exclusion is not preserved
• CS is atomic but not mutually exclusive
•Generally not an acceptable solution
Process Pi:repeat disable interrupts critical section enable interrupts remainder sectionforever
68
Hardware Solutions: Special Machine Instructions
• Normally, an access to a memory location excludes other access to that same location
• Extension: designers have proposed machines instructions that perform two actions atomically (indivisible) on the same memory location (ex: reading and writing)
• The execution of such an instruction is also mutually exclusive (even with multiple CPUs)
• They can be used to provide mutual exclusion but need to be complemented by other mechanisms to satisfy the other two requirements of the CS problem (and avoid starvation and deadlock)
69
Test-and-Set Instruction
•A C++ description of test-and-set:
•An algorithm that uses test&set for mutual exclusion:
bool testset(int& i){ if (i==0) { i=1; return true; } else { return false; }}
Process Pi:repeat repeat{} until testset(b); CS b:=0; RSforever
70
Test-and-Set Instruction (cont.)
Shared variable b is initialized to 0
Only the first Pi that sets b enters CS
Mutual exclusion is preserved
•if Pi enter CS, the other Pj are busy waiting
When Pi exit CS, the selection of the next Pj that enters CS is arbitrary
No bounded waiting
Starvation is possible
71
Using xchg for Mutual Exclusion
•Shared variable b is initialized to 0
•Each Pi has a local variable k
•The only Pi that can enter CS is the one that finds b=0
•This Pi excludes all the other Pj by setting b to 1
Process Pi:repeat k:=1 repeat xchg(k,b) until k=0; CS b:=0; RSforever
72Computer Science, Rutgers CS 519: Operating System Theory
Atomic Read-Modify-Write Instructions
Test-and-setRead a memory location and, if the value is currently 0, set it to 1
Fetch-and-incrementReturn the value of of a memory location
Increment the value by 1 (in memory, not the value returned)
Compare-and-swapCompare the value of a memory location with an old value
If the same, replace with a new value
73
Mutual Exclusion Machine Instructions
Advantages
Applicable to any number of processes/threads on either a single processor or multiple processors sharing main memory
It is simple and easy to verify
It can be used to support multiple critical sections
74
Mutual Exclusion Machine Instructions
Disadvantages
Busy-waiting consumes processor time
Starvation is possible when a process leaves a critical section and more than one process is waiting.
DeadlockIf a low priority process has the critical
region and a higher priority process needs it, the higher priority process will obtain the processor just to wait for the critical region
75Computer Science, Rutgers CS 519: Operating System Theory
Implementing Spin Locks Using Test&Set
#define UNLOCKED 0
#define LOCKED 1
Spin_acquire(lock)
{
while (test-and-set(lock) == LOCKED);
}
Spin_release(lock)
{
lock = UNLOCKED;
}Problems?
76Computer Science, Rutgers CS 519: Operating System Theory
Implementing Spin Locks Using Test&Set
Problems? Lots of memory traffic if TAS always sets; lots of traffic when lock is released; no ordering guarantees. Solutions?
#define UNLOCKED 0
#define LOCKED 1
Spin_acquire(lock)
{
while (test-and-set(lock) == LOCKED);
}
Spin_release(lock)
{
lock = UNLOCKED;
}
77Computer Science, Rutgers CS 519: Operating System Theory
Spin Locks Using Test and Test&Set
Spin_acquire(lock)
{
while (1) {
while (lock == LOCKED);
if (test-and-set(lock) == UNLOCKED) break;
}
}
Spin_release(lock)
{
lock = UNLOCKED;
}Better, since TAS is guaranteed not to generate trafficunnecessarily. But there is still lots of traffic after a release.Still no ordering guarantees.
78
OS Solutions: Semaphores
Synchronization tool (provided by the OS) that does not require busy waiting
A semaphore S is an integer variable that, apart from initialization, can only be accessed through two atomic and mutually exclusive operations:
wait(S)
signal(S)
Avoids busy waiting
when a thread has to wait, the OS will put it in a blocked queue of threads waiting for that semaphore
79
Semaphores
•Internally, a semaphore is a record (structure):
type semaphore = record count: integer; queue: list of threads end;var S: semaphore;
•When a thread must wait for a semaphore S, it is blocked and put on the semaphore’s queue
•The signal operation removes (according to a fair policy like ,FIFO) one thread from the queue and puts it in the list of ready threads
80
Semaphore Operations
wait(S): S.count--; if (S.count<0) { block this thread place this thread in S.queue }
signal(S): S.count++; if (S.count<=0) { remove a thread P from S.queue place this thread P on ready list }
•S.count must be initialized to a nonnegative value (depending on application)
81
Semaphores: Observations
S.count >=0
the number of threads that can execute wait(S) without being blocked is S.count
S.count<0
the number of threads waiting on S is = |S.count|
Atomicity and mutual exclusion
no two threads can be in wait(S) and signal(S) (on the same S) at the same time (even with multiple CPUs)
The code defining wait(S) and signal(S) must be executed in critical sections
82
Semaphores: Implementation
•Key observation: the critical sections defined by wait(S) and signal(S) are very short (typically 10 instructions)
•Uniprocessor solutions:disable interrupts during these operations (ie: for a
very short period)
does not work on a multiprocessor machine.
•Multiprocessor solutions:use software or hardware mutual exclusion solutions
in the OS.
the amount of busy waiting is small.
83
Using Semaphores for Solving Critical Section Problems
For n threads
Initialize S.count to 1
Only one thread is allowed into CS (mutual exclusion)
To allow k threads into CS, we initialize S.count to k
Process Pi:repeat wait(S); CS signal(S); RSforever
84
Using Semaphores to SynchronizeThreads
We have two threads: P1 and P2
Problem: Statement S1 in P1 must be performed before statement S2 in P2
Solution: define a semaphore “synch”
Initialize synch to 0P1 code:
S1;
signal(synch);
P2 code
wait(synch);
S2;
85
Binary Semaphores
•Similar to general (counting) semaphores except that “count” is Boolean valued
•Counting semaphores can be implemented using binary semaphores
•More difficult to use than counting semaphores (eg: they cannot be initialized to an integer k > 1)
86
Binary Semaphore Operations
waitB(S): if (S.value = 1) { S.value := 0; } else { block this process place this process in S.queue }
signalB(S): if (S.queue is empty) { S.value := 1; } else { remove a process P from S.queue place this process P on ready list }
87
Problems with Semaphores
• Semaphores are a powerful tool for enforcing mutual exclusion and coordinate threads
• Problem: wait(S) and signal(S) are scattered among several threads
•It is difficult to understand their effects
•Usage must be correct in all threads
•One badly coded (or malicious) thread can fail the entire collection of threads
88
Monitors
• Are high-level language constructs that provide equivalent functionality to semaphores but are easier to control
• Found in many concurrent programming languages
• Concurrent Pascal, Modula-3, uC++, Java...
• Can be implemented using semaphores
89
Monitors
Is a software module containing:
•one or more procedures
•an initialization sequence
•local data variables
Characteristics:
•local variables accessible only by monitor’s procedures
•a process enters the monitor by invoking one of it’s procedures
•only one process can be in the monitor at any one time
90
Monitors
• The monitor ensures mutual exclusion
• no need to program this constraint explicitly
• Shared data are protected by placing them in the monitor
•The monitor locks the shared data on process entry
• Process/thread synchronization is done using condition variables, which represent conditions a process may need to wait for before executing in the monitor
91Computer Science, Rutgers CS 519: Operating System Theory
Condition Variables
A condition variable is always associated with a condition and a lock
Typically used to wait for a condition to take on a given value
Three operations:cond_wait(lock, cond_var)
cond_signal(cond_var)
cond_broadcast(cond_var)
92
Condition Variables
Local to the monitor (accessible only within the monitor)
Can be access and changed only by two functions:
•cwait(a): blocks execution of the calling thread on condition (variable) a
• the process can resume execution only if another process executes csignal(a)
•csignal(a): resume execution of some process blocked on condition (variable) a.
• If several such process exists: choose any one
• If no such process exists: do nothing
93Computer Science, Rutgers CS 519: Operating System Theory
Condition Variables
cond_wait(lock, cond_var)Release the lock
Sleep on cond_var
When awakened by the system, reacquire the lock and return
cond_signal(cond_var)If at least 1 thread is sleeping on cond_var, wake 1 up
Otherwise, no effect
cond_broadcast(cond_var)If at least 1 thread is sleeping on cond_var, wake everyone up
Otherwise, no effect
94
Posix Thread (Pthread) API
thread creation and termination
pthread_create(&tid,NULL,start_fn,arg);
pthread_exit(status)’ thread join
pthread_join(tid, &status); mutual exclusion
pthread_mutex_lock(&lock);
pthread_mutex_unlock(&lock); condition variable
pthread_cond_wait(&c,&lock);
pthread_cond_signal(&c);
95
Condition Variables (example)
thread 1pthread_mutex_lock(&lock);
while (!my-condition)
pthread_cond_wait(&c,&lock);
do_critical_section();
pthread_mutex_unlock(&lock);
thread 2 pthread_mutex_lock(&lock);
my-condition = true;
pthread_mutex_unlock(&lock);
pthread_cond_signal(&c);
96Computer Science, Rutgers CS 519: Operating System Theory
Producer/Consumer Example
Producer
lock(lock_bp)while (free_bp.is_empty()) cond_wait(lock_bp, cond_freebp_empty)buffer free_bp.get_buffer()unlock(lock_bp)
… produce data in buffer …
lock(lock_bp)data_bp.add_buffer(buffer)cond_signal(cond_databp_empty)unlock(lock_bp)
Consumer
lock(lock_bp)while (data_bp.is_empty()) cond_wait(lock_bp, cond_databp_empty)buffer data_bp.get_buffer()unlock(lock_bp)
… consume data in buffer …
lock(lock_bp)free_bp.add_buffer(buffer)cond_signal(cond_freebp_empty)unlock(lock_bp)
97
Monitors
•Awaiting processes are either in the entrance queue or in a condition queue
•A process puts itself into condition queue cn by issuing cwait(cn)
•csignal(cn) brings into the monitor one process in condition cn queue
•csignal(cn) blocks the calling process and puts it in the urgent queue (unless csignal is the last operation of the monitor procedure)
101Computer Science, Rutgers CS 519: Operating System Theory
Deadlock (Cont’d)
Deadlock can occur whenever multiple parties are competing for exclusive access to multiple resources
How can we deal deadlocks?Deadlock prevention
Design a system without one of mutual exclusion, hold and wait, no preemption or circular wait (four necessary conditions)
To prevent circular wait, impose a strict ordering on resources. For instance, if need to lock variables A and B, always lock A first, then lock B
Deadlock avoidanceDeny requests that may lead to unsafe states (Banker’s algorithm)
Running the algorithm on all resource requests is expensive
Deadlock detection and recoveryCheck for circular wait periodically. If circular wait is found, abort
all deadlocked processes (extreme solution but very common)
Checking for circular wait is expensive