COT 5611 Operating Systems Design Principles Spring 2012

COT 5611 Operating SystemsDesign Principles Spring 2012

Dan C. MarinescuOffice: HEC 304Office hours: M-Wd 5:00-6:00 PM

Lecture 19

Lecture 19 – Wednesday March 21, 2012

Reading assignment: Chapter 9 from the on-line text

Last time – All-or-nothing and before-or after atomicity Atomicity and processor management Processes, threads, and address spaces Thread coordination with a bounded buffer – the naïve approach Thread management Address spaces and multi-level memories Kernel structures for the management of multiple cores/processors and

threads/processes

3/21/2012 2

Lecture 19

Today Locks and before-or-after actions; hardware support for locks YIELD Conditions for thread coordination – Safety, Liveness, Bounded-Wait, Fairness Critical sections – a solution to critical section problem Deadlocks Signals Semaphores Monitors Thread coordination with a bounded buffer.

WAIT NOTIFY AWAIT ADVANCE SEQUENCE TICKET

3/21/2012 3

Lecture 19

Locks; Before-or-After actions

Locks shared variables which acts as a flag to coordinate access to a shared data. Manipulated with two primitives

ACQUIRE RELEASE

Support implementation of before-or-after actions; only one thread can acquire the lock, the others have to wait.

All threads must obey the convention regarding the locks. The two operations ACQUIRE and RELEASE must be atomic. Hardware support for implementation of locks

RSM – Read and Set Memory CMP –Compare and Swap

RSM (mem) If mem=LOCKED then RSM returns r=LOCKED and sets mem=LOCKED If mem=UNLOCKED the RSM returns r=LOCKED and sets mem=LOCKED

3/21/2012 4

Lecture 193/21/2012 5

Lecture 193/21/2012 6

Lecture 19

Important facts to remember

Each thread has a unique ThreadId Threads save their state on the stack. The stack pointer of a thread is stored in the thread table. To activate a thread the registers of the processor are loaded with

information from the thread state. What if no thread is able to run

create a dummy thread for each processor called a processor_thread which is scheduled to run when no other thread is available

the processor_thread runs in the thread layer the SCHEDULER runs in the processor layer

We have a processor thread for each processor/core. We can use spin locks only if the two processes (the producer and the

consumer) run on different CPUs; we need an active process to release a spin lock….

3/21/2012 7

Lecture 19

Switching threads with dynamic thread creation Switching from one user thread to another requires two steps

Switch from the thread releasing the processor to the processor thread Switch from the processor thread to the new thread which is going to have the

control of the processor The last step requires the SCHEDULER to circle through the thread_table until a

thread ready to run is found The boundary between user layer threads and processor layer thread is

crossed twice Example: switch from thread 0 to thread 6 using

YIELD ENTER_PROCESSOR_LAYER EXIT_PROCESSOR_LAYER

3/21/2012 8

Lecture 193/21/2012 9

Lecture 19

The control flow when switching from one thread to another The control flow is not obvious as some of the procedures reload the stack

pointer (SP) When a procedure reloads the stack pointer then the place where it

transfers control when it executes a return is the procedure whose SP was saved on the stack and was reloaded before the execution of the return.

ENTER_PROCESSOR_LAYER Changes the state of the thread calling YIELD from RUNNING to RUNNABLE Save the state of the procedure calling it , YIELD, on the stack Loads the processors registers with the state of the processor thread, thus starting the

SCHEDULER EXIT_PROCESSOR_LAYER

Saves the state of processor thread into the corresponding PROCESSOR_TABLE and loads the state of the thread selected by the SCHEDULER to run (in our example of thread 6) in the processor’s registers

Loads the SP with the values saved by the ENTER_PROCESSOR_LAYER

3/21/2012 10

Lecture 193/21/2012 11

Lecture 193/21/2012 12

Lecture 19

In ENTER PROCESSOR_LAYER instead of SCHEDULER() should be SP processor_table[processor].topstack

3/21/2012 13

Lecture 19

1 2 N-1N-2

out

in

Read from the bufferlocation

pointed by out

Write to the bufferlocation

pointed by out

shared structure buffer message instance message[N] integer in initially 0 integer out initially 0

procedure SEND (buffer reference p, message instance msg) while p.in – p.out = N do nothing /* if buffer full wait p.message [p.in modulo N] ß msg /* insert message into buffer cell p.in ß p.in + 1 /* increment pointer to next free cell

procedure RECEIVE (buffer reference p) while p.in = p.out do nothing /* if buffer empty wait for message msgß p.message [p.in modulo N] /* copy message from buffer cell p.out ß p.out + 1 /* increment pointer to next message return msg

0 1

3/21/2012 14

Lecture 19

Implicit assumptions for the correctness of the implementation1. One sending and one receiving thread. Only one thread updates each

shared variable.2. Sender and receiver threads run on different processors to allow spin

locks3. in and out are implemented as integers large enough so that they do not

overflow (e.g., 64 bit integers)4. The shared memory used for the buffer provides read/write coherence5. The memory provides before-or-after atomicity for the shared variables in

and out 6. The result of executing a statement becomes visible to all threads in

program order. No compiler optimization supported

3/21/2012 15

Lecture 19

In practice….. Threads run concurrently Race conditions may occur data in the buffer may be overwritten a lock for the bounded buffer the producer acquires the lock before writing the consumer acquires the lock before reading

3/21/2012 16

Lecture 19

time

Operations of Thread A

Buffer is empty

in=out=0

on=out=0

Fill entry 0 at time t1 with item b

0

Operations of Thread B

Fill entry 0 at time t2with item a

Increment pointer at time t3

inß 1

Increment pointer at time t4

inß 2

Two senders execute the code concurrently

Processor 1 runs thread A

Processor 2 runs thread B

Memory contains shared dataBuffer, In, Out

Processor-memory bus

Item b is overwritten, it is lost

t1 t4t3t2

3/21/2012 17

Lecture 19

We have to avoid deadlocks

If a producer thread cannot write because the buffer is full it has to release the lock to allow the consumer thread to acquire the lock to read, otherwise we have a deadlock.

If a consumer thread cannot read because the there is no new item in the buffer it has to release the lock to allow the consumer thread to acquire the lock to write, otherwise we have a deadlock.

3/21/2012 18

Lecture 193/21/2012 19

Lecture 19

In practice…

We have to ensure atomicity of some operations, e.g., updating the pointers

3/21/2012 20

Lecture 19

One more pitfall of the previous implementation of bounded buffer

If in and out are long integers (64 or 128 bit) then a load requires two registers, e.,g, R1 and R2.

int “00000000FFFFFFFF” L R1,int /* R1 00000000 L R2,int+1 /* R2 FFFFFFFF Race conditions could affect a load or a store of the long integer.

3/21/2012 21

Lecture 19

time

Another manifestation of race conditions à incrementing a pointer is not atomic

in ß in+11. L R1,in

2. ADD R1,1

3. ST R1,in



A1 A2 A3

B1 B2 B3

inß 1 inß 2

Correct execution



A1 A2 A3

B1 B2 B3

inß 1 inß 1

Incorrect execution

3/21/2012 22

Lecture 19

In practice the threads may run on the same system…. We cannot use spinlocks for a thread to wait until an event occurs. That’s why we have spent time on YIELD…

3/21/2012 23

Lecture 19

1 2 N-1N-2

out

inRead from the bufferlocation pointed by out

Write to the bufferlocation pointed by out

shared structure buffer message instance message[N] integer in initially 0 integer out initially 0 lock instance buffer_lock initially UNLOCKED

procedure SEND (buffer reference p, message instance msg) ACQUIRE (p_buffer_lock) while p.in – p.out = N do /* if buffer full wait RELEASE (p_buffer_lock) YIELD() ACQUIRE (p_buffer_lock) p.message [p.in modulo N] ß msg /* insert message into buffer cell p.in ß p.in + 1 /* increment pointer to next free cell RELEASE (p_buffer_lock)

procedure RECEIVE (buffer reference p) ACQUIRE (p_buffer_lock) while p.in = p.out do /* if buffer empty wait for message RELEASE (p_buffer_lock) YIELD() ACQUIRE (p_buffer_lock) msgß p.message [p.in modulo N] /* copy message from buffer cell p.out ß p.out + 1 /* increment pointer to next message return msg

0 1

3/21/2012 24

Lecture 19

Thread coordination

Critical section code that accesses a shared resource Race conditions two or more threads access shared data and the result

depends on the order in which the threads access the shared data. Mutual exclusion only one thread should execute a critical section at any

one time. Scheduling algorithms decide which thread to choose when multiple

threads are in a RUNNABLE state FIFO – first in first out LIFO – last in first out Priority scheduling EDF – earliest deadline first

Preemption ability to stop a running activity and start another one with a higher priority.

Side effects of thread coordination Deadlock Priority inversion a lower priority activity is allowed to run before one with a higher priority

3/21/2012 25

Lecture 19

Solutions to thread coordination problems must satisfy a set of conditions

1. Safety: The required condition will never be violated. 2. Liveness: The system should eventually progress irrespective of contention.3. Freedom From Starvation: No process should be denied progress for ever. That

is, every process should make progress in a finit time. 4. Bounded Wait: Every process is assured of not more than a fixed number of

overtakes by other processes in the system before it makes progress. 5. Fairness: dependent on the scheduling algorithm • FIFO: No process will ever overtake another process. • LRU: The process which received the service least recently gets the service next.

For example for the mutual exclusion problem the solution should guarantee that:Safety the mutual exclusion property is never violatedLiveness a thread will access the shared resource in a finit time Freedom for starvation a thread will access the shared resource in a finit timeBounded wait a thread will access the shared resource at least after a fixed number of

accesses by other threads.

3/21/2012 26

Lecture 19

Thread coordination problems

Dining philosophers Critical section

3/21/2012 27

Lecture 19

A solution to critical section problem

Applies only to two threads Ti and Tj with i,j ={0,1} which share integer turn if turn=i then it is the turn of Ti to enter the critical section boolean flag[2] if flag[i]= TRUE then Ti is ready to enter the critical section

To enter the critical section thread Ti

sets flag[i]= TRUE

sets turn=j

If both threads want to enter then turn will end up with a value of either i or j and the corresponding thread will enter the critical section.

Ti enters the critical section only if either flag[j]= FALSE or turn=i The solution is correct

Mutual exclusion is guaranteed The liveliness is ensured The bounded-waiting is met

But this solution may not work as load and store instructions can be interrupted on modern computer architectures

3/21/2012 28

Lecture 19 293/21/2012

Lecture 19

Deadlocks

Happen quite often in real life and the proposed solutions are not always logical: “When two trains approach each other at a crossing, both shall come to a full stop and neither shall start up again until the other has gone.” a pearl from Kansas legislation.

Deadlock jury. Deadlock legislative body.

3/21/2012 30

Lecture 19

Examples of deadlock

Traffic only in one direction. Solution one car backs up (preempt resources and rollback). Several

cars may have to be backed up . Starvation is possible.

3/21/2012 31

Lecture 19

A B

J

K

J

K

A B

3/21/2012 32

Lecture 19

Thread deadlock Deadlocks prevent sets of concurrent threads/processes from

completing their tasks. How does a deadlock occur a set of blocked threads each

holding a resource and waiting to acquire a resource held by another thread in the set.

Example locks A and B, initialized to 1

P0 P1

wait (A); wait(B)wait (B); wait(A)

Aim prevent or avoid deadlocks

3/21/2012 33

Lecture 19

System model

Resource types R1, R2, . . ., Rm (CPU cycles, memory space, I/O devices)

Each resource type Ri has Wi instances. Resource access model:

request use release

3/21/2012 34

Lecture 19

Simultaneous conditions for deadlock

Mutual exclusion: only one process at a time can use a resource. Hold and wait: a process holding at least one resource is waiting to

acquire additional resources held by other processes. No preemption: a resource can be released only voluntarily by the

process holding it (presumably after that process has finished). Circular wait: there exists a set {P0, P1, …, P0} of waiting processes such

that P0 is waiting for a resource that is held by P1, P1 is waiting for a resource that is held by P2, …, Pn–1 is waiting for a resource that is held by Pn, and P0 is waiting for a resource that is held by P0.

3/21/2012 35

Lecture 19

Wait for graphs

3/21/2012 36

Lecture 19

Semaphores Abstract data structure introduced by Dijkstra to reduce complexity of threads

coordination; has two components C count giving the status of the contention for the resource guarded by s L list of threads waiting for the semaphore s

Counting semaphore – for an arbitrary resource count. Supports two operations: V - signal() increments the semaphore C P - wait() P decrements the semaphore C. Binary semaphore: C is either 0 or 1.

3/21/2012 37

Lecture 19

The wait and signal operations

P (s) (wait) { If s.C > 0 then s.C − −; else join s.L; } V (s) (signal) { If s.L is empty then s.C + +; else release a process from s.L; }

3/21/2012 38

Lecture 19

Monitors Semaphores can be used incorrectly

multiple threads may be allowed to enter the critical section guarded by the semaphore may cause deadlocks

Threads may access the shared data directly without checking the semaphore. Solution encapsulate shared data with access methods to operate on them. Monitors an abstract data type that allows access to shared data with specific

methods that guarantee mutual exclusion

3/21/2012 39

Lecture 193/21/2012 40

Lecture 19

Asynchronous events and signals

Signals, or software interrupts, were originally introduced in Unix to notify a process about the occurrence of a particular event in the system.

Signals are analogous to hardware I/O interrupts: When a signal arrives, control will abruptly switch to the signal handler. When the handler is finished and returns, control goes back to where it came from

After receiving a signal, the receiver reacts to it in a well-defined manner. That is, a process can tell the system (OS) what they want to do when signal arrives: Ignore it. Catch it and deliver it. In this case, it must specify (register) the signal handling

procedure. This procedure resides in the user space. The kernel will make a call to this procedure during the signal handling and control returns to kernel after it is done.

Kill the process (default for most signals). Examples: Event - child exit, signal - to parent. Control signal from keyboard.

3/21/2012 41

Lecture 19

Signals state and implementation

A signal has the following states: Signal send - A process can send signal to one of its group member process

(parent, sibling, children, and further descendants). Signal delivered - Signal bit is set. Pending signal - delivered but not yet received (action has not been taken). Signal lost - either ignored or overwritten.

Implementation: Each process has a kernel space (created by default) called signal descriptor having bits for each signal. Setting a bit is delivering the signal, and resetting the bit is to indicate that the signal is received. A signal could be blocked/ignored. This requires an additional bit for each signal. Most signals are system controlled signals.

3/21/2012 42

Lecture 19

Back to thread coordination with a bounded buffer

The bounded buffer is a shared resource thus it must be protected; the critical section is implemented with a lock.

The lock must be released if the thread cannot continue. Spin lock a lock which involves busy wait. The thread must relinquish control of the processor, it must YIELD.

3/21/2012 43

Lecture 193/21/2012 44

Lecture 19

1 2 N-1N-2

out

inRead from the bufferlocation pointed by out

Write to the bufferlocation pointed by out

shared structure buffer message instance message[N] integer in initially 0 integer out initially 0 lock instance buffer_lock initially UNLOCKED

procedure SEND (buffer reference p, message instance msg) ACQUIRE (p_buffer_lock) while p.in – p.out = N do /* if buffer full wait RELEASE (p_buffer_lock) YIELD() ACQUIRE (p_buffer_lock) p.message [p.in modulo N] ß msg /* insert message into buffer cell p.in ß p.in + 1 /* increment pointer to next free cell RELEASE (p_buffer_lock)

procedure RECEIVE (buffer reference p) ACQUIRE (p_buffer_lock) while p.in = p.out do /* if buffer empty wait for message RELEASE (p_buffer_lock) YIELD() ACQUIRE (p_buffer_lock) msgß p.message [p.in modulo N] /* copy message from buffer cell p.out ß p.out + 1 /* increment pointer to next message return msg

0 1

3/21/2012 45

Lecture 19

Coordination with events and signals

We introduce two events p_room event which signals that there is room in the buffer p_notempty event which signals that there is a new item in the buffer

We also introduce two new system calls WAIT(ev) wait until the event ev occurs NOTIFY(ev) notify the other process that event ev has occurred.

SEND will wait if the buffer is full until it is notified that the RECIVE has created more room SEND WAIT(p_room) and RECIVE NOTIFY(p_room)

RECEIVE will wait if there is no new item in the buffer until it is notified by SEND that a new item has been written RECIVEWAIT(p_notempty) and SENDNOTIFY(p_notempty)

3/21/2012 46

Lecture 193/21/2012 47

Lecture 19

NOTIFY could be sent before the WAIT and this causes problems

The NOTIFY should always be sent after the WAIT. If the sender and the receiver run on two different processor there could be a race condition for the notempty event.

Tension between modularity and locks Several possible solutions: AWAIT/ADVANCE, semaphores, etc

3/21/2012 48

Lecture 19

AWAIT - ADVANCE solution A new state, WAITING and two before-or-after actions that take a

RUNNING thread into the WAITING state and back to RUNNABLE state. eventcount variables with an integer value shared between threads and

the thread manager; they are like events but have a value. A thread in the WAITING state waits for a particular value of the

eventcount AWAIT(eventcount,value)

If eventcount >value the control is returned to the thread calling AWAIT and this thread will continue execution

If eventcount ≤value the state of the thread calling AWAIT is changed to WAITING and the thread is suspended.

ADVANCE(eventcount) increments the eventcount by one then searches the thread_table for threads waiting for this eventcount if it finds a thread and the eventcount exceeds the value the thread is waiting for then

the state of the thread is changed to RUNNABLE

3/21/2012 49

Lecture 193/21/2012 50

Lecture 19

Implementation of AWAIT and ADVANCE

3/21/2012 51

Lecture 193/21/2012 52

Lecture 19

Solution for a single sender and single receiver

3/21/2012 53

Lecture 19

Supporting multiple senders: the sequencer

Sequencer shared variable supporting thread sequence coordination -it allows threads to be ordered and is manipulated using two before-or-after actions.

TICKET(sequencer) returns a negative value which increases by one at each call. Two concurrent threads calling TICKET on the same sequencer will receive different values based upon the timing of the call, the one calling first will receive a smaller value.

READ(sequencer) returns the current value of the sequencer

3/21/2012 54

Lecture 19

Multiple sender solution; only the SEND must be modified

3/21/2012 55

Lecture 19

The modular design

VM attempts to translate the virtual memory address to a physical memory address If the page is not in main memory VM generates a page-fault exception. The exception handler uses a SEND to send to an MLMM port the page number The SEND invokes ADVANCE which wakes up a thread of MLMM The MMLM invokes AWAIT on behalf of the thread interrupted due to the page fault. The AWAIT releases the processor to the SCHEDULER thread.

3/21/2012 56

Lecture 19

Application thread 1

Virtual MemoryManager

Exception Handler Scheduler

Multi-Level MemoryManager

Application thread 2

IR PC Translate (PC)into (Page#,Displ) Is (Page#) in primary storage?

YES- compute the physical addressof the instruction

IR PC

NO – page faultSave PCHandle page fault

Identify Page #

Issue AWAIT on behalf of thread 1

AWAIT

SEND(Page #)

Thread 1 WAITINGThread 2 RUNNING

IR PC Load PC of

thread 2

Find a block in primary storageIs “dirty” bit of block ON?YES- write block to secondary storage

NO- fetch block corresponding to missing page

I/O operation completsADVANCE

Thread 1 RUNNING

Load PC of thread 1

IR PC 3/21/2012 57

Lecture 19

Name resolution in multi-level memories We consider pairs of layers:

Upper level of the pair primary Lower level of the pair secondary

The top level managed by the application which generates LOAD and STORE instructions to/from CPU registers from/to named memory locations

The processor issues READs/WRITEs to named memory locations. The name goes to the primary memory device located on the same chip as the processor which searches the name space of the on-chip cache (L1 cache), the primary device with the L2 cache as secondary device.

If the name is not found in L1 cache name space the Multi-Level Memory Manager (MLMM) looks at the L2 cache (off-chip cache) which becomes the primary with the main memory as secondary.

If the name is not found in the L2 cache name space the MLMM looks at the main memory name space. Now the main memory is the primary device.

If the name is not found in the main memory name space then the Virtual Memory Manager is invoked

3/21/2012 58

Date post:	15-Feb-2016
Category:	Documents
Upload:	havyn
View:	25 times
Download:	0 times

COT 5611 Operating Systems Design Principles Spring 2012

Documents