Concurrent Programming I - algo.ing.unimo.italgo.ing.unimo.it/people/marko/2009/Concurrent...

Concurrent Programming I

Anna Lina Ruscelli – Scuola Superiore Sant’ Anna

Contact info Email [email protected]

RTOS course web pagehttp://retis.sssup.it/~marko/rtos.html

RTOS course mailing [email protected]

Many course slides and other material are courtesy of Prof. Giuseppe Lipari Ing. Paolo Gai

Real-Time Operating Systems A.A. 2009-2010 A.L.Ruscelli

2

mailto:[email protected]�

Reference material PDF notes and slides (available on the web page) P.Ancilotti e M.Boari, Principi e tecniche di

programmazione concorrente, Utet libreria, 1987 (In biblioteca, SOLO pagine 67-107).

Paolo Ancilotti, Maurelio Boari, Anna Ciampolini e Giuseppe Lipari, Sistemi Operativi, Mc-Graw Hill, June 2004

Linux man pages (as a reference for POSIX programming).

Notes on concurrent programming in UNIX systems. Other reference books are available on the web

page


3

Outline


4

Introduction to concurrency Model of concurrency: shared memory Critical Sections Synchronization

Semaphores

The need of Concurrency There are many reasons for concurrency functional performance expressive power

Functional many users may be connected to the same system at the

same time each user can have its own processes that execute concurrently

with the processes of the other users perform many operations concurrently

for example, listen to the music, write with a word processor, burn a CD, etc...

they are all different and independent activities they can be done at the same time

5 Real-Time Operating Systems A.A. 2009-2010 A.L.Ruscelli

The need of Concurrency (2) Performance take advantage of blocking time

while some thread waits for a blocking condition, anotherthread performs another operation

parallelism in multi-processor machines if we have a multi-processor machine, independent activities

can be carried out on different processors at the same time

Expressive power many control applications are inherently concurrent concurrency support helps in expressing concurrency,

making application development simpler


Concurrency model (theoretical) A system is a set of concurrent activities they can be processes or threads

They interact in two ways they access the hardware resources (processor, disk,

memory, etc.) they exchange data

These activities compete for the resources and/orcooperate for some common objective


Resource A resource can be a HW resource like a I/O device a SW resource, i.e. a data structure

in both cases, access to a resource must be regulated toavoid interference

example 1 if two processes want to print on the same printer, their access

must be sequentialized, otherwise the two printing could beintermingled!

example 2 if two threads access the same data structure, the operation on

the data must be sequentialised otherwise the data could beinconsistent!


Interaction model Activities can interact according to two fundamental

models shared memory All activities access the same memory space

message passing All activities communicate each other by sending

messages through OS primitives we will analyze both models in the following slides


Cooperative vs. CompetitiveThe interaction between concurrent activities (threads

or processes) can be classified into: competitive concurrency different activities compete for the resources one activity does not know anything about the other ones the OS must manage the resources so to

avoid conflicts be fair

cooperative concurrency many activities cooperate to perform an operation every activity knows about the others they must synchronize on particular events

interference


CompetitionCooperative and competitive activities need different

models of execution and synchronization competing activities need to be “protected” from each

other separate memory spaces

the allocation of the resource and the synchronization must be centralized competitive activities request for services to a central manager

(the OS or some dedicated process) which allocates the resources in a fair way

Client/Server model communication is usually done through messages

the process model of execution is the best one


Competition (2)

In a client/server system• a server manages the resource exclusively

for example, the printer• if a process needs to access the resource, it sends a

request to the server for example, printing a file, or asking for the status

• the server can send back the responses• the server can also be on a remote system

two basic primitives• send and receive

Server

Client 1

Client 2


Cooperation Cooperative activities know about each other they do not need memory protection

not using memory protection, we have less overhead they need to access the same data structures allocation of the resource is de-centralized shared memory is the best model the thread model of execution is the best one


Cooperation and Competition Competition is best resolved by using the message

passing model however it can be implemented using a shared memory

paradigm too Cooperation is best implemented by using the

shared memory paradigm however, it can be realized by using pure message

passing mechanisms shared memory or message passing? in the past, there were OSs that supported only shared

memory or only message passing


Cooperation and Competition (2) A general purpose OS needs to support both

models we need at least protection for competing activities we need to support client/server models. So we need

messag passing primitives we need to support shared memory for reducing the

overhead some special OS supports only one of the two for example, many RTOS support only shared memory


Interference There is a third kind of interaction, the interference It is due to two kinds of programming errors: interactions between processes that are not required by

the semantic of the problem erroneous solution to the problems of interaction

interference problems are usually time-dependent problems


Model of Concurrency Shared memory Critical section Synchronization


Shared memory Shared memory communication it was the first one to be supported in old OSs it is the simplest one and the closest to the machine all threads can access the same memory locations


Hardware analogy An abstract model that presents a good analogy is

the following many HW CPU, each one running one activity (thread) one shared memory


Resource allocation Allocation of resource can be static: once the resource is granted, it is never revoked dynamic: resource can be granted and revoked dynamically

manager Access to a resource can be dedicated: one activity at time only is granted access to the

resource shared: many activities can access the resource at the same

time mutual exclusion


Mutual exclusion problem We do not know in advance the relative speed of the

processes hence, we do not know the order of execution of the

hardware instructions

recall the example of incrementing variable x incrementing x is not an atomic operation atomic behavior can be obtained using interrupt disabling

or special atomic instructions


Example 1/* Shared memory */int x;

void *threadA(void *){

...;x = x + 1;...;

}

void *threadB(void *){

...;x = x + 1;...;

}

...LD R0, x (TA) x = 0, R0=0LD R0, x (TB) x = 0, R0=0INC R0 (TB) x = 0, R0=1ST x, R0 (TB) x = 1, R0=0INC R0 (TA) x = 1, R0=1ST x, R0 (TA) x = 1, R0=0...

Bad interleaving !


Example 2// Shared object (sw resource)struct A_t {

int a;int b;

} A;void A_init(A_t *x) { x->a=1; x->b=1; }void A_inc(A_t *x) { x->a++; x->b++; }void A_mul(A_t *x) { x->b*=2; x->a *=2 }

void *threadA(void *){...A_inc(&A);...}

void * threadB(void *){

...A_mul(&B);...

}

Consistency:After each operation, a == b

a = a + 1; TA a = 2b = b * 2; TB b = 2b = b + 1; TA b = 3a = a * 2; TB a = 4

Resource in a non-consistent state !!


Consistency For any resource, we can state a set of consistency

properties a consistency property Ci is a boolean expression on the

values of the internal variables a consistency property must hold before and after each

operation it does not need to hold during an operation if the operations are properly sequentialized, the consistency

properties will always hold Formal verification let R be a resource and let C(R) be a set of consistency

properties on the resource R C(R) = {Ci}

Definition: A concurrent program is correct if, for every possible interleaving of the operations on the resource, ∀ Ci ∈ C(R), Ci holds.


Example 3: Circular array-implementation of a FIFO queue

head: index of the first free element in the queue here will be inserted the next element

tail: index of the first occupied element in the queue will be the one that will be extracted next time


tail head

Circular array: implementation of a FIFO queue

struct CA {int array[10];int head, tail, num;

}void init(struct CA *ca) {

ca->head=0; ca->tail=0;ca->num=0;

}boolean insert(struct CA *ca, int elem) {

if (ca->num == 10) return false;else {

ca->array[ca->head] = elem;ca->head = (ca->head + 1) % 10;ca->num ++;return true;}

}boolean extract(struct CA *ca, int *elem) {

if (ca->num == 0) return false;else {*elem = ca->array[ca->tail];ca->tail = (ca->tail + 1) % 10;ca->num--;}

return true;}

26

consistency properties(suppose num++ and num--atomic)

C1: if (num == 0 || num == 10)head == tail;

C2: if (0 < num < 10)num == (head - tail) %10

C3: num == NI – NE

C4: (insert x)pre: if (num < 10)post: num == num + 1 &&

array[(head-1)%10] = x;

C5: (extract &x)pre: if (num > 0)post: num == num -1 &&

x = array[(tail-1)%10];


Consistency properties

27

consistency properties

C1: if (num == 0 || num == 10)head == tail;

C2: if (0 < num < 10)num == (head - tail) %10

C3: num == NI – NE

C4: (insert x)pre: if (num < 10)post: num == num + 1 &&

array[(head-1)%10] = x;

C5: (extract &x)pre: if (num > 0)post: num == num -1 &&

x = array[(tail-1)%10];

C1: when the queue is empty, or when the queue is full,

head == tail C3: num is equal to the number of times

that insert() has been called minus the number of times that extract() has been called

C4: if element x has been inserted, eventually it must be extracted with an appropriate number of extracts

C5: Every element that is extracted, has been inserted sometimes in the past.

Last two can also be expressed as: Let (x1, x2, . . . , xk ) be the sequence

of inserted elements, and let (y1, y2, . . . , yk ) be the sequence of extracted elements;

then ∀i = 1, . . . , k yi = xi


Consistency properties for struct CA1. When the queue is empty, or when the queue is full, head ==

tail2. num is equal to the number of times that insert() has

been called minus the number of times that extract() has been called

3. If element x has been inserted, eventually it must be extracted with an appropriate number of extracts

4. Every element that is extracted, has been inserted sometimes in the past.

Last two can also be expressed as: Let (x1, x2, . . . , xk ) be the sequence of inserted elements,

and let (y1, y2, . . . , yk ) be the sequence of extracted elements;

then ∀i = 1, . . . , k yi = xi


Example 3: empty queue

head: index of the first free element in the queue here will be inserted the next element

tail: index of the first occupied element in the queue will be the one that will be extracted next time

the queue is empty, hence head == tail


tail head

Example 3: circular array - insert

30

lavagna


Example 3: insertboolean insert(struct CA *ca,

int elem){

if (ca->num == 10)return false;ca->array[ca->head] = elem;ca->head = (ca->head+1)%10;ca->num++;return true;

}

5545

3 8 2 5

tail queue

num = (head - tail) % 8 ↦ num = 4;

insert(ca, 9);

head and num have been increased

9


Example 3: circular array - extract

32

lavagna


Example 3: concurrent insert (interference)

33

If the insert() operation is performed by two processes, some consistency property may be violated!

int insert_CA(struct CircularArray_t *a, int elem)

void *threadA(void *){...

insert_CA( &queue, 5);...}

void *threadB(void *){...

insert_CA( &queue, 2);...}


Example 4: concurrent insert (interference)

34

if (a->num == 10) return 0; (TA)else {

a->array[a->head] = 5; (TA)if (a->num == 10) return 0; (TB)

else {a->array[a->head] = 2; (TB)a->head = (a->head + 1) % 10; (TB)

(*)a->num ++; (TB)return 1; (TB)a->head = (a->head + 1) % 10; (TA)

(**)a->num ++; (TA)return 1; (TA)

if (a->num == 10) return 0;else {

a->array[a->head] = 5;a->head = (a->head + 1) % 10;(**)a->num ++;return 1;}

if (a->num == 10) return 0;else {

a->array[a->head] = 2;a->head = (a->head + 1) % 10;(*)a->num ++;return 1;}

TA TB


Example 4: concurrent insert

33 8 2

tail head

Two threads, they both callinsert(9).

boolean insert(struct CA *ca,int elem)

{if (ca->num == 10)return false;ca->array[ca->head] = elem;

thread 1 calls insert(ca, 9);

preemption by second thread

...boolean insert(struct CA *ca,

int elem){

if (ca->num == 10)return false;ca->array[ca->head] = elem;ca->head = (ca->head+1)%10;ca->num++;return true;

}...

second thread completes

ca->head = (ca->head+1)%10;ca->num++;return true;}

there is a hole! At some point,the extract will read a 9 and arandom value, instead of two 9s.

9


Example 3: Correctness of Circular Array implementation The previous program is not correct, as the last

property is not verified the sequence of extracted elements does not correspond

to the sequence of inserted elements The problem is that the first thread was preempted while

updating the data structure in a critical point. we must prevent thread 2 from accessing the data

structure while another thread is completing an operation on it


Example 3: Correctness Proving the non-correctness is easy, in the sense

that we must find a counterexample Proving the correctness is a very complex task! it is necessary to prove the correctness for every possible

interleaving of every operation, for every possible inputdata and for every possible internal state


37

Insert and Extract What happens if an insert() and an extract() are

interleaved? Let’s assume that increments and decrements are atomic

operations Producer: thread that inserts elements Consumer: thread that extracts elements

It can be proved that interleaving exactly one producer and one consumer does not bring any problem

Proof: if 0 < num < 10, insert() and extract() are independent if num==0

if extract() begins before insert, it immediately returns false, if insert() begins before, extract will still return false, so it cannot interfere

with insert same thing when num==10

Correctness is guaranteed for one consumer and oneproducer.


Insert and Extract II What happens if we exchange the sequence of

instructions in insert()?boolean insert(struct CA *ca, int elem){ if (ca->num == 10) return false;

else {ca->num++;ca->array[ca->head] = elem;ca->head = (ca->head+1)%10;return true;}

}

It is easy to prove that in this case insert() cannotbe interleaved with extract()

39

boolean extract(struct CA *ca, int *elem) {if (ca->num == 0) return false;else {

*elem = ca->array[ca->tail];ca->tail = (ca->tail + 1) % 10;ca->num--;

return true;}}


Circular array properties

40

a) If more than one thread executes insert_CA() inconsistency!!

b) If we have only two threads one threads calls insert_CA() and the other thread

calls extract_CA() no inconsistency!

The order of the operations is important! a wrong order can make the object inconsistent even

under the assumption b) the case is when num is incremented but the data has not yet

been inserted in any case, the final result depends on the timings of the

different requests (e.g, an insertion with the buffer full)


Exercise: non-atomic increment Problem: in the previous examples, we supposed that num++ and num-- are atomic operations

what happens if they are not atomic? question: assuming that operation -- and ++ are not atomic, and

assuming that we have only one producer and one consumer, can we make the Circular Array safe?

hint: try to substitute variable num with two booleanvariables, bool empty and bool full;


Outline Model of concurrency: shared memory Critical Sections


42

Critical section: definitions

43

The shared object where the conflict may happen is a resource

The parts of the code where the problem may happen are called critical sections

A critical section is a sequence of operations that cannot be interleaved with other operations on the same resource

Two critical sections on the same resource must be properly sequentialized

We say that two critical sections on the same resource must execute in MUTUAL EXCLUSION


Mutual Exclusion There are three ways to obtain mutual exclusion

1. implementing the critical section as an atomic operation

2. disabling the preemption (system-wide)3. selectively disabling the preemption (using

semaphores and mutual exclusion)


44

1- Implementig atomic operations In single processor systems disable interrupts during a critical section non-voluntary context switch is disabled!

Limitations: if the critical section is long, no interrupt can arrive during

the critical section consider a timer interrupt that arrives every 1 msec. if a critical section lasts for more than 1 msec, a timer interrupt could be lost It must be done only for very short critical sections;

Non voluntary context switch is disabled during the critical section

Disabling interrupts is a very low level solution: it is not possiblein user space.

Concurrency is disabled during the critical section! we must avoid conflicts on the resource, not disabling interrupts!

CLI;<critical section>STI;


Atomic operations on multiprocessors Disabling interrupts is not sufficient disabling interrupts on one processor lets a thread on another

processor free to access the resource Solution: use lock() and unlock() operations define a flag s for each resource, and then surround a critical

section with lock(s) and unlock(s); Problems: busy waiting: if the critical section is long, we waste a lot of time cannot be used in single processors!

int s;...lock(s);<critical section>unlock(s);...


Low level synchronisation in SMP


47

The atomicity problem cannot be solved by disabling the interrupts! If we disable the interrupts, we protect the code from

interrupts. It is not easy to protect from other processors

...LD R0, xINC R0ST x, RO...

...LD R0, xINC R0ST x, RO...

...LD R0, x (CPU 0)LD R0, x (CPU 1)INC R0 (CPU 0)INC R0 (CPU 1)ST x, R0 (CPU 0)ST x, R0 (CPU 1)...

CPU 0

CPU 1

Low level synchronisation in SMP


48

Most processors support some special instruction XCH Exchange register with memory location TST If memory location = 0, set location to 1 and

return true (1), else return false (0)

XCH and TSTare atomic!

void xch(register R, memory x){

int tmp;tmp = R; R = x; x=tmp;

}

int tst(int x){

if (x == 1) return 0;else {

x=1;return 1;

}}

Locking in multi-processors


49

We define one variable s If s == 0, then we can perform the critical operation If s == 1, then must wait before performing the critical operation

Using XCH or TST we can implement two functions:

lock() unlock()

void lock(int s){

int a = 1;while (a==1) XCH (s,a);

}

void lock(int x){

while (TST (s) == 0);}

void unlock(int s){

s = 0;}

2 - Disabling preemption On single processor systems in some scheduler, it is possible to disable preemption for

a limited interval of time problems: if a high priority critical thread needs to execute, it cannot

make preemption and it is delayed even if the high priority task does not access the

resource!

disable_preemption();<critical section>enable_preemption();

no context switch may happen duringthe critical section,but interrupts are enabled


3 - Selectively disabling preemption

51

Some general mechanisms exist to implement mutual exclusion only between the processes that use a resource: Semaphores Mutex


Critical section: a general approach General techniques exist to protect critical sections Semaphores Mutex

Properties: Interrupts always enabled Preemption always enabled

Basic idea: if a thread is inside a critical section on a given resource

all other threads are blocked upon entrance on the criticalsection on the same resource -> selectivity

We will study such techniques in the following


Outline Model of concurrency: shared memory Synchronization


53

Synchronization: Producer/Consumer model Mutual exclusion is not the only problem we need a way of synchronize two or more threads

example: producer/consumer suppose we have two threads, one produces some integers and sends them to another

thread (PRODUCER) another one takes the integer and elaborates it

(CONSUMER)

ConsumerProducer


Producer/Consumer: implementation with circular array Suppose that the two threads have different speeds for example, the producer is much faster than the

consumer we need to store the temporary results of the producer in

some memory buffer, so that no data are lost for our example, we will use the circular array structure,

the CircularArray_t structure


Producer/Consumer II

56

problems with this approach: if the queue is full, the producer actively waits if the queue is empty, the consumer actively waits

struct CircularArray_t queue;

void *producer(void *){

bool res;int data;while(1) {<obtain data>while (!insert_CA(&queue, data));}

}

void *consumer(void *){

bool res;int data;while(1) {while (!extract_CA(&queue, &data));<use data>}

}


A more general approach

57

We need to provide a general mechanism for synchronization and mutual exclusion

requirements provide mutual exclusion between critical sections

avoid two interleaved insert() operations (semaphores, mutexes)

synchronize two threads on one condition for example, block the producer when the queue is full (semaphores, condition variables)


Outline Semaphores


58

A general mechanism for blocking tasks: semaphores

59

The semaphore mechanism was first proposed by Dijkstra

A semaphore is an abstract data type that consists of a counter a blocking queue operation wait operation signal

The operations on a semaphore must be atomic the OS makes them atomic by appropriate low-level

mechanisms


Semaphores definitions

60

Semaphores are a basic mechanisms for providingsynchronization

it has been shown that every kind of synchronization and mutual exclusion can be implemented by using semaphores

we will analyze possible implementation of the semaphore mechanism later

typedef struct {<blocked queue> blocked;int counter;} sem_t;void sem_init (sem_t &s, int n);void sem_wait (sem_t &s);void sem_post (sem_t &s);

Note:the real prototypeof sem_init isslightly different!


Wait and signal

61

A wait operation has the following behavior if counter == 0, the requiring thread is blocked

it is removed from the ready queue it is inserted in the blocked queue

if counter > 0, then counter--; a post operation has the following behavior if counter == 0 and there is some blocked thread,

unblock it the thread is removed from the blocked queue it is inserted in the ready queue

otherwise, increment counter


Semaphores

62

void sem_init (sem_t *s, int n){

s->count=n;...

}void sem_wait(sem_t *s){

if (counter == 0)<block the thread>elsecounter--;

}void sem_post(sem_t *s){

if (<there are blocked threads>)<unblock a thread>elsecounter++;

}Real-Time Operating Systems A.A. 2009-2010

A.L.Ruscelli

Signal semantic What happens when a thread blocks on a semaphore? in general, it is inserted in a BLOCKED queue

extraction from the blocking queue can follow different semantics: strong semaphore

the threads are removed in well-specified order for example, the FIFO order is the fairest policy, or priority based

ordering, ... signal and suspend

after the new thread has been unblocked, a thread switch happens signal and continue

after the new thread has been unblocked, the thread that executed the signal continues to execute

concurrent programs should not rely too much on thesemaphore semantic


Mutual exclusion with semaphores: Mutex

64

How to use a semaphore for critical sections define a semaphore initialized to 1 before entering the critical section, perform a wait after leaving the critical section, perform a post

sem_t s;...sem_init(&s, 1);

void *threadA(void *arg){...sem_wait(&s);<critical section>sem_post(&s);...}

void *threadB(void *arg){...sem_wait(&s);<critical section>sem_post(&s);...}


Mutual exclusion: example

65

Semaphorecounter 1

Blocked queue

Ready queue

TB TA



66

Semaphorecounter 0

Blocked queue

Ready queue

TB TA

s.wait(); (TA)



67

Semaphorecounter 0

Blocked queue

Ready queue

TB TA

s.wait(); (TA) <critical section (1)> (TA)



68

Semaphorecounter 0

Blocked queue

Ready queue

TA TB

s.wait(); (TA) <critical section (1)> (TA) s.wait(); (TB)



69

Semaphorecounter 0

Blocked queue

Ready queue

TA

s.wait(); (TA) <critical section (1)> (TA) s.wait(); (TB) <critical section (2)> (TA)

TB



70

Semaphorecounter 0

Blocked queue

Ready queue

TA

s.wait(); (TA) <critical section (1)> (TA) s.wait(); (TB) <critical section (2)> (TA) s.signal(); (TA)TB



71

Semaphorecounter 0

Blocked queue

Ready queue

TB

s.wait(); (TA) <critical section (1)> (TA) s.wait(); (TB) <critical section (2)> (TA) s.signal(); (TA) <critical section> (TB)

TA



72

Semaphorecounter 1

Blocked queue

Ready queue

TB

s.wait(); (TA) <critical section (1)> (TA) s.wait(); (TB) <critical section (2)> (TA) s.signal(); (TA) <critical section> (TB) s.signal(); (TB)

TA


Synchronization with semaphores

73

How to use a semaphore for synchronization define a semaphore initialized to 0 at the synchronization point, perform a wait when the synchronization point is reached, perform a post

in the example, threadA blocks until threadB wakes it upsem_t s;...sem_init(&s, 0);

void *threadA(void *){...sem_wait(&s);...}

void *threadB(void *){...sem_post(&s);...}


Problem 1

74

How to make each thread waits for the other one? The first one that arrives at the synchronization point

waits for the other one. Solution: use two semaphores!

Semaphore sa(0), sb(0);

void *threadA(void *) {...sa.signal();sb.wait();...}

void *threadB(void *) {...sb.signal();sa.wait();...}


Semaphores in POSIX

75

sem_t sema;int sem_init(sem_t *s, int flag, int count);int sem_wait(sem_t *s);int sem_trywait(sem_t *s);int sem_post(sem_t *s);

sem_t is the semaphoretype; it is an “opaque” Cstructure

sem_init initializes thesemaphore; if flag = 0,the semaphore is local to theprocess; if flag = 1, thesemaphore is shared withother processes; count isthe initial value of thecounter

sem_wait is the normalwait operation;

sem_trywait does notblock the task, but returnswith error (< 0) if thesemaphore counter is 0.

sem_post is the normalsignal operation.


Producer/consumer

76

Consider a producer/consumer system: now we want to implement a mailbox with a circular array

avoiding busy wait one producer executes insert_CA() the producer must be blocked when the mailbox is full the producer will be unblocked when there is some space again

one consumer executes extract_CA() the consumer must be blocked when the mailbox is empty the consumer will be unblocked when there is one new element

and the queue is not empty We use appropriate semaphores to block these threads Initially we consider only one producer and one

consumer


Producer/Consumer implementation

78

struct CircularArray_t {int array[10];int head, tail;sem_t empty, full;}void init_CA(struct CircularArray_t *c) {

c->head=0; c->tail=0;sem_init(&c->empty, 0);sem_init(&c->full, 10); }

void insert_CA(struct CircularArray_t *c, int elem){ sem_wait(&c->full);c->array[c->head] = elem;c->head = (c->head + 1) % 10;sem_post(&c->empty);}

void extract_CA(struct CircularArray_t *c, int &elem) {sem_wait(&c->empty);elem = c->array[c->tail];c->tail = (c->tail + 1) % 10;sem_post(c->full);}


Producer/consumer properties Notice that the value of the counter of empty is the number of

elements in the queue it is the number of times we can call extract without blocking

the value of the counter of full is the complement of the elements in the queue it is the number of times we can call insert without blocking

exercise prove that the implementation is correct

insert_CA() never overwrites elements extract_CA() always gets an element of the queue


Proof of correctness

80

When the number of elements in the queue is between 1 and 9, there is no problem; insert and extract work on different variables (head and tail

respectively) and different elements of the array; The value of full and empty is always greater than 0, so neither

the producer nor the consumer can block; When there is no element in the queue, head = tail, counter of

empty = 0, counter of full = N; If extract begins before the end of insert, it will be blocked After an insert, there is an element in the queue, so we are in the

previous case For symmetry, the same holds in the case of N elements in the

queue. Again, head = tail, counter of empty = N, counter of full = 0; If insert begins before the end of an extract, it will be blocked After an extract, we fall back in the previous case


Multiple producers/consumers

81

Suppose now there are many producers and many consumers;

All producers will act on the same variable head, and all consumers on the same variable tail;

If one producer preempts another producer, an inconsistency can arise Exercise: prove the above sentence

Therefore, we need to combine synchronization and mutual exclusion we want to implement synchronization we want to protect the data structure


First solution

82

struct CircularArray_t {int array[10];int head, tail;sem_t full, empty;sem_t mutex;

}

void init_CA(struct CircularArray_t*c){c->head=0; c->tail=0;sem_init(&c->empty, 0);sem_init(&c->full, 10);sem_init(&c->mutex, 1);

}

void insert_CA(struct CircularArray_t *c,int elem){sem_wait(&c->mutex);sem_wait(&c->full);c->array[c->head]=elem;c->head = (c->head+1)%10;sem_post(&c->empty);sem_post(&c->mutex);}

void extract_CA(struct CircularArray_t *c, int *elem){sem_wait(&c->mutex);sem_wait(&c->empty);elem = c->array[c->tail];c->tail = (c->tail+1)%10;sem_post(&c->full);sem_post(&c->mutex);}


Wrong solution

83

The previous solution is wrong! Counter example: A consumer thread executes first, locks the mutex and

blocks on the empty semaphore All other threads (producers or consumers) will block on

the mutex Lesson learned: never block inside a mutex!


Correct solution

84

struct CircularArray_t {int array[10];int head, tail;Semaphore full, empty;Semaphore mutex;

}

void init_CA(struct CircularArray_t*c){c->head=0; c->tail=0;sem_init(&c->empty, 0); sem_init(&c->full, 10);sem_init(&c->mutex, 1);

}

void insert_CA(struct CircularArray_t *c,int elem){sem_wait(&c->full);sem_wait(&c->mutex);c->array[c->head]=elem;c->head = (c->head+1)%10;sem_post(&c->mutex);sem_post(&c->empty);}

void extract_CA(struct CircularArray_t *c,int *elem){sem_wait(&c->empty);sem_wait(&c->mutex);elem = c->array[c->tail];c->tail = (c->tail+1)%10;sem_post(&c->mutex);sem_post(&c->full);}


Producers/Consumers: deadlock situation Deadlock situation a thread executes sem_wait(&c->mutex) and then blocks on a

synchronisation semaphore to be unblocked another thread must enter a critical section

guarded by the same mutex semaphore! so, the first thread cannot be unblocked and free the mutex! the situation cannot be solved, and the two threads will never

proceed as a rule, never insert a blocking synchronization

inside a critical section!!!


Internal implementation of semaphores

86

wait()and signal()involve a possible thread-switch

therefore they must be implemented as system calls! one blocked thread must be removed from state

RUNNING and be moved in the semaphore blocking queue

a semaphore is itself a shared resource wait()and signal()are critical sections! they must run with interrupt disabled and by using lock() and unlock() primitives


Readers/Writers One shared buffer Readers they read the content of the buffer many readers can read at the same time

Writers they write in the buffer while one writer is writing no other readers or writers

can access the buffer use semaphores to implement the resource


Readers/Writers: simple implementation


89

struct Buffer_t{

sem_t synch;sem_t s_R;int nr;

};

void init_B(struct Buffer_t *b){

sem_init(&b->synch, 1); sem_init(&b->s_R, 1); b->nr=0;

}

void read_B(struct Buffer_t *b) {sem_wait(&b->s_R);b->nr++;if (b->nr==1)

sem_wait(&b->synch);sem_post(&b->s_R);<read the buffer>sem_wait(&b->s_R);b->nr--;if (b->nr==0)

sem_post(&b->synch);sem_post(&b->s_R);

}

void write_B(struct Buffer_t *b) {

sem_wait(&b->synch);<write the buffer>sem_post(&b->synch);

}

Readers/Writers: more than one pending writer

90

struct Buffer_t {sem_t synch, mutex;sem_t s_R, s_W;int nr, nw;

};

void init_B(struct Buffer_t *b){

sem_init(&b->synch, 1); sem_init(&b->mutex, 1);sem_init(&b->s_R, 1); sem_init(&b->s_W, 1);b->nr=0; b->nw=0;

}

void read_B(struct Buffer_t *b) {sem_wait(&b->s_R);b->nr++;if (b->nr==1)

sem_wait(&b->synch);sem_post(&b->s_R);<read the buffer>sem_wait(&b->s_R);b->nr--;if (b->nr==0)

sem_post(&b->synch);sem_post(&b->s_R);

}

void write_B(struct Buffer_t *b) {sem_wait(&b->s_W);nw++;if (nw==1) sem_wait(&b->synch);sem_post(&b->s_W);sem_wait(&b->mutex);<write the buffer>sem_post(&b->mutex);sem_wait(&b->s_W);nw--;if (nw==0) sem_post(&b->synch);sem_post(&b->s_W);

}


Readers/Writers: starvation A reader will be blocked for a finite time The writer suffers starvation Suppose we have 2 readers (R1 and R2) and 1 writer

W1 and suppose that R1 starts to read while R1 is reading, W1 blocks because it wants to write now R2 starts to read now R1 finishes, but, since R2 is reading, W1 cannot be

unblocked before R2 finishes to read, R1 starts to read again when R2 finishes, W1 cannot be unblocked because R1 is

reading a solution readers should not be counted whenever there is a writer

waiting for them

91 Real-Time Operating Systems A.A. 2009-2010 A.L. Ruscelli

Readers/Writers: priority to writers!

92

struct Buffer_t {sem_t synch, synch1;sem_t s_R, s_W;int nr, nw;

};

void init_B(struct Buffer_t *b) {sem_init(&b->synch, 1); sem_init(&b->synch1, 1);sem_init(&b->s_R, 1); sem_init(&b->s_W, 1);b->nr=0; b->nw=0;

}

void read_B(struct Buffer_t *b) {sem_wait(&b->synch1);sem_wait(&b->s_R);nr++;if (nr==1) sem_wait(&b->synch);sem_post(&b->s_R);sem_post(&b->synch1);<read the buffer>sem_wait(&b->s_R);nr--;if (nr==0) sem_post(&b->synch);sem_post(&b->s_R);

}

void write_B(struct Buffer_t *b) {sem_wait(&b->s_W);nw++;if (nw==1) sem_wait(&b->synch1);sem_post(&b->s_W);sem_wait(&b->synch);<write the buffer>sem_post(&b->synch);sem_wait(&b->s_W);nw--;if (nw == 0) sem_post(&b->synch1);sem_post(&b->s_W);

}


Readers/Writers: problem Now, there is starvation for readers the readers/writers problem can be solved in

general? no starvation for readers no starvation for writers

solution maintain a FIFO ordering with requests

if at least one writer is blocked, every next reader blocks if at least one reader is blocked, every next writer blocks

we can do that using the private semaphorestechnique


Problem2: Synchronization of N threads

94

Generalize the threads synchronization problem to N threads The first N-1 threads must block waiting for the last one

First solution (more elegant) Second solution (more practical)


First solution to problem 2

95

#include <pthread.h>#include <semaphore.h>#define N 8sem_t s[N][N];void init(){

int i, j;for (i=0; i<N; i++)for(j=0; j<N; j++)sem_init(&s[i][j], 0, 0);

}void *thread(void *arg){int k = (int) arg; int j;printf("TH%d: before synch\n", k);for (j=0; j<N; j++)

if (j!=k) sem_post(&s[k][j]);for (j=0; j<N; j++)

if (j!=k) sem_wait(&s[j][k]);printf("TH%d: after synch\n", k);

}

int main(){pthread_t tid[N];int i;init();for (i=0; i<N; i++)

pthread_create(&tid[i], 0, thread, (void *)i);for (i=0; i<N; i++)

pthread_join(tid[i], 0);printf("Main: exiting\n");

}


Elegant solution but it uses many semaphores!

Second solution to problem 2

96

Practical solution. We need a mutex semaphore, a counter and a semaphore to block threads.

struct synch {int count;sem_t m; // mutexsem_t b; // blockedint N; // number of threads

};

void initsynch(struct synch *s, int n){int i;s->count = 0;sem_init(&s->m, 0, 1);sem_init(&s->b, 0, 0);s->N = n;

}

void my_synch(struct synch *s){int i;sem_wait(&s->m);if (++s->count < s->N) {

sem_post(&s->m);sem_wait(&s->b);}

else {for (i=0; i < s->N - 1; i++)sem_post(&s->b);sem_post(&s->m);}

}struct synch sp;void *thread(void *arg){

...my_synch(&sp);...

}Real-Time Operating Systems A.A. 2009-2010

A.L.Ruscelli

Date post:	07-Apr-2019
Category:	Documents
Upload:	doanquynh
View:	232 times
Download:	0 times

Concurrent Programming I - algo.ing.unimo.italgo.ing.unimo.it/people/marko/2009/Concurrent...

Documents