+ All Categories
Home > Documents > Concurrent Programming · Parallel Programming Or they could drive in parallel lanes, thus arriving...

Concurrent Programming · Parallel Programming Or they could drive in parallel lanes, thus arriving...

Date post: 20-Jun-2020
Category:
Upload: others
View: 11 times
Download: 0 times
Share this document with a friend
44
Concurrent Programming Introduction Frédéric Haziza <[email protected]> Department of Computer Systems Uppsala University Ericsson - Fall 2007
Transcript
Page 1: Concurrent Programming · Parallel Programming Or they could drive in parallel lanes, thus arriving at about the same time without getting in each other’s way. ... Concurrent Program

Concurrent ProgrammingIntroduction

Frédéric Haziza <[email protected]>

Department of Computer Systems

Uppsala University

Ericsson - Fall 2007

Page 2: Concurrent Programming · Parallel Programming Or they could drive in parallel lanes, thus arriving at about the same time without getting in each other’s way. ... Concurrent Program

Good to know Scenario Definitions Hardware Classical Paradigms

Outline

1 Good to know

2 Scenario

3 Definitions

4 Hardware

5 Classical ParadigmsIterative ParallelismRecursive ParallelismProducer/ConsumerClient/ServerInteracting Peers

MP’07 | MP’07 (Introduction)

Page 3: Concurrent Programming · Parallel Programming Or they could drive in parallel lanes, thus arriving at about the same time without getting in each other’s way. ... Concurrent Program

Good to know Scenario Definitions Hardware Classical Paradigms

Literature

Gregory Andrews.Foundations ofMultithreaded, Parallel andDistributed Programming.Addison-Wesley, 1999 (ISBN:

0-201-35752-6)

MP’07 | MP’07 (Introduction)

Page 4: Concurrent Programming · Parallel Programming Or they could drive in parallel lanes, thus arriving at about the same time without getting in each other’s way. ... Concurrent Program

Good to know Scenario Definitions Hardware Classical Paradigms

Schedule

Date, Time, Comments

Tue 6 Nov 9.00-12.00 Setting the decorTue 13 Nov 9.00-12.00 Locks, Barriers + LabTue 27 Nov 9.00-12.00 Remainder...

MP’07 | MP’07 (Introduction)

Page 5: Concurrent Programming · Parallel Programming Or they could drive in parallel lanes, thus arriving at about the same time without getting in each other’s way. ... Concurrent Program

Good to know Scenario Definitions Hardware Classical Paradigms

Scenario

Several cars want to drive from point A to point B.

They can compete for space on the same roadand end up either:

following each other

or competing for positions (and having accidents!).

Or they could drive in parallel lanes,thus arriving at about the same time without getting in eachother’s way.

Or they could travel different routes, using separate roads.

MP’07 | MP’07 (Introduction)

Page 6: Concurrent Programming · Parallel Programming Or they could drive in parallel lanes, thus arriving at about the same time without getting in each other’s way. ... Concurrent Program

Good to know Scenario Definitions Hardware Classical Paradigms

Scenario

Several cars want to drive from point A to point B.

Sequential ProgrammingThey can compete for space on the same roadand end up either:

following each other

or competing for positions (and having accidents!).

Parallel ProgrammingOr they could drive in parallel lanes,

thus arriving at about the same time without getting in each other’s way.

Distributed ProgrammingOr they could travel different routes, using separate roads.

MP’07 | MP’07 (Introduction)

Page 7: Concurrent Programming · Parallel Programming Or they could drive in parallel lanes, thus arriving at about the same time without getting in each other’s way. ... Concurrent Program

Good to know Scenario Definitions Hardware Classical Paradigms

Definitions

Concurrent Program

2+ processes working together to perform a task.

Each process is a sequential program(= sequence of statements executed one after another)

Single thread of control vs multiple thread of control

Communication• Shared Variables• Message Passing

Synchronization• Mutual Exclusion• Condition Synchronization

MP’07 | MP’07 (Introduction)

Page 8: Concurrent Programming · Parallel Programming Or they could drive in parallel lanes, thus arriving at about the same time without getting in each other’s way. ... Concurrent Program

Good to know Scenario Definitions Hardware Classical Paradigms

Correctness

Wanna write a concurrent program?

What kinds of processes?

How many processes?

How should they interact?

CorrectnessEnsure that processes interaction is properly synchronized

Mutual ExclusionEnsuring the critical sections of statements do not executeat the same timeCondition SynchronizationDelaying a process until a given condition is true

Our focus: imperative programs and asynchronous executionMP’07 | MP’07 (Introduction)

Page 9: Concurrent Programming · Parallel Programming Or they could drive in parallel lanes, thus arriving at about the same time without getting in each other’s way. ... Concurrent Program

Good to know Scenario Definitions Hardware Classical Paradigms

Amdhal’s law

P is the fraction of a calculation that can be parallelized

(1 − P) is the fraction that is sequential(i.e. cannot benefit from parallelization)

N processors

⇒ maximum speedup = 1(1−P)+P/N .

ExampleIf P = 90% ⇒ max speedup of 10no matter how large the value of N used (ie N → ∞)

MP’07 | MP’07 (Introduction)

Page 10: Concurrent Programming · Parallel Programming Or they could drive in parallel lanes, thus arriving at about the same time without getting in each other’s way. ... Concurrent Program

Good to know Scenario Definitions Hardware Classical Paradigms

Single-Processor Machine

MP’07 | MP’07 (Introduction)

Page 11: Concurrent Programming · Parallel Programming Or they could drive in parallel lanes, thus arriving at about the same time without getting in each other’s way. ... Concurrent Program

Good to know Scenario Definitions Hardware Classical Paradigms

Memory Hierarchy

Main Memory

Level 2 cache

Level 1 cache

CPU

MP’07 | MP’07 (Introduction)

Page 12: Concurrent Programming · Parallel Programming Or they could drive in parallel lanes, thus arriving at about the same time without getting in each other’s way. ... Concurrent Program

Good to know Scenario Definitions Hardware Classical Paradigms

Why do we miss in the cache?

Compulsory missTouching the data for the first time

Capacity missThe cache is too small

Conflict missNon-ideal cache implementation (data hash to the same cache line)

Main Memory

Miss

Cache

Hit

CPU

MP’07 | MP’07 (Introduction)

Page 13: Concurrent Programming · Parallel Programming Or they could drive in parallel lanes, thus arriving at about the same time without getting in each other’s way. ... Concurrent Program

Good to know Scenario Definitions Hardware Classical Paradigms

Locality

Temporal locality

Spatial locality

Inner loop stepping through an array

A, B, C, A+1, B, C, A+2, B, C,

spatial temporal

MP’07 | MP’07 (Introduction)

Page 14: Concurrent Programming · Parallel Programming Or they could drive in parallel lanes, thus arriving at about the same time without getting in each other’s way. ... Concurrent Program

Good to know Scenario Definitions Hardware Classical Paradigms

MultiProcessor world - Taxonomy

SIMD MIMD

Message Passing

Fine-grained Coarse-grained

Shared Memory

UMA NUMA COMA

MP’07 | MP’07 (Introduction)

Page 15: Concurrent Programming · Parallel Programming Or they could drive in parallel lanes, thus arriving at about the same time without getting in each other’s way. ... Concurrent Program

Good to know Scenario Definitions Hardware Classical Paradigms

Shared-Memory Multiprocessors

Memory Memory...

Interconnection network / Bus

Cache Cache...

CPU CPU

MP’07 | MP’07 (Introduction)

Page 16: Concurrent Programming · Parallel Programming Or they could drive in parallel lanes, thus arriving at about the same time without getting in each other’s way. ... Concurrent Program

Good to know Scenario Definitions Hardware Classical Paradigms

Programming Model

Thread

$

Thread

$

Thread

$

Thread

$

Thread

$

Thread

$

Thread

$

Thread

$

Thread

$

Shared Memory

MP’07 | MP’07 (Introduction)

Page 17: Concurrent Programming · Parallel Programming Or they could drive in parallel lanes, thus arriving at about the same time without getting in each other’s way. ... Concurrent Program

Good to know Scenario Definitions Hardware Classical Paradigms

Cache coherency

Shared Memory

A: B:

$

Thread

$

Thread

$

Thread

Read A

Read A

...

...

Read A

...

Read A

...

Write A

Read B

...

Read A

MP’07 | MP’07 (Introduction)

Page 18: Concurrent Programming · Parallel Programming Or they could drive in parallel lanes, thus arriving at about the same time without getting in each other’s way. ... Concurrent Program

Good to know Scenario Definitions Hardware Classical Paradigms

Summing up Coherence

There can be many copies of adatum, but only one value

Too strong!!!

There is a single global order ofvalue changes to each datum

MP’07 | MP’07 (Introduction)

Page 19: Concurrent Programming · Parallel Programming Or they could drive in parallel lanes, thus arriving at about the same time without getting in each other’s way. ... Concurrent Program

Good to know Scenario Definitions Hardware Classical Paradigms

Memory Ordering

The coherence defines a per-datum order of valuechanges.

The memory model defines the order of value changes forall the data.

What ordering does the memory system guarantees?“Contract” between the HW and the SW developersWithout it, we can’t say much about the result of a parallelexecution

MP’07 | MP’07 (Introduction)

Page 20: Concurrent Programming · Parallel Programming Or they could drive in parallel lanes, thus arriving at about the same time without getting in each other’s way. ... Concurrent Program

Good to know Scenario Definitions Hardware Classical Paradigms

What order for these threads?

A’ denotes a modified value to the datum at address A

Thread 1

LD AST B’LD CST D’LD E......

Thread 2

ST A’LD B’ST C’LD DST E’......

LD A happens before ST A’

MP’07 | MP’07 (Introduction)

Page 21: Concurrent Programming · Parallel Programming Or they could drive in parallel lanes, thus arriving at about the same time without getting in each other’s way. ... Concurrent Program

Good to know Scenario Definitions Hardware Classical Paradigms

Other possible orders?

Thread 1

LD AST B’LD C

ST D’LD E......

Thread 2

ST A’LD B’ST C’LD D

ST E’......

Thread 1

LD AST B’LD C

ST D’LD E......

Thread 2

ST A’LD B’ST C’LD D

ST E’......

MP’07 | MP’07 (Introduction)

Page 22: Concurrent Programming · Parallel Programming Or they could drive in parallel lanes, thus arriving at about the same time without getting in each other’s way. ... Concurrent Program

Good to know Scenario Definitions Hardware Classical Paradigms

Memory model flavors

Sequentially Consistent: Programmer’s intuition

Total Store Order: Almost Programmer’s intuition

Weak/Release Consistency: No guaranty

MP’07 | MP’07 (Introduction)

Page 23: Concurrent Programming · Parallel Programming Or they could drive in parallel lanes, thus arriving at about the same time without getting in each other’s way. ... Concurrent Program

Good to know Scenario Definitions Hardware Classical Paradigms

Dekker’s algorithm

Initially A = 0,B = 0

“fork”

A := 1if(B==0)print(“A wins”);

B := 1if(A==0)print(“B wins”);

Can both A and B win?

Does the write

become globally

visible before the

read is performed?

Left: The read (ie, test if B==0) can bypass the store (A := 1)Right: The read (ie, test if A==0) can bypass the store (B := 1)⇒ Both loads can be performed before any of the stores⇒ Yes, it is possible that both win!

MP’07 | MP’07 (Introduction)

Page 24: Concurrent Programming · Parallel Programming Or they could drive in parallel lanes, thus arriving at about the same time without getting in each other’s way. ... Concurrent Program

Good to know Scenario Definitions Hardware Classical Paradigms

Dekker’s algorithm for Total Store Order

Initially A=0,B=0

“fork”

A := 1Membar #StoreLoad;if(B==0)print(“A wins”);

B := 1Membar #StoreLoad;if(A==0)print(“B wins”);

Can both A and B win?

Does the write

become globally

visible before the

read is performed?

Membar: the read is started after all previous stores have been“globally ordered”⇒ Behaves like a sequentially consistent machine⇒ No, they won’t both win. Good job Mister Programmer!

MP’07 | MP’07 (Introduction)

Page 25: Concurrent Programming · Parallel Programming Or they could drive in parallel lanes, thus arriving at about the same time without getting in each other’s way. ... Concurrent Program

Good to know Scenario Definitions Hardware Classical Paradigms

Dekker’s algorithm, in general

Initially A = 0,B = 0

“fork”

A := 1if(B==0)print(“A wins”);

B := 1if(A==0)print(“B wins”);

Can both A and B win?

The answer depends on the memory model

Remember? ...Contract between the HW and SW developers.

MP’07 | MP’07 (Introduction)

Page 26: Concurrent Programming · Parallel Programming Or they could drive in parallel lanes, thus arriving at about the same time without getting in each other’s way. ... Concurrent Program

Good to know Scenario Definitions Hardware Classical Paradigms

So....

Memory Modelis a tricky issue

MP’07 | MP’07 (Introduction)

Page 27: Concurrent Programming · Parallel Programming Or they could drive in parallel lanes, thus arriving at about the same time without getting in each other’s way. ... Concurrent Program

Good to know Scenario Definitions Hardware Classical Paradigms

New issues

Compulsory miss

Capacity miss

Conflict miss

Memory Memory...

Interconnection network / Bus

Cache Cache

...

CPU CPU

Communication missCache-to-cache transfer

False-sharingSide-effect from large cache lines

What about the compiler?Code reordering? volatile keyword in C...

MP’07 | MP’07 (Introduction)

Page 28: Concurrent Programming · Parallel Programming Or they could drive in parallel lanes, thus arriving at about the same time without getting in each other’s way. ... Concurrent Program

Good to know Scenario Definitions Hardware Classical Paradigms

Good to know

Performance ⇒ Use of CacheMemory hierarchy ⇒ Consistency problems

To get maximal performance on a given machine,the programmer has to know about the characteristics of thememory system and has to write programs to account them

MP’07 | MP’07 (Introduction)

Page 29: Concurrent Programming · Parallel Programming Or they could drive in parallel lanes, thus arriving at about the same time without getting in each other’s way. ... Concurrent Program

Good to know Scenario Definitions Hardware Classical Paradigms

Distributed Memory Architecture

Interconnection network

Memory Memory

...Cache Cache

CPU CPU

Communication through Message Passing

Own cache, but memory not shared⇒ No coherency problems

MP’07 | MP’07 (Introduction)

Page 30: Concurrent Programming · Parallel Programming Or they could drive in parallel lanes, thus arriving at about the same time without getting in each other’s way. ... Concurrent Program

Good to know Scenario Definitions Hardware Classical Paradigms

Classical Paradigms

Data Parallel

Task Parallel

5 paradigms:

Iterative parallelism

Recursive parallelism

Producer/Consumer

Client/Server

Interacting peers

MP’07 | MP’07 (Introduction)

Page 31: Concurrent Programming · Parallel Programming Or they could drive in parallel lanes, thus arriving at about the same time without getting in each other’s way. ... Concurrent Program

Good to know Scenario Definitions Hardware Classical Paradigms

Iterative Parallelism: Matrix multiplication

1: double a[n,n], b[n,n], c[n,n];

2: for [i=0 to n-1] { ⊲iterating trough the rows

3: for [j=0 to n-1] { ⊲iterating trough the columns

4: ⊲ Computes inner product of a[i,*] and b[*,j]

5: c[i,j] = 0.0;6: for [ k = 0 to n-1 ] {7: c[i,j] = c[i,j] + a[i,k]*b[k,j];8: }9: }10: }

What can we parallelize? Line 5 to 7⇒ c[i,j] is written to, and a[i,k], b[k,j] are only read⇒ every c[i,j] computation!

MP’07 | MP’07 (Introduction)

Page 32: Concurrent Programming · Parallel Programming Or they could drive in parallel lanes, thus arriving at about the same time without getting in each other’s way. ... Concurrent Program

Good to know Scenario Definitions Hardware Classical Paradigms

Iterative Parallelism: Matrix multiplication

Parallelizing the rows

co [i=0 to n-1] { ⊲compute rows in parallel

for [j=0 to n-1] {c[i,j] = 0.0;for [ k = 0 to n-1 ] {

c[i,j] = c[i,j] + a[i,k]*b[k,j];}

}}

MP’07 | MP’07 (Introduction)

Page 33: Concurrent Programming · Parallel Programming Or they could drive in parallel lanes, thus arriving at about the same time without getting in each other’s way. ... Concurrent Program

Good to know Scenario Definitions Hardware Classical Paradigms

Iterative Parallelism: Matrix multiplication

Parallelizing the columns

co [j=0 to n-1] { ⊲compute columns in parallel

for [i=0 to n-1] {c[i,j] = 0.0;for [ k = 0 to n-1 ] {

c[i,j] = c[i,j] + a[i,k]*b[k,j];}

}}

MP’07 | MP’07 (Introduction)

Page 34: Concurrent Programming · Parallel Programming Or they could drive in parallel lanes, thus arriving at about the same time without getting in each other’s way. ... Concurrent Program

Good to know Scenario Definitions Hardware Classical Paradigms

Iterative Parallelism: Matrix multiplication

Parallelizing all rows and columns

co [i=0 to n-1, j=0 to n-1] {c[i,j] = 0.0;for [ k = 0 to n-1 ] {

c[i,j] = c[i,j] + a[i,k]*b[k,j];}

}

MP’07 | MP’07 (Introduction)

Page 35: Concurrent Programming · Parallel Programming Or they could drive in parallel lanes, thus arriving at about the same time without getting in each other’s way. ... Concurrent Program

Good to know Scenario Definitions Hardware Classical Paradigms

Recursive Parallelism: Adaptive Quadrature

f (x)

x

y

a b

∫ b

af (x)dx

MP’07 | MP’07 (Introduction)

Page 36: Concurrent Programming · Parallel Programming Or they could drive in parallel lanes, thus arriving at about the same time without getting in each other’s way. ... Concurrent Program

Good to know Scenario Definitions Hardware Classical Paradigms

Recursive Parallelism: Adaptive Quadrature

1: double fleft = f(a), fright, area = 0.0;2: double width = (b-a)/ INTERVALS;

3: for [x = (a+width) to b by width] {4: fright = f(x);5: ⊲Compute the small rectangle area

6: area = area + (fleft * lfright) * width / 2;7: fleft = fright; ⊲the right-hand value becomes the new left-hand value

8: }

f (x)

x

y

x

MP’07 | MP’07 (Introduction)

Page 37: Concurrent Programming · Parallel Programming Or they could drive in parallel lanes, thus arriving at about the same time without getting in each other’s way. ... Concurrent Program

Good to know Scenario Definitions Hardware Classical Paradigms

Divide and Conquer

f (x)

x

y

f (x)

x

y

|areanew − areaold | > EPSILON

MP’07 | MP’07 (Introduction)

Page 38: Concurrent Programming · Parallel Programming Or they could drive in parallel lanes, thus arriving at about the same time without getting in each other’s way. ... Concurrent Program

Good to know Scenario Definitions Hardware Classical Paradigms

Divide and Conquer

double quad (double left, right, fleft, fright, oldarea) {

double mid = (left + right)/2; ⊲find the middle point

double fmid = f(mid); ⊲get its value

double larea = (fleft + fmid) ∗ (mid − left)/2;double rarea = (fmid + fright) ∗ (right − mid)/2;

if |(larea + rarea) − oldarea| > EPSILON {⊲Recurse to integrate both halves

larea = quad (left,mid,fleft,fmid,larea);rarea = quad (mid,right,fmid,fright,rarea);

}return (larea + rarea);

}

∫ b

af (x)dx ≈ quad(a, b, f (a), f (b), (f (a) + f (b)) ∗ (b − a)/2);

MP’07 | MP’07 (Introduction)

Page 39: Concurrent Programming · Parallel Programming Or they could drive in parallel lanes, thus arriving at about the same time without getting in each other’s way. ... Concurrent Program

Good to know Scenario Definitions Hardware Classical Paradigms

Divide and Conquer - Parallel

double quad (double left, right, fleft, fright, oldarea) {

double mid = (left + right)/2; ⊲find the middle point

double fmid = f(mid); ⊲get its value

double larea = (fleft + fmid) ∗ (mid − left)/2;double rarea = (fmid + fright) ∗ (right − mid)/2;

if |(larea + rarea) − oldarea| > EPSILON {⊲Recurse to integrate both halves

co [] {larea = quad (left,mid,fleft,fmid,larea);

⊲in parallel!

rarea = quad (mid,right,fmid,fright,rarea);} ⊲Must wait for larea and rarea

}return (larea + rarea);

}

MP’07 | MP’07 (Introduction)

Page 40: Concurrent Programming · Parallel Programming Or they could drive in parallel lanes, thus arriving at about the same time without getting in each other’s way. ... Concurrent Program

Good to know Scenario Definitions Hardware Classical Paradigms

Producer / Consumer

Producer Consumer

Shared Resource

MP’07 | MP’07 (Introduction)

Page 41: Concurrent Programming · Parallel Programming Or they could drive in parallel lanes, thus arriving at about the same time without getting in each other’s way. ... Concurrent Program

Good to know Scenario Definitions Hardware Classical Paradigms

Client / Server

Client1

Clientn

......

Server

Request

Reply

Request

Reply

MP’07 | MP’07 (Introduction)

Page 42: Concurrent Programming · Parallel Programming Or they could drive in parallel lanes, thus arriving at about the same time without getting in each other’s way. ... Concurrent Program

Good to know Scenario Definitions Hardware Classical Paradigms

Interacting Peers - Coordinator/Workers

Coordinator

Worker1 Workern−1Results

Data

Results

Data

MP’07 | MP’07 (Introduction)

Page 43: Concurrent Programming · Parallel Programming Or they could drive in parallel lanes, thus arriving at about the same time without getting in each other’s way. ... Concurrent Program

Good to know Scenario Definitions Hardware Classical Paradigms

Interacting Peers - Circular Pipeline

Worker1 ... Workern−1

MP’07 | MP’07 (Introduction)

Page 44: Concurrent Programming · Parallel Programming Or they could drive in parallel lanes, thus arriving at about the same time without getting in each other’s way. ... Concurrent Program

Good to know Scenario Definitions Hardware Classical Paradigms

Interacting Peers

Coordinator/Workers

Coordinator

Worker1 Workern−1Results

Data

Results

Data

Circular pipeline

Worker1 ... Workern−1

MP’07 | MP’07 (Introduction)


Recommended