Graph Algorithms (Chapter 10) - Aalborg...

Post on 13-Jul-2020

0 views 0 download

transcript

1

Graph Algorithms(Chapter 10)

Alexandre David1.2.05

2

16-04-2008 Alexandre David, MVP'08 2

TodayRecall on graphs.Minimum spanning tree (Prim’s algorithm).Single-source shortest paths (Dijkstra’salgorithm).All-pair shortest paths (Floyd’s algorithm).Connected components.

3

16-04-2008 Alexandre David, MVP'08 3

Graphs – DefinitionA graph is a pair (V,E )

V finite set of vertices.E finite set of edges.e ∈ E is a pair (u,v ) of vertices.Ordered pair → directed graph.Unordered pair → undirected graph.

4

16-04-2008 Alexandre David, MVP'08 4

edge

vertex

V=E=

V=E=

5

16-04-2008 Alexandre David, MVP'08 5

Graphs – EdgesDirected graph:

(u,v ) ∈ E is incident from u and incident to v.(u,v ) ∈ E : vertex v is adjacent to u.

Undirected graph:(u,v ) ∈ E is incident on u and v.(u,v ) ∈ E : vertices u and v are adjacent to each other.

6

16-04-2008 Alexandre David, MVP'08 6

4 adjacent to 6

7

16-04-2008 Alexandre David, MVP'08 7

Graphs – PathsA path is a sequence of adjacent vertices.

Length of a path = number of edges.Path from v to u ⇒ u is reachable from v.Simple path: All vertices are distinct.A path is a cycle if its starting and ending vertices are the same.Simple cycle: All intermediate vertices are distinct.

8

16-04-2008 Alexandre David, MVP'08 8

Simple path:Simple cycle:Non simple cycle:

Simple path:Simple cycle:Non simple cycle:

9

16-04-2008 Alexandre David, MVP'08 9

GraphsConnected graph: ∃ path between any pair.G’=(V’,E’) sub-graph of G=(V,E) if V’⊆V and E’⊆E.Sub-graph of G induced by V’: Take all edges of E connecting vertices of V’⊆V.Complete graph: Each pair of vertices adjacent.Tree: connected acyclic graph.

10

16-04-2008 Alexandre David, MVP'08 10

Sub-graph:Induced sub-graph:

11

16-04-2008 Alexandre David, MVP'08 11

Graph RepresentationSparse graph (|E| much smaller than |V|2):

Adjacency list representation.

Dense graph:Adjacency matrix.

For weighted graphs (V,E,w): weighted adjacency list/matrix.

12

16-04-2008 Alexandre David, MVP'08 12

⎩⎨⎧ ∈

=otherwise

Evvifa ji

ji 0

),(1,

Undirected graph ⇒ symmetric adjacency matrix.

|V|

|V|2 entries

13

16-04-2008 Alexandre David, MVP'08 13

|V|

|V|+|E| entries

14

16-04-2008 Alexandre David, MVP'08 14

Minimum Spanning TreeWe consider undirected graphs.Spanning tree of (V,E) = sub-graph

being a tree andcontaining all vertices V.

Minimum spanning tree of (V,E,w) = spanning tree with minimum weight.Example: minimum length of cable to connect a set of computers.

15

16-04-2008 Alexandre David, MVP'08 15

Spanning Trees

16

16-04-2008 Alexandre David, MVP'08 16

Prim’s AlgorithmGreedy algorithm:

Select a vertex.Choose a new vertex and edge guaranteed to be in a spanning tree of minimum cost.Continue until all vertices are selected.

17

16-04-2008 Alexandre David, MVP'08 17

Vertices of minimum spanning tree.

Weights from VT to V.

select

addupdate

18

16-04-2008 Alexandre David, MVP'08 18

19

16-04-2008 Alexandre David, MVP'08 19

20

16-04-2008 Alexandre David, MVP'08 20

21

16-04-2008 Alexandre David, MVP'08 21

Prim’s AlgorithmComplexity Θ(n2).Cost of the minimum spanning tree:

How to parallelize?Iterative algorithm.Any d[v] may change after every loop.But possible to run each iteration in parallel.

∑∈Vv

vd ][

22

16-04-2008 Alexandre David, MVP'08 22

1-D Block Mapping

p processesn verticesn/p vertices per process

23

16-04-2008 Alexandre David, MVP'08 23

Parallel Prim’s Algorithm

1-D block partitioning: Vi per Pi.For each iteration:

Pi computes a local min di[u].All-to-one reduction to P0 to compute the global min.One-to-all broadcast of u.Local updates of d[v].

Every process needs a column of the adjacencymatrix to compute the update.Θ(n2/p) space per process.

24

16-04-2008 Alexandre David, MVP'08 24

AnalysisThe cost to select the minimum entry is O(n/p + log p). The cost of a broadcast is O(log p). The cost of local update of the d vector is O(n/p). The parallel run-time per iteration isO(n/p + log p). The total parallel time (n iterations) is given by O(n2/p + n log p).

25

16-04-2008 Alexandre David, MVP'08 25

AnalysisEfficiency = Speedup/# of processes:E=S/p=1/(1+Θ((p logp)/n).Maximal degree of concurrency = n.To be cost-optimal we can only use up to n/logn processes.Not very scalable.

max at n2/p =Θ(n log p),with bound p=O(n)

Keep cost optimality: p logp=O(n), logp+loglogp=O(logp)=O(logn) →p=O(n/logn).pTP=TS+T0 → T0=O(pn logp)=O((p logp)2).

26

16-04-2008 Alexandre David, MVP'08 26

Single-Source Shortest Paths: Dijkstra’s AlgorithmFor (V,E,w), find the shortest paths from a vertex to all other vertices.

Shortest path=minimum weight path.Algorithm for directed & undirected with non negative weights.

Similar to Prim’s algorithm.Prim: store d[u] minimum cost edge connecting a vertex of VT to u.Dijkstra: store l[u] minimum cost to reach u from s by a path in VT.

27

16-04-2008 Alexandre David, MVP'08 27

Parallel formulation: Same as Prim’s algorithm.

28

16-04-2008 Alexandre David, MVP'08 28

All-Pairs Shortest PathsFor (V,E,w), find the shortest paths between all pairs of vertices.

Dijkstra’s algorithm: Execute the single-source algorithm for n vertices → Θ(n3).Floyd’s algorithm.

29

16-04-2008 Alexandre David, MVP'08 29

All-Pairs Shortest Paths –Dijkstra – Parallel FormulationSource-partitioned formulation: Each process has a set of vertices and compute their shortest paths.

No communication, E=1, but maximal degree of concurrency = n. Poor scalability.

Source-parallel formulation (p>n):Partition the processes (p/n processes/subset), each partition solves one single-source problem (in parallel).In parallel: n single-source problems.

Up to n processes. Solve in Θ(n2 ).

Up to n2 processes, n2/ logn for cost-optimal,in which case solve in Θ(n logn).

30

16-04-2008 Alexandre David, MVP'08 30

Floyd’s AlgorithmFor any pair of vertices vi, vj ∈ V, consider all paths from vi to vj whose intermediate vertices belong to the set {v1,v2,…,vk}.Let pi,j

(k) (of weight di,j(k)) be the minimum-

weight path among them.

1

2

3

5

4

6

7

8

ki

jpi,j

(k)

31

16-04-2008 Alexandre David, MVP'08 31

Floyd’s AlgorithmIf vertex vk is not in the shortest path from vi to vj, then pi,j

(k) = pi,j(k-1).

1

2

3

5

4

6

7

8

ki

j

pi,j(k)

k-1

=pi,j(k-1)

32

16-04-2008 Alexandre David, MVP'08 32

Floyd’s AlgorithmIf vk is in pi,j

(k), then we can break pi,j(k)

into two paths - one from vi to vk and one from vk to vj . Each of these paths uses vertices from {v1,v2,…,vk-1}.

1

2

3

5

4

6

7

8

ki

jpi,j

(k)

di,j(k)=di,k

(k-1)+dk,j(k-1)

33

16-04-2008 Alexandre David, MVP'08 33

Floyd’s AlgorithmRecurrence equation:

Length of shortest path from vi to vj = di,j

(n). Solution set = a matrix.

( ) ⎭⎬⎫

≥=

⎪⎩

⎪⎨⎧

+=

−−− 10

,min

),()1(

,)1(

,)1(

,

)(, kif

kifddd

vvwd

kjk

kki

kji

jikji

34

16-04-2008 Alexandre David, MVP'08 34

Floyd’s Algorithm

Θ(n3)

Also works in place.

How to parallelize?

35

16-04-2008 Alexandre David, MVP'08 35

Parallel Formulation2-D block mapping:

Each of the p processes has a sub-matrix (n/√p)2 and computes its D(k).Processes need access to the corresponding k row and column of D(k-1).kth iteration: Each processes containing part of the kth row sends it to the other processes in the same column. Same for column broadcast on rows.

36

16-04-2008 Alexandre David, MVP'08 36

2-D Mapping

n/√p

37

16-04-2008 Alexandre David, MVP'08 37

Communication

38

16-04-2008 Alexandre David, MVP'08 38

Parallel Algorithm

39

16-04-2008 Alexandre David, MVP'08 39

Analysis

E=1/(1+Θ((√p logp)/n).Cost optimal if up to O((n/ logn)2) processes.Possible to improve: pipelined 2-D block mapping: No broadcast, send to neighbour. Communication: Θ(n), up to O(n2) processes & cost optimal.

40

16-04-2008 Alexandre David, MVP'08 40

All-Pairs Shortest Paths: Matrix Multiplication Based AlgorithmMultiplication of the weighted adjacency matrix with itself – except that we replace multiplications by additions, and additions by minimizations.The result is a matrix that contains shortest paths of length 2 between any pair of nodes. It follows that An contains all shortest paths.

41

16-04-2008 Alexandre David, MVP'08 41

Serial algorithm notoptimal but we canuse n3/logn processesto run in O(log2n).

42

16-04-2008 Alexandre David, MVP'08 42

Transitive ClosureFind out if any two vertices are connected.G*=(V,E*) where E*={(vi,vj)|∃ a path from vi to vj in G}.

43

16-04-2008 Alexandre David, MVP'08 43

Transitive ClosureStart with D=(ai,j or ∞).Apply one all-pairs shortest paths algorithm.Solution:

⎪⎩

⎪⎨⎧

=>

∞=∞=

jiordif

difa

ji

jiji 01 ,

,*,

Also possible to modify Floyd’s algorithm by replacing + by logical or and min by logical and.

44

16-04-2008 Alexandre David, MVP'08 44

Connected ComponentsConnected components of G=(V,E) are the

maximal disjoint sets C1,…,Ck s.t. V=UCkand u,v ∈ Ci iff u reachable from v and v reachable from u.

45

16-04-2008 Alexandre David, MVP'08 45

DFS Based AlgorithmDFS traversal of the graph → forest of (DFS) spanning trees.

46

16-04-2008 Alexandre David, MVP'08 46

47

16-04-2008 Alexandre David, MVP'08 47

Parallel FormulationPartition G into p sub-graphs. Pi has Gi=(V,Ei).

Each Pi computes the spanning forest of Gi.Merge the forests pair-wise.

Each merge possible in Θ(n).Not described in the book – out of scope.Find if an edge of A has its vertices in B:

no for all → union of 2 disjoint sets.yes for one → merge.

48

16-04-2008 Alexandre David, MVP'08 48

Partition the adjacency matrix.1-D partitioning in p stripes of n/pconsecutive rows.

49

16-04-2008 Alexandre David, MVP'08 49

P1

P2

50

16-04-2008 Alexandre David, MVP'08 50

Analysis

E=1/(1+Θ((p logp)/n).Up to O(n/ logn) to be cost-optimal.Performance similar to Prim’s algorithm.