Greedy Algorithms and Dynamic Programming1 Lectures on Greedy Algorithms and Dynamic Programming...

transcript

Greedy Algorithms and Dynamic Programming 1

Lectures on Greedy Algorithms and Dynamic Programming

COMP 523: Advanced Algorithmic Techniques

Lecturer: Dariusz Kowalski

Overview

Previous lectures:

• Algorithms based on recursion - call to the same procedure to solve the problem for the smaller-size sub-input(s)

• Graph algorithms: searching, with applications

These lectures:

• Greedy algorithms

• Dynamic programming

Greedy algorithm’s paradigm

Algorithm is greedy if :• it builds up a solution in small consecutive steps• it chooses a decision at each step myopically to optimize

some underlying criterion

Analyzing optimal greedy algorithms by showing that:• in every step it is not worse than any other algorithm, or• every algorithm can be gradually transformed to the

greedy one without hurting its quality

Interval scheduling

Input: set of intervals on the line, represented by pairs of points (ends of intervals)

Output: the largest set of intervals such that none two of them overlap

Generic greedy solution:• Consider intervals one after another using

some rule

Rule 1

Select the interval that starts earliest

(but is not overlapping the already chosen intervals)

Underestimated solution!

optimal

algorithm

Rule 2

Select the shortest interval

(but not overlapping the already chosen intervals)

Underestimated solution!

optimal

algorithm

Rule 3

Select the interval intersecting the smallest number of remaining intervals

(but still is not overlapping the already chosen intervals) Underestimated solution!

optimal

algorithm

Rule 4

Select the interval that ends first (but still is not overlapping the already chosen intervals) Hurray! Exact solution!

Analysis - exact solutionAlgorithm gives non-overlapping intervals:

obvious, since we always choose an interval which does

not overlap the previously chosen intervals

The solution is exact:

Let: • A be the set of intervals obtained by the algorithm,• Opt be the largest set of pairwise non-overlapping

intervals

We show that A must be as large as Opt

Analysis - exact solution cont.Let A = {A1,…,Ak} and Opt = {B1,…,Bm} be sorted. By definition of Opt we have k m.

Fact: for every i k, Ai finishes not later than Bi.Proof: by induction.For i = 1 by definition of the first step of the algorithm.

From i -1 to i : Suppose that Ai-1 finishes not later than Bi-1.

From the definition of a single step of the algorithm, Ai is the first interval that finishes after Ai-1 and does not overlap it.

If Bi finished before Ai then it would overlap some of the previous A1,…, Ai-1 and consequently - by the inductive assumption - it would overlap or end before Bi-1, which would be a contradiction.

Bi-1 Bi

Analysis - exact solution cont.Theorem: A is the exact solution.Proof: we show that k = m.Suppose to the contrary that k < m.

We already know that Ak finishes not later than Bk.

Hence we could add Bk+1 to A and obtain a bigger solution by the algorithm - a contradiction.

Bk-1 Bk

algorithm finishes selection

Implementation & time complexity

Efficient implementation:

• Sort intervals according to the right-most ends

• For every consecutive interval: – If the left-most end is after the right-most end of the

last selected interval then we select this interval– Otherwise we skip it and go to the next interval

Time complexity: O(n log n + n) = O(n log n)

Textbook and Exercises

READING:

• Chapter 4 “Greedy Algorithms”, Section 4.1

EXERCISE:

• All Interval Scheduling problem from Section 4.1

Minimum spanning tree

Greedy algorithm’s paradigm

Algorithm is greedy if :• it builds up a solution in small consecutive steps• it chooses a decision at each step myopically to optimize

some underlying criterion

Analyzing optimal greedy algorithms by showing that:• in every step it is not worse than any other algorithm, or• every algorithm can be gradually transformed to the

greedy one without hurting its quality

Minimum spanning treeInput: weighted graph G = (V,E) • every edge in E has its positive weight Output: spanning tree such that the sum of weights is

not bigger than the sum of weights of any other spanning tree

Spanning tree: subgraph with – no cycle, and– spanning and connected (every two nodes in V are

connected by a path)

Properties of minimum spanning trees MST

Properties of spanning trees:• n nodes• n - 1 edges• at least 2 leaves (leaf - a node with only one neighbor)MST cycle property:• after adding an edge we obtain exactly one cycle and

each edge from MST in this cycle has no bigger weight than the weight of the added edge

23cycle

Crucial observation about MST

Consider sets of nodes A and V - A• Let F be the set of edges between A and V - A• Let a be the smallest weight of an edge in F Theorem:Every MST must contain at least one edge of weight afrom set F

Proof of the TheoremLet e be the edge in F with the smallest weight - for simplicity assume that such edge is unique. Suppose to the contrary that e is not in some MST. Consider one such MST.Add e to MST - a cycle is obtained, in which e has weight not smaller than any other weight of edge in this cycle, by the MST cycle property. Since the two ends of e are in different sets A and V - A, there is another edge f in the cycle and in F. By definition of e, such f must have a bigger weight than e, which is a contradiction.

Greedy algorithms finding MST

Kruskal’s algorithm:• Sort all edges according to their weights• Choose n - 1 edges, one after another, as follows:

– If a new added edge does not create a cycle with previously selected edges then we keep it in (partial) solution;otherwise we remove it

Remark: we always have a partial forest

Greedy algorithms finding MST

Prim’s algorithm:• Select an arbitrary node as a root• Choose n - 1 edges, one after another, as follows:

– Consider all edges which are incident to the currently build (partial) solution and which do not create a cycle in it, and select one having the smallest weight

Remark: we always have a connected partial tree

Why the algorithms work?Follows from the crucial observations:Kruskal’s algorithm:• Suppose we add edge {v,w};• This edge has a smallest weight among edges between the

set of nodes already connected with v (by a path in already selected subgraph) and other nodes

Prim’s algorithm:• Always chooses an edge with a smallest weight among

edges between the set of already connected nodes and free nodes (i.e., non-connected nodes)

Time complexity

There are implementations using

• Union-find data structure (Kruskal’s algorithm)

• Priority queue (Prim’s algorithm)

achieving time complexity

O(m log n)

where n is the number of nodes and m is the

number of edges in a given graph G

READING:

• Chapter 4 “Greedy Algorithms”, Section 4.5

EXERCISES:

• Solved Exercise 3 from Chapter 4

• Generalize the proof of the Theorem to the case where may be more than one edges of smallest weight in F

Priority Queues (PQ)

Implementation of Prim’s algorithm using PQ

Minimum spanning treeInput: weighted graph G = (V,E) • every edge in E has its positive weight Output: spanning tree such that the sum of weights is

not bigger than the sum of weights of any other spanning tree

Spanning tree: subgraph with – no cycle, and– connected (every two nodes in V are connected by a path)

Crucial observation about MST

Consider sets of nodes A and V - A• Let F be the set of edges between A and V - A• Let a be the smallest weight of an edge in F Theorem:Every MST must contain at least one edge of weight afrom set F

Greedy algorithm finding MST

Prim’s algorithm:• Select an arbitrary node as a root• Choose n - 1 edges, one after another, as follows:

– Consider all edges which are incident to the currently build (partial) solution and which do not create a cycle in it, and select one which has the smallest weight

Remark: we always have a connected partial tree

Priority queue

Set of n elements, each has its priority value (key)– the smaller key the higher priority the element has

Operations provided in time O(log n):

• Adding new element to PQ

• Removing an element from PQ

• Taking element with the smallest key

Implementation of PQ based on heaps

Heap: rooted (almost) complete binary tree, each node has its

• value• key• 3 pointers: to the parent and

children (or nil(s) if parent or child(ren) not available)

Required property: in each subtree the smallest key

is always in the root

2 3 4 7 5 6

Operations on the heap

PQ operations:• Add• Remove• Take

Additional supporting operation:• Last leaf:

Updating the pointer to the rigth-most leaf on the lowest level of the tree, after each operation (take, add, remove)

Construction of the heap

Construction:• Start with arbitrary element • Keep adding next elements using add operation provided

by the heap data structure

(which will be defined in the next slide)

Implementing operations on heapSmallest key element: trivially read from the root

Adding new element: • find the next last leaf location in the heap• put the new element as the last leaf• recursively compare it with its parent’s key:

– if the element has the smaller key then swap the element and its parent and continue;otherwise stop

Remark: finding the next last leaf may require to search through the path up and then down (exercise)

Implementing operations on heapRemoving element: • remove it from the tree • move the value from last leaf on its place• update the last leaf • compare the moved element recursively either

– “up” if its value is smaller than its current parent: swap the elements and continue going up until reaching smaller parent or the root,

or– “down” if its value is bigger than its current parent:

swap it with the smallest of its children and continue going down until reaching a node with no smaller child or a leaf

Examples - adding

1 3 2 7 5 6

2 3 4 7 5 6

2 3 1 7 5 6

add 1 at the end swap 1 and 4

swap 1 and 2

Examples - removing

2 3 4 7 5 6

3 4 7 56

6 4 7 53

5 4 7 63

removing 2 swap 2 and last element remove 2 and swap 6 and 3

swap 6 and 5

Heap operations - time complexity

• Taking minimum: O(1)

• Adding: – Updating last leaf: O(log n)– Going up with swaps through (almost) complete binary

tree: O(log n)

• Removing: – Updating last leaf: O(log n)– Going up or down (only once direction is selected)

doing swaps through (almost) complete binary tree: O(log n)

Prim’s algorithm - time complexityInput: graph is given as an adjacency list

• Select a root node as an initial partial tree

• Construct PQ with all edges incident to the root (weights are keys)

• Repeat until PQ is empty– Take the smallest edge from PQ and remove it

– If exactly one end of the edge is in the partial tree then• Add this edge and its other end to the partial tree

• Add to PQ all edges, one after another, which are incident to the new node and remove all their copies from graph representation

Time complexity: O(m log n)

where n is the number of nodes, m is the number of edges

Textbook and ExercisesREADING:• Chapters 2 and 4, Sections 2.5 and 4.5

EXERCISES:• Solved Exercises 1 and 2 from Chapter 4• Prove that a spanning tree of an n - node graph has

n - 1 edges• Prove that an n - node connected graph has at least

n - 1 edges• Show how to implement the update of the last leaf in

time O(log n)

Dynamic programming

Two problems:

• Weighted interval scheduling

• Sequence alignment

Dynamic Programming paradigmDynamic Programming (DP):• Decompose the problem into series of sub-problems• Build up correct solutions to larger and larger sub-

problemsSimilar to:• Recursive programming vs. DP: in DP sub-problems may

strongly overlap • Exhaustive search vs. DP: in DP we try to find

redundancies and reduce the space for searching• Greedy algorithms vs. DP: sometimes DP orders sub-

problems and processes them one after another

(Weighted) Interval scheduling

(Weighted) Interval scheduling:

Input: set of intervals (with weights) on the line, represented by pairs of points - ends of intervals

Output: the largest (maximum sum of weights) set of intervals such that none two of them overlap

Greedy algorithm doesn’t work for weighted case!

ExampleGreedy algorithm:• Repeatedly select the interval that ends first (but still not

overlapping the already chosen intervals) Exact solution of unweighted case.

weight 1weight 3

weight 1

Greedy algorithm gives total weight 2 instead of optimal 3

Basic structure and definition• Sort the intervals according to their right ends• Define function p as follows:

– p(1) = 0– p(i) is the number of intervals which finish before ith interval

starts

weight 1

weight 3

weight 1

weight 2

p(1)=0

p(2)=1

p(3)=0

p(4)=2

Basic property• Let wj be the weight of jth interval• Optimal solution for the set of first j intervals satisfies

OPT(j) = max{ wj + OPT(p(j)) , OPT(j-1) }Proof:If jth interval is in the optimal solution O then the other intervals in

O are among intervals 1,…,p(j).Otherwise search for solution among first j-1 intervals.

weight 1

weight 3

weight 1

weight 2

p(1)=0

p(2)=1

p(3)=0

p(4)=2

Sketch of the algorithm• Additional array M[0…n] initialized by 0,p(1),…,p(n)( intuitively M[j] stores optimal solution OPT(j) )Algorithm• For j = 1,…,n do

– Read p(j) = M[j]– Set M[j] := max{ wj + M[p(j)] , M[j-1] }

weight 1

weight 3

weight 1

weight 2

p(1)=0

p(2)=1

p(3)=0

p(4)=2

Complexity of solution

Time: O(n log n)• Sorting: O(n log n)• Initialization of M[0…n] by 0,p(1),…,p(n): O(n log n)• Algorithm: n iterations, each takes constant time, total

O(n)Memory: O(n) - additional array M

weight 1

weight 3

weight 1

weight 2

p(1)=0

p(2)=1

p(3)=0

p(4)=2

Sequence alignment problemPopular problem from word processing and computational

biology• Input: two words X = x1x2…xn and Y = y1y2…ym • Output: largest alignment

Alignment A:

set of pairs (i1,j1),…,(ik,jk) such that• If (i,j) in A then xi = yj

• If (i,j) is before (i’,j’) in A then i < i’ and j < j’ (no crossing matches)

Example• Input: X = c t t t c t c c Y = t c t t c cAlignment A:

Finding the size of max alignmentOptimal alignment OPT(i,j) for prefixes of X and Y of lengths i

and j respectively:OPT(i,j) = max{ ij + OPT(i-1,j-1) , OPT(i,j-1) , OPT(i-1,j) }

where ij equals 1 if xi = yj, otherwise is equal to -Proof:

If xi = yj in the optimal solution O then the optimal alignment contains one match (xi , yj) and the optimal solution for prefixes of length i-1 and j-1 respectively.

Otherwise at most one end is matched. It follows that either x1x2…xi-1 is matched only with letters from y1y2…ym or y1y2…yj-1 is matched only with letters from x1x2…xn. Hence the optimal solution is either the same as for OPT(i-1,j) or for OPT(i,j-1).

Algorithm finding max alignment

• Initialize matrix M[0..n,0..m] into zerosAlgorithm• For i = 1,…,n do

– For j = 1,…,m do• Compute ij

• Set M[i,j] : =

max{ ij + M[i-1,j-1] , M[i,j-1] , M[i-1,j] }

Complexity

Time: O(nm)• Initialization of matrix M[0..n,0..m]: O(nm)• Algorithm: O(nm)

Memory: O(nm)

Reconstruction of optimal alignment

Input: matrix M[0..n,0..m] containing OPT values

Algorithm• Set i = n, j = m • While both i,j > 0 do

• Compute ij

• If M[i,j] = ij + M[i-1,j-1] then match xi and yj and set i = i - 1, j = j - 1; else

• If M[i,j] = M[i,j-1] then set j = j - 1 (skip letter yj ); else

• If M[i,j] = M[i-1,j] then set i = i - 1 (skip letter xi )

Distance between words

Generalization of alignment problem• Input:

– two words X = x1x2…xn and Y = y1y2…ym – mismatch costs pq, for every pair of letters p and q– gap penalty

• Output: – (smallest) distance between words X and Y

Example• Input: X = c t t t c t c c Y = t c t t c c

Alignment A: (4 gaps of cost each, 1 mismatch of cost ct)X = c t t t c t c c | | | ^ |Y = t c t t c c

Finding the distance between wordsOptimal alignment OPT(i,j) for prefixes of X and Y of lengths i and j

respectively:

OPT(i,j) = min{ ij + OPT(i-1,j-1) , + OPT(i,j-1) , + OPT(i-1,j) }Proof:

If xi and yj are (mis)matched in the optimal solution O then the optimal alignment contains one (mis)match (xi , yj) of cost ij and the optimal solution for prefixes of length i-1 and j-1 respectively.

Otherwise at most one end is (mis)matched. It follows that either x1x2…xi-1 is (mis)matched only with letters from y1y2…ym or y1y2…yj-1 is (mis)matched only with letters from x1x2…xn. Hence the optimal solution is either the same as counted for OPT(i-1,j) or for OPT(i,j-1), plus the penalty gap .

Algorithm and complexity remain the same.

READING:

• Chapter 6 “Dynamic Programming”, Sections 6.1 and 6.6

EXERCISES:

• All Shortest Paths problem, Section 6.8

Conclusions• Greedy algorithms: algorithms constructing solutions

step after step by using a local rule• Exact greedy algorithm for interval selection problem -

in time O(n log n) illustrating “greedy stays ahead” rule• Greedy algorithms for finding minimum spanning tree

in a graph– Kruskal’s algorithm– Prim’s algorithm

• Priority Queues– greedy Prim’s algorithms for finding a minimum spanning

tree in a graph in time O(m log n)

Conclusions cont.

• Dynamic programming– Weighted interval scheduling in time O(n log n) – Sequence alignment in time O(nm)

Greedy Algorithms and Dynamic Programming1 Lectures on Greedy Algorithms and Dynamic Programming...

Documents