+ All Categories
Home > Documents > Chapter 7

Chapter 7

Date post: 06-Jan-2016
Category:
Upload: jacqui
View: 23 times
Download: 0 times
Share this document with a friend
Description:
Chapter 7. Dynamic Programming. Fibonacci Sequence. Fibonacci sequence : 0 , 1 , 1 , 2 , 3 , 5 , 8 , 13 , 21 , … F i = i if i  1 F i = F i-1 + F i-2 if i  2 Solved by a recursive program: Much replicated computation is done. - PowerPoint PPT Presentation
46
7 -1 © The McGraw-Hill Companies, Inc., 2005 Chapter 7 Dynamic Programming
Transcript
Page 1: Chapter 7

7 -1

© The McGraw-Hill Companies, Inc., 2005

Chapter 7

Dynamic Programming

Page 2: Chapter 7

7 -2

© The McGraw-Hill Companies, Inc., 2005

Fibonacci Sequence Fibonacci sequence: 0 , 1 , 1 , 2 , 3 , 5 , 8 , 13 , 21 , …

Fi = i if i 1

Fi = Fi-1 + Fi-2 if i 2 Solved by a recursive program:

Much replicated computation is done. It should be solved by a simple loop.

f2

f4 f3

f1

f3

f2 f1

f1 f0

f2

f1 f0

f1

f5

f0

Page 3: Chapter 7

7 -3

© The McGraw-Hill Companies, Inc., 2005

Dynamic Programming Dynamic Programming is an

algorithm design method that can be used when the solution to a problem may be viewed as the result of a sequence of decisions.

Page 4: Chapter 7

7 -4

© The McGraw-Hill Companies, Inc., 2005

The Shortest Path

To find a shortest path in a multi-stage graph

Apply the greedy method: the shortest path from S to T: 1 + 2 + 5 = 8.

S A B T

3

4

5

2 7

1

5 6

Page 5: Chapter 7

7 -5

© The McGraw-Hill Companies, Inc., 2005

The Shortest Path in Multi-stage Graphs

e.g.

The greedy method cannot be applied to this case: (S, A, D, T) 1 + 4 + 18 = 23.

The real shortest path is: (S, C, F, T) 5 + 2 + 2 = 9.

S T132

B E

9

A D4

C F2

1

5

11

5

16

18

2

Page 6: Chapter 7

7 -6

© The McGraw-Hill Companies, Inc., 2005

Dynamic Programming Approach

Dynamic programming approach (forward approach):

d(S, T) = min{1+d(A, T), 2+d(B, T), 5+d(C, T)}

S T2

B

A

C

1

5d(C, T)

d(B, T)

d(A, T)

A

T

4

E

D

11d(E, T)

d(D, T) d(A,T) = min{4+ d(D,T), 11+d(E,T)} = min{4+18, 11+13} = 22.

S T132

B E

9

A D4

C F2

1

5

11

5

16

18

2

Page 7: Chapter 7

7 -7

© The McGraw-Hill Companies, Inc., 2005

d(B, T) = min{9+d(D, T), 5+d(E, T), 16+d(F, T)} = min{9+18, 5+13, 16+2} = 18.

d(C, T) = min{ 2+d(F, T) } = 2+2 = 4 d(S, T) = min{1+d(A, T), 2+d(B, T), 5+d(C, T)} = min{1+22, 2+18, 5+4} = 9. The above way of reasoning is called backward reasoning.

S T132

B E

9

A D4

C F2

1

5

11

5

16

18

2

Page 8: Chapter 7

7 -8

© The McGraw-Hill Companies, Inc., 2005

Backward Approach (Forward Reasoning)

d(S, A) = 1d(S, B) = 2d(S, C) = 5

d(S,D)=min{d(S, A)+d(A, D),d(S, B)+d(B, D)} = min{ 1+4, 2+9 } = 5 d(S,E)=min{d(S, A)+d(A, E),d(S, B)+d(B, E)} = min{ 1+11, 2+5 } = 7 d(S,F)=min{d(S, A)+d(A, F),d(S, B)+d(B, F)} = min{ 2+16, 5+2 } = 7

S T132

B E

9

A D4

C F2

1

5

11

5

16

18

2

Page 9: Chapter 7

7 -9

© The McGraw-Hill Companies, Inc., 2005

d(S,T) = min{d(S, D)+d(D, T),d(S,E)+ d(E,T), d(S, F)+d(F, T)}

= min{ 5+18, 7+13, 7+2 } = 9

S T132

B E

9

A D4

C F2

1

5

11

5

16

18

2

Page 10: Chapter 7

7 -10

© The McGraw-Hill Companies, Inc., 2005

Principle of Optimality Principle of optimality: Suppose that in solving

a problem, we have to make a sequence of decisions D1, D2, …, Dn. If this sequence is optimal, then the last k decisions, 1 k n must be optimal.

e.g. the shortest path problem

If i, i1, i2, …, j is a shortest path from i to j, then i1, i2, …, j must be a shortest path from i1 to j

In summary, if a problem can be described by a multi-stage graph, then it can be solved by dynamic programming.

Page 11: Chapter 7

7 -11

© The McGraw-Hill Companies, Inc., 2005

Forward approach and backward approach: If the recurrence relations are formulated using

the forward approach, the relations are solved backward, i.e., beginning with the last decision.

If the relations are formulated using the backward approach, they are solved forwards.

To solve a problem by using dynamic programming: Find out the recurrence relations. Represent the problem by a multi-stage graph.

Dynamic Programming

Page 12: Chapter 7

7 -12

© The McGraw-Hill Companies, Inc., 2005

The Resource Allocation Problem

m resources, n projects p(i, j): the profit obtained when j

resources are allocated to project i. Maximize the total profit.

Resource Project

1

2

3

1 2 8 9 2 5 6 7 3 4 4 4 4 2 4 5

Page 13: Chapter 7

7 -13

© The McGraw-Hill Companies, Inc., 2005

The Multi-Stage Graph Solution

The resource allocation problem can be described as a multi-stage graph.

(i, j): i resources allocated to projects 1, 2, …, je.g. node H = (3, 2): 3 resources allocated to projects 1, 2.

S T

6

0,1

1,1

2,1

3,1

0,2

1,2

2,2

3,2

0,3

1,3

2,3

3,3

A

7

6

44

4

B

C

D H

G

F

E I

J

K

L

0 5

8

9

0

0

0

0

5

5

5

0

0

0

0

4

4

4

42

2

0

Page 14: Chapter 7

7 -14

© The McGraw-Hill Companies, Inc., 2005

Find the longest path from S to T: (S, C, H, L, T), 8 + 5 + 0 + 0 = 13.

2 resources allocated to project 1.1 resource allocated to project 2.0 resource allocated to projects 3, 4.

Page 15: Chapter 7

7 -15

© The McGraw-Hill Companies, Inc., 2005

The Longest Common Subsequence (LCS)

problem A string: A = b a c a d A subsequence of A: deleting 0 or more

symbols from A (not necessarily consecutive).

e.g. ad, ac, bac, acad, bacad, bcd. Common subsequences of A = b a c a d and B = a c c b a d c b : ad, ac, bac, acad. The longest common subsequence (LCS) of

A and B: a c a d.

Page 16: Chapter 7

7 -16

© The McGraw-Hill Companies, Inc., 2005

The LCS Algorithm Let A = a1 a2 am and B = b1 b2 bn

Let Li,j denote the length of the longest common subsequence of a1 a2 ai and b1 b2 bj.

 Li,j = Li-1,j-1 + 1 if ai=bj

max{ Li-1,j, Li,j-1 } if aibj

L0,0 = L0,j = Li,0 = 0 for 1im, 1jn.

Page 17: Chapter 7

7 -17

© The McGraw-Hill Companies, Inc., 2005

The dynamic programming approach for solving the LCS problem:

Time complexity: O(mn)

L1,1

L2,1

L3,1

L1,2 L1,3

L2,2

Lm,n

Page 18: Chapter 7

7 -18

© The McGraw-Hill Companies, Inc., 2005

Tracing Back in the LCS Algorithm

e.g. A = b a c a d, B = a c c b a d c b

After all Li,j’s have been found, we can trace back to find the longest common subsequence of A and B.

2

43

100000

0 0 0

111

1 1 11 1 1 1 1

22 2 2

222

2 2 2 222

33 3 3

33

4 4

0 0 0 0 0 0 0 00 0 0 0 0

bac

a

d

A

ac c ba d cbB

2

Page 19: Chapter 7

7 -19

© The McGraw-Hill Companies, Inc., 2005

The Two-Sequence Alignment Problem

Sequence alignment may be viewed as a method to measure the similarity of two sequences

Let A = a1 a2 am and B = b1 b2 bn A sequence alignment of A and B : a 2k

matrix  M (k m, n ) of characters over {-}. e.g.

if A = a b c d and B = c b d.a possible alignment of them would be:

a b c - d - - c b d

Page 20: Chapter 7

7 -20

© The McGraw-Hill Companies, Inc., 2005

Let f(x, y) denote the score for aligning x with y. If x and y are the same, f(x, y)= 2.

If x and y are not the same, f(x, y)= 1.If x or y is “-“, f(x, y)= -1.

e.g.

A= a b c d

B= c b - dThe total score of the alignment is 1+ 2 – 1 + 2 = 4.

Page 21: Chapter 7

7 -21

© The McGraw-Hill Companies, Inc., 2005

Let Ai,j denote the optimal alignment score between a1 a2 ai and b1 b2 bj and, where 1 i m and 1 j n.

Then, Ai,j can be expressed as follows:

Ai,0 : a1 a2 ai are all aligned with “-”.

A0,j : b1 b2 bj are all aligned with “-”.

f(ai ,-): ai is aligned with “-”.

f(ai ,bj): ai is aligned with bj.

f(- ,bj): bj is aligned with “-”.

),(

),(

),(

max

),(

),(

0

1,

1,1

,1

,

,0

0,

0,0

jji

jiji

iji

ji

jj

ii

bfA

bafA

afA

A

bfjA

afiA

A

Page 22: Chapter 7

7 -22

© The McGraw-Hill Companies, Inc., 2005

The Ai,j’s for A = a b d a d and B = b a c d using the above recursive formula are listed below:   ai

bj

a b d a d

0 1 2 3 4 5

0 0 -1 -2 -3 -4 -5

b 1 -1 1 1 0 -1 -2

a 2 -2 1 2 2 2 1

c 3 -3 0 2 3 3 3

d 4 -4 -1 1 4 4 5

Page 23: Chapter 7

7 -23

© The McGraw-Hill Companies, Inc., 2005

In the above table, we have also recorded how each Ai,j is obtained.An arrow from (ai,bj) to (ai-1,bj-1): ai matched with bj.

An arrow from (ai,bj) to (ai-1,bj): ai matched with “-”,

An arrow from (ai,bj) to (ai,bj-1): bj matched with “-”.

Based upon the arrows in the table, we can trace back and find the optimal alignment is as follows:

a b d a d

- b a c d

Page 24: Chapter 7

7 -24

© The McGraw-Hill Companies, Inc., 2005

Edit Distance Edit Distance is also used quite often to

measure the similarity between two

sequences. Transform A to B by three edit operations:

deletion of a character from A. insertion of a character into A. substitution of a character in A with another

character.

Page 25: Chapter 7

7 -25

© The McGraw-Hill Companies, Inc., 2005

For exampleLet A = GTAAHTY and B = TAHHYC

(1) Delete the first character G of A. GTAAHTY → TAAHTY(2) Substitute the third character of A by H.

TAAHTY → TAHHTY(3) Delete the fifth character of A.

TAHHTY → TAHHY(4) Insert C after the last character of A.

TAHHY → TAHHYC = B

Page 26: Chapter 7

7 -26

© The McGraw-Hill Companies, Inc., 2005

Associate a cost with each operation. Let , and denote the costs of insertion, deletion and

substitution respectively. Ai,j : the edit distance between a1 a2 .. ai and b1 b2 ..bj.

It takes O(nm) time to find an optimal alignment.

otherwisemin

if

0

1,

1,1

,1

1,1

,

,0

0,

0,0

ji

ji

ji

jiji

ji

j

i

A

A

A

baA

A

jA

iA

A

Page 27: Chapter 7

7 -27

© The McGraw-Hill Companies, Inc., 2005

The RNA Maximum Base Pair Matching Problem

Ribonucleic acid (RNA) is a single strand of nucleotides (bases) adenine (A), guanine (G), cytosine (C) and uracil (U).

The sequence of the bases A, G, C and U is called the primary structure of an RNA.

The primary structure of an RNA can fold back on itself to form its secondary structure. G and C can form a base pair G≡C by a triple-

hydrogen bond. A and U can form a base pair A=U by a

double-hydrogen bond. G and U can form a base pair GU by a single

hydrogen bond.

Page 28: Chapter 7

7 -28

© The McGraw-Hill Companies, Inc., 2005

E.g. The primary structure of an RNA, R:

A–G–G–C–C–U–U–C–C–U Six possible secondary structures of RNA, R:

Page 29: Chapter 7

7 -29

© The McGraw-Hill Companies, Inc., 2005

An RNA sequence will be represented as a string of n characters R = r1r2 · · ·rn, where ri {A, C, G, U}.

A secondary structure of R is a set S of base pairs (ri, rj), where 1 i < j n. (1) j - i > t, where t is a small positive constant. Typically, t

= 3. (2) If (ri, rj) and (rk, rl) are two base pairs in S and i k,

then either(a) i = k and j = l, i.e., (ri, rj) and (rk, rl) are the same

base pair,(b) i < j < k < l, i.e., (ri, rj) precedes (rk, rl), or

(c) i < k < l < j, i.e., (ri, rj) includes (rk, rl). Pseudoknot: Two base pairs (ri, rj) and (rk, rl) ,if i < k < j < l Due to different hydrogen bonds, the energies of base pairs

are usually assigned different values. For example, the reasonable values for A≡U, G=C and G–U are -3, -2 and -1, respectively.

Page 30: Chapter 7

7 -30

© The McGraw-Hill Companies, Inc., 2005

Dynamic Programming Given an RNA sequence R = r1r2 · · ·rn, find a secondary

structure of RNA with the maximum number of base pairs. Let Si,j denote the secondary structure of the maximum

number of base pairs on the substring Ri,j = riri+1 · · · rj.

Denote the number of matched base pairs in Si,j by Mi,j .

If j - i 3, then ri and rj cannot be a base pair of Si,j and Mi,j = 0 .

Let WW = {(A, U), (U, A), (G, C), (C, G), (G, U), (U, G)}. ρ(ri, rj) : whether any two bases ri and rj can be a legal

base pair:

otherwise0

),( if1),(

WWrrrr

jiji

Page 31: Chapter 7

7 -31

© The McGraw-Hill Companies, Inc., 2005

To compute Mi,j, where j- i > 3, we consider the following cases from rj’s point of view. Case 1: In the optimal solution, rj is

not paired with any other base. In this case, find an optimal solution for riri+1 . . . rj-1 and Mi,j =Mi,j-1.

Page 32: Chapter 7

7 -32

© The McGraw-Hill Companies, Inc., 2005

Case 2: In the optimal solution, rj is paired with ri. In this case, find an optimal solution for ri+1ri+2 . . . rj-1 and Mi,j = 1 + Mi+1,j-1.

Page 33: Chapter 7

7 -33

© The McGraw-Hill Companies, Inc., 2005

Case 3: In the optimal solution, rj is paired with some rk, where i + 1 k j - 4. In this case, find the optimal solutions for riri+1 . . . rk-1 and rk+1rk+1 . . . rj-1 and Mi,j = 1 + Mi,k-i+ Mk+1,j-1.

Page 34: Chapter 7

7 -34

© The McGraw-Hill Companies, Inc., 2005

Find the k between i+1 and j+ 4 such that Mi,j

is the maximum, we have

Compute Mi,j by the following recursive formula

If j - i ≦ 3, then Mi,j = 0. If j - i > 3, then

1,11,41

, 1max jkki

jkiji MMM

),()1(max

),()1(max

1,11,41

1,1

1,

,

jkjkkijki

jkji

ji

ji

rrMM

rrM

M

M

Page 35: Chapter 7

7 -35

© The McGraw-Hill Companies, Inc., 2005

The following table illustrates the computation of Mi,j, where 1 i < j 10, for an RNA sequence R1,10 = A–G–G–C–C–U–U–C–C–U.

Maximum number of base pairs in S1,10 is 3 since M1,10 = 3.

i\j 1 2 3 4 5 6 7 8 9 101 0 0 0 0 0 1 2 2 2 32 - 0 0 0 0 1 1 2 2 23 - - 0 0 0 0 1 1 1 14 - - - 0 0 0 0 0 0 05 - - - - 0 0 0 0 0 06 - - - - - 0 0 0 0 07 - -- - - - - 0 0 0 08 - - - - - - - 0 0 09 - - - - - - - - 0 0

10 - - - - - - - - - 0

Page 36: Chapter 7

7 -36

© The McGraw-Hill Companies, Inc., 2005

Algorithm to Compute M1,n

Input: An RNA sequence R = r1r2 · · · rn.

Output: Find a secondary structure of RNA with the

maximum number of base pairs. Step 1: /* Computation of ρ(ri, rj) function for 1 i < j n

*/ WW = {(A, U), (U, A), (G, C), (C, G), (G, U), (U, G)};for i = 1 to n do for j = i to n do if (ri, rj) WW then ρ(ri, rj) = 1; else ρ(ri, rj) = 0;

end forend for

Page 37: Chapter 7

7 -37

© The McGraw-Hill Companies, Inc., 2005

Step 2: /* Initialization of Mi,j for j - i 3 */for i = 1 to n do  for j = i to i + 3 do if j ≦ n then Mi,j = 0; end forend forStep 3: /* Calculation of Mi,j for j - i > 3 */for h = 4 to n - 1 do for i = 1 to n - h do j = i + h; case1 = Mi,j-1; case2 = (1+Mi+1,j-1) × ρ(ri, rj ); case3 = Mi,j = max{case1, case2, case3}; end forend for

Time-complexity : O(n3). ),()1(max 1,11,41

jkjkkijki

rrMM

Page 38: Chapter 7

7 -38

© The McGraw-Hill Companies, Inc., 2005

0/1 Knapsack Problem n objects , weights W1, W2, ,Wn

profits P1, P2, ,Pn

capacity M maximize

subject to M xi = 0 or 1, 1in e.g.

ni

iixP1

ni

iixW1

i Wi Pi 1 10 40 2 3 20 3 5 30

M=10

Page 39: Chapter 7

7 -39

© The McGraw-Hill Companies, Inc., 2005

The Multi-Stage Graph Solution

The 0/1 knapsack problem can be described by a multi-stage graph.

S T

0

1 0

10

00

01

100

010

011

000

001

0

0

0

0

00

40

020

0

30

0

0

30

x1=1

x1=0

x2=0

x2=1

x2=0

x3=0

x3=1

x3=0

x3=1

x3=0

Page 40: Chapter 7

7 -40

© The McGraw-Hill Companies, Inc., 2005

The Dynamic Programming Approach

The longest path represents the optimal solution:

x1=0, x2=1, x3=1

= 20+30 = 50 Let fi(Q) be the value of an optimal solution

to objects 1,2,3,…,i with capacity Q. fi(Q) = max{ fi-1(Q), fi-1(Q-Wi)+Pi }

The optimal solution is fn(M).

iixP

Page 41: Chapter 7

7 -41

© The McGraw-Hill Companies, Inc., 2005

Optimal Binary Search Trees

e.g. binary search trees for 3, 7, 9, 12;

3

7

12

9

(a) (b)

9

3

7

12

12

3

7

9

(c)

12

3

7

9

(d)

Page 42: Chapter 7

7 -42

© The McGraw-Hill Companies, Inc., 2005

Optimal Binary Search Trees

n identifiers : a1 <a2 <a3 <…< an

Pi, 1in : the probability that ai is searched.

Qi, 0in : the probability that x is searched

where ai < x < ai+1 (a0=-, an+1=).111

n

ii

n

ii QP

Page 43: Chapter 7

7 -43

© The McGraw-Hill Companies, Inc., 2005

Identifiers : 4, 5, 8, 10, 11, 12, 14

Internal node: successful search, Pi

External node: unsuccessful search, Qi

10

14

E 7

5

11

12E 4

4

E 0 E 1

8

E 2 E 3

E 5 E 6

The expected cost of a binary tree:

The level of the root: 1

n

0nii

n

1nii 1))(level(EQ)level(aP

Page 44: Chapter 7

7 -44

© The McGraw-Hill Companies, Inc., 2005

The Dynamic Programming Approach

Let C(i, j) denote the cost of an optimal binary search tree containing ai,…,aj .

The cost of the optimal binary search tree with ak as its root:

ak

a1...ak-1 ak+1...an

P 1 ...P k-1

Q 0...Q k-1

P k+1 ...P n

Q k ...Q n

C(1,k-1) C(k+1,n)

n1,kCQPQ1k1,CQPQPminn)C(1,n

1kmmmk

1k

1mmm0k

nk1

Page 45: Chapter 7

7 -45

© The McGraw-Hill Companies, Inc., 2005

j

immm1-i

jki

j

1kmmmk

1k

immm1-ik

jki

QPQj1,kC1ki,Cmin

j1,kCQPQ

1ki,CQPQPminj)C(i,

General Formula

ak

ai...ak-1 ak+1...aj

Pi...Pk-1

Qi-1...Qk-1

Pk+1...Pj

Qk...Qj

C(i,k-1) C(k+1,j)

Page 46: Chapter 7

7 -46

© The McGraw-Hill Companies, Inc., 2005

Computation Relationships of Subtrees

e.g. n = 4

Time complexity: O(n3) (n-m) C(i, j)’s are computed when j - i = m. Each C(i, j) with j – i = m can be computed in O(m)

time.

C(1,4)

C(1,3) C(2,4)

C(1,2) C(2,3) C(3,4)

)O(n)m)m(nO( 3

nm1


Recommended