10/9/2014
1
15.2 Matrix-chain multiplication
• Given a sequence (chain) , … , ofmatrices we wish to compute the product
• We can evaluate the expression using standardalgorithm for multiplying pairs of matrices oncewe have parenthesized it to resolve allambiguities in how the matrices are multipliedtogether
• Matrix multiplication is associative, and so allparenthesizations yield the same product
9-Oct-14MAT-72006 AA+DS, Fall 2014 472
• A product of matrices is fully parenthesized if it is– either a single matrix or– the product of two fully parenthesized matrix
products, surrounded by parentheses• For example, we can fully parenthesize the
product in ve distinct ways:))
( ))(( )( ))(( )) )((( ) ) )
9-Oct-14MAT-72006 AA+DS, Fall 2014 473
10/9/2014
2
• How we parenthesize a chain of matrices has adramatic impact on cost of product evaluation
• Standard algorithm for multiplying two matrices:MATRIX-MULTIPLY )1. if .columns .rows2. error “incompatible dimensions”3. else let be a new .rows .columns matrix4. for to .rows5. for to .columns6. 07. for to .columns8.9. return
9-Oct-14MAT-72006 AA+DS, Fall 2014 474
• We can multiply two matrices and only ifthey are compatible: the number of columns ofmust equal the number of rows of
• If is a matrix and is a matrix, theresulting matrix is a matrix
• The time to compute is dominated by thenumber of scalar multiplications in line 8, whichis
• We shall express costs in terms of the number ofscalar multiplications
9-Oct-14MAT-72006 AA+DS, Fall 2014 475
10/9/2014
3
• Consider matrix product of a chain ofmatrices with dimensions 10 × 100, 100 × 5, and 5 × 50
• If we apply the parenthesization (( ), we perform10 100 5 = 5000 scalar multiplications to compute the10 × 5 matrix product , plus 10 50 = 2500further ones to multiply this matrix by
a total of 7500 scalar multiplications• If instead we use )), we perform 100 50 =
25,000 scalar multiplications to compute the 100 × 50matrix product , plus another 10 100 50 = 50,000scalar multiplications to multiply by this matrix
a total of 75,000 scalar multiplications• Thus, computing the product according to the rst
parenthesization is 10 times faster9-Oct-14MAT-72006 AA+DS, Fall 2014 476
• Matrix-chain multiplication problem: given achain , … , of matrices, where matrix
has dimension , fully parenthesizethe product in a way that minimizesthe number of scalar multiplications
• We are not actually multiplying matrices• Our goal is only to determine an order for
multiplying matrices that has the lowest cost• Typically, the time invested in determining this
optimal order is more than paid for by the timesaved later on when actually performing thematrix multiplications
9-Oct-14MAT-72006 AA+DS, Fall 2014 477
10/9/2014
4
Counting the number of parenthesizations
• Exhaustive checking of all possible parenthesescombinations doesn’t yield an ef cient algorithm
• Let the number of alternative parenthesizations of asequence of matrices be
• When = 1, we have just one matrix and only oneway to fully parenthesize the matrix product
• When 2, a fully parenthesized matrix product isthe product of two fully parenthesized matrixsubproducts, and the split between the two mayoccur between the th and + 1)st matrices
9-Oct-14MAT-72006 AA+DS, Fall 2014 478
• Thus, we obtain the recurrence
=
1 if = 1
) if 2
• Exercise: show that the solution to therecurrence is (2 )
• The number of solutions is exponential in• The brute-force method of exhaustive search
makes for a poor strategy when determininghow to optimally parenthesize a matrix chain
9-Oct-14MAT-72006 AA+DS, Fall 2014 479
10/9/2014
5
Applying dynamic programming
Step 1: The structure of an optimalparenthesization• Let us adopt the notation .. , , for the matrix that
results from evaluating the product• Observe that if then to parenthesize the product
, we must split the product between andfor some
• I.e., for some value of , we rst compute the matrices.. and .. and then multiply them together to
produce the nal product ..
9-Oct-14MAT-72006 AA+DS, Fall 2014 480
• The cost: cost of computing the matrix .. + cost ofcomputing .. + cost of multiplying them together
• The optimal substructure of this problem:• Suppose that to optimally parenthesize ,
we split the product between and• The way we parenthesize within this
optimal parenthesization of must be anoptimal parenthesization of
• If there were a less costly way to parenthesize, then substitute that parenthesization in
the optimal parenthesization of toproduce a way to parenthesize withlower cost than the optimum: a contradiction
9-Oct-14MAT-72006 AA+DS, Fall 2014 481
10/9/2014
6
• We can construct an optimal solution to the problemfrom optimal solutions to subproblems
• Solution to a nontrivial instance requires us to splitthe product, and any optimal solution contains withinit optimal solutions to subproblem instances
• We can build an optimal solution by splitting theproblem into two subproblems (optimallyparenthesizing and ),nding optimal solutions to these, and then
combining these optimal subproblem solutions• We must ensure that when we search for the correct
place to split the product, we have considered allpossible places, so that we are sure of havingexamined the optimal one
9-Oct-14MAT-72006 AA+DS, Fall 2014 482
Step 2: A recursive solution• Next, we de ne the cost of an optimal solution
recursively in terms of the optimal solutions tosubproblems
• For the matrix-chain multiplication problem, wepick as our subproblems the problems ofdetermining the minimum cost of parenthesizing
for• Let ] be the minimum number of scalar
multiplications needed to compute the matrix.. for the full problem, the lowest-cost way to
compute .. would thus be [1, ]9-Oct-14MAT-72006 AA+DS, Fall 2014 483
10/9/2014
7
• If , no scalar multiplications are necessary• Thus, = 0 for = 1,2, … ,• To compute , , we take advantage of the
structure of an optimal solution from step 1• Assume that to optimally parenthesize, we split the
product between and• Then, ] = the minimum cost for computing the
subproducts .. and .. , plus the cost ofmultiplying these two matrices together
• Recalling that each matrix is , we seethat computing the matrix product .. .. takes
scalar multiplications
9-Oct-14MAT-72006 AA+DS, Fall 2014 484
• Thus, we obtain+ 1,
• This recursive equation assumes that we know thevalue of , which we do not
• There are only possible values for , however,namely + 1, … , 1
• The optimal parenthesization must use one of thesevalues, we need only check them all to nd the best
• Thus, our recursive de nition for the minimum costof parenthesizing the product becomes
=0 if
min + 1, if
9-Oct-14MAT-72006 AA+DS, Fall 2014 485
10/9/2014
8
• The values give the costs of optimalsolutions to subproblems, but they do notprovide an optimal solution
• To help us do so, we de ne to be avalue of at which we split the product
in an optimal parenthesization• That is, equals a value such that
+ 1,
9-Oct-14MAT-72006 AA+DS, Fall 2014 486
Step 3: Computing the optimal costs• We could easily write a recursive algorithm
based on the recurrence to compute theminimum cost 1, for multiplying
• This algorithm takes exponential time• It is no better than the brute-force method of
checking each way of parenthesizing a product• We have relatively few distinct subproblems: one
subproblem for each choice of and satisfying, or 2 ) in all
• We encounter each subproblem many times indifferent branches of its recursion tree
9-Oct-14MAT-72006 AA+DS, Fall 2014 487
10/9/2014
9
• We implement the tabular, bottom-up method inMATRIX-CHAIN-ORDER, which assumes thatmatrix has dimensions
• Its input is a sequence = , … , , where+ 1
• The procedure uses– table [1. . , 1. . ] for storing the ] costs and– table [1. . 1, 2. . ] records which index of
achieved the optimal cost in computing ]• We use the table to construct an optimal
solution9-Oct-14MAT-72006 AA+DS, Fall 2014 488
• By the recurrence the cost [ , ] of computinga matrix-chain product of + 1 matricesdepends only on the costs of computing matrix-chain products of + 1 matrices
• I.e., for + 1, … , 1, .. is a product of+ 1 < + 1 matrices and .. is a
product of + 1 matrices• Thus, we should ll in in a manner that
corresponds to solving the problem on matrixchains of increasing length
• For , we consider the subproblemsize to be the length + 1 of the chain
9-Oct-14MAT-72006 AA+DS, Fall 2014 489
10/9/2014
10
MATRIX-CHAIN-ORDER( )1. 12. let [1. . , 1. . ] and [1. . 1, 2. . ] be new tables3. for = 1 to4. = 05. for = 2 to // is the chain length6. for = 1 to + 17. 18.9. for to 110. + 1,11. if ]12.13.14. return and
9-Oct-14MAT-72006 AA+DS, Fall 2014 490
• The algorithm computes in lines 5–13 theminimum costs for chains of length = 2,
+ 1] for = 1,2, … , 1during the rst execution of the for loop
• The second time through the loop, it computesthe minimum costs for chains of length = 3,
+ 2], and so forth• At each step, the ] cost computed in lines
10–13 depends only on table entries ] and+ 1, ] already computed
9-Oct-14MAT-72006 AA+DS, Fall 2014 491
10/9/2014
11
9-Oct-14MAT-72006 AA+DS, Fall 2014 492
PRINT-OPTIMAL-PARENS , 1,6) printsthe parenthesization
• The minimum number of scalar multiplications to multiply the 6matrices is 1,6 = 15,125
• Of the darker entries, the pairs that have the same shading aretaken together in line 10 when computing
• 2,5 =
min2,2 + 3,5 + = 0 + 2500 + 35 15 20 = 13,000
2,3 + 4,5 + = 2625 + 1000 + 35 20 =2,4 + 5,5 + = 4375 + 0 + 35 10 20 = 11,375
• The nested loop structure of MATRIX-CHAIN-ORDER yields a runningtime of ) for the algorithm– The loops are nested three deep, and each loop index ( , , and ) takes
on at most 1 values• The running time of this algorithm is in fact also• The algorithm requires ) space to store the two tables• MATRIX-CHAIN-ORDER is much more ef cient than the exponential-
time method of enumerating all possible parenthesizations andchecking each one
9-Oct-14MAT-72006 AA+DS, Fall 2014 493
10/9/2014
12
Step 4: Constructing an optimal solution• Table [1. . 1, 2. . ] gives us the information
we need to multiply the matrices• Entry ] records a value of s.t. an optimal
parenthesization of splits the productbetween and
• Thus, we know that the nal matrix multiplicationin computing .. optimally is .. ] ..
• [1, 1, ] determines the last matrixmultiplication when computing .. ] and
1, + 1, ] determines the last matrixmultiplication when computing ..
9-Oct-14MAT-72006 AA+DS, Fall 2014 494
• The following procedure prints an optimalparenthesization of , … , , given thetable computed by MATRIX-CHAIN-ORDER and theindices and
• The call PRINT-OPTIMAL-PARENS , 1, prints anoptimal parenthesization of , … ,
PRINT-OPTIMAL-PARENS1. if2. print ”3. else print “(”4. PRINT-OPTIMAL-PARENS ]5. PRINT-OPTIMAL-PARENS + 1,6. print “)”
9-Oct-14MAT-72006 AA+DS, Fall 2014 495
10/9/2014
13
15.4 Longest common subsequence
• Biological applications often need to comparethe DNA of two (or more) different organisms
• A strand of DNA consists of a string of moleculescalled bases, where the possible bases areAdenine, Guanine, Cytosine, and Thymine
• We express a strand of DNA as a string over thealphabet {A,C,G,T}
• E.g., the DNA of two organisms may be– ACCGGTCGAGTGCGCGGAAGCCGGCCGAA– = GTCGTTCGGAATGCCGTTGCTCTGTAAA
9-Oct-14MAT-72006 AA+DS, Fall 2014 496
• By comparing two strands of DNA we determinehow “similar” they are, as some measure of howclosely related the two organisms are
• We can de ne similarity in many different ways• E.g., we can say that two DNA strands are
similar if one is a substring of the other– Neither nor is a substring of the other
• Alternatively, we could say that two strands aresimilar if the number of changes needed to turnone into the other is small
9-Oct-14MAT-72006 AA+DS, Fall 2014 497
10/9/2014
14
• Yet another way to measure the similarity ofand is by nding a third strand in which thebases in appear in each of and• these bases must appear in the same order, but not
necessarily consecutively• The longer the strand we can nd, the more
similar and are• In our example, the longest strand is
– ACCGGTCGAGTGCGCGGAAGCCGGCCGAA– GTCGTTCGGAATGCCGTTGCTCTGTAAA– = GTCGTCGGAAGCCGGCCGAA
9-Oct-14MAT-72006 AA+DS, Fall 2014 498
• We formalize this last notion of similarity as thelongest-common-subsequence problem
• A subsequence is just the given sequence withzero or more elements left out
• Formally, given a sequence = , … , ,another sequence = , … , is asubsequence of if there exists a strictlyincreasing sequence , … , of indices ofsuch that for all = 1,2, … , , we have
• For example, = is a subsequenceof = with correspondingindex sequence 2,3,5,7
9-Oct-14MAT-72006 AA+DS, Fall 2014 499
10/9/2014
15
• We say that a sequence is a commonsubsequence of and if is a subsequenceof both and
• For example, if = and= , the sequence is a
common subsequence of both and• is not a longest common subsequence
(LCS) of and• The sequence , which is also common
to both and , has length 4• This sequence is an LCS of and , as is
; and have no commonsubsequence of length 5 or greater
9-Oct-14MAT-72006 AA+DS, Fall 2014 500
• In longest-common-subsequence problem, we aregiven sequences = , … , and
= , … , and wish to nd a maximum-lengthcommon subsequence of and
Step 1: Characterizing a longest common subsequence• In a brute-force approach, we would enumerate all
subsequences of and check each of them to seewhether it is also a subsequence of , keeping track ofthe longest subsequence we nd
• Each subsequence of corresponds to a subset of theindices 1,2, … , of
• Because has 2 subsequences, this approachrequires exponential time, making it impractical for longsequences
9-Oct-14MAT-72006 AA+DS, Fall 2014 501
10/9/2014
16
• The LCS problem has an optimal-substructureproperty, however, as the following theoremshows
• We shall see that the natural classes ofsubproblems correspond to pairs of “pre xes” ofthe two input sequences
• Precisely, given a sequence = , … , ,we de ne the th pre x of , for = 0,1, … , , as
= , … ,
• For example, if = , then= , , , and is the empty sequence
9-Oct-14MAT-72006 AA+DS, Fall 2014 502
Theorem 15.1 (Optimal substructure of LCS)Let = , … , and = , … , besequences, and let = , … , be any LCSof and .1. If , then and is an
LCS of and .2. If , then implies that is an
LCS of and .3. If , then implies that is an
LCS of and .
9-Oct-14MAT-72006 AA+DS, Fall 2014 503
10/9/2014
17
Proof (1) If , then we could appendto to obtain a common subsequence of
and of length + 1, contradicting thesupposition that is a LCS of and . Thus, wemust have . Now, the pre x is alength- 1) common subsequence of and
. We wish to show that it is an LCS. Supposefor the purpose of contradiction that there exists acommon subsequence of and withlength greater than 1. Then, appending
to produces a common subsequenceof and whose length is greater than , which isa contradiction.
9-Oct-14MAT-72006 AA+DS, Fall 2014 504
(2) If , then is a common subsequence ofand . If there were a common subsequence
with length greater than , then would alsobe a common subsequence of and ,contradicting the assumption that is an LCS ofand .(3) The proof is symmetric to (2).
• Theorem 15.1 tells us that an LCS of twosequences contains within it an LCS of pre xesof the two sequences
• Thus, the LCS problem has an optimal-substructure property
9-Oct-14MAT-72006 AA+DS, Fall 2014 505
10/9/2014
18
Step 2: A recursive solution• We examine either one or two subproblems
when nding an LCS of and• If , we nd an LCS of and• Appending yields an LCS of and• If , then we (1) nd an LCS of and
and (2) nd an LCS of and• Whichever of these two LCSs is longer is an
LCS of and• These cases exhaust all possibilities, and we
know that one of the optimal subproblemsolutions must appear within an LCS of and
9-Oct-14MAT-72006 AA+DS, Fall 2014 506
• To nd an LCS of and , we may need to ndthe LCSs of and and of and
• Each subproblem has the subsubproblem ofnding an LCS of and
• Many other subproblems share subsubproblems• As in the matrix-chain multiplication, recursive
solution to the LCS problem involves arecurrence for the value of an optimal solution
• Let us de ne ] to be the length of an LCS ofthe sequences and
• If either = 0 or = 0, one of the sequences haslength 0, and so the LCS has length 0
9-Oct-14MAT-72006 AA+DS, Fall 2014 507
10/9/2014
19
• The optimal substructure of the LCS problem gives
=0 if = 0 or = 0
1, 1 + 1 if > 0 andmax 1 , 1, ) if > 0 and
• Observe that a condition in the problem restricts whichsubproblems we may consider
• When , we consider nding an LCS of and
• Otherwise, we instead consider the two subproblems ofnding an LCS of and and of and
• In the previous dynamic-programming algorithms — forrod cutting and matrix-chain multiplication — we ruledout no subproblems due to conditions in the problem
9-Oct-14MAT-72006 AA+DS, Fall 2014 508
Step 3: Computing the length of an LCS• Since the LCS problem has only distinct
subproblems, we can use dynamic programming tocompute the solutions bottom up
• LCS-LENGTH stores the ] values in [0. . , 0. . ],and it computes the entries in row-major order– I.e., the procedure lls in the rst row of from left to
right, then the second row, and so on• The procedure also maintains the table [1. . , 1. . ]• Intuitively, ] points to the table entry
corresponding to the optimal subproblem solutionchosen when computing
• contains the length of an LCS of and
9-Oct-14MAT-72006 AA+DS, Fall 2014 509
10/9/2014
20
LCS-LENGTH )1.2.3. let [1. . , 1. . ] and
[0. . , 0. . be newtables
4. for to5. , 0 06. for 0 to7. 0, 08. for to9. for 1 to
10. if11. 1, 1 + 112. ”13. elseif 1, 1]14. 1, ]15. ”16. else 1]17. ”18. return and
Running time: )
9-Oct-14MAT-72006 AA+DS, Fall 2014 510
9-Oct-14MAT-72006 AA+DS, Fall 2014 511
The and tables computed by LCS-LENGTH on= and =
10/9/2014
21
Step 4: Constructing an LCS• The table returned by LCS-LENGTH enables us
to quickly construct an LCS of and• We simply begin at ] and trace through the
table by following the arrows• Whenever we encounter a “ ” in entry , it
implies that is an element of the LCS thatLCS-LENGTH found
• With this method, we encounter the elements ofthis LCS in reverse order
• A recursive procedure prints out an LCS ofand in the proper, forward order
9-Oct-14MAT-72006 AA+DS, Fall 2014 512
• The square in row and column contains the value ofand the appropriate arrow for the value of ]
• The entry 4 in [7,6] — the lower right-hand corner ofthe table — is the length of an LCS
• For > 0, entry depends only on whetherand the values in entries 1, , 1 ,
and 1, 1 , which are computed before• To reconstruct the elements of an LCS, follow the ]
arrows from the lower right-hand corner• Each “ ” on the shaded sequence corresponds to an
entry (highlighted) for which is a member of anLCS
9-Oct-14MAT-72006 AA+DS, Fall 2014 513
10/9/2014
22
Improving the code
• Each entry depends on only 3 other tableentries: 1, , 1 , and 1, 1
• Given the value of , we can determine in(1) time which of these three values was used
to compute , without inspecting table• We can reconstruct an LCS in ) time• The auxiliary space requirement for computing
an LCS does not asymptotically decrease, sincewe need space for the table anyway
9-Oct-14MAT-72006 AA+DS, Fall 2014 514
• We can, however, reduce the asymptotic spacerequirements for LCS-LENGTH, since it needsonly two rows of table at a time– the row being computed and the previous row
• This improvement works if we need only thelength of an LCS– if we need to reconstruct the elements of an LCS,
the smaller table does not keep enoughinformation to retrace our steps in ) time
9-Oct-14MAT-72006 AA+DS, Fall 2014 515