15.2 Matrix-chain multiplication - TUTelomaa/teach/AADS-2014-7.pdf10/9/2014 1 15.2 Matrix-chain...

10/9/2014

1

15.2 Matrix-chain multiplication

• Given a sequence (chain) , … , ofmatrices we wish to compute the product

• We can evaluate the expression using standardalgorithm for multiplying pairs of matrices oncewe have parenthesized it to resolve allambiguities in how the matrices are multipliedtogether

• Matrix multiplication is associative, and so allparenthesizations yield the same product

9-Oct-14MAT-72006 AA+DS, Fall 2014 472

• A product of matrices is fully parenthesized if it is– either a single matrix or– the product of two fully parenthesized matrix

products, surrounded by parentheses• For example, we can fully parenthesize the

product in ve distinct ways:))

( ))(( )( ))(( )) )((( ) ) )

9-Oct-14MAT-72006 AA+DS, Fall 2014 473

10/9/2014

2

• How we parenthesize a chain of matrices has adramatic impact on cost of product evaluation

• Standard algorithm for multiplying two matrices:MATRIX-MULTIPLY )1. if .columns .rows2. error “incompatible dimensions”3. else let be a new .rows .columns matrix4. for to .rows5. for to .columns6. 07. for to .columns8.9. return

9-Oct-14MAT-72006 AA+DS, Fall 2014 474

• We can multiply two matrices and only ifthey are compatible: the number of columns ofmust equal the number of rows of

• If is a matrix and is a matrix, theresulting matrix is a matrix

• The time to compute is dominated by thenumber of scalar multiplications in line 8, whichis

• We shall express costs in terms of the number ofscalar multiplications

9-Oct-14MAT-72006 AA+DS, Fall 2014 475

10/9/2014

3

• Consider matrix product of a chain ofmatrices with dimensions 10 × 100, 100 × 5, and 5 × 50

• If we apply the parenthesization (( ), we perform10 100 5 = 5000 scalar multiplications to compute the10 × 5 matrix product , plus 10 50 = 2500further ones to multiply this matrix by

a total of 7500 scalar multiplications• If instead we use )), we perform 100 50 =

25,000 scalar multiplications to compute the 100 × 50matrix product , plus another 10 100 50 = 50,000scalar multiplications to multiply by this matrix

a total of 75,000 scalar multiplications• Thus, computing the product according to the rst

parenthesization is 10 times faster9-Oct-14MAT-72006 AA+DS, Fall 2014 476

• Matrix-chain multiplication problem: given achain , … , of matrices, where matrix

has dimension , fully parenthesizethe product in a way that minimizesthe number of scalar multiplications

• We are not actually multiplying matrices• Our goal is only to determine an order for

multiplying matrices that has the lowest cost• Typically, the time invested in determining this

optimal order is more than paid for by the timesaved later on when actually performing thematrix multiplications

9-Oct-14MAT-72006 AA+DS, Fall 2014 477

10/9/2014

4

Counting the number of parenthesizations

• Exhaustive checking of all possible parenthesescombinations doesn’t yield an ef cient algorithm

• Let the number of alternative parenthesizations of asequence of matrices be

• When = 1, we have just one matrix and only oneway to fully parenthesize the matrix product

• When 2, a fully parenthesized matrix product isthe product of two fully parenthesized matrixsubproducts, and the split between the two mayoccur between the th and + 1)st matrices

9-Oct-14MAT-72006 AA+DS, Fall 2014 478

• Thus, we obtain the recurrence

=

1 if = 1

) if 2

• Exercise: show that the solution to therecurrence is (2 )

• The number of solutions is exponential in• The brute-force method of exhaustive search

makes for a poor strategy when determininghow to optimally parenthesize a matrix chain

9-Oct-14MAT-72006 AA+DS, Fall 2014 479

10/9/2014

5

Applying dynamic programming

Step 1: The structure of an optimalparenthesization• Let us adopt the notation .. , , for the matrix that

results from evaluating the product• Observe that if then to parenthesize the product

, we must split the product between andfor some

• I.e., for some value of , we rst compute the matrices.. and .. and then multiply them together to

produce the nal product ..

9-Oct-14MAT-72006 AA+DS, Fall 2014 480

• The cost: cost of computing the matrix .. + cost ofcomputing .. + cost of multiplying them together

• The optimal substructure of this problem:• Suppose that to optimally parenthesize ,

we split the product between and• The way we parenthesize within this

optimal parenthesization of must be anoptimal parenthesization of

• If there were a less costly way to parenthesize, then substitute that parenthesization in

the optimal parenthesization of toproduce a way to parenthesize withlower cost than the optimum: a contradiction

9-Oct-14MAT-72006 AA+DS, Fall 2014 481

10/9/2014

6

• We can construct an optimal solution to the problemfrom optimal solutions to subproblems

• Solution to a nontrivial instance requires us to splitthe product, and any optimal solution contains withinit optimal solutions to subproblem instances

• We can build an optimal solution by splitting theproblem into two subproblems (optimallyparenthesizing and ),nding optimal solutions to these, and then

combining these optimal subproblem solutions• We must ensure that when we search for the correct

place to split the product, we have considered allpossible places, so that we are sure of havingexamined the optimal one

9-Oct-14MAT-72006 AA+DS, Fall 2014 482

Step 2: A recursive solution• Next, we de ne the cost of an optimal solution

recursively in terms of the optimal solutions tosubproblems

• For the matrix-chain multiplication problem, wepick as our subproblems the problems ofdetermining the minimum cost of parenthesizing

for• Let ] be the minimum number of scalar

multiplications needed to compute the matrix.. for the full problem, the lowest-cost way to

compute .. would thus be [1, ]9-Oct-14MAT-72006 AA+DS, Fall 2014 483

10/9/2014

7

• If , no scalar multiplications are necessary• Thus, = 0 for = 1,2, … ,• To compute , , we take advantage of the

structure of an optimal solution from step 1• Assume that to optimally parenthesize, we split the

product between and• Then, ] = the minimum cost for computing the

subproducts .. and .. , plus the cost ofmultiplying these two matrices together

• Recalling that each matrix is , we seethat computing the matrix product .. .. takes

scalar multiplications

9-Oct-14MAT-72006 AA+DS, Fall 2014 484

• Thus, we obtain+ 1,

• This recursive equation assumes that we know thevalue of , which we do not

• There are only possible values for , however,namely + 1, … , 1

• The optimal parenthesization must use one of thesevalues, we need only check them all to nd the best

• Thus, our recursive de nition for the minimum costof parenthesizing the product becomes

=0 if

min + 1, if

9-Oct-14MAT-72006 AA+DS, Fall 2014 485

10/9/2014

8

• The values give the costs of optimalsolutions to subproblems, but they do notprovide an optimal solution

• To help us do so, we de ne to be avalue of at which we split the product

in an optimal parenthesization• That is, equals a value such that

+ 1,

9-Oct-14MAT-72006 AA+DS, Fall 2014 486

Step 3: Computing the optimal costs• We could easily write a recursive algorithm

based on the recurrence to compute theminimum cost 1, for multiplying

• This algorithm takes exponential time• It is no better than the brute-force method of

checking each way of parenthesizing a product• We have relatively few distinct subproblems: one

subproblem for each choice of and satisfying, or 2 ) in all

• We encounter each subproblem many times indifferent branches of its recursion tree

9-Oct-14MAT-72006 AA+DS, Fall 2014 487

10/9/2014

9

• We implement the tabular, bottom-up method inMATRIX-CHAIN-ORDER, which assumes thatmatrix has dimensions

• Its input is a sequence = , … , , where+ 1

• The procedure uses– table [1. . , 1. . ] for storing the ] costs and– table [1. . 1, 2. . ] records which index of

achieved the optimal cost in computing ]• We use the table to construct an optimal

solution9-Oct-14MAT-72006 AA+DS, Fall 2014 488

• By the recurrence the cost [ , ] of computinga matrix-chain product of + 1 matricesdepends only on the costs of computing matrix-chain products of + 1 matrices

• I.e., for + 1, … , 1, .. is a product of+ 1 < + 1 matrices and .. is a

product of + 1 matrices• Thus, we should ll in in a manner that

corresponds to solving the problem on matrixchains of increasing length

• For , we consider the subproblemsize to be the length + 1 of the chain

9-Oct-14MAT-72006 AA+DS, Fall 2014 489

10/9/2014

10

MATRIX-CHAIN-ORDER( )1. 12. let [1. . , 1. . ] and [1. . 1, 2. . ] be new tables3. for = 1 to4. = 05. for = 2 to // is the chain length6. for = 1 to + 17. 18.9. for to 110. + 1,11. if ]12.13.14. return and

9-Oct-14MAT-72006 AA+DS, Fall 2014 490

• The algorithm computes in lines 5–13 theminimum costs for chains of length = 2,

+ 1] for = 1,2, … , 1during the rst execution of the for loop

• The second time through the loop, it computesthe minimum costs for chains of length = 3,

+ 2], and so forth• At each step, the ] cost computed in lines

10–13 depends only on table entries ] and+ 1, ] already computed

9-Oct-14MAT-72006 AA+DS, Fall 2014 491

10/9/2014

11

9-Oct-14MAT-72006 AA+DS, Fall 2014 492

PRINT-OPTIMAL-PARENS , 1,6) printsthe parenthesization

• The minimum number of scalar multiplications to multiply the 6matrices is 1,6 = 15,125

• Of the darker entries, the pairs that have the same shading aretaken together in line 10 when computing

• 2,5 =

min2,2 + 3,5 + = 0 + 2500 + 35 15 20 = 13,000

2,3 + 4,5 + = 2625 + 1000 + 35 20 =2,4 + 5,5 + = 4375 + 0 + 35 10 20 = 11,375

• The nested loop structure of MATRIX-CHAIN-ORDER yields a runningtime of ) for the algorithm– The loops are nested three deep, and each loop index ( , , and ) takes

on at most 1 values• The running time of this algorithm is in fact also• The algorithm requires ) space to store the two tables• MATRIX-CHAIN-ORDER is much more ef cient than the exponential-

time method of enumerating all possible parenthesizations andchecking each one

9-Oct-14MAT-72006 AA+DS, Fall 2014 493

10/9/2014

12

Step 4: Constructing an optimal solution• Table [1. . 1, 2. . ] gives us the information

we need to multiply the matrices• Entry ] records a value of s.t. an optimal

parenthesization of splits the productbetween and

• Thus, we know that the nal matrix multiplicationin computing .. optimally is .. ] ..

• [1, 1, ] determines the last matrixmultiplication when computing .. ] and

1, + 1, ] determines the last matrixmultiplication when computing ..

9-Oct-14MAT-72006 AA+DS, Fall 2014 494

• The following procedure prints an optimalparenthesization of , … , , given thetable computed by MATRIX-CHAIN-ORDER and theindices and

• The call PRINT-OPTIMAL-PARENS , 1, prints anoptimal parenthesization of , … ,

PRINT-OPTIMAL-PARENS1. if2. print ”3. else print “(”4. PRINT-OPTIMAL-PARENS ]5. PRINT-OPTIMAL-PARENS + 1,6. print “)”

9-Oct-14MAT-72006 AA+DS, Fall 2014 495

10/9/2014

13

15.4 Longest common subsequence

• Biological applications often need to comparethe DNA of two (or more) different organisms

• A strand of DNA consists of a string of moleculescalled bases, where the possible bases areAdenine, Guanine, Cytosine, and Thymine

• We express a strand of DNA as a string over thealphabet {A,C,G,T}

• E.g., the DNA of two organisms may be– ACCGGTCGAGTGCGCGGAAGCCGGCCGAA– = GTCGTTCGGAATGCCGTTGCTCTGTAAA

9-Oct-14MAT-72006 AA+DS, Fall 2014 496

• By comparing two strands of DNA we determinehow “similar” they are, as some measure of howclosely related the two organisms are

• We can de ne similarity in many different ways• E.g., we can say that two DNA strands are

similar if one is a substring of the other– Neither nor is a substring of the other

• Alternatively, we could say that two strands aresimilar if the number of changes needed to turnone into the other is small

9-Oct-14MAT-72006 AA+DS, Fall 2014 497

10/9/2014

14

• Yet another way to measure the similarity ofand is by nding a third strand in which thebases in appear in each of and• these bases must appear in the same order, but not

necessarily consecutively• The longer the strand we can nd, the more

similar and are• In our example, the longest strand is

– ACCGGTCGAGTGCGCGGAAGCCGGCCGAA– GTCGTTCGGAATGCCGTTGCTCTGTAAA– = GTCGTCGGAAGCCGGCCGAA

9-Oct-14MAT-72006 AA+DS, Fall 2014 498

• We formalize this last notion of similarity as thelongest-common-subsequence problem

• A subsequence is just the given sequence withzero or more elements left out

• Formally, given a sequence = , … , ,another sequence = , … , is asubsequence of if there exists a strictlyincreasing sequence , … , of indices ofsuch that for all = 1,2, … , , we have

• For example, = is a subsequenceof = with correspondingindex sequence 2,3,5,7

9-Oct-14MAT-72006 AA+DS, Fall 2014 499

10/9/2014

15

• We say that a sequence is a commonsubsequence of and if is a subsequenceof both and

• For example, if = and= , the sequence is a

common subsequence of both and• is not a longest common subsequence

(LCS) of and• The sequence , which is also common

to both and , has length 4• This sequence is an LCS of and , as is

; and have no commonsubsequence of length 5 or greater

9-Oct-14MAT-72006 AA+DS, Fall 2014 500

• In longest-common-subsequence problem, we aregiven sequences = , … , and

= , … , and wish to nd a maximum-lengthcommon subsequence of and

Step 1: Characterizing a longest common subsequence• In a brute-force approach, we would enumerate all

subsequences of and check each of them to seewhether it is also a subsequence of , keeping track ofthe longest subsequence we nd

• Each subsequence of corresponds to a subset of theindices 1,2, … , of

• Because has 2 subsequences, this approachrequires exponential time, making it impractical for longsequences

9-Oct-14MAT-72006 AA+DS, Fall 2014 501

10/9/2014

16

• The LCS problem has an optimal-substructureproperty, however, as the following theoremshows

• We shall see that the natural classes ofsubproblems correspond to pairs of “pre xes” ofthe two input sequences

• Precisely, given a sequence = , … , ,we de ne the th pre x of , for = 0,1, … , , as

= , … ,

• For example, if = , then= , , , and is the empty sequence

9-Oct-14MAT-72006 AA+DS, Fall 2014 502

Theorem 15.1 (Optimal substructure of LCS)Let = , … , and = , … , besequences, and let = , … , be any LCSof and .1. If , then and is an

LCS of and .2. If , then implies that is an

LCS of and .3. If , then implies that is an

LCS of and .

9-Oct-14MAT-72006 AA+DS, Fall 2014 503

10/9/2014

17

Proof (1) If , then we could appendto to obtain a common subsequence of

and of length + 1, contradicting thesupposition that is a LCS of and . Thus, wemust have . Now, the pre x is alength- 1) common subsequence of and

. We wish to show that it is an LCS. Supposefor the purpose of contradiction that there exists acommon subsequence of and withlength greater than 1. Then, appending

to produces a common subsequenceof and whose length is greater than , which isa contradiction.

9-Oct-14MAT-72006 AA+DS, Fall 2014 504

(2) If , then is a common subsequence ofand . If there were a common subsequence

with length greater than , then would alsobe a common subsequence of and ,contradicting the assumption that is an LCS ofand .(3) The proof is symmetric to (2).

• Theorem 15.1 tells us that an LCS of twosequences contains within it an LCS of pre xesof the two sequences

• Thus, the LCS problem has an optimal-substructure property

9-Oct-14MAT-72006 AA+DS, Fall 2014 505

10/9/2014

18

Step 2: A recursive solution• We examine either one or two subproblems

when nding an LCS of and• If , we nd an LCS of and• Appending yields an LCS of and• If , then we (1) nd an LCS of and

and (2) nd an LCS of and• Whichever of these two LCSs is longer is an

LCS of and• These cases exhaust all possibilities, and we

know that one of the optimal subproblemsolutions must appear within an LCS of and

9-Oct-14MAT-72006 AA+DS, Fall 2014 506

• To nd an LCS of and , we may need to ndthe LCSs of and and of and

• Each subproblem has the subsubproblem ofnding an LCS of and

• Many other subproblems share subsubproblems• As in the matrix-chain multiplication, recursive

solution to the LCS problem involves arecurrence for the value of an optimal solution

• Let us de ne ] to be the length of an LCS ofthe sequences and

• If either = 0 or = 0, one of the sequences haslength 0, and so the LCS has length 0

9-Oct-14MAT-72006 AA+DS, Fall 2014 507

10/9/2014

19

• The optimal substructure of the LCS problem gives

=0 if = 0 or = 0

1, 1 + 1 if > 0 andmax 1 , 1, ) if > 0 and

• Observe that a condition in the problem restricts whichsubproblems we may consider

• When , we consider nding an LCS of and

• Otherwise, we instead consider the two subproblems ofnding an LCS of and and of and

• In the previous dynamic-programming algorithms — forrod cutting and matrix-chain multiplication — we ruledout no subproblems due to conditions in the problem

9-Oct-14MAT-72006 AA+DS, Fall 2014 508

Step 3: Computing the length of an LCS• Since the LCS problem has only distinct

subproblems, we can use dynamic programming tocompute the solutions bottom up

• LCS-LENGTH stores the ] values in [0. . , 0. . ],and it computes the entries in row-major order– I.e., the procedure lls in the rst row of from left to

right, then the second row, and so on• The procedure also maintains the table [1. . , 1. . ]• Intuitively, ] points to the table entry

corresponding to the optimal subproblem solutionchosen when computing

• contains the length of an LCS of and

9-Oct-14MAT-72006 AA+DS, Fall 2014 509

10/9/2014

20

LCS-LENGTH )1.2.3. let [1. . , 1. . ] and

[0. . , 0. . be newtables

4. for to5. , 0 06. for 0 to7. 0, 08. for to9. for 1 to

10. if11. 1, 1 + 112. ”13. elseif 1, 1]14. 1, ]15. ”16. else 1]17. ”18. return and

Running time: )

9-Oct-14MAT-72006 AA+DS, Fall 2014 510

9-Oct-14MAT-72006 AA+DS, Fall 2014 511

The and tables computed by LCS-LENGTH on= and =

10/9/2014

21

Step 4: Constructing an LCS• The table returned by LCS-LENGTH enables us

to quickly construct an LCS of and• We simply begin at ] and trace through the

table by following the arrows• Whenever we encounter a “ ” in entry , it

implies that is an element of the LCS thatLCS-LENGTH found

• With this method, we encounter the elements ofthis LCS in reverse order

• A recursive procedure prints out an LCS ofand in the proper, forward order

9-Oct-14MAT-72006 AA+DS, Fall 2014 512

• The square in row and column contains the value ofand the appropriate arrow for the value of ]

• The entry 4 in [7,6] — the lower right-hand corner ofthe table — is the length of an LCS

• For > 0, entry depends only on whetherand the values in entries 1, , 1 ,

and 1, 1 , which are computed before• To reconstruct the elements of an LCS, follow the ]

arrows from the lower right-hand corner• Each “ ” on the shaded sequence corresponds to an

entry (highlighted) for which is a member of anLCS

9-Oct-14MAT-72006 AA+DS, Fall 2014 513

10/9/2014

22

Improving the code

• Each entry depends on only 3 other tableentries: 1, , 1 , and 1, 1

• Given the value of , we can determine in(1) time which of these three values was used

to compute , without inspecting table• We can reconstruct an LCS in ) time• The auxiliary space requirement for computing

an LCS does not asymptotically decrease, sincewe need space for the table anyway

9-Oct-14MAT-72006 AA+DS, Fall 2014 514

• We can, however, reduce the asymptotic spacerequirements for LCS-LENGTH, since it needsonly two rows of table at a time– the row being computed and the previous row

• This improvement works if we need only thelength of an LCS– if we need to reconstruct the elements of an LCS,

the smaller table does not keep enoughinformation to retrace our steps in ) time

9-Oct-14MAT-72006 AA+DS, Fall 2014 515

Date post:	24-Mar-2020
Category:	Documents
Upload:	others
View:	4 times
Download:	0 times

15.2 Matrix-chain multiplication - TUTelomaa/teach/AADS-2014-7.pdf10/9/2014 1 15.2 Matrix-chain...

Documents