Chapter 10: Algorithm Design Techniques• Greedy Algorithms• Divide-And-Conquer Algorithms
CS 340 Page 1
• Dynamic Programming• Randomized Algorithms• Backtracking Algorithms
CS 340 Page 2
Greedy AlgorithmsCertain problems lend themselves to a greedy approach, i.e., to obtain an optimal global solution, just make a series of optimal local steps.
JOB
RUNNING TIME
j18
j219
j324
j437
j53
j630
j713
j840
j95
j10
21j11
32j12
10j13
27j14
35j15
16
Processor 1:Processor 2:Processor 3:Processor 4:Processor 5:
JOBCOMPLETION
TIME
j18
j224
j334
j471
j53
j649
j713
j880
j95
j10
29j11
56j12
10j13
40j14
64j15
19
Average Completion Time: 33.67
Greedy Example 1: Multiprocessor SchedulingTo schedule n jobs j1, j2, …, jn, with respective running times t1, t2, …, tn, on p processors, just cycle through the processors, assigning the job with the smallest running time to the next processor. The result will be a schedule that minimizes the average completion time for all of the jobs.
j5 (3)j9 (5)j1 (8)j12 (10)j7 (13)
j15 (16) j2 (19) j10 (21) j3 (24) j13 (27)
j6 (30) j11 (32) j14 (35) j4 (37) j8 (40)
CS 340 Page 3
Greedy Example 2: Huffman CodesTo compress the binary representation of textual data as much as possible, a greedy approach known as Huffman coding can be used.Original Text:I DO NOT LIKE GREEN EGGS AND HAM. I DO NOT LIKE THEM, SAM-I-AM.
CHARACTERFREQUENCY
A
4D
3E
6G
3H
2I
5K
2L
2M
4N
4O
4R
1S
2T
3space
14,
1.
2-
2
Create a binary tree by placing the characters in the leaf nodes and repeatedly joining the pair of least frequent characters with a common parent, replacing that pair with a new “character” that has their combined frequencies.
A H ,R spG TK M N S .OED I L4 3 6 3 2 5 2 2 4 4 4 1 2 3 14 1 2
-2
A H R spG TK M N S .OED I L4 3 6 3 2 5 2 2 4 4 4 2 3 142 2
-2
,
A H spG TI M N S .OED L4 3 6 3 5 2 4 4 4 2 3 144 2
-2
K R2,
CS 340 Page 4
A H spG TI M N S .OED L4 3 6 3 5 2 4 4 4 2 3 144 2
-2
K R2
,
A H spG TI M N .OED4 3 6 3 5 4 4 4 3 144 2
-2
K R2
,L4S
A H spG TI M N OED4 3 6 3 5 4 4 4 3 144
K R2
,L4
S .4
-
A H spG TI M N OE
D
4 6 3 5 4 4 4 3 144KR
5, L
4S .
4-
A H spI M N OE
D
4 6 5 4 4 4 144KR
5, G
6T .
4-L
4S
H spI N OE
D
6 5 4 4 144KR
5,A
8M .
4-L
4SG
6T
CS 340 Page 5
H spI N OE
D
6 5 4 4 144KR
5,A
8M .
4-L
4SG
6T
H spIE
D
6 5 144KR
5,A
8M .
4-N
8OG
6T L
4S
H spIE
D
6 5 14KR
5,A
8M .
4-N
8OG
6T L
8S
H spE
I
6 14K .
9-A
8M N
8OG
6T L
8S
D
R5
,
H sp
I
14K .
9-A
8M N
8OG
6T L
8S
D
R11,
E
CS 340 Page 6
H sp
I
14K .
9-A
8M N
8OG
6T L
8S
D
R11
,
E
A sp
I
14M .
9- N
8OG
14T
D
R11
, H K L8
S
E
A sp
I
14M .
9-
N O
G14
T
D
R11
, H K L16
S
E
A sp
I
14M
. - N O
G14
T
D
R20
, H K L16
S
E
CS 340 Page 7
The Huffman code for each character
is determined by traversing the tree
from the root to the character’s leaf
node, adding a zero to the code for each
left offspring encountered and
adding a one to the code for each right
offspring encountered.CHARACTER
HUFFMAN CODE
A0000
D1000
0
E1001
G0010
H1100
0
I1010
K1100
1
L1101
0
M0001
N111
0
O1111
R10001
0
S1101
1
T0011
space01
,10001
1
.1011
0
-1011
1Note that if five bits per character had been used, then the original message would have needed 320 bits, but with the Huffman code it only needs 247 bits.
28 36A
I
M
. -
N O
G T
D
R ,
H K L SEsp
A IM. -
N OG TD
R ,
H K L SE
sp
CS 340 Page 8
Divide-And-Conquer AlgorithmsAnother common algorithmic technique is the divide and conquer approach, i.e., recursively divide the problem into smaller problems, from which the global solution can be obtained.
Minimum distance in left
partitionMinimum
distance in right partition
The closest pair is either the pair discovered recursively on the left side of the partition, the pair discovered recursively on the right side of the partition, or some pair that straddles the partition.
Divide-And-Conquer Example 1: Closest-Points Problem
Given n points p1, p2, …, pn in two-space, to find the pair with the smallest distance between them, just sort the points by their x-coordinates, sort them in a separate list by their y-coordinates, and split the problem.
CS 340 Page 9
Let be the minimum of the two recursive values, and examine all possible straddling pairs in the strip of width 2 surrounding the partition line.
For each point in the strip, only the square below the point’s y-coordinate and on the opposite side of the partition needs to be examined, resulting in at most four comparisons. This yields O(nlogn) time complexity for the entire algorithm.
CS 340 Page 10
Divide-And-Conquer Example 2: Matrix MultiplicationTraditional matrix multiplication of two n n matrices takes (n3) time. A divide-and-conquer approach can reduce this to O(nlog7) O(n2.81). A11 A12
A21 A22
B11 B12
B21 B22
C11 C12
C21 C22
=
then C11 = A11B11+A12B21 , C12 = A11B12+A12B22 ,C21 = A21B11+A22B21 , and C22 = A21B12+A22B22 .
Splitting every n n matrix into four n/2 n/2 matrices, we note that if:
Using the following matrices:P = (A11+A22)(B11+B22) T =
(A11+A12)B22
Q = (A21+A22)B11 U = (A21-A11)(B11+B12)
R = A11(B12-B22) V = (A12-A22)(B21+B22)
S = A22(B21-B11)we obtain C11 = P+S-T+V , C12 = R+T , C21 = Q+S , and C22 = P+R-Q+U .
This technique uses 7 matrix multiplications and 18 matrix additions, compared to the traditional 8 multiplications and 4 additions. (Note that one matrix multiplication performs n3 multiplications and n3-n2 additions, while one matrix addition performs only n2 additions.)
CS 340 Page 11
Dynamic Programming
WORDPROBABILITY
a.09
and.13
he.07
I.14
it.07
not.10
or.07
she.07
the.15
you.11
The minimum cost trees for each range can be obtained dynamically, and represented with the range of word values, the cost, and the root.
Dynamic programming techniques determine a problem’s solution by means of a sequence of smaller decisions.Dynamic Programming Example: Optimal Binary
Search TreesGiven n words w1, w2, …, wn with respective probabilities of occurrence p1, p2, …, pn, we wish to place the words in a binary search tree in such a way that the average access time is minimized.We take advantage of the fact that subtrees in the binary search tree contain complete ranges of values. To minimize the cost of the entire tree, then, we select the root word wi for which the sum of the left subtree’s cost, the right subtree’s cost, and the sum of all word probabilities is minimal.
For instance, assume the following word probabilities:
CS 340 Page 12
Range Size: 1
ROOTCOST
a.09
and.13
he.07
I.14
it.07
not.10
or.07
she.07
the.15
you.11
RANGE a..a and..and he..he I..I it..it not..not or..or she..she the..the you..you
Range Size: 2
ROOTCOST
and.31
and.27
I.28
I.28
not.24
not.24
or.21
the.29
the.37
RANGE a..and and..he he..I I..it it..not not..or or..she she..the the..you
Range Size: 3
ROOTCOST
and.45
he.61
I.42
it.55
not.38
or.41
the.50
the.51
RANGE a..he and..I he..it I..not it..or not..she or..the she..you
Range Size: 4
ROOTCOST
and.80
I.75
I.69
not.73
not.59
or.78
the.72
RANGE a..I and..it he..not I..or it..she not..the or..you
Range Size: 5
ROOTCOST
and1.01
I1.02
I.90
not.94
or.99
the1.02
RANGE a..it and..not he..or I..she it..the not..you
Range Size: 6
ROOTCOST
I1.29
not1.40
not1.15
not1.38
the1.27
RANGE a..not and..or he..she I..the it..youRange Size: 7
ROOTCOST
I1.50
I1.51
not1.59
not1.71
RANGE a..or and..she he..the I..you
Range Size: 8
ROOTCOST
I1.78
not2.05
not1.92
RANGE a..she and..the he..youRange Size: 9
ROOTCOST
I2.33
not2.38
RANGE a..the and..youRange Size: 10
ROOTCOST
I2.72
RANGE a..you
CS 340 Page 13
For instance, to determine the best size-5 subtree for the range I..she, this process compares the five possibilities:I
it..sheNULL
COST: 0 + .59 + .45 = 1.04
it
not..sheI..I
COST: .14 + .41 + .45 = 1.00
not
or..sheI..it
COST: .28 + .21 + .45 = .94or
she..sheI..not
COST: .55 + .07 + .45 = 1.07
she
NULLI..or
COST: .73 + .0+ .45 = 1.18
cheapest
The cheapest binary search tree: I
and the
a he younot
it or
she
CS 340 Page 14
Randomized Algorithms
Randomized Example: Skip ListsLet’s alter the definition of a linked list to facilitate the application of binary searches to it.Specifically, assuming that the list will have at most n elements, let each node have between 1 and logn pointers to later nodes, and let’s ensure that the ith pointer of each node points to a node with at least i pointers.We do this by implementing the insert
operation as follows: starting at the header’s highest pointer, traverse the list until the next node is larger than the new value (or null), at which point the process is continued at the next lower level pointer. When this process halts, a new node
is inserted after the last node where a level shift occurred; this node contains the new value and a number of pointers between 1 and logn.
Here’s the randomized part!
This number is chosen randomly, with a
distribution such that 1 is chosen half the time, 2 is chosen one-quarter of the
time, 3 is chosen one-eighth of the time, etc.
Randomized algorithms use random numbers at certain key points in the program to make decisions. While the worst-case time complexity of such algorithms is unaffected, the best-case time complexity becomes correlated to the distribution function for the random number generator, which can improve the performance significantly.
CS 340 Page 15
35
Insert35
R=2
Insert50
R=1
3550
Insert20
R=1
355020
Insert65
R=1
355020 65
Insert40
R=3
355020 65
40
Insert10
R=2
Insert60
R=2
35
5020 65
4010
35
5020 65
4010 60
Insert25
R=1
355020 65
4010 60
25
Insert 55
R=4
355020 65
4010 60
25
55
Insert15
R=1
355020 65
4010 60
25
55
15
CS 340 Page 16
Backtracking AlgorithmsBacktracking algorithms are usually a variation of exhaustive searching, where the search is halted whenever the situation becomes untenable, and the algorithm “backtracks” to the last point where a dubious decision was made.
Current Game Status
Computer’s Possible Move #1
Computer’s Possible Move #2
Computer’s Possible Move #3
Computer’s Possible Move #4
Human Move #1.1
Human Move #1.2
Human Move #1.3
Human Move #2.1
Human Move #2.2
Human Move #3.1
Human Move #3.2
Human Move #3.3
Human Move #4.1
Human Move #4.2
Human Move #4.3
CPM #1.1.
1
CPM #1.2.
1
CPM #1.2.
2
CPM #1.3.
1
CPM #2.1.
1
CPM #2.2.
1
CPM #2.2.
2
CPM #3.1.
1
CPM #3.2.
1
CPM #3.3.
1
CPM #3.3.
2
CPM #4.1.
1
CPM #4.2.
1
CPM #4.3.
1
At the odd levels, choose the move that will
maximize the computer’s chances of victory.
At the even levels, choose the move that will minimize the computer’s
chances of victory (i.e., the move the human would make).
Backtracking Example: Game PlayingWhen developing a computer game program, one common technique is the minimax procedure, in which the program determines the next move to take based upon its attempt to maximize its chances of victory while assuming that its human opponent will try to minimize those chances.A tree structure is used for this purpose:
CS 340 Page 17
Let’s try a simple tic-tac-toe example (assuming that the computer is ‘X’): XO–
O–X–O–
Although the human could win no matter which move the computer makes next, the computer’s odds are better with the second move.
XOXO–X–O–
XO–OXX–O–
XO–O–XXO–
XO–O–X–OX
XOXOOX–O–
XOOOXX–O–
XOOO–XXO–
XOOO–X–OX
XOXO–XOO–
XO–OXXOO–
XO–OOXXO–
XO–OOX–OX
XOXO–X–OO
XO–OXX–OO
XO–O–XXOO
XO–O–XOOX
XOOOXXXO–
XOOOXXXO–
XOOOXX–OX
XOXO–XOOX
XOXOXXOO–
XOXOXX–OO
XOXOXX–OO
XO–OXXXOO
XO–OXXOOX
XOOOXX–OX
XOOO–XXOX
XOOO–XXOX
XOXOXXOO–
XO–OXXOOX
XOXO–XXOO
XO–OXXXOO
XOXO–XXOO
XOXO–XOOX
XOOOXXXOO
XOOOXXXOO
XOXOXXOOO
XOXOXXOOO
XOXOXXOOO
XOOOXXXOO
XOOOOXXOX
XOOOOXXOX
XOXOXXOOO
XOXOOXXOO
XOOOXXXOO
XOXOOXXOO
Five outcomes:• 4 human
wins• 1 computer
win
Six outcomes:• 2 human wins• 2 computer
wins• 2 draws
Five outcomes:• 3 human
wins• 2 draws
Five outcomes:• 3 human wins• 2 computer
wins
A more thorough analysis reveals that if the computer makes that move, the human’s response will lead to either a computer victory or a draw.
Furthermore, each of the other moves leads to an inevitable human victory.