+ All Categories
Home > Education > Chapter 17

Chapter 17

Date post: 27-Jun-2015
Category:
Upload: bansal-ashish
View: 72 times
Download: 0 times
Share this document with a friend
Popular Tags:
32
Chapter 17 Greedy Algorithms
Transcript
Page 1: Chapter 17

Chapter 17

Greedy Algorithms

Page 2: Chapter 17

Introduction

Dynamic programming is a good technique for many optimization problems, but may be overkill for many others Relies on global optimization of all choices at all steps

Greedy algorithms make the choice that seems best at the moment Choices are locally optimal, in the hopes that this will lead to a

globally optimal solution Because of this, greedy algorithms do not always yield optimal

solutions, but many times they do

Page 3: Chapter 17

An Activity-Selection Problem

Suppose we have a set S={1, 2, …, n} of n proposed activities that wish to use a resource - a lecture hall, for example - that can only be used by one activity at a time

Each activity i has a start time si, and finish time fi, where si <= fi

Selected activities take place during the interval [si, fi)

Activities i and j are compatible if their respective intervals [si, fi) and [sj, fj) do not overlap

The activity-selection problem is to select a maximum-size set of mutually compatible activities

Page 4: Chapter 17

An Activity-Selection Problem Here is a greedy algorithm to select a maximum size set of

mutually compatible activites It assume that input activities have been sorted by increasing

finish time

GreedyActivitySelector(time s[], time f[], stack<int> &A, int size)

{

int j = 0;

for ( int i = 1 ; i < n ; ++i ) {

if ( s[i] >= f[j] ) {

A.push(i);

j = i;

}

}

}

GreedyActivitySelector(time s[], time f[], stack<int> &A, int size)

{

int j = 0;

for ( int i = 1 ; i < n ; ++i ) {

if ( s[i] >= f[j] ) {

A.push(i);

j = i;

}

}

}

Page 5: Chapter 17

An Activity-Selection Problem How does it work?

A collects the selected activities j specifies the most recent addition to A

As such, fj is always the maximum finishing time of any activity in A

We use this to check compatibility - an activity is compatible (and added to A) if it’s start time is not earlier than the finish time of the most recent addition to A

Page 6: Chapter 17

An Activity-Selection Problem As the algorithm progresses, the activity picked next

is always the one with the earliest finish time that can legally be scheduled This is thus a “greedy” choice in that it leaves as much

opportunity as possible for the remaining activities to be scheduled

i.e., it maximizes the amount of unscheduled time remaining

Page 7: Chapter 17

An Activity-Selection Problem How efficient is this algorithm?

There’s only a single loop, so it can schedule a set of activities that have already been sorted in O(n) time

How does this compare with a dynamic programming approach? Base the approach on computing mi iteratively for i=1, 2,

…, n, where mi is the size of the largest set of mutually compatible activities among activities {1, 2, …, i}

Page 8: Chapter 17

Elements of the Greedy Strategy

Greedy algorithms solve problems by making locally optimal choices This won’t always lead to a globally optimal solution, but

many times will Problems that lend themselves to the greedy

strategy exhibit two main properties: The greedy-choice property Optimal substructure

Page 9: Chapter 17

Elements of the Greedy Strategy

The greedy-choice property A globally optimal solution can be arrived at by

making a locally optimal (greedy) choice In dynamic programming, choices may depend on

the answers to previously computed subproblems In a greedy algorithm, we make the best choice

we can at the moment then solve the resulting subproblems May depend on choices so far, but cannot depend

on future choices or on solutions to subproblems Usually progresses in top-down fashion

Page 10: Chapter 17

Elements of the Greedy Strategy

Proving that greedy choices yields globally optimal solutions Examine a globally optimal solution Modify the solution to make a greedy choice at the first

step, and show that this reduces the problem to a similar but smaller problem Reduces the problem of correctness to demonstrating that an

optimal solution must exhibit optimal substructure Apply induction to show that a greedy choice can be used

at every step

Page 11: Chapter 17

Elements of the Greedy Strategy

Optimal substructure A problem exhibits optimal substructure if an optimal

solution to the problem contains within it optimal solutions to subproblems

This is a key of both dynamic programming and greedy algorithms

Page 12: Chapter 17

Greedy vs. Dynamic Programming

Consider two similar problems The 0-1 knapsack problem

A thief robbing a store finds n items; the ith item is worth vi dollars and weighs wi pounds (both integers)

The thief wants to take the most valuable load he can, subject to the maximum weight W that he can carry

What items does he choose? The fractional knapsack problem

Same thief, same problem, except that the thief may take fractions of items rather than having to either take it or leave it

Page 13: Chapter 17

Greedy vs. Dynamic Programming

Both problems exhibit the optimal substructure property Removing an item means that the items that

remain must be the most valuable that can be carried within the remaining weight allowance (W-w, where w is the weight of the item/fraction of item removed)

Page 14: Chapter 17

Greedy vs. Dynamic Programming

Solving the fractional knapsack problem First compute the value per pound vi/wi for each item By the greedy strategy, the thief starts by grabbing as

much as possible of the item with the greatest value per pound

If he runs out of the first item, he starts with the next highest value per pound and grabs as much of it as possible

And so on, until no more can be carried

Page 15: Chapter 17

Greedy vs. Dynamic Programming

Solving the 0-1 knapsack problem The greedy solution doesn’t work here

Grabbing the item with the highest value per pound doesn’t guarantee that you get the most value, nor does grabbing the biggest item

What must happen in this case is that solutions to subproblems must be compared to determine which one provides the most valuable load This is the overlapping subproblem property

Page 16: Chapter 17

Huffman Codes

Huffman codes are an effective technique for compressing data The algorithm builds a table of the frequencies of

each character in a file The table is then used to determine an optimal

way of representing each character as a binary string

Page 17: Chapter 17

Huffman Codes

Consider a file of 100,000 characters from a-f, with these frequencies: a = 45,000 b = 13,000 c = 12,000 d = 16,000 e = 9,000 f = 5,000

Page 18: Chapter 17

Huffman Codes

Typically each character in a file is stored as a single byte (8 bits) If we know we only have six characters, we can use a 3 bit

code for the characters instead: a = 000, b = 001, c = 010, d = 011, e = 100, f = 101

This is called a fixed-length code With this scheme, we can encode the whole file with 300,000

bits We can do better

Better compression More flexibility

Page 19: Chapter 17

Huffman Codes

Variable length codes can perform significantly better Frequent characters are given short code words, while

infrequent characters get longer code words Consider this scheme:

a = 0; b = 101; c = 100; d = 111; e = 1101; f = 1100 How many bits are now required to encode our file?

45,000*1 + 13,000*3 + 12,000*3 + 16,000*3 + 9,000*4 + 5,000*4 = 224,000 bits

This is in fact an optimal character code for this file

Page 20: Chapter 17

Huffman Codes

Prefix codes Huffman codes are constructed in such a way that they can

be unambiguously translated back to the original data, yet still be an optimal character code

Huffman codes are really considered “prefix codes” No code word is a prefix of any other codeword This guarantees unambiguous decoding

Once a code is recognized, we can replace with the decoded data, without worrying about whether we may also match some other code

Page 21: Chapter 17

Huffman Codes

Both the encoder and decoder make use of a binary tree to recognize codes The leaves of the tree represent the unencoded characters Each left branch indicates a “0” placed in the encoded bit

string Each right branch indicates a “1” placed in the bit string

Page 22: Chapter 17

Huffman Codes

100

a:45

0

55

1

25

0

c:12

0

b :13

1

30

1

14

0

d :16

1

f:5 e:9

0 1

A Huffman Code Tree

To encode: Search the tree for the character

to encode As you progress, add “0” or “1” to

right of code Code is complete when you find

character To decode a code:

Proceed through bit string left to right For each bit, proceed left or right as

indicated When you reach a leaf, that is the

decoded character

Page 23: Chapter 17

Huffman Codes

Using this representation, an optimal code will always be represented by a full binary tree Every non-leaf node has two children

If this were not true, then there would be “waste” bits, as in the fixed-length code, leading to a non-optimal compression

For a set of c characters, this requires c leaves, and c-1 internal nodes

Page 24: Chapter 17

Huffman Codes

Given a Huffman tree, how do we compute the number of bits required to encode a file? For every character c:

Let f(c) denote the character’s frequency Let dT(c) denote the character’s depth in the tree

This is also the length of the character’s code word The total bits required is then:

Cc

T cdcfTB )()()(

Page 25: Chapter 17

Constructing a Huffman Code

Huffman developed a greedy algorithm for constructing an optimal prefix code The algorithm builds the tree in a bottom-up manner It begins with the leaves, then performs merging operations

to build up the tree At each step, it merges the two least frequent members

together It removes these characters from the set, and replaces them

with a “metacharacter” with frequency = sum of the removed characters’ frequencies

Page 26: Chapter 17

Constructing a Huffman CodeHuffman(table C[], int n, tree T){

// Create a priority queue that has all characters// sorted by their frequencies - each entry happens// to also be a complete tree nodePriorityQueue Q(C);for ( int i = 0 ; i < n-1 ; ++i ){ z = T.AllocateNode(); x = z->Left = Q.ExtractMin(); y = z->Right = Q.ExtractMin(); z->freq = x->freq + y->freq; Q.Insert(z);}

}

Huffman(table C[], int n, tree T){

// Create a priority queue that has all characters// sorted by their frequencies - each entry happens// to also be a complete tree nodePriorityQueue Q(C);for ( int i = 0 ; i < n-1 ; ++i ){ z = T.AllocateNode(); x = z->Left = Q.ExtractMin(); y = z->Right = Q.ExtractMin(); z->freq = x->freq + y->freq; Q.Insert(z);}

}

Page 27: Chapter 17

Constructing a Huffman Code

f:5 e :9 c :12 b :13 d :16 a :45

The algorithm starts by placing an entry in a priority queue for each character, sorted by frequency

f:5 e :9

c :12 b :13 d :16 a :4514

0 1

Next it removes the two entries with the lowest frequency and inserts them into the Huffman tree. It replaces them with a

node that has a frequency = to the sum of their removed frequencies

Page 28: Chapter 17

Constructing a Huffman Code

The tree after removing and merging c & b

Because we’re using a priority queue, each node is inserted into it’s sorted position, based on the sum of the frequencies

of its children

f:5 e :9 c :12 b :13

d :16 a :4514

0 1

25

0 1

Page 29: Chapter 17

Constructing a Huffman Code

f:5 e :9

c :12 b :13 d :16

a :45

14

0 1

25

0 1

30

0 1

The tree after removing the previously merged f & e, and d, and merging them

Page 30: Chapter 17

Constructing a Huffman Code

f:5 e :9

c :12 b :13 d :16

a :45

14

0 1

25

0 1

30

0 1

550 1

The tree after removing the “25” and “30” frequency nodes

Page 31: Chapter 17

Constructing a Huffman Code

f:5 e :9

c :12 b :13 d :16

a :45

14

0 1

25

0 1

30

0 1

550 1

1000 1

The final tree

Page 32: Chapter 17

Constructing a Huffman Code

What is the running time? If we use a binary heap for the priority queue, it

takes O(n) to build The loop is executed n-1 times Each heap operation requires O(log2n)

So, building the Huffman tree requires O(nlog2n)


Recommended