Data Structures and Algorithms
Jörg Endrullis
Vrije Universiteit AmsterdamDepartment of Computer Science
Section Theoretical Computer Science
2009–2010
http://www.few.vu.nl/~tcs/ds/
The Lecture
Lectures:I Tuesday 11:00 – 12:45
(weeks 36–42 in WN-KC159)
I Wednesday 9:00 – 10:45(weeks 36–41 in WN-Q105, 42 in WN-Q112)
Exercise classes:I Thursday 11:00 – 12:45 (weeks 36–42 in WN-M607(50))
I Thursday 13:30 – 15:15 (weeks 36–42 in WN-M623(50))
I Thursday 15:30 – 17:15 (weeks 36–42 in WN-C121)
Exam on 19.10.2009 at 8:45 – 11:30.
Retake exam on 12.01.2010 at 18:30 – 21:15.
Important Information: Voortentamen
Voortentamen (pre-exam):I Tuesday 29 september, 11:00 – 12:45 in WN-KC159
I The participation is not obligatory, but recommended.I The result can influence the final grade only positively:
I Let V be the pre-exam grade, and T the exam grade.The final grade is calculated by:
max(T ,2T + V
3)
Thus if the final exam grade T is higher than the pre-examgrade V , then only the final grade T counts. But if the pre-examV is higher then it counts with 1
3 .
Contact Persons
I lecture:I Jörg [email protected]
I exercise classes:I Dennis [email protected]
I Michel [email protected]
I Atze van der [email protected]
Literature
Introduction
DefinitionAn algorithm is a list of instructions for completing a task.
I Algorithms are central to computer science.I Important aspects when designing algorithms:
I correctness
I termination
I efficiency
Evaluation of Algorithms
I Algorithms with equal functionality may have hugedifferences in efficiency (complexity)
I Important measures:I time complexity
I space (memory) complexity
I Time and memory usage increase with size of the input:
input size
time
best case
average case
worst case
10 20 30 40
2s
4s
6s
8s
10s
Time Complexity
I Running time of programs depends on various factors:I input for the program
I compiler
I speed of the computer (CPU, memory)
I time complexity of the used algorithm
I Average case often difficult to determine.I We focus on worst case running time:
I easier to analyse
I usually the crucial measure for applications
Methods for Analysing Time Complexity
Experiments (measurements on a certain machine):I requires implementation
I comparisons require equal hardware and software
I experiments may miss out imporant inputs
Calculation of complexity for idealised computer model:I requires exact counting
I computer model: Random Access Machine (RAM)
Asymptotic complexity estimation depending on input size n:I allows for approximations
I logarithmic (log2 n), linear (n), . . . , exponential (an), . . .
Random Access Machine (RAM)
A Random Access Machine is a CPU connected to a memory:I potentially unbounded number of memory cells
I each memory cell can store an arbitrary number
CPU Memory
I primitive operations are executed in constant time
I memory cells can be accessed with one primitive operation
Pseudo-code
We use Pseudo-code to describe algorithms:I programming-like, hight-level description
I independent from specific programming languageI primitive operations:
I assigning a value to a variable
I calling a method
I arithmetic operations, comparing two numbers
I array access
I returning from a method
Counting Primitive Operations: arrayMax
We analyse the worst case complexity of arrayMax.
Algorithm arrayMax(A,n):Input: An array storing n ≥ 1 integers.Output: The maximum element in A.
currentMax = A[0]
for i = 1 to n − 1 doif currentMax < A[i] then
currentMax = A[i]donereturn currentMax
1 + 11 + n + (n − 1)·
1 + 11 + 11 + 1
1
Hence we have a worst case time complexity:
T (n) = 2 + 1 + n + (n − 1) · 6 + 1= 7 · n − 2
Counting Primitive Operations: arrayMax
We analyse the best case complexity of arrayMax.
Algorithm arrayMax(A,n):Input: An array storing n ≥ 1 integers.Output: The maximum element in A.
currentMax = A[0]
for i = 1 to n − 1 doif currentMax < A[i] then
currentMax = A[i]donereturn currentMax
1 + 11 + n + (n − 1)·
1 + 101 + 1
1
Hence we have a best case time complexity:
T (n) = 2 + 1 + n + (n − 1) · 4 + 1= 5 · n
Important Growth Functions
Examples of growth functions, ordered by speed of growth:
I log n
I n
I n · log n
I n2
I n3
I an
logarithmic growth
linear growth
(n · log n)-growth
quadratic growth
cubic growth
exponential growth (a > 1)
We recall the definition of logarithm:
loga b = c such that ac = b
Speed of Growth, Pictorial
x
f (x)
f (x) = log x
f (x) = x
f (x) = x log x
f (x) = x2
f (x) = ex
Speed of Growth
n 10 100 1000 104 105 106
log2 n 3 7 10 13 17 20n 10 100 1000 104 105 106
n · log n 30 700 13000 105 106 107
n2 100 10000 16 108 1010 1012
n3 1000 16 19 1012 1015 1018
2n 1024 1030 10300 103000 1030000 10300000
Note that log2 n growth very slow, whereas 2n growth explosive!
Time depending on Problem Size
Computation time assuming that 1 step takes 1µs (0.000001s).
n 10 100 1000 104 105 106
log n < < < < < 0.00002sn < < 0.001s 0.01s 0.1s 1s
n · log n < < 0.013s 0.1s 1s 10sn2 < 0.01s 1s 100s 3h 1000hn3 0.001s 1s 1000s 1000h 100y 105y2n 0.001s 1023y > > > >
Here < means ‘fast’ (< 0.001s), and > ‘more than 10300 years’.
A problem for which there exist only exponential algorithms areusually considered intractable.
Search in Arrays
We analyse the worst case complexity of search.
Algorithm search(A,n, x):Input: An array storing n ≥ 1 integers.Output: Returns true if A contains x and false, otherwise.
for i = 0 to n − 1 doif A[i] == x then
return truedonereturn false
1 + (n + 1) + n·1 + 1
(not taken for worst case)1 + 1
1
Hence we have a worst case time complexity:
T (n) = 1 + (n + 1) + n · 4 + 1= 5 · n + 3
Search in Arrays
We analyse the best case complexity of search.
Algorithm search(A,n, x):Input: An array storing n ≥ 1 integers.Output: Returns true if A contains x and false, otherwise.
for i = 0 to n − 1 doif A[i] == x then
return truedonereturn false
1 + 11 + 11
Hence we have a best case time complexity:
T (n) = 5
Search in Sorted Arrays: Binary Search
Algorithm binSearch(A,n, x):Input: An array storing n integers in ascending order.Output: Returns true if A contains x and false, otherwise.
low = 0high = n − 1while low ≤ high do
mid = b(low + high)/2cy = A[mid]
if x < y then high = mid − 1if x == y then return trueif x > y then low = mid + 1
donereturn false
1 2 2 4 6 7 8 11 13 15 16
low highmid
x = 8y = 7
Search in Sorted Arrays: Binary Search
Algorithm binSearch(A,n, x):Input: An array storing n integers in ascending order.Output: Returns true if A contains x and false, otherwise.
low = 0high = n − 1while low ≤ high do
mid = b(low + high)/2cy = A[mid]
if x < y then high = mid − 1if x == y then return trueif x > y then low = mid + 1
donereturn false
1 2 2 4 6 7 8 11 13 15 16
highlow mid
x = 8y = 13
Search in Sorted Arrays: Binary Search
Algorithm binSearch(A,n, x):Input: An array storing n integers in ascending order.Output: Returns true if A contains x and false, otherwise.
low = 0high = n − 1while low ≤ high do
mid = b(low + high)/2cy = A[mid]
if x < y then high = mid − 1if x == y then return trueif x > y then low = mid + 1
donereturn false
1 2 2 4 6 7 8 11 13 15 16
low highmid
x = 8y = 8
Search in Sorted Arrays: Binary Search
We analyse the worst case complexity of binSearch.
low = 0high = n − 1while low ≤ high do
. . .donereturn false
12number of loops · (1+
. . .)
1
The analyse the maximal number of loops L for minimal lists A:
L A1 1 x = 22 1 2 x = 33 1 2 3 4 x = 54 1 2 3 4 5 6 7 8 x = 9
For a list A of length n we need (1 + log2 n) loops in worst case.
Search in Sorted Arrays: Binary Search
We analyse the worst case complexity of binSearch.
low = 0high = n − 1while low ≤ high do
mid = b(low + high)/2cy = A[mid]
if x < y then high = mid − 1if x == y then return trueif x > y then low = mid + 1
donereturn false
121+(1+ log2 n) ·(1+
421 (false)1 (false)3 (true) )
1
Hence we have a worst case time complexity:
T (n) = 5 + (1 + log2 n) · 12= 17 + 12 · log2 n
Search in Sorted Arrays: Binary Search
We analyse the best case complexity of binSearch.
low = 0high = n − 1while low ≤ high do
mid = b(low + high)/2cy = A[mid]
if x < y then high = mid − 1if x == y then return trueif x > y then low = mid + 1
donereturn false
121+
421 (false)2 (true)
Hence we have a best case time complexity:
T (n) = 13
Summary: Counting Primitive Operations
Assumption (Random Access Machine):I primitive operations are executed in constant time
Allows for analysis of worst, best and average case complexity.(average analysis requires probability distribution for the inputs)
Disadvantages
I exact counting is cumbersome and error-proneI in the real world:
I different operations take different time
I computers have different CPU/memory speeds
Estimating Running Time
Assume we have an algorithm that performs (input size n):I n2 additions, and
I n multiplications.
Assume that we have computers A and B for which:I A needs 3s per addition and 7s per multiplication,
I B needs 1s per addition and 10s per multiplication.
Then the algorithm runs on A with time complexity:
T (n) = 3 · n2 + 7 · n
and on computer B with:
T (n) = 1 · n2 + 10 · n
The constant factors may differ on different computers.
Estimating Running Time
LetI A be an algorithm,
I T (n) the time complexity for Random Access Machines,
I TC(n) the time complexity on a (real) computer C.
For every computer C there exist factors a, b > 0 such that:
a · T (n) ≤ TC(n) ≤ b · T (n)
We can choose:I a = time for the fastest primitive operation on C
I b = time for the slowest primitive operation on C
Thus idealized computation is precise up to constant factors:I changing hardware/software affects only constant factors
The Role of Constant Factors
Problem size that can be solved in 1 hour:
current computer 100× faster 10000× faster
n N1 100 · N1 10000 · N1
n2 N2 10 · N2 100 · N2
n3 N3 4.6 · N3 21.5 · N3
2n N4 N4 + 7 N4 + 13
A 10000× faster computer is equal to:I normal computer with 10000h time, or
I 10000 times smaller constant factors in T (n): 110000 · T (n).
The growth rate of running time (e.g. n,n2,n3, 2n,. . . ) is moreimportant than constant factors.
Methods for Analysing Time Complexity
Experiments (measurements on a certain machine):I requires implementation
I comparisons require equal hardware and software
I experiments may miss out imporant inputs
Calculation of complexity for idealised computer model:I requires exact counting
I computer model: Random Access Machine (RAM)
Asymptotic complexity estimation depending on input size n:I allows for approximations
I linear (n), quadratic (n2), exponential (an), . . .
Asymptotic estimation
Magnitude of growth of complexity depending on input size n.
Mostly used: upper bounds big-Oh-notation (worst case).
Definition (Big-Oh-notation)Given functions f (n) and g(n), we say that f (n) is O(g(n)),denoted f (n) ∈ O(g(n)), if there exist c, n0 such that:
∀n ≥ n0 : f (n) ≤ c · g(n)
In words: the growth rate of f (n) is ≤ to the growth rate of g(n).
Example (2n + 1 ∈ O(n))We need to find c, n0 such that for all n ≥ n0 we have
2n + 1 ≤ c · n
We choose c = 3, n0 = 1. Then 3 · n = 2n + n ≥ 2n + 1.
Examples
Example (7n − 2 ∈ O(n))We need to find c, n0 such that for all n ≥ n0 we have
7n − 2 ≤ c · n
We choose c = 7, n0 = 1.
Example (3n3 + 5n2 + 2 ∈ O(n3))We need to find c, n0 such that for all n ≥ n0 we have
3n3 + 5n2 + 2 ≤ c · n3
We choose c = 4, n0 = 7, then
4 · n3 = 3n3 + n3 ≥ 3n3 + 7n2
= 3n3 + 5n2 + 2n2 ≥ 3n3 + 5 · n2 + 2
Examples
Example (7 ∈ O(1))We need to find c, n0 such that for all n ≥ n0 we have
7 ≤ c · 1
Take c = 7, n0 = 1.
In general: O(1) means ‘constant time’.
Example (n2 6∈ O(n))Assume there would exist c, n0 such that for all n ≥ n0 we have
n2 ≤ c · n
Take an arbitrary n ≥ c + 1, then
n2 ≥ (c + 1) · n > c · n
This contradicts n2 ≤ c · n.
Big-Oh Rules
I If f (n) is a polynomial of degree d , then f (n) ∈ O(nd).I More general, we can drop:
I constant factors, and
I lower order terms: ( means higher order/growth rate)
3n 2n n5 n2 n · log2 n n log2 n log2 log2 n
Example
5n6 + 3n2 + 2n7 + n ∈ O(n7)
Example
n200 + 3 · 2n + 50n3 + log2 n ∈ O(2n)
Asymptotic estimation: arrayMax
We analyse the worst case complexity of arrayMax.
Algorithm arrayMax(A,n):Input: An array storing n ≥ 1 integers.Output: The maximum element in A.
currentMax = A[0]
for i = 1 to n − 1 doif currentMax < A[i] then
currentMax = A[i]donereturn currentMax
1 (constant time)(n − 1)·
1
1
Hence we have a (worst case) time complexity:
T (n) ∈ O(1 + (n − 1) · 1 + 1)
= O(n)
Asymptotic estimation: prefixAverage
We analyse the complexity of prefixAverage.
Algorithm prefixAverage(A,n):Input: An array storing n ≥ 1 integers.Output: An array B of length n such that for all i < n:
B[i ] = 1i+1 · (A[0] + A[1] + . . .+ A[i ])
B = new array of length nfor i = 0 to n − 1 do
sum = 0for j = 0 to i do
sum = sum + A[j]doneB[i] = sum/(i + 1)
donereturn B
nnn · 1(1 + 2 + . . .+ n)·
1
n · 1
1
The (worst case) time complexity is: T (n) ∈ O(n2)
Asymptotic estimation: prefixAverage2
We analyse the complexity of prefixAverage2.
Algorithm prefixAverage2(A,n):Input: An array storing n ≥ 1 integers.Output: An array B of length n such that for all i < n:
B[i ] = 1i+1 · (A[0] + A[1] + . . .+ A[i ])
B = new array of length nsum = 0for i = 0 to n − 1 do
sum = sum + A[i]B[i] = sum/(i + 1)
donereturn B
n1n ·
1
1
The (worst case) time complexity is:
T (n) ∈ O(n)
Efficiency for Small Problem Size
Asymptotic complexity (big-O-notation) speaks about big n:I For small n, an algorithm with good asymptotic complexity
can be slower than one with bad complexity.
ExampleWe have
I 1000 · n ∈ O(n) is asymptotically better than n2 ∈ O(n2)
but for n = 10:I 1000 · n = 10000 is much slower than n2 = 100
Relatives of Big-Oh
Lower bounds: big-Omega-notation.
Definition (Big-Omega-notation)f (n) is Ω(g(n)), denoted f (n) ∈ Ω(g(n)), if there exist c, n0 s.t.:
∀n ≥ n0 : f (n) ≥ c · g(n)
Exact bound (lower and upper): big-Theta-notation.
Definition (Big-Theta-notation)f (n) is Θ(g(n)), denoted f (n) ∈ Θ(g(n)) if:
f (n) ∈ O(g(n)) and f (n) ∈ Ω(g(n))
Abstract Data Types
DefinitionAn abstract data type (ADT) is an abstraction of a data type.An ADT specifies:
I data stored
I operations on the data
I error conditions associated with operations
The choice of data types is important for efficient algorithms.
The Stack ADT
I stores arbitrary objects: last-in first-out principle (LIFO)I main operations:
I push(o): insert the object o at the top of the stack
I pop(): removes & returns the object on the top of the stack.An error occurs (exception is thrown) if the stack is empty.
21
5
21
21
53
push(3)pop()
returns 5
I auxiliary operations:I isEmpty(): indicates whether the stack is empty
I size(): returns number of elements in the stack
I top(): returns the top element without removing it.An error occurs if the stack is empty.
Applications of Stacks
Direct applications:I page-visited history in a web browser
I undo-sequence in a text editor
I chain of method calls in the Java Virtual MachineIndirect applications:
I auxiliary data structure for algorithms
I component of other data structures
Stack in the Jave Virtual Machine (JVM)
JVM keeps track of the chain of active methods with a stack:I stores local variables and program counter
main() int i = 5;foo(i);
foo(int j) int k;k = j+1;bar(k);
bar(int m) ...
barPC = 1m = 6
fooPC = 3j = 5k = 6
mainPC = 2i = 5
Array-based Implementation of Stacks
I we fix the maximal size N of the stack in advance
I elements are added to the array from left to right
I variable t points to the top of the stack
S 3 2 5 4 1 4
0 1 2 . . . t
I initially t = 0, the stack is empty
Array-based Implementation of Stacks
size():return t + 1
push(o):if size() == N then
throw FullStackExceptionelse
t = t + 1S[t ] = o
(throws an exception if the stack is full)
S 3 2 5 4 1 4
0 1 2 . . . t
Array-based Implementation of Stacks
pop():if size() == 0 then
throw EmptyStackExceptionelse
o = S[t ]t = t − 1return o
(throws an exception if the stack is emtpy)
S 3 2 5 4 1 4
0 1 2 . . . t
Array-based Implementation of Stacks, Summary
Performace:I every operation runs in O(1)
Limitations:I maximum size of the stack must be choosen a priory
I trying to add more than N elements causes an exceptionThese limitations do not hold for stacks in general:
I only for this specific implementation
The Queue ADT
I stores arbitrary objects: first-in first-out principle (FIFO)I main operations:
I enqueue(o): insert the object o at the end of the queue
I dequeue(): removes & returns the object at the beginningof the queue. An error occurs if the queue is empty.
21721
3217
enqueue(3)dequeue()
returns 7
I auxiliary operations:I isEmpty(): indicates whether the queue is empty
I size(): returns number of elements in the queue
I front(): returns the first element without removing it.An error occurs if the queue is empty.
Applications of Queues
Direct applications:I waiting lists (bureaucracy)
I access to shared resources (CPU, printer, . . . )Indirect applications:
I auxiliary data structure for algorithms
I component of other data structures
Array-based Implementation of Queues
I we fix the maximal size N − 1 of the queue in advanceI uses array of size N in circular fashion:
I variable f points to the front of the queue
I variable r points one behind the rear of the queue(that is, array location r is kept empty)
Q 3 2 5 4 1
0 1 2 . . . f r
normal configuration
Q 3 25 4 1fr
wrapped configuration
Qfr
empty queue (f == r)
Array-based Implementation of Queues
size():return (N + r − f ) mod N
isEmpty():return size() == 0
Q 3 2 5 4 1
0 1 2 . . . f r
normal configuration
Q 3 25 4 1fr
wrapped configuration
Array-based Implementation of Queues
enqueue(o):if size() == N − 1 then
throw FullQueueExceptionelse
Q[r ] = or = (r + 1) mod N
(throws an exception if the queue is full)
Q 3 2 5 4 1
0 1 2 . . . f r
normal configuration
Q 3 25 4 1fr
wrapped configuration
Array-based Implementation of Queues
dequeue():if size() == 0 then
throw EmptyQueueExceptionelse
o = Q[f ]f = (f + 1) mod Nreturn o
(throws an exception if the queue is emtpy)
Q 3 2 5 4 1
0 1 2 . . . f r
normal configuration
Q 3 25 4 1fr
wrapped configuration
What is a Tree?
I stores elements in a hierarchical structure
I consists of nodes in parent-child relation
I children are ordered (we speak about ordered trees)
root
left
ping
0 1
right
2 pong
3
4
Applications of trees:I file systems, databasesI organization charts
Tree Terminology
I Root: node without parent (A)I Inner node: at least one child (A, B, C, D, F)I Leaf (external node): no children (H, I, E, J, G)I Depth of a node: length of the path to the rootI Height of the tree: maximum depth of any node (3)
A
B
D
H I
C
E F
J
G
I Ancestors:I A is parent of CI B is grandparent of H
I Descendant:I C is child of AI H is grandchild of B
I Substree:I node + all descendants
The Tree ADT
I Accessor operations:I root(): returns root of the treeI children(v): returns the list of children of node vI parent(v): returns parent of node v.
An error occurs (exception is thrown) if v is root.
I Generic operations:I size(): returns number of nodes in the treeI isEmpty(): indicates whether the tree is emptyI elements(): returns set of all the nodes of the treeI positions(): returns set of all positions of the tree.
I Query methods:I isInternal(v): indicates whether v is an inner nodeI isExternal(v): indicates whether v is a leafI isRoot(v): indicates whether v is the root node
Preorder Traversal
A traversal visits the nodes of a tree in systematic manner.
In a preorder traversal:I a node is visited before its descendants.
preOrder(v):visit(v)for each child w of v do
preOrder(w)
A
B
D
H I
C
E F
J
G
1
2
3
4 5
6
7 8
9
10
Postorder Traversal
In a postorder traversal:I a node is visited after its descendants.
postOrder(v):for each child w of v do
postOrder(w)visit(v)
A
B
D
H I
C
E F
J
G
1 2
3
4
5
6
7 8
9
10
Binary Trees
A binary tree is a tree with the property:I each inner node has exactly two children
We call the children of inner nodes left and right child.
Applications of binary trees:I arithmetic expressions
I decision processes
I searching
A
B
D E
H I
C
F G
Binary Trees and Arithmetic Expressions
Binary trees can represent arithmetic expressions:I inner nodes are operators
I leafs are operands
Example: (3× (a − 2)) + (b/4)
+
×
3 −
a 2
/
b 4
(postorder traversal can be used to evaluate an expression)
Binary Decision Trees
Binary tree can represent a decision process:I inner nodes are questions with yes/no answers
I leafs are decisions
Example:Question A?
Question B?
Result 1
yes
Question C?
Result 2
yes
Result 3
no
no
yes
Question D?
Result 4
yes
Result 5
no
no
Properties of Binary Trees
Notation:I n number of nodes
I i number of internal nodes
I e number of leafs
I h height
Properties:I e = i + 1
I n = 2e − 1
I h ≤ i
I h ≤ (n − 1)/2
I e ≤ 2h
I h ≥ log2 e
I h ≥ log2(n + 1) − 1
Binary Tree ADT
I Inherits all methods from Tree ADT.
I Additional methods:I leftChild(v): returns the left child of vI rightChild(v): returns the right child of vI sibling(v): returns the sibling of v
(exceptions are thrown if left/right child or sibling do not exist)
Inorder Traversal
In a inorder traversal:I a node is visited after its left and before its right subtree.
Application: drawing binary treesI x(v) = inorder rank v
I y(v) = depth of v
inOrder(v):if isInternal(v) then
inOrder(leftChild(v))visit(v)if isInternal(v) then
inOrder(rightChild(v))
A
B
D E
H I
C
F G1
2
3
4
5
6
7
8
9
Inorder Traversal: Printing Arithmetic Expressions
A specialization of the inorder traversal for printing expressions:
printExpression(v):if isInternal(v) then
print(‘(’)inOrder(leftChild(v))
print(v.element())if isInternal(v) then
inOrder(rightChild(v))print(‘)’)
+
×
3 −
a 2
/
b 4
Example: printExpression(v) of the above tree yields
((3× (a − 2)) + (b/4))
Postorder Traversal: Evaluating Expressions
Postorder traversal for evaluating arithmetic expressions:
eval(v):if isExternal(v) then
return v.element()else
x = eval(leftChild(v))y = eval(rightChild(v)) = operator stored at vreturn x y
+
×
3 −
4 2
/
8 4
Euler Tour Traversal
I Generic traversal of a binary tree:I preorder, postorder, inorder are special cases
I Walk around the tree, each node is visited tree times:I on the left (preorder)
I from below (inorder)
I on the right (postorder)
+
×
3 −
5 2
/
8 4
L R
B
Vector ADT
A Vector stores a list of elements:I Access via rank/index.
Accessor methods:elementAtRank(r),
Update methods:replaceAtRank(r, o),insertAtRank(r, o),removeAtRank(r)
Generic methods:size(), isEmtpy()
Here r is of type integer, n, m are nodes, o is an object (data).
List ADT
A List consists of a sequence of nodes:I The nodes store arbitrary objects.
I Access via before/after relation between nodes.
element 1 element 2 element 3 element 4
Accessor methods:first(), last(),before(n), after(n)
Query methods:isFirst(n), isLast(n)
Generic methods:size(), isEmtpy()
Update methods:replaceElement(n, o),swapElements(n,m),insertBefore(n, o),insertAfter(n, o),insertFirst(o),insertLast(o),remove(n)
Here r is of type integer, n, m are nodes, o is an object (data).
Sequence ADT
I Sequence ADT is the union of Vector and List ADT:I inherits all methods
I Additional methods:I atRank(r): returns the node at rank r
I rankOf(n): returns the rank of node n
I Elements can by accessed by:I rank/index, or
I navigation between nodes
RemarksThe distinction between Vector, List and Sequence is artificial:
I every element in a list naturally has a rankImportant are the different access methods:
I access via rank/index, or navigation between nodes
Singly Linked List
I A singly linked list provides an implementation of List ADT.
I Each node stores:I element (data)
I link to the next node node
next
element
I Variables first and optional last point to the first/last node.
I Optional variable size stores the size of the list.
∅
first last
element 1 element 2 element 3 element 4
Singly Linked List: size
If we store the size on the variable size, then:size():
return sizeRunning time: O(1).
If we do not store the size on a variable, then:size():
s = 0m = firstwhile m != ∅ do
s = s + 1m = m.next
donereturn s
Running time: O(n).
Singly Linked List: insertAfter
insertAfter(n, o):x = new node with element ox.next = n.nextn.next = xif n == last then last = xsize = size + 1
∅
A B C
∅
A B C
o
n
n
x
last
last
first
first
Running time: O(1).
Singly Linked List: insertFirst
insertFirst(o):x = new node with element ox.next = firstfirst = xif last == ∅ then last = xsize = size + 1
∅
A B C
∅
A B Co
first
first, x
Running time: O(1).
Singly Linked List: insertBefore
insertBefore(n, o):if n == first then
insertFirst(o)
elsem = firstwhile m.next != n do
m = m.nextdoneinsertAfter(m, o)
(error check m != ∅ has been left out for simplicity)
We need to find the node m before n:I requires a search through the list
Running time: O(n) where n the number of elements in the list.(worst case we have to search through the whole list)
Singly Linked List: Performance
Operation Worst case Complexitysize, isEmpty O(1) 1
first, last, after O(1) 2
before O(n)
replaceElement, swapElements O(1)
insertFirst, insertLast O(1) 2
insertAfter O(1)
insertBefore O(n)
remove O(n) 3
atRank, rankOf, elemAtRank O(n)
replaceAtRank O(n)
insertAtRank, removeAtRank O(n)
1 size needs O(n) if we do not store the size on a variable.2 last and insertLast need O(n) if we have no variable last.3 remove(n) runs in best case in O(1) if n == first.
Stack with a Singly Linked List
We can implement a stack with a singly linked list:I top element is stored at the first node
Each stack operation runs in O(1) time:
I push(o):insertFirst(o)
I pop():o = first().elementremove(first())return o
∅
first (top)
element 1 element 2 element 3 element 4
Queue with a Singly Linked List
We can implement a queue with a singly linked list:I front element is stored at the first node
I rear element is stored at the last node
Each queue operation runs in O(1) time:
I enqueue(o):insertLast(o)
I dequeue():o = first().elementremove(first())return o
∅
first (top) last (rear)
element 1 element 2 element 3 element 4
Doubly Linked List
I A doubly linked list provides an implementation of List ADT.
I Each node stores:I element (data)
I link to the next node
I link to the previous node node
prev next
element
I Special header and trailer nodes.
I Variable size storing the size of the list.
header trailer
element 1 element 2 element 3 element 4
Doubly Linked List: insertAfter
insertAfter(n, o):m = n.nextx = new node with element ox.prev = nx.next = mn.next = xm.prev = xsize = size + 1
A B C
A B Co
n
n mx
Doubly Linked List: insertBefore
insertBefore(n, o):insertAfter(n.prev, o)
A B C
A B C
o
n
n.prev n
x
Doubly Linked List: remove
remove(n):p = n.prevq = n.nextp.next = qq.prev = psize = size − 1
A B C D
A B C
o
n
p q
n
Doubly Linked List: Performance
Operation Worst case Complexitysize, isEmpty O(1)
first, last, after O(1)
before O(1)
replaceElement, swapElements O(1)
insertFirst, insertLast O(1)
insertAfter O(1)
insertBefore O(1)
remove O(1)
atRank, rankOf, elemAtRank O(n)
replaceAtRank O(n)
insertAtRank, removeAtRank O(n)
Now all operations of List ADT are O(1):I only operations accessing via index/rank are O(n)
Amortization
Amortization is a tool for understanding algorithm complexity ifsteps have widely varying performance:
I analyses running time of a series of operations
I takes into account interaction between different operations
The amortized running time of an operation in a series ofoperations is the worst-case running time of the series dividedby the number of operations.
Example:I N operations in 1s,
I 1 operation in NsAmortized running time:
(N · 1 + N)/(N + 1) ≤ 2
That is: O(1) per operation. operations
time
N
N
Amortization Techniques: Accounting Method
The accounting method uses a scheme of credits and debitsto keep track of the running time of operations in a sequence:
I we pay one cyber-dollar for a constant computation time
We charge for every operation some amount of cyber-dollars:I this amount is the amortized running time of this operation
When an operation is executed we need enough cyber-dollarsto pay for its running time.
operations
time
$ $ $ $ $ $ $ $ $$ $ $ $ $ $ $ $ $
N
N
We charge:I 2 cyber-dollars per operation
The first N operations consume 1dollar each. For the last operationwe have N dollars saved.
Amortized O(1) per operation.
Amortization, Example: Clearable Table
Clearable Table supports operations:I add(o) for adding an object to the table
I clear() for emptying the table
Implementation using an array:I add(o) ∈ O(1)
I clear() ∈ O(m) where m is number of elements in the table(clear() removes one by one every entry in the table)
Amortization, Example: Clearable Table
time
clear() clear() clear()
$ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $$ $ $$ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $
We charge 2 per operation:I add consumes 1; thus we save 1 per add.
Whenever m elements are in the list, we have saved m dollars:I we use the m dollars to pay for clear ∈ O(m)
Thus amortized costs per operation are 2 dollars, that is, O(1).
Array-based Implementation of Lists
Uses array in circular fashion (as for queues):I variable f points to first position
I variable l points one behind last position
A
A B C
f l
Adding elements runs in O(1) as long as the array is not full.
When the array is full:I we need to allocate a larger array B
I copy all elements from A to B
I set A = B (that is, from then on work further with B)
Array-based Implementation: Constant Increase
Each time the array is full we increase the size by k elements:I creating B of size n + k and copying n elements is O(n)
Example: k = 3
operations
time
Every k -th insert operation we need to resize the array.
Worst-case complexity for a sequence of m insert operations:bm/kc∑
i=1
k · i ∈ O(n2)
Average costs for each operation in the sequence: O(n).
Array-based Implementation: Doubling the Size
Each time the array is full we double the size:I creating B of size 2n and copying n elements is O(n)
operations
time
$$
$$
$$$$$$
$$$$$$$$$$$$$$$
$ $ $ $$ $ $
$ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $
Array-based Implementation: Doubling the Size
Each time the array is full we double the size:I insert costs O(1) if the array is not full
I insert costs O(n) if the array is full (for rezise n 7→ 2n)
We charge 3 cyber dollars per insert.
After doubling the size n 7→ 2n:I we have at least n times insert’s without resize
I we save 2n cyber-dollars before the next resizeThus we have 2n cyber-dollars for the next resize 2n 7→ 4n.
Hence the amortized costs for insert is 3: O(1).
(can also be used for fast queue and stack implementations)
Array-based Lists: Performance
The performance under assumption of the doubling strategy.
Operation Amortized Complexitysize, isEmpty O(1)
first, last, after O(1)
before O(1)
replaceElement, swapElements O(1)
insertFirst, insertLast O(1)
insertAfter O(n)
insertBefore O(n)
remove O(n)
atRank, rankOf, elemAtRank O(1)
replaceAtRank O(1)
insertAtRank, removeAtRank O(n)
Comparison of the Amortized Complexities
Operation Singly Doubly Arraysize, isEmpty O(1) O(1) O(1)
first, last, after O(1) O(1) O(1)
before O(n) O(1) O(1)
replaceElement, swapElements O(1) O(1) O(1)
insertFirst, insertLast O(1) O(1) O(1)
insertAfter O(1) O(1) O(n)
insertBefore O(n) O(1) O(n)
remove O(n) O(1) O(n)
atRank, rankOf, elemAtRank O(n) O(n) O(1)
replaceAtRank O(n) O(n) O(1)
insertAtRank, removeAtRank O(n) O(n) O(n)
I singly linked lists (Singly)I doubly linked lists (Doubly)I array-based implementation with doubling strategy (Array)
Iterators
I Iterator allows to traverse the elements in a list or set.I Iterator ADT provides the following methods:
I object(): returns the current object
I hasNext(): indicates whether there are more elements
I nextObject(): goes to the next object and returns it
Priority Queue ADT
A priority queue stores a list of items:I the items are pairs (key, element)
I the key represens the priority: smaller key = higher priority
Main methods:I insertItem(k, o): inserts element o with key k
I removeMin(): removes item (k, o) with smallest key andreturns its element o
Additional methods:I minKey(): returns but does not remove the smallest key
I minElement(): returns but does not remove the elementwith the smallest key
I size(), isEmpty()
Applications of Priority Queues
Direct applications:I bandwidth management in routers
I task scheduling (execution of tasks in priority order)Indirect applications:
I auxiliary data structure for algorithms, e.g.:I shortest path in graphs
I component of other data structures
Priority Queues: Order on the Keys
I Keys can be arbitrary objects with a total order.
I Two distinct items in a queue may have the same key.
A relation ≤ is called order if for all x , y , z:I reflexivity: x ≤ x
I transitivity: x ≤ y and y ≤ z implies x ≤ z
I antisymmetry: x ≤ y and y ≤ x implies x = y
Implementation of the order: external class or functionI Comparator ADT
I isLessThan(x, y), isLessOrEqualTo(x, y)
I isEqual(x, y)
I isGreaterThan(x, y), isGreaterOrEqualTo(x, y)
I isComparable(x)
Sorting with Priority Queues
We can use a priority queue for soring as follows:I Insert all elements e stepwise with insertItem(e, e)
I Remove the elements in sorted order using removeMin()
Algorithm PriorityQueueSort(A,C):Input: List A, Comparator COutput: List A sorted in ascending order
P = new priority queue with comparator Cwhile ¬A.isEmpty() do
e = A.remove(A.first())P.insertItem(e, e)
while ¬P.isEmpty() doA.insertLast(P.removeMin())
Running time depends on:I implementation of the priority queue
List-based Priority Queues
Implementation with an unsorted List:
5 7 3 4 1I Performance:
I insertItem is O(1) since we can insert in front or end
I removeMin, minKey and minElement are O(n) since wehave to search the whole list for the smallest key
Implementation with an sorted List:
1 3 4 5 7I Performance:
I insertItem is O(n)
I for singly/doubly linked list O(n) for finding the insert position
I for array-based list O(n) for insertAtRank
I removeMin, minKey and minElement are O(1) since thesmallest key is in front of the list
Sorting
I Given is a sequence of pairs
(k1, e1), (k2, e2), . . . , (kn, en)
of elements ei with keys ki and an order ≤ on the keys.
4
e1
2
e2
3
e3
1
e4
5
e5
I We search a permutation π of the pairs such that the keyskπ(1) ≤ kπ(2) ≤ . . . ≤ kπ(n) are in ascending order.
1
e4
2
e2
3
e3
4
e1
5
e5
Selection-Sort
Selection-sort takes an unsorted list A and sorts as follows:I search the smallest element A and swap it with the first
I afterwards continue sorting the remainder of A
5 4 3 7 1
1 4 3 7 5
1 3 4 7 5
1 3 4 7 5
1 3 4 5 7
1 3 4 5 7
Unsorted Part
Sorted Part
Minimal Element
Selection-Sort: Properties
I Time complexity (best, average and worst-case) O(n2):
n + (n − 1) + . . .+ 1 =n2 + n
2∈ O(n2)
(caused by searching the minimal element)
I Selection-sort is an in-place sorting algorithm.
In-Place AlgorithmAn algorithm is in-place if apart from space for input data onlya constant amount of space is used: space complexity O(1).
Stable Sorting Algorithms
Stable Sorting AlgorithmA sorting algorithm is called stable if the order of items withequal key is preserved.
Example: not stable
3 2 2
A B C2 2 3
C B A2 2 3
C B AB, C exchanged order,although they have equal keys
Example: stable
3 2 2
A B C2 3 2
B A C2 2 3
B C A
Stable Sorting Algorithms
Applications of stable sorting:I preserving original order of elements with equal key
For example:I we have an alphabetically sorted list of names
I we want to sort by date of birth while keeping alphabeticalorder for persons with same birthday
Selection-sort is stable ifI we always select the first (leftmost) minimal element
Insertion-Sort
Selection-sort takes an unsorted list A and sorts as follows:I distinguishes a sorted and unsorted part of A
I in each step we remove an element from the unsortedpart, and insert it at the correct position in the sorted part
5 3 4 7 1
5 3 4 7 1
3 5 4 7 1
3 4 5 7 1
3 4 5 7 1
1 3 4 5 7
Unsorted Part
Sorted Part
Minimal Element
Insertion-Sort: Complexity
I Time complexity worst-case O(n2):
1 + 2 + . . .+ n =n2 + n
2∈ O(n2)
(searching insertion position together with inserting)I Time complexity best-case O(n):
I if list is already sorted, and
I we start searching insertion position from the end
I More general: time complexity O(n · (n − d + 1))
I if the first d elements are already sorted
I Insertion-sort is an in-place sorting algorithm:I space complexity O(1)
Insertion-Sort: Properties
I Simple implementation.I Efficient for:
I small lists
I big lists of which a large prefix is already sorted
I Insertion-sort is stable if:I we always pick the first element from the unsorted part
I we always insert behind all elements with equal keys
I Insertion-sort can be used online:I does not need all data at once,
I can sort a list while receiving it
Heaps
A heap is a binary tree storing keys at its inner nodes such that:
I if A is a parent of B, then key(A) ≤ key(B)
I the heap is a complete binary tree: let h be the heap heightI for i = 0, . . . , h − 1 there are 2i nodes at depth i
I at depth h − 1 the inner nodes are left of the external nodes
We call the rightmost inner node at depth h − 1 the ‘last node’.
2
5
9 7
6
last node
Height of Heaps
TheoremA heap storing n keys height O(log2 n).
Proof.A heap of height h contains 2i nodes at every depthi = 0, . . . , h − 2 and at least one node at depth h − 1. Thusn ≥ 1 + 2 + 4 + . . . 2h−2 + 1 = 2h−1. Hence h ≤ 1 + log2 n.
10
21
2ii
1h − 1
depth keys
Heaps and Priority Queues
We can use a heap to implement a priority queue:I inner nodes store (key, element) pair
I variable last points to the last node
(2,Sue)(5,Pat) (6,Mark)
(7,Anna)
(9, Jeff )
last
I For convenience, in the sequel, we only show the keys.
Heaps: Insertion
The insertion of key k into the heap consits of 3 steps:I Find the insertion node z (the new last node).
I Expand z to an internal node and store k at z.
I Restore the heap-order property (see following slides).
2
5
9 7
6
insertion node z
Example: insertion of key 1 (without restoring heap-order)
2
5
9 7
6
1z
Heaps: Insertion, Upheap
After insertion of k the heap-order may be violated.
We restore the heap-order using the upheap algorithm:I we swap k upwards along the path to the root
as long as the parent of k has a larger keyTime complexity is O(log2 n) since the heap height is O(log2 n).
2
5
9 7
6
1
Now the heap-order property is restored.
Heaps: Insertion, Upheap
After insertion of k the heap-order may be violated.
We restore the heap-order using the upheap algorithm:I we swap k upwards along the path to the root
as long as the parent of k has a larger keyTime complexity is O(log2 n) since the heap height is O(log2 n).
1
5
9 7
2
6
Now the heap-order property is restored.
Heaps: Insert, Finding the Insertion Position
An algorithm for finding the insertion position (new last node):I start from the current last node
I while the current node is a right child, go to the parent node
I if the current node is a left child, go to the right child
I while the current node has a left child, go to the left childTime complexity is O(log2 n) since the heap height is O(log2 n).(we walk at most at most once completely up and down again)
Heaps: Removal of the Root
The removal of the root consits of 3 steps:I Replace the root key with the key of the last node w.
I Compress w and its children into a leaf.
I Restore the heap-order property (see following slides).
2
5
9 7
6w
Example: removal of the root (without restoring heap-order)
7
5
9
6w
Heaps: Removal, Downheap
Replacing the root key by k may violate the heap-order.
We restore the heap-order using the downheap algorithm:I we swap k with its smallest child
as long as a child of k has a smaller keyTime complexity is O(log2 n) since the heap height is O(log2 n).
7
5
9
6
Now the heap-order property is restored. The new last nodecan be found similar to finding the insertion position (but nowwalk against the clock direction).
Heaps: Removal, Downheap
Replacing the root key by k may violate the heap-order.
We restore the heap-order using the downheap algorithm:I we swap k with its smallest child
as long as a child of k has a smaller keyTime complexity is O(log2 n) since the heap height is O(log2 n).
5
7
9
6
Now the heap-order property is restored. The new last nodecan be found similar to finding the insertion position (but nowwalk against the clock direction).
Heaps: Removal, Finding the New Last Node
After the removal we have to find the new last node:I start from the old last node (which has been remove)
I while the current node is a left child, go to the parent node
I if the current node is a right child, go to the left child
I while the current node has an right child which is not a leaf,go to the right child
Time complexity is O(log2 n) since the heap height is O(log2 n).(we walk at most at most once completely up and down again)
removed
Heap-Sort
We implement a priority queue by means of a heap:I insertItem(k, e) corresponds to adding (k, e) to the heap
I removeMin() corresponds to removing the root of the heapPerformance:
I insertItem(k, e), and removeMin() run in O(log2 n) time
I size(), isEmtpy(), minKey(), and minElement() are O(1)
Heap-sort is O(n log2 n)
Using a heap-based priority queue we can sort a list of nelements in O(n · log2 n) time (n times insert + n times removal).
Thus heap-sort is much faster than quadratic sorting algorithms(e.g. selection sort).
Vector-based Heap Implementation
We can represent a heap with n keys by a vector of size n + 1:
2
5
9 7
6
0 1 2 3 4 52 5 6 9 7
I The root node has rank 1 (cell at rank 0 is not used).I For a node at rank i :
I the left child is at rank 2i
I the right child is at rank 2i + 1
I the parent (if i > 1) is located at rank bi/2c
I Leafs and links between the nodes are not stored explicitly.
Vector-based Heap Implementation, continued
We can represent a heap with n keys by a vector of size n + 1:
2
5
9 7
6
0 1 2 3 4 52 5 6 9 7
I The last element in the heap has rank n, thus:I insertItem corresponds to inserting at rank n + 1
I removeMin corresponds to removing at rank n
I Yields in-place heap-sort (space complexity O(1)):I uses a max-heap (largest element on top)
Merging two Heaps
We are given two heaps h1, h2 and a key k:I create a new heap with root k and h1, h2 as children
I we perform downheap to restore the heap-order
2
6 5
h1 3
4 5
h2 k = 7
7
2
6 5
3
4 5
Merging two Heaps
We are given two heaps h1, h2 and a key k:I create a new heap with root k and h1, h2 as children
I we perform downheap to restore the heap-order
2
6 5
h1 3
4 5
h2 k = 7
2
5
6 7
3
4 5
Bottom-up Heap Construction
We have n keys and want to construct a heap from them.
Possibility one:I start from empty heap and use n times insert
I needs O(n log2 n) time
Possibility two: bottom-up heap constructionI for simplicity we assume n = 2h − 1 (for some h)
I take 2h−1 elements and turn them into heaps of size 1I for phase i = 1, . . . , log2 n:
I merge the heaps of size 2i − 1 to heaps of size 2i+1 − 1
2i − 1 2i − 1merge
2i − 1 2i − 1
2i+1 − 1
Bottom-up Heap Construction, Example
We construct a heap from the following 24 − 1 = 15 elements:
16,15,4,12,6,9,23,20,25,5,11,27,7,8,10
16 15 4 12 6 9 23 20
Bottom-up Heap Construction, Example
We construct a heap from the following 24 − 1 = 15 elements:
16,15,4,12,6,9,23,20,25,5,11,27,7,8,10
25
16 15
5
4 12
11
6 9
27
23 20
Bottom-up Heap Construction, Example
We construct a heap from the following 24 − 1 = 15 elements:
16,15,4,12,6,9,23,20,25,5,11,27,7,8,10
15
16 25
4
5 12
6
11 9
20
23 27
Bottom-up Heap Construction, Example
We construct a heap from the following 24 − 1 = 15 elements:
16,15,4,12,6,9,23,20,25,5,11,27,7,8,10
7
15
16 25
4
5 12
8
6
11 9
20
23 27
Bottom-up Heap Construction, Example
We construct a heap from the following 24 − 1 = 15 elements:
16,15,4,12,6,9,23,20,25,5,11,27,7,8,10
4
15
16 25
5
7 12
6
8
11 9
20
23 27
Bottom-up Heap Construction, Example
We construct a heap from the following 24 − 1 = 15 elements:
16,15,4,12,6,9,23,20,25,5,11,27,7,8,10
10
4
15
16 25
5
7 12
6
8
11 9
20
23 27
We are ready: this is the final heap.
Bottom-up Heap Construction, Example
We construct a heap from the following 24 − 1 = 15 elements:
16,15,4,12,6,9,23,20,25,5,11,27,7,8,10
4
5
15
16 25
7
10 12
6
8
11 9
20
23 27
We are ready: this is the final heap.
Bottom-up Heap Construction, Performance
Visualization of the worst-case of the construction:
I displays the longest possible heapdown paths(may not be the actual path, but maximal length)
I each edge is traversed at most once
I we have 2n edges hence the time complexity is O(n)
I faster than n successive insertions
Dictionary ADT
Dictionary ADT models:I a searchable collection of key-element items.
I search via the key
Main operations are: searching, inserting, deleting itemsI findElement(k): returns the element with key k if the
dictionary contains such an element(otherwise returns special element No_Such_Key)
I insertItem(k, e): inserts (k, e) into the dictionary
I removeElement(k): like findElement(k) but additionallyremoves the item if present
I size(), isEmpty()
I keys(), elements()
Log File
A log file is a dictionary implemented based on an unsorted list:I Performance:
I insertItem is O(1) (can insert anywhere, e.g. end)
I findElement, and removeElement are O(n)(have to search the whole sequence in worst case)
I Efficient for small dictionaries or if insertions are muchmore frequent than search and removal.(e.g. access log of a workstation)
Ordered Dictionaries
Ordered Dictionaries:I Keys are assumed to come from a total order.I New operations:
I closestKeyBefore(k)
I closestElementBefore(k)
I closestKeyAfter(k)
I closestElementAfter(k)
Binary Search
Binary search performs findElement(k) on sorted arrays:I each step the search space is halved
I time complexity is O(log2 n)
I for pseudo code and complexity analysis see lecture 1
Example: findElement(7)
0 1 3 5 7 8 9 11 15 16 18low high
Binary Search
Binary search performs findElement(k) on sorted arrays:I each step the search space is halved
I time complexity is O(log2 n)
I for pseudo code and complexity analysis see lecture 1
Example: findElement(7)
0 1 3 5 7 8 9 11 15 16 18low high
8
Binary Search
Binary search performs findElement(k) on sorted arrays:I each step the search space is halved
I time complexity is O(log2 n)
I for pseudo code and complexity analysis see lecture 1
Example: findElement(7)
0 1 3 5 7 8 9 11 15 16 18low high
8
0 1 3 5 7 8 9 11 15 16 18low high
Binary Search
Binary search performs findElement(k) on sorted arrays:I each step the search space is halved
I time complexity is O(log2 n)
I for pseudo code and complexity analysis see lecture 1
Example: findElement(7)
0 1 3 5 7 8 9 11 15 16 18low high
8
0 1 3 5 7 8 9 11 15 16 18low high
3
Binary Search
Binary search performs findElement(k) on sorted arrays:I each step the search space is halved
I time complexity is O(log2 n)
I for pseudo code and complexity analysis see lecture 1
Example: findElement(7)
0 1 3 5 7 8 9 11 15 16 18low high
8
0 1 3 5 7 8 9 11 15 16 18low high
3
0 1 3 5 7 8 9 11 15 16 18low high
Binary Search
Binary search performs findElement(k) on sorted arrays:I each step the search space is halved
I time complexity is O(log2 n)
I for pseudo code and complexity analysis see lecture 1
Example: findElement(7)
0 1 3 5 7 8 9 11 15 16 18low high
8
0 1 3 5 7 8 9 11 15 16 18low high
3
0 1 3 5 7 8 9 11 15 16 18low high5
Binary Search
Binary search performs findElement(k) on sorted arrays:I each step the search space is halved
I time complexity is O(log2 n)
I for pseudo code and complexity analysis see lecture 1
Example: findElement(7)
0 1 3 5 7 8 9 11 15 16 18low high
8
0 1 3 5 7 8 9 11 15 16 18low high
3
0 1 3 5 7 8 9 11 15 16 18low high5
0 1 3 5 7 8 9 11 15 16 18lowhigh
Binary Search
Binary search performs findElement(k) on sorted arrays:I each step the search space is halved
I time complexity is O(log2 n)
I for pseudo code and complexity analysis see lecture 1
Example: findElement(7)
0 1 3 5 7 8 9 11 15 16 18low high
8
0 1 3 5 7 8 9 11 15 16 18low high
3
0 1 3 5 7 8 9 11 15 16 18low high5
0 1 3 5 7 8 9 11 15 16 18lowhigh
7
Lookup Table
A lookup table is a (ordered) dictionary based on a sorted array:I Performance:
I findElement takes O(log2 n) using binary search
I insertItem is O(n) (shifting n/2 items worst case)
I removeElement takes O(n) (shifting n/2 items worst case)
I Efficient for small dictionaries or if search are much morefrequent than insertion and removal.(e.g. user authentication with password)
Data Structure for TreesA node is represented by an object storing:
I an element
I link to the parent node
I list of links to children nodes
A
B C
E F
D
∅
A
∅
B C
∅
D
∅
E
∅
F
Data Structure for Binary TreesA node is represented by an object storing:
I an element
I link to the parent node
I left child
I right child
A
B C
D E
∅
A∅ ∅
B C∅ ∅
D
∅ ∅
EBinary trees can also be represented by a Vector, see heaps.
Binary Search Tree
A binary search tree is a binary tree such that:I The inner nodes store keys (or key-element pairs).
(leafs are empty, and usually left out in literature)I For every node n:
I the left subtree of n contains only keys < key(n)
I the right subtree of n contains only keys > key(n)
6
2
1 4
9
8
I Inorder traversal visits the keys in increasing order.
Binary Search Tree: Searching
Searching for key k in a binary search tree t works as follows:I if the root of t is an inner node with key k ′:
I if k == k ′, then return the element stored at the root
I if k < k ′, then search in the left subtree
I if k > k ′, then search in the right subtree
I if t is a leaf, then return No_Such_Key (key not found)
Example: findElement(4)
6
2
1 4
9
8
<
>
Binary Search Tree: Minimal Key
Finding the minimal key in a binary search tree:I walk left until the the left child is a leafminKey(n):
if isExternal(n) thenreturn No_Such_Key
if isExternal(leftChild(n)) thenreturn key(n)
elsereturn minKey(leftChild(n))
Example: for the tree below, minKey() returns 1
6
2
1 4
9
8
Binary Search Tree: Maximal Key
Finding the maximal key in a binary search tree:I walk right until the the right child is a leafmaxKey(n):
if isExternal(n) thenreturn No_Such_Key
if isExternal(rightChild(n)) thenreturn key(n)
elsereturn maxKey(rightChild(n))
Example: for the tree below, maxKey() returns 9
6
2
1 4
9
8
Binary Search Tree: Insertion
Insertion of key k in a binary search tree t:I search for k and remember the leaf w where we ended up
(assuming that we did not find k)
I insert k at node w, and expand w to an inner node
Example: insert(5)
6
2
1 4
9
8
<
>
>
6
2
1 4
5
9
8
Binary Search Tree: Deletion above External
The algorithm removeAboveExternal(n) removes a node forwhich at least one child is a leaf (external node):
I if the left child of n is a leaf, replace n by its right subtree
I if the right child of n is a leaf, replace n by its left subtree
Example: removeAboveExternal(9)
6
2
1 4
9
8
7
remove6
2
1 4
8
7
Binary Search Tree: Deletion
The algorithm remove(n) removes a node:I if at least one child is a leaf then removeAboveExternal(n)
I if both children of n are internal nodes, then:I find the minimal node m in the right subtree of n
I replace the key of n by the key of m
I remove the node m using removeAboveExternal(m)
(works also with the maximal node of the left subtree)
Example: remove(6)
6
2
1 4
9
8
7
remove7
2
1 4
9
8
Performance of Binary Search Trees
Binary search tree storing n keys, and height h:I the space used is O(n)
I findElement, insertItem, removeElement take O(h) time
The height h is O(n) in worst case and O(log2 n) in best case:I findElement, insertItem, and removeElement
take O(n) time in worst case
AVL Trees
AVL trees are binary search trees such that for every innernode n the heights of the children differ at most by 1.
I AVL trees are often called height-balanced.
3
0
2
8
6
4 7
9
4
2
1
1 1
12
3
I Heights of the subtrees are displayed above the nodes.
AVL Trees: Balance Factor
The balance factor of a node is the height of its right subtreeminus the height of its left subtree.
3
0
2
8
6
4 7
9
1
1
0
0 0
00
-1
I Above the nodes we display the balance factor.
I Nodes with balance factor -1,0, or 1 are called balanced.
The Height of AVL Trees
The height of an AVL tree storing n keys is O(log n).
Proof.Let n(h) be the minimal number of inner nodes for height h.
I we have n(1) = 1 and n(2) = 2
I we know
n(h) = 1 + n(h − 1) + n(h − 2)
> 2 · n(h − 2) since n(h − 1) > n(h − 2)
> 4 · n(h − 4)
> 8 · n(h − 6)
n(h) > 2i · n(h − 2i) by induction
≥ 2h/2
Thus h ≤ 2 · log2 n(h).
AVL Trees: Insertion
Insertion of a key k in AVL trees works in the following steps:I We insert k as for binary search trees:
I let the inserted node be w
I After insertion we might need to rebalance the tree:I the balance of the ancestors of w may be affected
I nodes with balance factor -2 or 2 need rebalancing
Example: insertItem(1) (without rebalancing)
3
0
2
8
-1
1
0
03
0
2
1
8
-2
0
2
-1
0 insert
rebalance
AVL Trees: Rebalancing after Insertion
After insertion the AVL tree may need to be rebalanced:I walk from the inserted node to the root
I we rebalance the first node with balance factor -2 or 2
There are only the following 4 cases (2 modulo symmetry):
Left Left -2
-1
A
h + 1
inserted
B
h
C
h
Left Right -2
1
A
h
B
h + 1
inserted
C
h
Right Right 2
1A
h B
h
C
h + 1
inserted
Right Left 2
-1A
h B
h + 1
inserted
C
h
AVL Trees: Rebalancing, Case Left Left
The case ‘Left Left’ requires a right rotation:
x
y
A
h + 1
inserted
B
h
C
h
-2
-1y
xA
h + 1
inserted
B
h
C
h
0
0rotate right
Example: Case Left Left
Example: insertItem(0)
7
4
2
1 3
6
8
90
0
0
0 0
-1 1
-1 insert 7
4
2
1
0
3
6
8
9
0
-1
-1
0
0 0
-2 1
-2
7
2
1
0
4
3 6
8
90
-1
0
0 0
00
1
-1
rotate right 2, 4
AVL Trees: Rebalancing, Case Right Right
The case ‘Right Right’ requires a left rotation:(this case is symmetric to the case ‘Left Left’)
x
yA
h B
h
C
h + 1
inserted
2
1y
x
A
h
B
h
C
h + 1
inserted
0
0rotate left
Example: Case Right Right
Example: insertItem(7)
3
2 5
4 8
0
0
0
0
1 insert 3
2 5
4 8
7
0
-1
0
1
0
2
5
3
2 4
8
70
-1
0
0
0
0rotate right 3, 5
AVL Trees: Rebalancing, Case Left Right
The case ‘Left Right’ requires a left and a right rotation:
x
y
zA
h B
hinserted
C
h − 1
D
h
-2
1
-1
x
z
y
A
h
B
hinserted
C
h − 1
D
h
-2
-2
0
rotate left y, z
z
y x
A
h
B
hinserted
C
h − 1
D
h
0
0 1rotate right x, z
(insertion in C or z instead of B works exactly the same)
Example: Case Left Right
Example: insertItem(5)
7
4
2 6
80 0
0 0
-1 insert 7
4
2 6
5
80 -1
1 0
-2
0
7
6
4
2 5
8
0
-2
0
0
-2
0
rotate left 4, 66
4
2 5
7
80
0
0
0
1
0
rotate right 6, 7
AVL Trees: Rebalancing, Case Right Left
The case ‘Right Left’ requires a right and a left rotation:
x
y
zA
h
B
h − 1
C
hinserted
D
h
2
-1
1
x
z
yA
h B
h − 1 C
hinserted
D
h
2
2
0
rotate right y, z
z
x y
A
h
B
h − 1
C
hinserted
D
h
0
0-1
rotate left x, z
(this case is symmetric to the case ‘Left Right’)
Example: Case Right Left
Example: insertItem(7)
5
8
1
0insert 5
8
7
2
-1
0
5
7
8
2
0
1
rotate right 7, 87
5 80 0
0rotate left 5, 7
AVL Trees: Rebalancing after Deletion
After deletion the AVL tree may need to be rebalanced:I walk from the inserted node to the root
I we rebalance all nodes with balance factor -2 or 2
There are only the following 6 cases (2 of which are new):
Left Left -2
-1
A
h + 1
B
h
C
hdeleted
Left Right -2
1
A
h
B
h + 1
C
hdeleted
Left -2
0
A
h + 1
B
h + 1
C
hdeleted
Right Right 2
1A
hdeleted
B
h
C
h + 1
Right Left 2
-1A
hdeleted
B
h + 1
C
h
Right 2
0A
hdeleted
B
h + 1
C
h + 1
AVL Trees: Rebalancing, Case Left
The case ‘Left’ requires a right rotation:
x
y
A
h + 1
B
h + 1
C
hdeleted
-2
0y
xA
h + 1 B
h + 1
C
hdeleted
1
-1rotate right
Example: Case Left
Example: remove(9)
7
4
2
1 3
6
5
8
90
0
0
-1 0
0 1
-1
0
remove 7
4
2
1 3
6
5
8
0
0
0
-10
0 0
-2
0
4
2
1 3
7
6
5
80
0
0 -1
0
1
0
-1
0
rotate right 4, 7
AVL Trees: Rebalancing, Case Right
The case ‘Right’ requires a left rotation:
x
yA
hdeleted
B
h + 1
C
h + 1
2
0y
x
A
hdeleted
B
h + 1
C
h + 1
-1
1rotate left
Example: Case Right
Example: remove(4)
6
4 8
7 90
00
1
0
remove 6
8
7 90
0
2
0
8
6
7
90
-1
1
0
rotate left 6, 8
AVL Trees: Performance
A single rotation (restructure) is O(1):I assuming a linked-structure binary tree
Insertion runs in O(log n):I initial find is O(log n)
I rebalancing and update of heights is O(log n)
(at most 2 rotations are needed, no further rebalancing)
Deletion runs in O(log n):I initial find is O(log n)
I rebalancing and update of heights is O(log n)
I rebalancing may decrease the height of the subtree, thusfurther rebalancing above the node may be necessary
Lookup (find) is O(log n) (height of the tree is O(log n)).
Example from Last Years Exam
8
5
3
2
1
4
6
7
10
9 11
11
I Show with pictures step for step how item 9 is removed.I Indicate in every picture:
I which node is not balanced, and
I which nodes are involved in rotation.
Dictionaries: Performance Overview
Dictionary methods:
search insert removeLog File O(n) O(1) O(n)
Lookup Table O(log2 n) O(n) O(n)
AVL Tree O(log2 n) O(log2 n) O(log2 n)
Ordered dictionary methods:
closestAfter closestBeforeLog File O(n) O(n)
Lookup Table O(log2 n) O(log2 n)
AVL Tree O(log2 n) O(log2 n)
I Log File corresponds to unsorted list.
I Lookup Table corresponds to unsorted list
Hash Functions and Hash Tables
A hash function h maps keys of a given type to integers in afixed interval [0, . . . ,N − 1]. We call h(x) hash value of x .
Examples:I h(x) = x mod N
is a hash function for integer keys
I h((x , y)) = (5 · x + 7 · y) mod Nis a hash function for pairs of integers
h(x) = x mod 5key element
01 6 tea2 2 coffee34 14 chocolate
A hash table consists of:I hash function h
I an array (called table) of size N
The idea is to store item (k, e) at index h(k).
Hash Tables: Example 1
Example: phone book with table size N = 5I hash function h(w) = (length of the word w) mod 5
0
1
2
3
4
(Alice, 020598555)
(Sue, 060011223)
(John, 020123456)
Alice
John
Sue
I Ideal case: one access for find(k) (that is, O(1)).I Problem: collisions
I Where to store Joe (collides with Sue)?
I This is an example of a bad hash function:I Lots of collisions even if we make the table size N larger.
Hash Tables: Example 2
A dictionary based on a hash table for:I items (social security number, name)
I 700 persons in the database
We choose a hash table of size N = 1000 with:I hash function h(x) = last three digits of x
0
1
2
3...
997
998
999
(025-611-001, Mr. X)
(987-067-002, Brad Pit)
(431-763-997, Alan Turing)
(007-007-999, James Bond)
Collisions
Collisionsoccur when different elements are mapped to the same cell:
I Keys k1, k2 with h(k1) = h(k2) are said to collide.
0
1
2
3...
(025-611-001, Mr. X)
(987-067-002, Brad Pit) (123-456-002, Dipsy)?
Different possibilities of handing collisions:I chaining,
I linear probing,
I double hashing, . . .
Collisions continued
Usual setting:I The set of keys is much larger than the available memory.
I Hence collisions are unavoidable.
How probable are collisions:I We have a party with p persons. What is the probability that
at least 2 persons have birthday the same day (N = 365).
I Probability for no collision:
q(p,N) =NN· N − 1
N· · · N − p + 1
N
=(N − 1) · (N − 2) · · · (N − p + 1)
Np−1
I Already for p ≥ 23 the probability for collisions is > 0.5.
Hashing: Efficiency Factors
The efficiency of hashing depends on various factors:I hash function
I type of the keys: integers, strings,. . .
I distribution of the actually used keys
I occupancy of the hash table (how full is the hash table)
I method of collision handling
The load factor α of a hash table is the ratio n/N, that is, thenumber of elements in the table divided by size of the table.
High load factor α ≥ 0.85 has negative effect on efficiency:I lots of collisions
I low efficiency due to collision overhead
What is a good Hash Function?
Hash fuctions should have the following properties:I Fast computation of the hash value (O(1)).I Hash values should be distributed (nearly) uniformly:
I Every has value (cell in the hash table) has equal probabilty.
I This should hold even if keys are non-uniformly distributed.
The goal of a hash function is:I ‘disperse’ the keys in an apparently random way
Example (Hash Function for Strings in Python)We dispay python hash values modulo 997:
h(‘a ′) = 535 h(‘b ′) = 80 h(‘c ′) = 618 h(‘d ′) = 163h(‘ab ′) = 354 h(‘ba ′) = 979 . . .
At least at first glance they look random.
Hash Code Map and Compression Map
Hash function is usually specified as composition of:I hash code map: h1 : keys→ integers
I compression map: h2 : integers→ [0, . . . ,N − 1]
The hash code map is appied before the compression map:I h(x) = h2(h1(x)) is the composed hash function
The compression map usually is of the form h2(x) = x mod N:
I The actual work is done by the hash code map.
I What are good N to choose? . . . see following slides
Compression Map: Example
We revisit the example (social security number, name):I hash function h(x) = x as number mod 1000
Assume the last digit is always 0 or 1 indicating male/femal.
0123456789
101112 ...
(025-611-000, Mr. X)(987-067-001, Ms. X)
(431-763-010, Alan Turing)(007-011-011, Madonna)
Then 80% of the cells in the table stay unused! Bad hash!
Compression Map: Division Remainder
A better hash function for ‘social security number’:I hash function h(x) = x as number mod 997
I e.g. h(025 − 611 − 000) = 025611000 mod 997 = 409
Why 997? Because 997 is a prime number!I Let the hash function be of the form h(x) = x mod N.
I Assume the keys are distributed in equidistance ∆ < N:
ki = z + i · ∆
We get a collision if:
ki mod N = kj mod N⇐⇒ z + i · ∆ mod N = z + j · ∆ mod N⇐⇒ i = j + m · N (for some m ∈ Z)
Thus a prime maximizes the distance of keys with collisions!
Hash Code Maps
What if the keys are not integers?I Integer cast: interpret the bits of the key as integer.
a0001
b0010
c0011 000100100011 = 291
What if keys are longer than 32/64 bit Integers?I Component sum:
I partition the bits of the key into parts of fixed length
I combine the components to one integer using sum(other combinations are possible, e.g. bitwise xor, . . . )
1001010 | 0010111 | 01100001001010 + 0010111 + 0110000 = 74 + 23 + 48 = 145
Hash Code Maps, continued
Other possible hash code maps:I Polynomial accumulation:
I partition the bits of the key into parts of fixed length
a0a1a2 . . . an
I take as hash value the value of the polynom:
a0 + a1 · z + a2 · z2 . . . an · zn
I especially suitable for strings (e.g. z = 33 has at most 6collisions for 50.000 english words)
I Mid-square method:I pick m bits from the middle of x2
I Random method:I take x as seed for random number generator
Collision Handling: Chaining
Chaining: each cell of the hash table points to a linked list ofelements that are mapped to this cell.
I colliding items are stored outside of the table
I simple but requires additional memory outside of the table
Example: keys = birthdays, elements = namesI hash function: h(x) = (month of birth) mod 5
0
1
2
3
4
(01.01., Sue) ∅
(12.03., John) (16.08., Madonna) ∅
Worst-case: everything in one cell, that is, linear list.
Collision Handling: Linear Probing
Open addressing:I the colliding items are placed in a different cell of the table
Linear probing:I colliding items stored in the next (circularly) available cell
I testing if cells are free is called ‘probing’
Example: h(x) = x mod 13I we insert: 18, 41, 22, 44, 59, 32, 31, 73
0 1 2 3 4 5 6 7 8 9 10 11 121841 2244 59 32 31 73
Colliding items might lump together causing new collisions.
Linear Probing: Search
Searching for a key k (findElement(k)) works as follows:I Start at cell h(k), and probe consecutive locations until:
I an item with key k is found, or
I an empty cell is found, or
I all N cells have been probed unsuccessfully.
findElement(k):i = h(k)p = 0while p < N do
c = A[i]if c == ∅ then return No_Such_Keyif c.key == k then return c.elementi = (i + 1) mod Np = p + 1
return No_Such_Key
Linear Probing: Deleting
Deletion remove(k) is expensive:I Removing 15, all consecutive elements have to be moved:
0 1 2 3 4 5 6 7 8 9 10 11 1215 2 3 4 5 6 7
0 1 2 3 4 5 6 7 8 9 10 11 122 3 4 5 6 7
To avoid the moving we introduce a special element Available:I Instead of deleting, we replace items by Available (A).
0 1 2 3 4 5 6 7 8 9 10 11 1215 2 3 4 5 6 7A
I From time to time we need to ‘clean up’:I remove all Available and reorder items
Linear Probing: Inserting
Inserting insertItem(k, o):I Start at cell h(k), probe consecutive elements until:
I empty or Available cell is found, then store item here, or
I all N cells have been probed (table full, throw exception)
0 1 2 3 4 5 6 7 8 9 10 11 1216 17 4 5 6 7A A
Example: insert(3) in the above table yields (h(x) = x mod 13)
0 1 2 3 4 5 6 7 8 9 10 11 1216 17 4 3 6 7A
Important: for findElement cells with Available are treated asfilled, that is, the search continues.
Linear Probing: Possible Extensions
Disadvantages of linear probing:I Colliding items lump together, causing:
I longer sequences of probes
I reduced performance
Possible improvements/ modifications:I instead of probing successive elements,
compute the i-th probing index hi depending on i and k:
hi(k) = h(k) + f (i , k)
Examples:I Fixed increment c: hi(k) = h(k) + c · i .I Changing directions: hi(k) = h(k) + c · i · (−1)i .
I Double hashing: hi(k) = h(k) + i · h ′(k).
Double Hashing
Double hashing uses a secondary hash function d(k):I Handles collisions by placing items in the first available cell
h(k) + j · d(k)
for j = 0,1, . . . ,N − 1.
I The function d(k) always be > 0 and < N.
I The size of the table N should be a prime.
Double Hashing: Example
We use double hashing with:I N = 13
I h(k) = k mod 13
I d(k) = 7 − (k mod 7)
k h(k) d(k) Probes18 5 3 541 2 1 222 9 6 944 5 5 5, 1059 7 4 732 6 3 631 5 4 5,9,073 8 4 8
0 1 2 3 4 5 6 7 8 9 10 11 121841 22 44593231 73
Performance of Hashing
In worst case insertion, lookup and removal take O(n) time:I occurs when all keys collide (end up in one cell)
The load factor α = n/N affects the performace:I Assuming that the hash values are like random numbers,
it can be shown that the expected number of probes is:
1/(1 − α)
α
f (x)
1/(1 − α)
0.2 0.4 0.6 0.8 1
5
10
15
20
Performance of Hashing
In worst case insertion, lookup and removal take O(n) time:I occurs when all keys collide (end up in one cell)
The load factor α = n/N affects the performace:I Assuming that the hash values are like random numbers,
it can be shown that the expected number of probes is:
1/(1 − α)
In practice hashing is very fast as long as α < 0.85:I O(1) expected running time for all Dictionary ADT methods
Applications of hash tables:I small databases
I compilers
I browser caches
Universal Hashing
No hash function is good in general:I there always exist keys that are mapped to the same value
Hence no single hash function h can be proven to be good.
However, we can consider a set of hash functions H.(assume that keys are from the interval [0,M − 1])
We say that H is universal (good) if for all keys 0 ≤ i 6= j < M:
probability(h(i) = h(j)) ≤ 1N
for h randomly selected from H.
Universal Hashing: Example
The following set of hash functions H is universal:I Choose a prime p betwen M and 2 ·M.
I Let H consist of the functions
h(k) = ((a · k + b) mod p) mod N
for 0 < a < p and 0 ≤ b < p.
Proof Sketch.Let 0 ≤ i 6= j < M. For every i ′ 6= j ′ < p there exist unique a,bsuch that i ′ = a · i + b mod p and j ′ = a · i + b mod p. Thusevery pair (i ′, j ′) with i ′ 6= j ′ has equal probability. Consequentlythe probability for i ′ mod N = j ′ mod N is ≤ 1
N .
Comparison AVL Trees vs. Hash Tables
Dictionary methods:
search insert removeAVL Tree O(log2 n) O(log2 n) O(log2 n)
Hash Table O(1) 1 O(1) 1 O(1) 1
1 expected running time of hash tables, worst-case is O(n).
Ordered dictionary methods:
closestAfter closestBeforeAVL Tree O(log2 n) O(log2 n)
Hash Table O(n + N) O(n + N)
Examples, when to use AVL trees instead of hash tables:1. if you need to be sure about worst-case performance2. if keys are imprecise (e.g. measurements),
e.g. find the closest key to 3.24: closestTo(3.72)
Sorting Algorithms
We have already seen:I Selection-sort
I Insertion-sort
I Heap-sort
We will see:I Bubble-sort
I Merge-sort
I Quick-sort
We will show that:I O(n · log n) is optimal for comparison based sorting.
Bubble-Sort
The basic idea of bubble-sort is as follows:I exchange neighboring elements that are in wrong order
I stops when no elements were exchanged
bubbleSort(A):n = length(A)
swapped = truewhile swapped == true do
swapped = falsefor i = 0 to n − 2 do
if A[i ] > A[i + 1] thenswap(A[i ],A[i + 1])
swapped = truedonen = n − 1
done
Bubble-Sort: Example
5 4 3 7 1
4 5 3 7 1
4 3 5 7 1
4 3 5 7 1
4 3 5 1 7
3 4 5 1 7
4 3 5 1 7
4 3 1 5 7
3 4 1 5 7
3 1 4 5 7
1 3 4 5 7
Unsorted Part
Sorted Part
Bubble-Sort: Properties
Time complexity:I worst-case:
(n − 1) + . . .+ 1 =(n − 1)2 + n − 1
2∈ O(n2)
(caused by sorting an inverse sorted list)
I best-case: O(n)
Bubble-sort is:I slow
I in-place
Divide-and-Conquer
Divide-and-Conquer is a general algorithm design paradigm:I Divide: divide the input S into two disjoint subsets S1, S2
I Recur: recursively solve the subproblems S1, S2
I Conquer: combine solutions for S1, S2 to a solution for S(the base case of the recursion are problems of size 0 or 1)
Example: merge-sort
7 2 | 9 4 7→ 2 4 7 9
7 | 2 7→ 2 7
7 7→ 7 2 7→ 2
9 | 4 7→ 4 9
9 7→ 9 4 7→ 4
I | indicates the splitting point
I 7→ indicates merging of the sub-solutions
Merge Sort
Merge-sort of a list S with n elements works as follows:I Divide: divide S into two lists S1, S2 of ≈ n/2 elements
I Recur: recursively sort S1, S2
I Conquer: merge S1 and S2 into a sorting of S
Algorithm mergeSort(S,C):Input: a list S of n elements and a comparator COutput: the list S sorted according to C
if size(S) > 1 then(S1,S2) = partition S into size bn/2c and dn/2emergeSort(S1,C)
mergeSort(S2,C)
S = merge(S1,S2,C)
Merging two Sorted Sequences
Algorithm merge(A,B,C):Input: sorted lists A, BOutput: sorted lists containing the elements of A and B
S = empty listwhile ¬A.isEmtpy() or ¬B.isEmtpy() do
if A.first().element < B.first().element thenS.insertLast(A.remove(A.first()))
elseS.insertLast(B.remove(B.first()))
donewhile ¬A.isEmtpy() do S.insertLast(A.remove(A.first()))while ¬B.isEmtpy() do S.insertLast(B.remove(B.first()))
Performance:I Merging two sorted lists of length about n/2 is O(n) time.
(for singly linked lists, double linked lists, and arrays)
Merge-sort: Example
a n e x | a m p l e
a n | e x
a | n
a n
e | x
e x
a m | p l e
a | m
a m
p | l e
p l | e
l e
a n e x a m e l
e l pa e n x
a e l m p
a a e e l m n p x
Divide(split)
Conquer(merge)
Merge-Sort Tree
An execution of merge-sort can be displayed in a binary tree:I each node represents recursive call and stores:
I unsorted sequence before execution, its partition
I sorted sequence after execution
I leaves are calls on subsequences of size 0 or 1
a n e x | a m p l e 7→ a a e e l m n p x
a n | e x 7→ a e n x
a | n 7→ a n
a 7→ a n 7→ n
e | x 7→ e x
e 7→ e x 7→ x
a m | p l e 7→ a e l m p
a | m 7→ a m
a 7→ a m 7→ m
p | l e 7→ e l p
p 7→ p l | e 7→ e l
l 7→ l e 7→ e
Merge-Sort: Example Execution
7 1 2 9 | 6 5 3 8 7→ 1 2 3 5 6 7 8 9
7 1 | 2 9 7→ 1 2 7 9
7 | 1 7→ 1 7
7 7→ 7 1 7→ 1
2 | 9 7→ 2 9
2 7→ 2 9 7→ 9
6 5 | 3 8 7→ 3 5 6 8
6 | 5 7→ 5 6
6 7→ 6 5 7→ 5
3 | 8 7→ 3 8
3 7→ 3 8 7→ 8
Finished merge-sort tree.
Merge-Sort: Running Time
The height h of the merge-sort tree is O(log2 n):I each recursive call splits the sequence in half
The work all nodes together at depth i is O(n):I partitioning, and merging of 2i sequences of n/2i
I 2i+1 ≤ n recursive calls
10 n
21 n/2
2ii n/2i
. . .. . . . . .
depth nodes size
Thus the worst-case running time is O(n · log2 n).
Quick-Sort
Quick-sort of a list S with n elements works as follows:I Divide: pick random element x (pivot) from S and split S in:
I L elements less than x
I E elements equal than x
I G elements greater than x
7 2 1 9 6 5 3 85
I Recur: recursively sort L, and G
2 1 3 5 7 9 6 8
L E G
I Conquer: join L, E , and G
1 2 3 5 6 7 8 9
Quick-Sort: The Partitioning
The partitioning runs in O(n) time:I we traverse S and compare every element y with x
I depending on the comparison insert y in L, E or G
Algorithm partition(S,p):Input: a list S of n elements, and positon p of the pivotOutput: list L, E , G of lists less, equal or greater than pivot
L,E ,G = empty listsx = S.elementAtRank(p)
while ¬S.isEmpty() doy = S.remove(S.first())if y < x then L.insertLast(y)
if y == x then E .insertLast(y)
if y > x then G.insertLast(y)
donereturn L, E , G
Quick-Sort Tree
An execution of quick-sort can be displayed in a binary tree:I each node represents recursive call and stores:
I unsorted sequence before execution, and its pivot
I sorted sequence after execution
I leaves are calls on subsequences of size 0 or 1
1 6 2 9 4 0 7→ 0 1 2 4 6 9
1 2 0 7→ 0 1 2
0 7→ 0 2 7→ 2
6 9 7→ 6 9
9 7→ 9
Quick-Sort: Example
8 2 9 3 1 5 7 6 4 7→ 1 2 3 4 5 6 7 8 9
2 3 1 5 4 7→ 1 2 3 4 5
2 3 5 4 7→ 2 3 4 5
2 7→ 2 5 4 7→ 4 5
4 7→ 4
8 9 7 7→ 7 8 9
7 7→ 7 9 7→ 9
Quick-Sort: Worst-Case Running Time
The worst-case running time occurs when:I the pivot is always the minimal or maximal element
I then one L and G has size n − 1, the other size 0
Then the running time is O(n2):
n + (n − 1) + (n − 2) + . . .+ 1 ∈ O(n2)
n
0 n − 1
0 n − 2
0 ...1
0 0
Quick-Sort: Average Running Time
Consider a recursive call on a list of size m:I Good call: if both L and G are each less then 3
4 · s size
I Bad call: one of L and G is greater than 34 · s size
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16
bad calls good calls bad calls
A good call has probability 12 :
I half of the pivots give rise to good calls
Quick-Sort: Average Running Time
For a node at depth i, we expect (average):I i/2 ancestors are good calls
I the size of the sequence is ≤ (3/4)i/2 · nAs a consequence:
I for a node at depth 2 · log3/4 n the expected input size is 1
I the expected height of the quick-sort tree is O(log n)
O(log n)
O(n)
O(n)
O(n)
sa
sb
sd
. . . . . .
se
. . . . . .
sc
sf
. . . . . .
sg
. . . . . .
The amount of work at depth i is O(n).Thus the expected (average) running time is O(n · log n).
In-Place Quick-Sort
Quick-Sort can be sorted in-place (but then non-stable):
Algorithm inPlaceQuickSort(A, l, r):Input: list A, indices l and rOutput: list A where elements from index l to r are sorted
if l ≥ r then returnp = A[r] (take rightmost element as pivot)l ′ = l and r ′ = rwhile l ′ ≤ r ′ do
while l ′ ≤ r ′ and A[l ′] ≤ p do l ′ = l ′ + 1 (find > p)while l ′ ≤ r ′ and A[r ′] ≥ p do r ′ = r ′ − 1 (find < p)if l ′ < r ′ then swap(A[l ′],A[r ′]) (swap < p with > p)
doneswap(A[r],A[l ′]) (put pivot into the right place)inPlaceQuickSort(A, l, l ′ − 1) (sort left part)inPlaceQuickSort(A, l ′ + 1, r) (sort right part)
Considered in-place although recursion needs O(log n) space.
In-Place Quick-Sort: Example
5 8 3 7 1 6
5 1 3 7 8 6
5 1 3 6 8 7
1 5 3 6 8 7
1 3 5 6 8 7
1 3 5 6 8 7
1 3 5 6 8 7
1 3 5 6 7 8
1 3 5 6 7 8
Unsorted Part
Sorted Part
Pivot
Sorting: Lower Bound
Many sorting algorithms are comparison based:I sort by comparing pairs of objects
I Examples: selection-sort, insertion-sort, bubble-sort,heap-sort, merge-sort, quick-sort,. . .
xi < xj ?noyes
No comparison based sorting algorithm can be faster than
Ω(n · log n)
time (worst-case).
We will prove this lower bound on the next slides. . .
Sorting: Lower Bound (Decision Tree)
We will only count comparisons (sufficient for lower bound):I Assume input is a permutation of the numbers 1, 2, . . . , n.
I Every execution corresponds to a path in the decision tree:
xa < xb?
xc < xd?
xg < xh?
. . . . . .
xi < xj?
. . . . . .
xe < xf ?
xk < xl?
. . . . . .
xm < xo?
. . . . . .
I Algorithm itself maybe does not have this tree structure,but this is the maximal information gained by the algorithm.
Sorting: Lower Bound (Leaves)
Every leaf corresponds to exactly one input permutation:I application of the same swapping steps on two different
input permutations, e.g. . . . ,6,. . . ,7,. . . and . . . ,7,. . . ,6,. . . ,yields different results (not both results can be sorted)
xa < xb?
xc < xd?
xg < xh?
. . . . . .
xi < xj?
. . . . . .
xe < xf ?
xk < xl?
. . . . . .
xm < xo?
. . . . . .
Sorting: Lower Bound (Height of the Decision Tree)
The height of the tree is a lower bound on the running time:I There are n! = n · (n − 1) · · · 1 permutations of 1, 2, . . . , n.
I Thus the height of the tree is at least: log2(n!).
n!
log2(n!)
xa < xb?
xc < xd?
xg < xh?
. . . . . .
xi < xj?
. . . . . .
xe < xf ?
xk < xl?
. . . . . .
xm < xo?
. . . . . .
Sorting: Lower Bound
Hence any comparison-based sorting algorithm takes at least
log2(n!) ≥ log2(n/2)n/2
=n2
log2n2
∈ Ω(n · log n)
time in worst-case.
Summary of Comparison-Based Sorting Algorithms
Algorithm Time Notes
selection-sort O(n2) I slow (but good for small lists)I in-place, stableI insertion-sort good for online
sorting and nearly sorted lists
insertion-sort O(n2)
bubble-sort O(n2)
heap-sort O(n · log2 n)I in-place, not stable, fastI good for large inputs (1K − 1M)
merge-sort O(n · log2 n)
I fast, stable, usually not in-placeI sequential data accessI good for large inputs (> 1M)
quick-sort O(n · log2 n)
(expected)I in-place, randomized, not stableI fastest, good for huge inputs
Quick-sort usually performs fastest, although worst-case O(n2).
Sorting: Comparison of Runtime
Algorithm 25.000sorted
100.000sorted
25.000not sorted
100.000not sorted
selection-sort 1.1 19.4 1.1 19.5
insertion-sort 0 0 1.1 19.6
bubble-sort 0 0 5.5 89.8
Algorithm 5 millionsorted
20 millionsorted
5 millionnot sorted
20 millionnot sorted
insertion-sort 0.03 0.13 timeout timeout
heap-sort 3.6 15.6 8.3 42.2
merge-sort 2.5 10.5 3.7 16.1
quick-sort 0.5 2.2 2.0 8.7
I Source: Gumm, Sommer Einführung in die Informatik.
Bucket-Sort
Let S be a list of n key-element items with keys in [0,N − 1].
Bucket-sort uses the keys as indices into auxiliary array B:I the elements of B are lists, so-called bucketsI Phase 1:
I empty S by moving each item (k, e) into its bucket B[k]
I Phase 2:I for i = 0, . . . ,N − 1 move the items of B[k] to the end of S
Performance:I phase 1 takes O(n) time
I phase 2 takes O(n + N) timeThus bucket-sort is O(n + N).
Bucket-Sort: Example
I key range [0,9]
7,d 1,c 3,a 7,g 3,b 7,e
I Phase 1: filling the buckets
B
1 2 3 4 5 6 7 8 9
1,c 3,a 3,b 7,d 7,g 7,e
I Phase 2: emptying the buckets into the list
1,c 3,a 3,b 7,d 7,g 7,e
Bucket-Sort: Properties and Extensions
The keys are used as indices for an array, thus:I keys should be numbers from [0,N − 1]
I no external comparator
Bucket-sort is a stable sorting algorithm.
Extensions:I can be extended to an arbitrary (fixed) finite set of keys D
(e.g. the names of the 50 U.S. states)
I sort D and compute the rank rankOf(k) of each element
I put item (k, e) into bucket B[rankOf(k)]
Bucket-sort runs in O(n + N) time:I very efficient if keys come from a small intervall [0,N − 1]
(or in the extended version from a small set D)
Lexicographic Order
A d-tuple is a sequence of d keys (k1, k2, . . . , kd):I ki is called the i-th dimension of the tuple
Example: (2,5,1) as point in 3-dimensional space
The lexicographic order of d tuples is recursively defined:
(x1, x2, . . . , xd) < (y1, y2, . . . , yd)⇐⇒x1 < y1 ∨ (x1 = y1 ∧ (x2, . . . , xd) < (y2, . . . , yd))
That is, the tuples are first compared by dimension 1, then 2,. . .
Lexicographic-Sort
Lexicographic-sort sorts a list of d-tuples in lexicographic order:
I Let Ci be comparator comparing tuples by i-th dimension.
I Let stableSort be a stable sorting algorithm.
Lexicographic-sort executes d-times stableSort, thus:I let T (n) be the running time of stableSort
I then lexicographic-sort runs in O(d · T (n))
Algorithm lexicographicSort(S):Input: a list S of d-tuplesOutput: list S sorted in lexicographic order
for i = d downto 1 dostableSort(S,Ci)
done
Lexicographic-Sort: Example
(7,4,6) (5,1,5) (2,0,6) (5,1,4) (2,1,4)
(5,1,4) (2,1,4) (5,1,5) (7,4,6) (2,0,6) dimension 3
(2,0,6) (5,1,4) (2,1,4) (5,1,5) (7,4,6) dimension 2
(2,0,6) (2,1,4) (5,1,4) (5,1,5) (7,4,6) dimension 1
Number representations
We can write numbers in different numeral systems, e.g.:I 4310, that is, 43 in decimal system (base 10)
I 1010112, that is, 43 in binary system (base 2)
I 11213, that is, 43 represented base 3
For every base b ≥ 2 and every number m there exist uniquedigits 0 ≤ d0, . . . ,dl < b such that:
m = dl · bl + dl−1 · bl−1 + . . .+ d1 · b1 + d0 · b0
and if l > 0 then dl 6= 0.
Example
43 = 4310 = 4 · 101 + 3 · 100
= 1010112 = 1 · 25 + 0 · 24 + 1 · 23 + 0 · 22 + 1 · 21 + 1 · 20
= 11213 = 1 · 33 + 1 · 32 + 2 · 31 + 1 · 30
Radix-Sort
Radix-sort is specialization of lexicographic-sort:I uses bucket-sort as stable soring algorithm
I is applicable if tuples consists of integers from [0,N − 1]
I runs in O(d · (n + N)) time
Sorting integers of fixed bit-length d in linear time:I consider a list of n d-bit integers xd−1xd−2 . . . x0 (base 2)
I thus each integer is a d-tuple (xd−1, xd−2, . . . , x0)
I apply radix sort with N = 2
I the runtime is O(d · n)
For example, we can sort 32-bit integers in linear time.
Example
We sort the following list of 4-bit integers:
1001 0010 1101 0001 1110
0010 1110 1001 1101 0001
1001 1101 0001 0010 1110
1001 0001 0010 1101 1110
0001 0010 1001 1101 1110
Exercise C-4.14
Suppose we are given a sequence S of n elements each ofwhich is an integer from [0,n2 − 1]. Describe a simple methodfor sorting S in O(n) time.
I Each number from [0,n2 − 1] can be represented by a twodigit number in the number system with base n.
(n − 1) · n + (n − 1) = n2 − 1
I Conversion of each element into base-n is O(1).(O(n) for the whole list).
I Then use radix-sort to sort in O(2 · n), that is, O(n) time.
The Selection Problem
The selection problem:I given an integer k and a list x1, . . . , xn of n elements
I find the k -th smallest element in the list
Example: the 3rd smallest element of the following list is 6
7 4 9 6 2
An O(n · log n) solution:I sort the list (O(n · log n))
I pick the k -th element of the sorted list (O(1))
2 4 6 7 9
Can we find the k -th smallest element faster?
Quick-Select
Quick-select of the k-th smallest element in the list S:I based on prune-and-search paradigm
I Prune: pick random element x (pivot) from S and split S in:I L elements < x, E elements == x, G elements > x
7 2 1 9 6 5 3 85
2 1 3 5 7 9 6 8
L E GI partitioning into L, E and G works precisely as for quick-sort
I Search:I if k ≤ |L| then return quickSelect(k,L)
I if |L| < k ≤ |L| + |E | then return x
I if k > |L| + |E | then return quickSelect(k − |L| − |E |,G)
Quick-Select Visualization
Quick-select can be displayed by a sequence of nodes:I each node represents recursive call and stores: k, the
sequence, and the pivot element
k = 5, S = (7 4 9 3 2 6 5 1 8)
k = 2, S = (7 4 9 6 5 8)
k = 2, S = (7 4 6 5)
k = 1, S = (7 6 5)
found 5
Quick-Select: Running Time
The worst-case running is O(n2) time:I if the pivot is always the minimal or maximal element
The expected running time is O(n) (compare with quick-sort):I with probability 0.5 the recursive call is good: (3/4)n size
I T (n) ≤ b · a · n + T (34n)
I a is the time steps for partitioning per element
I b is the expected number of calls until a good call
I b = 2 (average number of coins to toss until head shows up)
I Let m = 2 · a · n, then
T (n) ≤ 2 · a · n + T ((3/4)n)
≤ 2 · a · n + 2 · a · (3/4) · n + 2 · a · (3/4)2 · n . . .= 8 · a · n ∈ O(n) (geometric series)
Quick-Select: Median of Medians
We can do selection in O(n) worst-case time.
Idea: recursively use select itself to find a good pivot:I divide S into n/5 sets of 5 elements
I find a median in each set (baby median)
I recursively use select to find the median of the medians
1
2
3
45
1
2
3
45
1
2
3
45
1
2
3
45
1
2
3
45
1
2
3
45
1
2
3
45
1
2
3
45
1
2
3
45
min. size of L
min. size of G
3
The minimal size of L and G is 0.3 · n.
Quick-Select: Median of Medians
We know:I The minimal size of L and G is 0.3 · n.
I Thus the maximal size of L and G is 0.7 · n.Let b ∈ N such that:
I partitioning of a list of size n takes at most b · n time,
I finding the baby medians takes at most b · n time,
I the base case n ≤ 1 takes at most b time.
We derive a recurrence equation for the time complexity:
T (n) =
b if n ≤ 1T (0.7 · n) + T (0.2 · n) + 2 · b · n if n > 1
We will see how to solve recurrence equations. . .
Fundamental Techniques
We have considered algorithms for solving special problems.Now we consider a few fundamental techniques:
I Devide-and-Conquer
I The Greedy Method
I Dynamic Programming
Divide-and-Conquer
Divide-and-Conquer is a general algorithm design paradigm:I Divide: divide input S into k ≥ 2 disjoint subsets S1,. . . ,Sk
I Recur: recursively solve the subproblems S1,. . . ,Sk
I Conquer: combine solutions for S1,. . . ,Sk to solution for S(the base case of the recursion are problems of size 0 or 1)
We have already seen examples:I merge sort
I quick sort
I bottom-up heap construction
Focus now: analysing time complexity by recurrence equations.
Methods for Solving Recurrence Equation
We consider 3 methods for solving recurrence equation:I Iterative substitution method,
I Recursion tree method,
I Guess-and-test method, and
I Master method.
Iterative Substitution
Iterative substitution technique works as follows:I iteratively apply the recurrence equations to itself
I hope to find a pattern
Example
T (n) =
1 if n ≤ 1T (n − 1) + 2 if n > 1
We start with T (n) and apply the recursive case:
T (n) = T (n − 1) + 2= T (n − 2) + 4= T (n − k) + 2 · k
For k = n − 1 we reach the base case T (n − k) = 1 thus:
T (n) = 1 + 2 · (n − 1) = 2 · n − 1
Merge-Sort Review
Merge-sort of a list S with n elements works as follows:I Divide: divide S into two lists S1, S2 of ≈ n/2 elementsI Recur: recursively sort S1, S2
I Conquer: merge S1 and S2 into a sorting of SLet b ∈ N such that:
I merging two lists of size n/2 takes at most b · n time, and
I the base case n ≤ 1 takes at most b time
We obtain the following recurrence equation for merge-sort:
T (n) =
b if n ≤ 12 · T (n/2) + b · n if n > 1
We search a closed solution for the equation, that is:I T (n) = . . . where T (n) does not occur in the right side
Example: Merge-Sort
T (n) =
b if n ≤ 12 · T (n/2) + b · n if n > 1
We assume that n is a power of 2: n = 2k (that is, k = log2 n)I allowed since we are interested in asymptotic behaviour
We start with T (n) and apply the recursive case:
T (n) = 2 · T (n/2) + b · n= 2 ·
(2 · T (n/22) + b · (n/2)
)+ b · n
= 22 · T (n/22) + 2 · b · n= 23 · T (n/23) + 3 · b · n= 2k · T (n/2k) + k · b · n= n · b + (log2 n) · b · n
Thus T (n) = b · (n + n · log2 n) ∈ O(n log2 n).
The Recursion Tree Method
The recursion tree method is a visual approach:I draw the recursion tree and hope to find a pattern
T (n) =
b if n ≤ 23 · T (n/3) + b · n if n > 2
For a node with input size k: work at this node is b · k.
10 n
31 n/3
3ii n/3i
depth nodes size
Thus the work at depth i is 3i · b · n/3i = b · n.The height of the tree is log3 n. Thus T (n) is O(n · log3 n).
Guess-and-Test Method
The guess-and-test method works as follows:I we guess a solution (or an upper bound)
I we prove that the solution is true by induction
Example
T (n) =
1 if n ≤ 1T (n/2) + 2 · n if n > 1
Guess: T (n) ≤ 2 · nI for n = 1 it holds T (n) = 1 ≤ 2 · 1I for n > 1 we have:
T (n) = T (n/2) + 2 · n ≤ 2 · n/2 + 2 · n = 3 · n
Wrong guess: we cannot make 3 · n smaller or equal to 2 · n.
Example, continued
T (n) =
1 if n ≤ 1T (n/2) + 2 · n if n > 1
New guess: T (n) ≤ 4 · nI for n = 1 it holds T (n) = 1 ≤ 4 · 1I for n > 1 we have:
T (n) = T (n/2) + 2 · n≤ 4 · n/2 + 2 · n (by induction hypothesis)= 4 · n
This time the guess was good: 4 · n ≤ 4 · n. Thus T (n) ≤ 4 · n.
Example: Quick-Select with Median of Median
T (n) =
b if n ≤ 1T (0.7 · n) + T (0.2 · n) + 2 · b · n if n > 1
Guess: T (n) ≤ 20 · b · nI for n = 1 it holds T (n) = b ≤ 20 · b · 1I for n > 1 we have:
T (n) = T (0.7 · n) + T (0.2 · n) + 2 · b · n≤ 0.7 · 20 · b · n + 0.2 · 20 · b · n + 2 · b · n (by IH)= 0.9 · 20 · b · n + 2 · b · n = 18 · b · n + 2 · b · n= 20 · b · n
Thus the guess was good.This shows that quick-select with median of median is O(n).
Master method
Many divide-and-conquer recurrence equations have the form:
T (n) =
c if n < daT (n/b) + f (n) if n ≥ d
Theorem (The Master Theorem)
1. if f (n) is O(nlogb a−ε), then T (n) is Θ(nlogb a)
2. if f (n) is Θ(nlogb a logk n), then T (n) is Θ(nlogb a logk+1 n)
3. if f (n) is Ω(nlogb a+ε), then T (n) is Θ(f (n)), providedaf (n/b) ≤ δf (n) for some δ < 1.
Master Method: Example 1
T (n) =
c if n < daT (n/b) + f (n) if n ≥ d
Theorem (The Master Theorem)
1. if f (n) is O(nlogb a−ε), then T (n) is Θ(nlogb a)
2. if f (n) is Θ(nlogb a logk n), then T (n) is Θ(nlogb a logk+1 n)
3. if f (n) is Ω(nlogb a+ε), then T (n) is Θ(f (n)), providedaf (n/b) ≤ δf (n) for some δ < 1.
Example
T (n) = 4 · T (n/2) + n
Solution: logb a = 2, thus case 1 says T (n) = Θn2.
Master Method: Example 2
T (n) =
c if n < daT (n/b) + f (n) if n ≥ d
Theorem (The Master Theorem)
1. if f (n) is O(nlogb a−ε), then T (n) is Θ(nlogb a)
2. if f (n) is Θ(nlogb a logk n), then T (n) is Θ(nlogb a logk+1 n)
3. if f (n) is Ω(nlogb a+ε), then T (n) is Θ(f (n)), providedaf (n/b) ≤ δf (n) for some δ < 1.
Example
T (n) = 2 · T (n/2) + n · log n
Solution: logb a = 1, thus case 2 says T (n) = Θn log2 n.
Master Method: Example 3
T (n) =
c if n < daT (n/b) + f (n) if n ≥ d
Theorem (The Master Theorem)
1. if f (n) is O(nlogb a−ε), then T (n) is Θ(nlogb a)
2. if f (n) is Θ(nlogb a logk n), then T (n) is Θ(nlogb a logk+1 n)
3. if f (n) is Ω(nlogb a+ε), then T (n) is Θ(f (n)), providedaf (n/b) ≤ δf (n) for some δ < 1.
Example
T (n) = T (n/3) + n · log n
Solution: logb a = 0, thus case 3 says T (n) = Θn log n.
Master Method: Example 4
T (n) =
c if n < daT (n/b) + f (n) if n ≥ d
Theorem (The Master Theorem)
1. if f (n) is O(nlogb a−ε), then T (n) is Θ(nlogb a)
2. if f (n) is Θ(nlogb a logk n), then T (n) is Θ(nlogb a logk+1 n)
3. if f (n) is Ω(nlogb a+ε), then T (n) is Θ(f (n)), providedaf (n/b) ≤ δf (n) for some δ < 1.
Example
T (n) = 8 · T (n/2) + n2
Solution: logb a = 3, thus case 1 says T (n) = Θn3.
Master Method: Example 5
T (n) =
c if n < daT (n/b) + f (n) if n ≥ d
Theorem (The Master Theorem)
1. if f (n) is O(nlogb a−ε), then T (n) is Θ(nlogb a)
2. if f (n) is Θ(nlogb a logk n), then T (n) is Θ(nlogb a logk+1 n)
3. if f (n) is Ω(nlogb a+ε), then T (n) is Θ(f (n)), providedaf (n/b) ≤ δf (n) for some δ < 1.
Example
T (n) = 9 · T (n/3) + n3
Solution: logb a = 2, thus case 3 says T (n) = Θn3.
Master Method: Example 6
T (n) =
c if n < daT (n/b) + f (n) if n ≥ d
Theorem (The Master Theorem)
1. if f (n) is O(nlogb a−ε), then T (n) is Θ(nlogb a)
2. if f (n) is Θ(nlogb a logk n), then T (n) is Θ(nlogb a logk+1 n)
3. if f (n) is Ω(nlogb a+ε), then T (n) is Θ(f (n)), providedaf (n/b) ≤ δf (n) for some δ < 1.
Example
T (n) = T (n/2) + 1 binary search
Solution: logb a = 0, thus case 2 says T (n) = Θlog n.
Master Method: Example 7
T (n) =
c if n < daT (n/b) + f (n) if n ≥ d
Theorem (The Master Theorem)
1. if f (n) is O(nlogb a−ε), then T (n) is Θ(nlogb a)
2. if f (n) is Θ(nlogb a logk n), then T (n) is Θ(nlogb a logk+1 n)
3. if f (n) is Ω(nlogb a+ε), then T (n) is Θ(f (n)), providedaf (n/b) ≤ δf (n) for some δ < 1.
Example
T (n) = 2 · T (n/2) + log n heap construction
Solution: logb a = 0, thus case 1 says T (n) = Θn.
Integer Multiplication
Algorithm to multiply two n-bit integers A and B:I Divide: split A, B in n/2 higher-order and lower-order bits
A = Ah · 2n/2 + A`B = Bh · 2n/2 + B`
We can define A · B as follows:
A · B = (Ah · 2n/2 + A`) · (Bh · 2n/2 + A`)
= Ah · Bh · 2n + Ah · B` · 2n/2 + A` · Bh · 2n/2 + A` · B`
So, T (n) = 4 · T (n/2) + n wich implies T (n) ∈ O(n2).
Can we do better? . . .
Integer Multiplication, continued
Algorithm to multiply two n-bit integers A and B:I Divide: split A, B in n/2 higher-order and lower-order bits
A = Ah · 2n/2 + A`B = Bh · 2n/2 + B`
We use a different way to multiply the parts:
A · B = Ah · Bh · 2n + (Ah · B` + A` · Bh) · 2n/2 + A` · B`= Ah · Bh · 2n
+ [(Ah − A`) · (B` − Bh) + Ah · Bh + A` · B`] · 2n/2
+ A` · B`
Here T (n) = 3T (n/2) + n, hence T (n) ∈ O(nlog2 3) = O(n1.585).(note that we need to calculate Ah · Bh and A` · B` only once)
The Greedy Method
An optimization problem consists ofI a set of configurations (different choices) , and
I an objective function (a score assigned to configurations).We search a configuration that maximizes (or minimizes) theobjective function.
The greedy method for solving optimization problems:I tries to find the global optimum (or come close to it) by
iteratively selecting the locally optimal choice.
That is, in every step we choose the currently best possible.
Making Change
Problem: return coins for a given amount of money.I configurations: coins returned and money left to return
I objective function: minimize the number of coins returned
Greedy solution: always return the largest coin possible.
ExampleThe coins are valued 8, 4, 1. We give change for 18:
I we pick 8, 10 left to pay
I we pick 8, 2 left to pay
I we pick 1, 1 left to pay
I we pick 1, 0 left to payThus the greedy solution is 8,8,1,1.
Making Change, continued
ExampleThe coins are valued 4, 3, 1. We give change for 6:
I we pick 4, 2 left to pay
I we pick 1, 1 left to pay
I we pick 1, 0 left to payThus the greedy solution is 4,1,1.
But the optimal solution would be 3,3:I choosing currently best not always leads to global optimum
The above example does not have the greedy-choice property:
A problem has the greedy-choice property if beginning fromthe start configuration the sequence of locally optimal choicesleads to the globally optimal solution.
The Fractional Knapsack Problem
Given: a set S of n items where each item i has:I bi = a positive benefit
I wi = a positive weight
Fractional Knapsack Problem
I choose fractions xi ≤ wi of items with maximal total benefit:∑i∈S
bi ·xi
wi(objective to maximize)
I with weight at most W :∑i∈S
xi ≤W (constraint)
Example
You found a treasure with:I 50kg jewels, value 1 Million Euros
I 1kg chewing gum, value 20 Euros
I 5kg diamonds, value 5 million Euros
I 10kg gold, value 500.000 EurosYour backpack can carry only 20kg!
What do you take with you?I Of course the highest value per weight:
I 5kg diamonds (value/kg = 1 million Euro)
I 10kg gold (value/kg = 0.05 million Euro)
I 5kg jewels (value/kg = 0.02 million Euro)
Your backpack is filled with the maximum value 5.6 million Euro.
The Fractional Knapsack Algorithm
Greedy choice:I keep taking item with the highest benefit to weight ratio: bi
wi
I run time: O(n log n) (for sorting by this ratio)
fractionalKnapsack(S,~b, ~w ,W ):for each i ∈ S do
xi = 0vi = bi/wi
sort S such that elements are descending w.r.t. viwhile W > 0 do
i = S.remove(S.first())xi = min(wi ,W )
W = W − xidone
The Fractional Knapsack Algorithm
Correctness: suppose there is a better solution A.I Then A did not always choose the highest vj .
I Thus there exist items i , j in A such that:
xi > 0 vi < vj xj < wj
Let
a = min(xi ,wj − xj)
But then we could replace amount a of item i with item j ,which would increase the total benefit.
Thus the solution cannot be better than the greedy choice.
The fract. knapsack problem has the greedy-choice property!
Task Scheduling
Given: a set T of n tasks, each having:I a start time si
I a finish time fi (where si < fi )
Goal: perform all tasks using a minimal number of machines
time1 2 3 4 5 6 7 8 9
Machine 1Machine 2Machine 3
Task Scheduling Algorithm
Greedy choice:I keep taking the task with smallest start time
I assign task to free machine if possible(if all machines are busy take a new machine)
taskSchedule(T ,~s,~f ):m = 0 number of machinessort T such that elements are ascending w.r.t. siwhile ¬T .isEmpty() do
i = T .remove(T .first())if a machine j ≤ m has time for task i then
schedule i on machine jelse
m = m + 1schedule i on machine m
done
Example
Given: a set T of tasks:
[1,4], [1,3], [2,5], [3,7], [4,7], [6,9], [7,8]
(ordered by starting time)
time1 2 3 4 5 6 7 8 9
Machine 1Machine 2Machine 3
Task Scheduling Algorithm
Correctness: suppose there is a better schedule.I Assume:
I the greedy algorithm uses m machines,
I there exists a schedule using less than m machines.
I Let i be the first task scheduled on machine m.
I Then on each of the machines 1, . . . ,m − 1 there runs atask with starting time ≤ si and finishing time fi > si .
I All these task conflict with i and with each other.Hence, there cannot be a schedule with less than m machines.
The Greedy Method: Applications
Further applications of greedy algorithms:I string compression (construction of Huffman codes)
I shortest path in graphs
I minimal spanning trees
I . . .
Dynamic Programming
Dynamic programming is a algorithm design paradigm:I used for optimization problems
I solving complex problems by splitting into smaller parts
I optimal solutions of subproblemsare compbined to globally optimal solution
Sounds a bit like divide-and-conquer, but:I not necessarily recursive
I subproblems are allowed to overlap
I a suitable definition of subproblems can be choosen
Also does not tell much. . . we consider examples. . .
Maximum Subarray Problem
Given: array A of integers (can be negative).Goal: find the maximal sum A[i ] + . . .+ A[i + j ] of a subarray.
Algorithm maxSubarray(A, n):Input: array A containing n integersOutput: maximum subarray sum
max = 0for left = 0 to n − 1 do
sum = 0for right = left to n − 1 do
sum = sum + A[right]if sum > max then max = sum
donedonereturn max
The naive algorithm is O(n2).
Maximum Subarray Algorithm
How to split into subproblems?I let B[r ] be the maximum sum of a subarray ending at rank r
Let B[0] = max(A[0],0).We can compute B[r ] from B[r − 1] as follows:
B[r ] = max(0, B[r − 1] + A[r ])
That is, the maximal subarray ending at r is either:I the maximal subarray ending at r − 1 plus element A[r ], or
I the empty subarray ending at r .
The maximum subarray sum is the maximum of the B[i ]’s.This gives rise to a linear time algorithm. . .
Maximum Subarray Algorithm
Kadane’s algorithm for the maximum subarray problem:
Algorithm maxSubarray(A, n):Input: array A containing n integersOutput: maximum subarray sum
B = new array of length nB[0] = max(A[0],0)
max = B[0]
for r = 1 to n − 1 doB[r ] = max(0,B[r − 1] + A[r ])max = max(max,B[r ])
donereturn max
This algorithm computes the maximal subarray sum in O(n).
Making Change
Problem: return coins for a given amount of money.I configurations: coins returned and money left to return
I objective function: minimize the number of coins returnedLet C be the set of coins, e.g. C = 1,3,4.
How to split into subproblems?I let B[w ] be the best solution for making change for w
(that is the minimal number of coins for making w)
Let B[0] = 0. Compute B[w ] from B[0],. . . ,B[w − 1] as follows:
B[w ] = 1 + min B[w − c] | c ∈ C s.t. c ≤ w
That is, for the minimal number of coins for w we need:I a coin c from C plus the minimal number of coins for w − c
Example
Let C = 1,3,4, we make change for 6.
Let B[0] = ∅.I B[1] = 1 + max B[0] (for c = 1) = 1
I B[2] = 1 + max B[1] (for c = 1) = 2
I B[3] = 1 + max B[2] (for c = 1),B[0] + 1 (for c = 3) = 1
I B[4] = 1 + max B[3] (c = 1),B[1] (c = 3),B[0] (c = 4) = 1
I B[5] = 1 + max B[4] (c = 1),B[2] (c = 3),B[1] (c = 4) = 2
I B[6] = 1 + max B[5] (c = 1),B[3] (c = 3),B[2] (c = 4) = 2Thus we need 2 coins to make change for 6.
If we want to know which coins are needed to make change forw , we have to remember which coin was added in each step.(for example: coin 3 was choosen for B[6], and 3 for B[3])
The 0/1 Knapsack Problem
Given: a set S of n items where each item i has:I bi = a positive benefit
I wi = a positive weight
0/1 Knapsack Problem
I choose items T ⊆ S with maximal total benefit:∑i∈T
bi (objective to maximize)
I with weight at most W :∑i∈T
wi ≤W (constraint)
Thus now each item is either accepted or rejected entirely.
Example
You found a treasure with:I a 7kg gold bar,
I a 5kg gold bar, and
I a 4kg gold bar.The value of the gold coincides with the weight.Your backpack can carry only 10kg!
We cannot split the gold bars. What do you take with you?I The greedy approach would take the 7kg gold bar.
I However, better is: 5kg, 4kg with total weight 9kg.
The 0/1 Knapsack Algorithm
How to split into subproblems?I number items from 1,. . . ,n
I let Sk be the the set of elements 1,. . . ,k
I let B[k,w ] be the best selection from Sk with weight ≤ wThus the B[k,w ] are the solutions of subproblems.
We compute B[k,w ] from B[k − 1,0],. . . ,B[k − 1,w ] as follows:
B[k,w ] =
B[k − 1,w ], if wk > wmax(B[k − 1,w ],B[k − 1,w − wk] + bk) otherwise
That is, the best subset of Sk with weight ≤ w is either:I the best subset of Sk−1 with weight ≤ w , or
I the best subset of Sk−1 with weight ≤ w − wk plus item k
Example
We have:I item 1 with weight 3 and benefit 9 (b1/w1 = 3)
I item 2 with weight 2 and benefit 5 (b2/w2 = 2.5)
I item 3 with weight 2 and benefit 5 (b3/w3 = 2.5)Maximal total weight is w = 4!
We run the algorithm:I B[0,0] = 0, B[0,1] = 0, B[0,2] = 0, B[0,3] = 0, B[0,4] = 0
I B[1,0] = 0, B[1,1] = 0, B[1,2] = 0, B[1,3] = 9, B[1,4] = 9
I B[2,0] = 0, B[2,1] = 0, B[2,2] = 5, B[2,3] = 9, B[2,4] = 9
I B[3,0] = 0, B[3,1] = 0, B[3,2] = 5, B[3,3] = 9, B[3,4] = 10Thus the best benefit for weight 4 is B[3,4] = 10.
Review Matrix Multiplication
Let A, B be matrices:I let A have dimension d × e
I let B have dimension e × fThen C = A · B has dimension d × f .
The (naive) algorithm for computing C = A · B:
C[r , c] =
e−1∑i=0
A[r , i ] · B[i , c]
takes O(d · e · f ) time.(we multiply every row of A with every column of B)
Matrix Chain-Products
Matrix chain-product:I Compute A0 · A1 · · ·An−1.
I Let Ai have dimension di × di+1.Problem: how to put parenthesize?
Example:I B is 3× 100 (height × width)
I C is 100× 5
I D is 5× 5Then
I (B · C) · D takes 1500 + 75 = 1575 operations
I B · (C · D) takes 2500 + 1500 = 4000 operations
Attempt 1: Enumeration Approach
Idea of the enumeration (brute-force) approach:I Try all possible ways to parenthesize A0 · · ·An−1.
I Calculate the number of operations for each of them.
I Pick the best solution.
Running time:I As many ways to parenthesize as binary trees with n
leaves.
I This is exponential, almost 4n.Thus the algorithm is terribly slow.
Attempt 2: Greedy Approach
Idea: select the product that uses the fewest operations.
Example where the greedy approach fails:I A is 2× 1
I B is 1× 2
I C is 2× 3Then:
I A · B takes 4 operations
I B · C takes 6 operationsThus the greedy methods picks (A · B) · C. However:
I (A · B) · C takes 4 + 12 = 16 operations
I A · (B · C) takes 6 + 6 = 12 operations
Thus the greedy method does not yield the optimal result.
Solution: Dynamic Programming
What are the subproblems?I best parenthesization of Ai · Ai+1 · · ·Aj .
Let N[i , j ] be the number of operations for this subproblem.
The global optimum can be defined from optimal subproblems:I Recall that Ai has dimension di × di+1.
I Thus we can define:
N[i , j ] = mini≤k<j
N[i , k ] + N[k + 1, j ] + di · dk+1 · dj+1
Note that the subproblems are can overlap:
I e.g. N[1,4] and N[2,6] overlap
Solution: Dynamic Programming Algorithm
Matrix chain-products with running time O(n3):I We start with N[i , i ] = 0 (nothing to multiply there).
I Then for s = 2,3, . . . we compute problems of size s.(that is, the problems N[i , i + s − 1])
matrixChain(d0,d1, . . . ,dn−1):for i = 0 to n − 1 do N[i, i] = 0 donefor s = 2 to n do
for i = 0 to n − s doj = i + s − 1N[i, j] = +∞for k = i to j − 1 do
ops = N[i, k] + N[k + 1, j] + di · dk+1 · dj+1N[i, j] = min(N[i, j], ops)
donedone
done
Example
Let d0 = 2, d1 = 1, d2 = 2, d3 = 4, d4 = 3.I A0 is 2× 1, A1 is 1× 2, A2 is 2× 4, A3 is 4× 3.
We run the algorithm:I N[0,0] = 0, N[1,1] = 0, N[2,2] = 0, N[3,3] = 0
I N[0,1] = 4, N[1,2] = 8, N[2,3] = 24
I N[0,2] = min(0 + 8 + 8,4 + 0 + 16) = 16,N[1,3] = min(0 + 24 + 6,8 + 0 + 12) = 20
I N[0,4] = min(0 + 20 + 6,4 + 24 + 12,16 + 0 + 24) = 26
Thus the optimal solution is: A0 · ((A1 · A2) · A3).
Graphs
V
U X Z
W
Y
a
c
d
b
e
f
g
h
i
j
Graphs
A graph is a pair (V ,E) where:I V is a set of nodes, called vertices
I E is a set of pairs of nodes, called edgesBoth the nodes and edges can store elements (labels).
U X Zb
h
i
j
Example:I Nodes: U, X , Z
I Edges: (U,X ), (X ,Z ), (Z ,X ), (Z ,Z )
Edge Types
Directed edge:I ordered pair of vertices (u, v)
I first vertex x is the origin
I second vertex y is the destinationDirected graph: all edges are directed.
Amsterdam
London
flight A001
Undirected edge:I unordered pair of vertices (u, v)
Undirected graph: all edges are undirected.
Amsterdam
London
357.52 km
Applications of Graphs
I Electronic circuits.I Transportation networks:
I street network (Google maps)
I flight network
I Computer networks.
I . . .
Graphs: Terminology
I Start- and endpoint:I U is start-point of a
I V is end-point of a
I Edges:I a is outgoing edge of U
I a is incoming edge of V
I Degree of a vertex(number of edges):
I The degree deg(X ) of X is 5
I Self-loops:I j is a self-loop
V
U X Z
W
Y
a
c
d
b
e
f
g
h
i
j
Graphs: Terminology (continued)
I Path:I sequence of alternating
vertices and edges
I begins and ends with vertes
I each edge is preceded byits start-point and followedby its end-point
I e.g. U c W e X g Y f W d V
I Cycle:I path the first and last vertex
are equal
V
U X Z
W
Y
a
c
d
b
e
f
g
h
i
j
A path is simple if all vertices are distinct (except first & last).
Implementing Graphs
Vertex stores:I element
Edge stores:I element
I start-point and end-point
Edge List StructureGraph is stored as list of vertexes and list of edges.
Adjacency List StructureGraph is stored as list of vertexes. Each vertex stores:
I element
I list of (outgoing and incoming) edges
Implementing Graphs: Performance
Assume that we have a graph with n vertices, and m edges.
Edge List Adjacency ListoutgoingEdges(v) O(m) O(deg(v))areAdjacent(v,w) O(m) O(min(deg(v),deg(w)))
insertVertex(o) O(1) O(1)
insertEdge(v,w, o) O(1) O(1)
removeVertex(v) O(m) O(deg(v))insertEdge(e) O(1) O(1)
Weighted Graphs
V
U X Z
W
Y
5
1
2
3
6
3
1
2
5
3
In a weighted graph each edge has an associated numbercalled weight.
The weight can represent distances, costs,. . .
Shortest Path Problem
Given: weighted graph and vertices u, vI find the shortest path between u and v
I shortest means ‘minimal total weight’
Example: shortest path from U to Z has weight 8
V
U X Z
W
Y
5
1
2
3
6
3
1
2
5
3
Dijkstra’s Algorithm
The distance of a vertex v from a vertex s:I minimal weight of a path from v to s
Dijkstra’s algorithm computes distances (shortest paths):I from a given start vertex s to all other verteces
I assumptions: the edge weights are non-negative
Dijkstra’s algorithm:I For every vertex v we store the estimated distance d(v).
(initially d(s) = 0 and d(v) =∞ for all other vertices v)I We start with the set of vertices S = s and in every step:
I add to S the vertex v outside of S with smallest d(v)
I update distances of direct neighbours n of v:d(n) = min(d(n), d(v) + weight of the edge from v to n)
Dijkstra’s Algorithm: Example
Example: shortest paths from U
V
U X Z
W
Y
5
1
2
3
6
3
1
2
5
30
∞
∞
∞
∞
∞
First we add U to S, and update the distances neighbours of U.
Dijkstra’s Algorithm: Example
Example: shortest paths from U
V
U X Z
W
Y
5
1
2
3
6
3
1
2
5
30
5
1
∞
∞
∞
Now W is the closest node outside S:I we add W to S and update the distances of its neighbours
Dijkstra’s Algorithm: Example
Example: shortest paths from U
V
U X Z
W
Y
5
1
2
3
6
3
1
2
5
30
3
1
7
∞
∞
Now V is the closest node outside S:I we add V to S and update the distances of its neighbours
Dijkstra’s Algorithm: Example
Example: shortest paths from U
V
U X Z
W
Y
5
1
2
3
6
3
1
2
5
30
3
1
6
∞
∞
Now X is the closest node outside S:I we add X to S and update the distances of its neighbours
Dijkstra’s Algorithm: Example
Example: shortest paths from U
V
U X Z
W
Y
5
1
2
3
6
3
1
2
5
30
3
1
6
7
8
Now Y is the closest node outside S:I we add Y to S and update the distances of its neighbours
Dijkstra’s Algorithm: Example
Example: shortest paths from U
V
U X Z
W
Y
5
1
2
3
6
3
1
2
5
30
3
1
6
7
8
Now Z is the closest node outside S:I we add Z to S and update the distances of its neighbours
Dijkstra’s Algorithm: Example
Example: shortest paths from U
V
U X Z
W
Y
5
1
2
3
6
3
1
2
5
30
3
1
6
7
8
We obtain a tree of all shortest paths from U to all other nodes.(e.g. the shortest path from U to Z has length 8)
Dijkstra’s Algorithm
Assume that we have a graph with n vertices, and m edges.I A priority queue stores (heap) vertices outside of S:
I key: distance
I element: vertex
I Each vertex v stores distance d(v) and link to heap node.
Algorithm on the following slide. . .
Dijkstra’s Algorithm
DijkstraDistances(G, s):Q = new heap-based priority queue(set all distances to∞ except for s with distance 0)for each v ∈ G.vertices() do
if v == s then v.setDistance(0) else v.setDistance(∞)
h = Q.insert(v.getDistance(), v) (returns heap node)v.setHeapNode(h)
while ¬Q.isEmtpy() dov = Q.removeMin()
(update the distance of all neighbours)for each e ∈ v.outgoingEdges() do
w = e.endPoint()d = v.getDistance() + e.getWeight()if d < w.getDistance() then
w.setDistance(d)
Q.replaceKey(w.getHeapNode(), d)
done
Dijkstra’s Algorithm: Performance
Assume that we have a graph with n vertices, and m edges.I The first loop is executed n times:
I n times inserting in priority queue (O(log n))
I The second loop is executed n times:I n times removing from the priority queue (O(log n))
I The third loop is executed for every edge once (m times):I m times updating a key in the priority queue (O(log n))
In total we have: (n · log n) + (n · log n) + (m · log n).I O((n + m) · log n)
(if we use the adjacency list structure)
Dijkstras algorithm is O(m · log n) (graph is connected).
Why Dijkstra’s Algorithm Works
Dijkstras algorithm is based on the greedy method:I it adds vertices by increasing size
Suppose it would not work:I Let x be the closest vertex with wrongly assigned distance.
I Let y be the previous node on the shortest path from s to x.I Then y is closer, and has correct distance:
I thus y has been processed before x, and
I then x must have been assigned the correct distance.
Hence there cannot be a wrong vertex. The algorithm works!
Why It Doesn’t Work For Negative Weights
Example: shortest path from X
X
Y Z V
7 5
-31
0
∞ ∞ ∞
We have processed all nodes, but V has the wrong distance!
The problem with negative weights problem is:I The nodes on a shortest path do not always have
increasing distance.
Why It Doesn’t Work For Negative Weights
Example: shortest path from X
X
Y Z V
7 5
-31
0
7 5 ∞
We have processed all nodes, but V has the wrong distance!
The problem with negative weights problem is:I The nodes on a shortest path do not always have
increasing distance.
Why It Doesn’t Work For Negative Weights
Example: shortest path from X
X
Y Z V
7 5
-31
0
7 5 6
We have processed all nodes, but V has the wrong distance!
The problem with negative weights problem is:I The nodes on a shortest path do not always have
increasing distance.
Why It Doesn’t Work For Negative Weights
Example: shortest path from X
X
Y Z V
7 5
-31
0
7 5 6
We have processed all nodes, but V has the wrong distance!
The problem with negative weights problem is:I The nodes on a shortest path do not always have
increasing distance.
Why It Doesn’t Work For Negative Weights
Example: shortest path from X
X
Y Z V
7 5
-31
0
7 4 6
We have processed all nodes, but V has the wrong distance!
The problem with negative weights problem is:I The nodes on a shortest path do not always have
increasing distance.
Bellman-Ford Algorithm
Bellman-Ford algorithm:I works with negative weights
I iteration i finds shortest paths of length i
BellmanFord(G, s):for each v ∈ G.vertices() do
if v == s then v.setDistance(0) else v.setDistance(∞)
for i = 1 to n − 1 dofor each e ∈ G.edges() do
v = e.startPoint()w = e.endPoint()d = v.getDistance() + e.getWeight()if d < w.getDistance() then
w.setDistance(d)
done
Example
The nodes are labeled with the values d(v).
0
∞∞ ∞∞ ∞
56
7
-2 4
12
2-4
0
65 7
∞ ∞
56
7
-2 4
1 22
-4
0
64 7
6 9
56
7
-2 4
12
2-4
0
54 7
5 9
56
7
-2 4
12
2-4
0
53 7
5 9
56
7
-2 4
12
2-4
0
53 7
4 9
56
7
-2 4
12
2-4
All-Pair Shortest Paths
Find the distance between every pair of vertices:I n calls to Dijkstra’s algorithm is O(nm log n) time
I n calls to Bellman-Ford is O(n2m) timeWe can do it in O(n3) as follows (assumes nodes are 1, . . . ,n):
AllPair(G):for each vertex pairs (i, j) do
if i != j and (i, j) is an edge in G thenD0[i, j] = weight of edge (i, j)
elseif i == j then D0[i, j] = 0 else D0[i, j] =∞
for k = 1 to n dofor i = 1 to n do
for j = 1 to n doDk[i, j] = min(Dk−1[i, j],Dk−1[i, k] + Dk−1[k, j])
return Dn;
Example
1 2
3 4
1
2
1
53
1 2 3 41 0 1 5 ∞2 ∞ 0 ∞ 23 ∞ ∞ 0 ∞4 3 ∞ 1 0
k = 1
1 2 3 41 0 1 5 ∞2 ∞ 0 ∞ 23 ∞ ∞ 0 ∞4 3 4 1 0
k = 2
1 2 3 41 0 1 5 32 ∞ 0 ∞ 23 ∞ ∞ 0 ∞4 3 4 1 0
k = 3
1 2 3 41 0 1 5 32 ∞ 0 ∞ 23 ∞ ∞ 0 ∞4 3 4 1 0
k = 4
1 2 3 41 0 1 4 32 5 0 3 23 ∞ ∞ 0 ∞4 3 4 1 0
Minimum Spanning Trees
Spanning tree T of a weighted graph G:I T contains all nodes and a subset of the edges of G
I T is a treeSpanning tree is minimal if the total edge weight is minimal.
Application:I Communication networks.
I Transportation networks.X
Y
ZU
V
W
P
8
9 6
3
54
1 10
7
2
Cycle Property
Cycle property:I Let T be a minimal spanning tree for a weighted graph G.
I Let e be an edge that is not in T .
I Let C be the cyle formed by e and T .Then for every edge f of C we have weight(f ) ≤ weight(e).
X
Y
ZU
V
W
P
8
9 6
3
54
1 10
7
2
Proof:I assume weight(f ) > weight(e)
I replacing f by e in T would yieldbetter spanning tree
Example: 8 ≥ 4,1,6.
Partition Property
Partition property:I Consider a partition of the vertices in A and B.
I Let e be a minimum edge between A and B.Then there is a minimum spanning tree containing e.
A
B
X
Y
ZU
V
W
P
4
9 6
3
56
1 10
7
2
Proof: let T be an MSTI if e 6∈ T then it must create a
cycle C with T ; let f be anedge of C between A and B
I by the cycle propertyweight(f ) ≤ weight(e)
I thus weight(f ) = weight(e)
I replacing f with e yields MST
Prim-Jarnik’s Algorithm
Similar to Dijkstra’s algorithm:I we start with a node s, S = s
I build the MST while stepwise extending a set SBut something is changed:
I nodes do not store the distance to to s, but instead
I nodes store the distance to the cloud S
Prim-Jarnik’s algorithm: pick a node sI For every vertex v we store the distance d(v) to S.
(initially d(s) = 0 and d(v) =∞ for all other vertices v)I We start with the set of vertices S = s and in every step:
I add to S the vertex v outside of S with smallest d(v)
I update distances of direct neighbours n of v:d(n) = min(d(n), weight of the edge from v to n)(whenever d(n) is changed we do n.setParent(v))
Example
We compute the minimal spanning tree starting from U:
V
U X Z
W
Y
1
3
4
2
2
1
3
5
0
∞
∞
∞
∞
∞
Prim-Jarnik’s algorithm is O(m · log n).
(same analysis as Dijkstra’s algorithm)
Example
We compute the minimal spanning tree starting from U:
V
U X Z
W
Y
1
3
4
2
2
1
3
5
0
1
3
∞
∞
∞
Prim-Jarnik’s algorithm is O(m · log n).
(same analysis as Dijkstra’s algorithm)
Example
We compute the minimal spanning tree starting from U:
V
U X Z
W
Y
1
3
4
2
2
1
3
5
0
1
3
2
∞
∞
Prim-Jarnik’s algorithm is O(m · log n).
(same analysis as Dijkstra’s algorithm)
Example
We compute the minimal spanning tree starting from U:
V
U X Z
W
Y
1
3
4
2
2
1
3
5
0
1
2
2
3
5
Prim-Jarnik’s algorithm is O(m · log n).
(same analysis as Dijkstra’s algorithm)
Example
We compute the minimal spanning tree starting from U:
V
U X Z
W
Y
1
3
4
2
2
1
3
5
0
1
2
2
1
5
Prim-Jarnik’s algorithm is O(m · log n).
(same analysis as Dijkstra’s algorithm)
Example
We compute the minimal spanning tree starting from U:
V
U X Z
W
Y
1
3
4
2
2
1
3
5
0
1
2
2
1
5
Prim-Jarnik’s algorithm is O(m · log n).
(same analysis as Dijkstra’s algorithm)
Example
We compute the minimal spanning tree starting from U:
V
U X Z
W
Y
1
3
4
2
2
1
3
5
0
1
2
2
1
5
Prim-Jarnik’s algorithm is O(m · log n).
(same analysis as Dijkstra’s algorithm)
Strings
A string is a sequence of characters. Let S be a string of size m:
I A substring of S is a string of the form S[i . . . j].
I A prefix of S is a string of the form S[0 . . . i].
I A suffix of S is a string of the form S[i . . . (m − 1)].
The alphabet Σ is the set of possible characters.
ExampleLet S = ‘‘acaabca ′′:
I “abc” is a substring of S
I “acaa” is a prefix of S
I “ca” is a suffix of S
Pattern Matching
The pattern matching problem:I given: strings T (text) and P (pattern)
I task: find a substring of T equal to P
Example: find P = “ake” in T = “Hey there, wake up!”.
Applications:I text editors
I search engines
I biological reasearch
I . . .
Brute-Force Algorithm
The brute-force algorithm:I compares P with text T for each possible shift of P
Let n be the size of T and m the size of P.bruteForceMatch(T ,P):
for pos = 0 to n − m domatch = truefor i = 0 to m − 1 do
if P[i] != T [pos + i] thenmatch = falsebreak the for-i-loop
if match then return pos (match at position pos)return −1 (no match)
Worst case complexity: O(n · m)
I Example for worst-case: T = aaa . . . aah, P = aaah
Boyer-Moore Heuristics
Boyer-Moore’s pattern matching algorithm:I Looking-glass heuristic:
I compare P with substring of T backwards
I Character-jump heuristic: if mismatch occurs at T [i] = cI if P does not contain c, shift P to align P[0] with T [i + 1]
I else, shift P to align the last occurrence of c in P with T [i]
Example: we search the pattern P = “rithm” in
a p a t t e r n m a t c h i n g a l g o r i t h m
r i t h m
t
m1
r i t h m
e
m2
r i t h m
a
m3
r i t h m
n
m4
r i t h m
g
m5
r i t h m
h
m6
r i t h m
m
m7
h
h8
t
t9
i
i10
r
r11
Last-Occurrence Function
The last-occurrence function L(c):I L(c) maps letters c ∈ Σ to integers L(c) where:
I L(c) is the largest index i such that P[i] = c, or
I L(c) is −1 if no such index exists.
Example:I Σ = a,b, c,d
I P = ‘‘abacab ′′
c a b c dL(c) 4 5 3 -1
The last-occurence function can be computed in O(m + s) time:I where m is the size of P and s the size of Σ
I L(c) can be represented as array where the numericalcodes of letters c ∈ Σ are the index
Boyer-Moore Algorithm
BoyerMooreMatch(T ,P, Σ):L = lastOccurrenceFunction(P, Σ)
pos = 0while pos ≤ n − m do
match = truefor i = m − 1 downto 0 do
if P[i] != T [pos + i] thenmatch = false(align last occurrence, but move at least 1)pos = pos + max(1, i − L(T [pos + i]))break the for-i-loop
doneif match then return pos (match at position pos)
donereturn −1 (no match)
Example
Example:I Σ = a,b, c,d
I P = ‘‘abacab ′′
c a b c dL(c) 4 5 3 -1
a b a c a a b c a d a b a c a b a a b b
a b a c a b
a
b1
a b a c a b
b
b2
a
a3
a
c4
a b a c a b
c
b5
a b a c a b
d
b6
a b a c a b
b
b7
a
a8
c
c9
a
a10
b
b11
a
a12
Example
Example:I Σ = a,b, c,d
I P = ‘‘abacab ′′
c a b c dL(c) 4 5 3 -1
a b a c a a b c a d a b a c a b a a b b
a b a c a b
a
b1
a b a c a b
b
b2
a
a3
a
c4
a b a c a b
c
b5
a b a c a b
d
b6
a b a c a b
b
b7
a
a8
c
c9
a
a10
b
b11
a
a12
Example
Example:I Σ = a,b, c,d
I P = ‘‘abacab ′′
c a b c dL(c) 4 5 3 -1
a b a c a a b c a d a b a c a b a a b b
a b a c a b
a
b1
a b a c a b
b
b2
a
a3
a
c4
a b a c a b
c
b5
a b a c a b
d
b6
a b a c a b
b
b7
a
a8
c
c9
a
a10
b
b11
a
a12
Example
Example:I Σ = a,b, c,d
I P = ‘‘abacab ′′
c a b c dL(c) 4 5 3 -1
a b a c a a b c a d a b a c a b a a b b
a b a c a b
a
b1
a b a c a b
b
b2
a
a3
a
c4
a b a c a b
c
b5
a b a c a b
d
b6
a b a c a b
b
b7
a
a8
c
c9
a
a10
b
b11
a
a12
Example
Example:I Σ = a,b, c,d
I P = ‘‘abacab ′′
c a b c dL(c) 4 5 3 -1
a b a c a a b c a d a b a c a b a a b b
a b a c a b
a
b1
a b a c a b
b
b2
a
a3
a
c4
a b a c a b
c
b5
a b a c a b
d
b6
a b a c a b
b
b7
a
a8
c
c9
a
a10
b
b11
a
a12
Example
Example:I Σ = a,b, c,d
I P = ‘‘abacab ′′
c a b c dL(c) 4 5 3 -1
a b a c a a b c a d a b a c a b a a b b
a b a c a b
a
b1
a b a c a b
b
b2
a
a3
a
c4
a b a c a b
c
b5
a b a c a b
d
b6
a b a c a b
b
b7
a
a8
c
c9
a
a10
b
b11
a
a12
Example
Example:I Σ = a,b, c,d
I P = ‘‘abacab ′′
c a b c dL(c) 4 5 3 -1
a b a c a a b c a d a b a c a b a a b b
a b a c a b
a
b1
a b a c a b
b
b2
a
a3
a
c4
a b a c a b
c
b5
a b a c a b
d
b6
a b a c a b
b
b7
a
a8
c
c9
a
a10
b
b11
a
a12
Example
Example:I Σ = a,b, c,d
I P = ‘‘abacab ′′
c a b c dL(c) 4 5 3 -1
a b a c a a b c a d a b a c a b a a b b
a b a c a b
a
b1
a b a c a b
b
b2
a
a3
a
c4
a b a c a b
c
b5
a b a c a b
d
b6
a b a c a b
b
b7
a
a8
c
c9
a
a10
b
b11
a
a12
Example
Example:I Σ = a,b, c,d
I P = ‘‘abacab ′′
c a b c dL(c) 4 5 3 -1
a b a c a a b c a d a b a c a b a a b b
a b a c a b
a
b1
a b a c a b
b
b2
a
a3
a
c4
a b a c a b
c
b5
a b a c a b
d
b6
a b a c a b
b
b7
a
a8
c
c9
a
a10
b
b11
a
a12
Example
Example:I Σ = a,b, c,d
I P = ‘‘abacab ′′
c a b c dL(c) 4 5 3 -1
a b a c a a b c a d a b a c a b a a b b
a b a c a b
a
b1
a b a c a b
b
b2
a
a3
a
c4
a b a c a b
c
b5
a b a c a b
d
b6
a b a c a b
b
b7
a
a8
c
c9
a
a10
b
b11
a
a12
Example
Example:I Σ = a,b, c,d
I P = ‘‘abacab ′′
c a b c dL(c) 4 5 3 -1
a b a c a a b c a d a b a c a b a a b b
a b a c a b
a
b1
a b a c a b
b
b2
a
a3
a
c4
a b a c a b
c
b5
a b a c a b
d
b6
a b a c a b
b
b7
a
a8
c
c9
a
a10
b
b11
a
a12
Example
Example:I Σ = a,b, c,d
I P = ‘‘abacab ′′
c a b c dL(c) 4 5 3 -1
a b a c a a b c a d a b a c a b a a b b
a b a c a b
a
b1
a b a c a b
b
b2
a
a3
a
c4
a b a c a b
c
b5
a b a c a b
d
b6
a b a c a b
b
b7
a
a8
c
c9
a
a10
b
b11
a
a12
Example
Example:I Σ = a,b, c,d
I P = ‘‘abacab ′′
c a b c dL(c) 4 5 3 -1
a b a c a a b c a d a b a c a b a a b b
a b a c a b
a
b1
a b a c a b
b
b2
a
a3
a
c4
a b a c a b
c
b5
a b a c a b
d
b6
a b a c a b
b
b7
a
a8
c
c9
a
a10
b
b11
a
a12
Example
Example:I Σ = a,b, c,d
I P = ‘‘abacab ′′
c a b c dL(c) 4 5 3 -1
a b a c a a b c a d a b a c a b a a b b
a b a c a b
a
b1
a b a c a b
b
b2
a
a3
a
c4
a b a c a b
c
b5
a b a c a b
d
b6
a b a c a b
b
b7
a
a8
c
c9
a
a10
b
b11
a
a12
Example
Example:I Σ = a,b, c,d
I P = ‘‘abacab ′′
c a b c dL(c) 4 5 3 -1
a b a c a a b c a d a b a c a b a a b b
a b a c a b
a
b1
a b a c a b
b
b2
a
a3
a
c4
a b a c a b
c
b5
a b a c a b
d
b6
a b a c a b
b
b7
a
a8
c
c9
a
a10
b
b11
a
a12
Example
Example:I Σ = a,b, c,d
I P = ‘‘abacab ′′
c a b c dL(c) 4 5 3 -1
a b a c a a b c a d a b a c a b a a b b
a b a c a b
a
b1
a b a c a b
b
b2
a
a3
a
c4
a b a c a b
c
b5
a b a c a b
d
b6
a b a c a b
b
b7
a
a8
c
c9
a
a10
b
b11
a
a12
Example
Example:I Σ = a,b, c,d
I P = ‘‘abacab ′′
c a b c dL(c) 4 5 3 -1
a b a c a a b c a d a b a c a b a a b b
a b a c a b
a
b1
a b a c a b
b
b2
a
a3
a
c4
a b a c a b
c
b5
a b a c a b
d
b6
a b a c a b
b
b7
a
a8
c
c9
a
a10
b
b11
a
a12
Example
Example:I Σ = a,b, c,d
I P = ‘‘abacab ′′
c a b c dL(c) 4 5 3 -1
a b a c a a b c a d a b a c a b a a b b
a b a c a b
a
b1
a b a c a b
b
b2
a
a3
a
c4
a b a c a b
c
b5
a b a c a b
d
b6
a b a c a b
b
b7
a
a8
c
c9
a
a10
b
b11
a
a12
Boyer-Moore: Analysis
The worst case time complexity of Boyer-Moore is O(n · m + s)I Example of the worst-case: T = “aaa. . . a” and P = “baaa“
The worst-case:I may occur in images or DNA sequences
I is unlikely in English text
Boyer-Moore’s algorithm isI significantly faster than brute-force for English text
Knuth-Morris-Pratt (KMP) Algorithm
Knuth-Morris-Pratt’s algorithm:I compares the pattern left-to-right
I shifts pattern more intelligently than brute-force algorithm
When a mismatch occurs: what is the most we can shift thepattern to avoid redundant comparisons?
I Answer: largest prefix of P[0 . . . i] that is suffix of P[1 . . . i](where i is the last compared position)
. . a b a a b x . . . . .
a b a a b aa b a a b
a b a a b x
b
a b a a b aa bi
No need to compare again. Resume comparing here.
KMP Failure (Shift) Function
The Knuth-Morris-Pratt’s failure function F (i):I the algorithm preprocesses the pattern to compute F (i)
I F (i) is the largest j such that P[0 . . . j] is suffix of P[1 . . . i]
Example: P = ”abaaba“
i 0 1 2 3 4 5P[i] a b a a b aF (i) 0 0 1 1 2 3
. . a b a a b x . . . . .
a b a a b aa b a a b
a b a a b x
b
a b a a b aa bi
F (i − 1)
Knuth-Morris-Pratt Algorithm
KMPMatch(T ,P):F = failureFunction(P)
pos = 0i = 0while pos ≤ n − m do
if P[i] != T [pos + i] thenif i > 0 then
pos = pos + i − F (i − 1)
elsepos = pos + 1
i = F (i − 1)
elsei = i + 1
if i == m then return pos (match at position pos)donereturn −1 (no match)
Example
a b a a b c a b a b a a b a a b a a b b
a0
b1
a2
a3
b4
a5
a
a1
b
b2
a
a3
a
a4
b
b5
c
a6
a0
b1
a2
a3
b4
a5
a b
c
a7
a0
b1
a2
a3
b4
a5
c
a8
a0
b1
a2
a3
b4
a5
a
a9
b
b10
a
a11
b
a12
a0
b1
a2
a3
b4
a5
a
b
b13
a
a14
a
a15
b
b16
a
a17
i 0 1 2 3 4 5F (i) 0 0 1 1 2 3
Knuth-Morris-Pratt: Analysis
Knuth-Morris-Pratt’s algorithm runs in O(n + m) time:I Failure function can computed in O(m).I In each iteration of the while-loop:
I either i increases by 1, or
I pos increases by at least the amount i decreases.
I Thus the while-loop is executed at most 2n times.
Binary Character Encoding
Given: alphabet A1,A2, . . . ,An.I find binary codes c(Ai) of the letters Ai
Encode letters as binary numbers of bit length dlog2 ne:I e.g. alphabet A,B,C
I c(A) = 00, c(B) = 01, c(C) = 11
I ACBA = 00110100
More efficient encodings:I c(A) = 0, c(B) = 10, c(C) = 11
I ACBA = 011100
The coding should be unambiguous. Bad example:I Let c(A) = 10, for c(B) = 01, and for c(C) = 0.
I Then both BC and CA yield the 010.The code must be prefix-free: no c(Ai) is prefix of a c(Aj)!
Huffman code
Given:I Alphabet A1,A2, . . . ,An.
I Probabilities 0 ≤ p(Ai) ≤ 1 for every letter Ai .
Problem:I Find binary codes c(Ai) of the letters Ai .
I Expected length for the codes, that is,∑1≤i≤n
p(Ai) · |c(Ai)|
should be minimal.
Huffman Algorithm
Huffman Algorithm:I Create for every letter Ai a tree consisting only of Ai .I Search the two trees with the lowest sum of probabilities:
I merge these two trees with a new node at the root
I Repeat the last step until only one tree is left.
Example:
p(A) = .15, p(B) = .2, p(C) = .15, p(D) = .4, p(E) = .1
A B C D E.15 .2 .15 .4 .1A B D
C0
E1
.15 .2 .4 .25D
A0
B1
C0
E1
.4 .25.35D
A0
B1
0
C0
E1
1
.6
D0
A0
B1
0
C0
E1
1
1
.6 c(A) = 100c(B) = 101c(C) = 110c(D) = 0c(E) = 111
average code length:3 · 0.6 + 1 · 0.4 = 2.2