11/2/2006 Lecture7 gac1 1
The Growth of Functions• This lecture will introduce some tools and notations
necessary to study algorithm efficiency.• The lecture will cover
– Asymptotic behaviour– Big-O notation– Simplifying big-O expressions– Big-O of sums and products– Big-Omega and Big-Theta notation
11/2/2006 Lecture7 gac1 2
Asymptotic Behaviour• Definition
– An algorithm is a finite set of precise instructions for performing a computation or for solving a problem.
• Generally, the problem to be solved will have certain parameters– the algorithm needs to work for all valid parameters.– e.g. an algorithm for multiplying two matrices should work on any two
matrices.• An important question when comparing two algorithms for
the same problem: which one is faster?– this will have many possible answers, and may depend on
• the language in which it is written.• the machine used to run the algorithm.• the size of the problem instance, e.g. is this a 4x4 matrix or a 1024x1024
matrix?
11/2/2006 Lecture7 gac1 3
Asymptotic Behaviour• Asymptotic analysis is a method of abstracting away from
these details. This is achieved by– examining the run-time as the problem instance increases in size.– ignoring constants of proportionality
• if run-time doubles when going from a 2x2 matrix to a 3x3 matrix, that’s more interesting than it going from 0.01 sec to 0.02 sec on a particular machine.
• Example– You have implemented two division algorithms for n-bit numbers. – One takes 100n2 + 17n + 4 microseconds. The other takes n3
microseconds.– For some values of n, n3 < 100n2 + 17n + 4 (e.g. n = 5). But for large
enough n, n3 will be bigger.– We will concentrate on this “big picture”.
11/2/2006 Lecture7 gac1 4
Big-O Notation• Definition
– Let f and g be functions from the set of integers or the set of reals to the set of reals. The function f(x) is big-O of g(x) iff∃c∈R+ ∃k∈R+ ∀x ( (x > k) → ( |f(x)| ≤ c|g(x)| ) )
• Notation– We will write “f(x) is O(g(x))”, when f(x) is big-O of g(x).
• Intuition– The constant k formalizes our concept of “big enough”
problem instances.– The constant c formalizes our concern to avoid worrying
about constants of proportionality.
11/2/2006 Lecture7 gac1 5
Examples
• ff
f (x)cx
k x
• Example 2We will show that f(x) = x2 + 5x is O(x2).• x2 > 5x whenever x > 5. So for x > 5, f(x) ≤ 2x2.• So using k = 5, c = 2, we have f(x) is O(x2).
• Example 1• f(x) is O(x)
11/2/2006 Lecture7 gac1 6
Simplifying big-O Expressions• Theorem
– Let f(x) = anxn + an-1xn-1 + … + a1x + a0, where ∀i (ai∈R). Then f(x) is O(xn).
• Proof– For this proof, we will assume the “triangle inequality”
∀x∀y ( |x+y| ≤ |x| + |y| ).– Using this property,
|f(x)| ≤ |anxn| + |an-1xn-1| + … + |a1x| + |a0|= xn(|an| + |an-1|/x + … + |a1|/xn-1 + |a0|/xn)≤ xn(|an| + |an-1| + … + |a1| + |a0|) for x > 1
– So using c = |an| + |an-1| + … + |a1| + |a0| and k = 1 gives ∀x ( (x > k) → ( |f(x)| ≤ c|g(x)| ) ), completing the proof.
11/2/2006 Lecture7 gac1 7
Big-O of Sums• It is often necessary to characterise the growth of a sum of
functions.– Let f(x) denote the run-time of a Delphi function fun1, and let g(x)
denote the run-time of another Delphi function fun2. What is known about the run-time of the main program that first calls fun1 and then calls fun2?
• Theorem– Let f1(x) be O(g1(x)) and f2(x) be O(g2(x)). Then f1(x)+f2(x) is
O( max(|g1(x)|,|g2(x)|) ).
• Proof– There exist k1,k2,c1,c2, such that
|f1(x)| ≤ c1|g1(x)| for all x > k1, and|f2(x)| ≤ c2|g2(x)| for all x > k2
11/2/2006 Lecture7 gac1 8
Big-O of Sums– By the triangle inequality,
|f1(x)+f2(x)| ≤ |f1(x)| + |f2(x)|≤ c1|g1(x)| + c2|g2(x)| for all x > max(k1, k2)≤ c1 max( |g1(x)|, |g2(x)| ) + c1 max( |g1(x)|, |g2(x)| )≤ (c1 + c2) | max( |g1(x)|, |g2(x)| ) |
– Using c = c1 + c2, and k = max(k1, k2), we have that f1(x)+f2(x) is O( max( |g1(x)|, |g2(x)| ) ).
• Example• Given that n! is O(nn), provide a big-O expression for
n! + 34 n10. You may assume n is non-negative.n! is O(nn) and 34 n10 is O(n10). So n! + 34 n10 is
O( max( |nn|, |n10| ) ) = O( max(nn, n10) ).
11/2/2006 Lecture7 gac1 9
Big-O of Products• A similar theorem exists for the big-O of a product of
functions.• Theorem
– Let f1(x) be O(g1(x)) and f2(x) be O(g2(x)). Then f1(x)f2(x) is O( g1(x)g2(x) ).
• Proof– There exist k1,k2,c1,c2, such that
|f1(x)| ≤ c1|g1(x)| for all x > k1, and|f2(x)| ≤ c2|g2(x)| for all x > k2
– So for x > max(k1, k2), |f1(x)f2(x)| ≤ c1 c2 |g1(x)| |g2(x)| = c1 c2 |g1(x)g2(x)|.
– Using c = c1 c2 and k = max(k1, k2), we have that f1(x)f2(x) is O(g1(x)g2(x)).
11/2/2006 Lecture7 gac1 10
Big-Omega Notation• Big-O notation provides a useful abstraction for upper-
bounds on function growth, but has some problems.– x2 is O(x2), but also x2 is O(x3) or even O(2x). – Providing a lower-bound on function growth would be helpful, but is
often much harder.
• Definition– Let f and g be functions from the set of integers or the set of reals
to the set of reals. The function f(x) is big-Omega of g(x) iff∃c∈R+ ∃k∈R+ ∀x ( (x > k) → ( |f(x)| ≥ c|g(x)| ) ).
• Notation– We will write “f(x) is Ω(g(x))”, when f(x) is big-Omega of g(x).
11/2/2006 Lecture7 gac1 11
Big-Omega Notation• Example
Show that x2 – 6x is Ω(x2).
x2 – 6x = x2(1 – 6/x). For x ≥ 7, (1 – 6/x) is positive.|x2 – 6x| = |x2|(1 – 6/x) for x ≥ 7.
≥ |x2|(1 – 6/7) for x ≥ 7.So choosing k = 7 and c = 1/7, we have that x2 – 6x is Ω(x2).
11/2/2006 Lecture7 gac1 12
Big-Omega Notation• Theorem
f(x) is Ω(g(x)) iff g(x) is O(f(x)).• Proof
– If f(x) is Ω(g(x)), then we have|f(x)| ≥ c1|g(x)| for x > k1. Since c1 is positive, we have|g(x)| ≤ (1/c1)|f(x)|. Using c = 1/c1 and k = k1 gives g(x) is O(f(x)).
– If g(x) is O(f(x)), then we have|g(x)| ≤ c2|f(x)| for x > k2. Since c2 is positive, we have|f(x)| ≤ (1/c2)|g(x)|. Using c = 1/c2 and k = k2 gives f(x) is Ω(g(x)).
11/2/2006 Lecture7 gac1 13
Big-Theta Notation• Definition
– Let f and g be functions from the set of integers or the set of reals to the set of reals. The function f(x) is big-Theta of g(x) iff f(x) is O(g(x)) and f(x) is Ω(g(x)).
• Notation– We will write “f(x) is Θ(g(x))”, or “f(x) is order g(x)”, when f(x) is big-
Theta g(x).
• Example– We recently proved that x2 – 6x is Ω(x2). Also, from our theorem on
big-O of polynomials, x2 – 6x is O(x2). Thus x2 – 6x is Θ(x2).– Note that while x2 – 6x is O(x3), it is not Θ(x3)– Note that while x2 – 6x is Ω(x), it is not Θ(x).
11/2/2006 Lecture7 gac1 14
Big-Theta in Action
• Big-Theta is a tight bound on the function– it sandwiches the function f(x) between two scaled versions of the
same function g(x).– note that in the above example, we could not sandwich the function
between scaled versions of g(x) = x, for example.
x
x2
x2 – 6x
x2/7
11/2/2006 Lecture7 gac1 15
Test Your Knowledge• Which of these functions is
(i) O(x), (ii) Ω(x), (iii) Θ(x) ?
(a) f(x) = 10
(b) f(x) = x2 + x + 1
(c) f(x) = 3x + 7
11/2/2006 Lecture7 gac1 16
Summary• This lecture has covered
– Asymptotic behaviour– Big-O notation– Simplifying big-O expressions– Big-O of sums and products– Big-Omega and Big-Theta notation
• The next lecture will use these concepts to examine the complexity of algorithms.
11/2/2006 Lecture8 gac1 1
The Complexity of Algorithms• This lecture will apply our analysis of the growth of
functions to help us analyse algorithms.• The lecture will cover
– Time complexity and space-complexity– Worst-case and average-case complexity– Common complexity classes– The time-complexity of some common algorithms
• Linear search• Binary search• Bubble sort
11/2/2006 Lecture8 gac1 2
Worst-Case Time Complexity• Algorithms may be compared by
– how much time they take to execute (time complexity)– how much memory they require (space complexity)
• We will concentrate on time complexity in this lecture
• Often, an algorithm will take much longer on one input than on another, even when both inputs are the same size.
• Example– Consider the following algorithm, Linear Search, which
looks for a particular element value in an array.
11/2/2006 Lecture8 gac1 3
Linear Search
• This algorithm terminates with i equal to the number of the element, so a[i] = x.– (assuming that there is at least one such element)
• The algorithm works by comparing x to each element in a[ ] in turn. The first one to match is taken as the result.
procedure linear_search( integer x, integer a[n] )begin
1. i := 12. while( i ≤ n and x ≠ a[ i ] )3. i := i + 1
end
11/2/2006 Lecture8 gac1 4
Worst/Average-Case Complexity– Given the array a[1] = 1, a[2] = 5, a[3] = 2, a call linear_search( 1, a )
will clearly take less time than a call linear_search( 2, a ).• Often, we are most interested in the worst-case complexity,
e.g.– what is the worst possible time taken by linear_search, as a function
of n, over all the possible values of x and a[ ]?• Sometimes, we are also interested in the average-case
complexity, e.g.– what is the average time taken by linear_search, as a function of n,
over all the possible values of x and a[ ]?– typically, average case complexity is very difficult to analyse, and
also requires information on the joint probability distribution of the inputs.
– we will restrict ourselves to worst-case complexity analysis.
11/2/2006 Lecture8 gac1 5
Linear Search• We will now find an upper bound for the worst-case time
complexity of linear_search.– Line 1 executes once.– Line 2 (the comparisons) executes at most n times.– Line 3 executes at most n – 1 times.
• Thus we have an execution time of O(1) + O(n) + O(n-1) = O(n).
• Big-O notation has helped us here, by allowing us to ignore that– we don’t know the absolute amount of time taken for an assignment,
an array indexing (i.e. finding a[i] from a and i), or an addition;– we don’t even know the ratios of how long these operations take.
• All we needed to assume was that each of these times is a constant (independent of n).
11/2/2006 Lecture8 gac1 6
Binary Search• Binary search is another searching algorithm, which
can be used on an array which is known to be sorted– we will assume it is sorted in ascending order
procedure binary_search( integer x, integer a[n] )begin
1. i := 1 { we know that x is somewhere between }2. j := n { the i’th and the j’th element of a, inclusive }3. while( i < j )
begin4. m := floor( (i+j)/2 )5. if x > a[ m ] then i := m + 1 else j := m
endend
11/2/2006 Lecture8 gac1 7
Binary Search• How this algorithm works:
– Variables i and j record the lower and upper array indices which could contain x, respectively.
– In each iteration, the index half-way between i and j is found (m). If “half-way” is not an integer, it is rounded down (floor function).
– If x is bigger than element a[m], then x can only be above m in the array, since the array is sorted
• in this case, we don’t need to search from 1 to m, and so i := m+1– If not, x could only be between 1 and m in the array
• in this case, we don’t need to search from m+1 to n, and so j := m– The algorithm terminates when i = j, i.e. there is only one
possible element left to search.
11/2/2006 Lecture8 gac1 8
Binary Search• We will now find an upper bound for the worst-case time
complexity of binary_search.– For simplicity, we will analyse the algorithm for the case where n is a
power of 2.– Originally i = 1 and j = 2k, so m = 2k-1. After the first iteration, we have
either i = 2k-1 + 1 and j = 2k, or i = 1 and j = 2k-1. In either case, we have a total of 2k-1 elements to search.
– After the second iteration, we will have 2k-2 elements, and so on until we have one element left.
– So line 3 executes exactly k+1 times.– Line 4 and 5 therefore execute k times, and lines 1 and 2 execute
once.– The total complexity is O(1) + O(k+1) + O(k) = O(k) = O(log2 n).
11/2/2006 Lecture8 gac1 9
Bubble Sort
• The procedure– makes repeated passes over the array a[ ].– during each iteration, it repeatedly swaps neighbouring elements
that are not in the right order.– the first iteration ensures that the largest element in the array ends
up in position a[n].– the second iteration ensures that the second largest element ends
up in position a[n-1], and so on.– after n-1 iterations, the array will be sorted.
procedure bubble_sort( real a[n] ) beginfor i := 1 to n – 1
for j := 1 to n – iif a[ j ] > a[ j + 1 ] then swap a[ j ] and a[ j + 1 ]
end
• Bubble sort is an algorithm to sort an array in ascending order.
11/2/2006 Lecture8 gac1 10
Bubble Sort• An example execution
a[1] = 1 a[1] = 1 a[1] = 1 a[1] = 0a[2] = 5 a[2] = 3 a[2] = 0 a[2] = 1a[3] = 3 a[3] = 0 a[3] = 3 a[3] = 3a[4] = 0 a[4] = 5 a[4] = 5 a[4] = 5
1st iteration 2nd iteration 3rd iteration
pair-wise swap
11/2/2006 Lecture8 gac1 11
A Quick Lemma• To analyse the complexity of bubble sort, we need
the following lemma.• Lemma
• Proof– We will prove by induction. It is clearly true for n = 2.
Assume true for n, and we will demonstrate it is true for n+1.
.12
)1()(1
1>
−==∑
−
=
nnninSn
i for
.2
)1(2
)1()()1(1
nnnnnnnSinSn
i
+=+
−=+==+ ∑
=
11/2/2006 Lecture8 gac1 12
Bubble Sort• To analyse the worst-case complexity of bubble
sort– notice that there will be n-1 iterations of the outer loop– for each iteration of the outer loop, there will be n-i
iterations of the inner loop– each iteration of the inner loop is O(1)– the total number of iterations of the inner loop is thus
– Each of these is O(1), so the total complexity is O(n2).
2)1(
2)1()1()1()(
1
1
1
1
−=
−−−=−−=− ∑∑
−
=
−
=
nnnnnninninn
i
n
i
11/2/2006 Lecture8 gac1 13
Common Complexity Classes• Some big-O complexity cases arise commonly, and
so have been given special names.
Exponential complexityO(bn) (b > 1)
Polynomial complexityO(nb)
Linear complexityO(n)
Logarithmic complexityO(log n)
Constant complexityO(1)
NameBig-O Notation
11/2/2006 Lecture8 gac1 14
Test Your Knowledge• An algorithm for n by n matrix multiplication is
shown below. Give a big-O estimate of:(i) The number of times line A executes(ii) The number of times line B executes(iii) The time complexity of the algorithm
procedure matrix_mult( real a[n][n], b[n][n], c[n][n] ) beginfor i := 1 to n
for j := 1 to nLine A: c[ i ][ j ] := 0
for k := 1 to nLine B: c[ i ][ j ] := c[ i ][ j ] + a[ i ][ k ]*b[ k ][ j ]
end
11/2/2006 Lecture8 gac1 15
Summary• This lecture has covered
– Time complexity and space-complexity– Worst-case and average-case complexity– Common complexity classes– The time-complexity of some common algorithms
• Linear search• Binary search• Bubble sort
• However, we are not yet able to analyse recursive algorithms.
• The next lecture will introduce recurrence relations, which will enable us to do so.
11/2/2006 Lecture9 gac1 1
Recurrence Relations• This lecture will cover
– What is a recurrence relation?– The substitution method.– Solution of some linear homogeneous recurrence
relations: degree 1 and degree 2.– Solution of some linear non-homogeneous recurrence
relations: degree 1 and degree 2.
11/2/2006 Lecture9 gac1 2
What is a Recurrence Relation?• Definition
– A sequence is a function from a subset of the set of integers (usually N or Z+) to a set S. We use the notation an for the image of n.
• Definition– A recurrence relation for the sequence {an} is an equation
that expresses an in terms of one or more of the previous terms of the sequence (i.e. a0, a1,…, and an-1), for all n ≥ n0, where n0 ∈ N.
• Examplean = an-1 + an-2 for n ≥ 2 is a recurrence relation.
11/2/2006 Lecture9 gac1 3
Solving Recurrence Relations• Often, we are interested in finding a general
formula for the value of an that does not depend on the previous values.
• Example– Let a0 = 500 denote my initial financial investment, in £.– After n years in the bank at 10% interest, the total value
of my investment will be given by the recurrence relation an = 1.1an-1, for n ≥ 1.
– We can solve this recurrence relation to obtain an = 500(1.1)n, which does not depend on previous values of the sequence.
11/2/2006 Lecture9 gac1 4
The Substitution Method• The substitution method for recurrence solution is
essentially a “guessing” method.– Guess the form of a solution, and try to prove it holds
using induction.• Example
an = 2a⎣n/2⎦ + n, for n > 1.(note the notation ⎣x⎦ for floor(x) )– We may have seen “similar” recurrences before, and be
able to guess a solution an = O(n log2 n).– Assume this is true for ⎣n/2⎦. Then
an ≤ 2c ⎣n/2⎦ log ⎣n/2⎦ + n for large enough n.
11/2/2006 Lecture9 gac1 5
The Substitution Method– so an ≤ cn log2(n/2) + n = cn log2 n - cn + n ≤ cn log2 n for
c ≥ 1.– Thus an = O(n log2 n).
(A detail: note that we required c ≥ 1 to make this proof work, but this is not a problem, as c was the constant from the big-O notation. Therefore the big-O approximation will still hold for any c’ > c. We can instead choose a c’ large enough to ensure c’ ≥ 1. It would have caused us a problem if the requirement was c ≤ 1).
• Of course substitution has a major disadvantage– you need to guess the answer!– (how many of you would have guessed an = O(n log2 n)?)
• We need to turn our attention to methods that don’t require guesswork.
11/2/2006 Lecture9 gac1 6
Linear Homogeneous Recurrences• Definition
– A linear homogeneous recurrence relation of degree k with constant coefficients is a recurrence relation of the form an = c1an-1 + c2an-2 + … + ckan-k, where c1, c2, …, ckare real numbers, and ck ≠ 0.
• Linear homogeneous recurrence relations with constant coefficients are important – there are general methods to find their solution.
• Example– Our bank interest example an = 1.1an-1 is a linear
homogeneous recurrence relation of degree 1 with constant coefficients.
11/2/2006 Lecture9 gac1 7
Degree 1 Linear Homogeneous• Indeed, our bank interest example suggests a
simple general theorem for the degree 1 case.• Theorem
an = αcn is a solution to the recurrence relation an = can-1.The value of can α can be found from the initial condition
a0.• Proof
– We can prove this with the substitution method. Assume true for n – 1. We will show it holds for n.can-1 = cαcn-1 = αcn = an.
– It only remains to determine α. Substituting n = 0, gives α = a0.
11/2/2006 Lecture9 gac1 8
Degree 2 Linear Homogeneous• We will also prove a theorem dealing with the degree 2
case.• Theorem
– Let c1 and c2 be real numbers. Suppose r2 – c1r – c2 = 0 has two distinct roots r1 and r2. Then an = α1r1
n + α2r2n is a solution to the
recurrence relation an = c1an-1 + c2an-2. The values of α1 and α2 can be found from the initial conditions a0 and a1.
• Proof– Assume true for n-1 and n-2. We will show it to hold for n.– c1an-1 + c2an-2 = c1(α1r1
n-1 + α2r2n-1) + c2 (α1r1
n-2 + α2r2n-2)
= α1r1n-2 (c1r1 + c2) + α2r2
n-2(c1r2 + c2).– But c1r1 + c2 = r1
2 and c1r2 + c2 = r22.
– So c1an-1 + c2an-2 = α1r1n-2r1
2 + α2r2n-2r2
2 = α1r1n + α2r2
n
= an.
11/2/2006 Lecture9 gac1 9
Degree 2 Linear Homogeneous– It remains to determine α1 and α2.– Substituting n = 0 and n = 1 into an = α1r1
n + α2r2n gives
α1 + α2 = a0 and α1r1 + α2r2 = a1.– So a1 = α1r1 + (a0 - α1)r2.– So for r1 ≠ r2, α1 = (a1 – a0r2)/(r1 – r2).– And also α2 = a0 - (a1 – a0r2)/(r1 – r2)
= (a0r1 – a1)/(r1 – r2).• Example
– Obtain a solution to the recurrence an = an-1 + an-2, with a0 = 1, a1 = 1?
• note that this is the famous Fibonacci sequence.
11/2/2006 Lecture9 gac1 10
Degree 2 Linear Homogeneous– First, we note c1 = 1, c2 = 1, so we form the equation
r2 – r – 1 = 0. This has solutions r = (1 ± √5)/2.– These are two distinct solutions, r1 = (1 + √5)/2 and
r2 = (1 - √5)/2, so we can apply the previous theorem.– Following through the arithmetic leads to
– Thus the solution is given by
nn
na ⎟⎟⎠
⎞⎜⎜⎝
⎛ −−+⎟⎟
⎠
⎞⎜⎜⎝
⎛ ++=
251
1055
251
1055
1055,
1055
21−
=+
= αα
11/2/2006 Lecture9 gac1 11
Linear Non-homogeneous Recurrences• Definition
– A linear non-homogeneous recurrence relation of degree k with constant coefficients is a recurrence relation of the form an = c1an-1 + c2an-2 + … + ckan-k + F(n), where c1, c2, …, ck are real numbers, and ck ≠ 0.
• There are many results on these equations, but we will only look at the particularly important case when F(n) is a constant.
• Firstly, let’s consider the degree 1 case.• Theorem
Let c be a real number with c ≠ 1. Then an = αcn + k/(1 – c) is a solution to the recurrence relation an = can-1 + k.
The value of can α can be found from the initial condition a0.
11/2/2006 Lecture9 gac1 12
Non-Homogeneous Recurrences• Proof
Assume true for n-1. We will show it holds for n.can-1 + k = c(αcn-1 + k/(1 – c)) + k
= αcn + k( 1 + c/(1 – c) )= αcn + k/(1 – c) = an.
It remains to determine the value of α from initial condition a0. Substituting n = 0, we obtain a0 = α + k/(1 – c), so α = a0 - k/(1 – c).
• We will also consider the special case when c = 1– this case is just an “arithmetic progression”
• TheoremThen an = α + nk is a solution to the recurrence relation an = an-1 + k.The value of can α can be found from the initial condition a0.
11/2/2006 Lecture9 gac1 13
Non-Homogeneous Recurrences• Proof
Assume true for n-1. We will show it holds for n.an-1 + k = α + (n – 1)k + k
= α + nk = an.It remains to determine the value of α from initial condition a0. Substituting n = 0, we obtain α = a0.
• We can now proceed to look at the degree 2 case.• Theorem
– Let c1 and c2 be real numbers with c1 + c2 ≠ 1. Suppose r2 – c1r – c2 = 0 has two distinct roots r1 and r2. Then an = α1r1
n + α2r2n – k/(c1 + c2 – 1) is a solution to the recurrence
relation an = c1an-1 + c2an-2 + k. The values of α1 and α2 can be found from the initial conditions a0 and a1.
11/2/2006 Lecture9 gac1 14
Linear Non-homogeneous Recurrences• Proof
– Again, we will assume this to be true for n-1 and n-2, and demonstrate it to be true for n.c1an-1 + c2an-2 + k
= c1(α1r1n-1 + α2r2
n-1 – k/(c1 + c2 – 1)) + c2 (α1r1n-2 + α2r2
n-2 –k/(c1 + c2 – 1)) + k
= α1r1n-2(c1r1 + c2) + α2r2
n-2(c1r2 + c2) + k – k(c1 + c2)/(c1 + c2 – 1)
= α1r1n + α2r2
n – k/(c1 + c2 – 1)= an.– It remains only to determine the values of α1 and α2 from
the initial conditions a0 and a1.
11/2/2006 Lecture9 gac1 15
Linear Non-Homogeneous Recurrences– For simplicity, let us define p = – k/(c1 + c2 – 1).– Substituting n = 0 and n = 1 gives
a0 = α1 + α2 + p anda1 = α1r1 + α2r2 + p
– So we havea1 = α1r1 + (a0 – α1 – p)r2 + pα1(r1 – r2) = a1 – a0 r2 + p(r2 – 1).
– Thusα1 = (a1 – a0 r2 + p(r2 – 1))/(r1 – r2), since r1 ≠ r2, andα2 = a0 – α1 – p.
11/2/2006 Lecture9 gac1 16
Non-Homogeneous Example• We will try to find a solution for the recurrence
an = an-1 + 2an-2 + 1, with a0 = 1, a1 = 1.
• First, we note c1 = 1, c2 = 2, so we form the equation r2 – r – 2 = 0. – This has solutions r1 = 2 and r2 = -1.
• The solutions are distinct, and c1 + c2 ≠ 1, so the theorem will help us find a solution. We obtainan = α12n + α2(-1)n – ½.
• The initial conditions giveα1 = (a1 – a0 r2 - ½(r2 – 1))/(r1 – r2) = 1α2 = a0 – α1 + ½ = ½.
• The complete solution is then21)1(
212 −−+= nn
na
11/2/2006 Lecture9 gac1 17
Test Your Knowledge• Which of these are linear homogeneous recurrence
relations with constant coefficients?(a) an = 3an-2.(b) an = an-1
2.(c) an = an-1 + 2an-3.(d) an = a⎣n/2⎦ + 1.
• Which of these linear recurrence relations with constant coefficients can be solved using the techniques in this lecture?
(a) an = 2an-1 – 2an-2.(b) an = 4an-1 + 4an-2.
(c) an = 3an-1 + 2an-2 + 1.(d) an = 2an-1 – 2an-2 + 1.
11/2/2006 Lecture9 gac1 18
Summary• This lecture has covered
– What is a recurrence relation?– The substitution method.– Solution of some linear homogeneous recurrence
relations: degree 1 and degree 2.– Solution of some linear non-homogeneous recurrence
relations: degree 1 and degree 2.• We have ploughed through some [fairly
unpleasant?] algebra, but it will be worth it for the next lecture– we can use these relations to analyse recursive
algorithms.
11/2/2006 Lecture10 gac1 1
Analysing Recursive Algorithms I
• In this lecture, we will apply recurrence relations to the analysis of execution time for recursive algorithms.
• The lecture will cover– How to analyse recursive execution time– Recursive and iterative factorial algorithms– Recursive and iterative Fibonacci algorithms
11/2/2006 Lecture10 gac1 2
Recursive Algorithms• Definition
– An algorithm is called recursive if it solves a problem by reducing it to one or more instances of the same problem with smaller input.
• We will start by looking at a recursive algorithm to calculate the factorial.
function factorial(n : non-negative integer)beginif n = 0 thenresult := 1
elseresult := n * factorial( n – 1 )
end
11/2/2006 Lecture10 gac1 3
Execution Time Analysis• To analyse the run-time of this algorithm, we note
that– all inputs require a comparison (n = 0).– if n = 0, no calculation is required – just a store operation.– if n ≠ 0, a multiplication is required, together with a call to
factorial(n-1).• Let us denote by an the total number of
multiplication operations performed by factorial(n).• We have a0 = 0 and an = 1 + an-1, for n > 0.
– A recurrence relation!
11/2/2006 Lecture10 gac1 4
Execution Time Analysis• We may apply the theorems of the last lecture to
solve this recurrence.– it is a degree 1 linear non-homogeneous recurrence
relation with constant coefficients• The recurrence is of the form an = an-1 + k, so the
solution has the form an = α + nk. – k = 1, a0 = 0– So an = n. n multiplications are performed by a call to
factorial(n).– The run time will be O(n).
11/2/2006 Lecture10 gac1 5
Recursive Fibonacci• For a different type of example, consider the following
algorithm for calculating the Fibonacci sequence– recall that fib(1) = 1, fib(2) = 1, and fib(n) = fib(n-1) +
fib(n-2) for n > 2.
function fib(n : positive integer)beginif n <= 2 thenresult := 1
elseresult := fib( n – 1 ) + fib( n – 2 )
end
11/2/2006 Lecture10 gac1 6
Recursive Fibonacci• We may wish to determine the number of addition
operations required by this algorithm.– Let us denote this an.
• An addition is only performed when n > 2. In this case– fib(n-1) is called, so we must include any additions
performed by this.– fib(n-2) is called, so we must include any additions
performed by this.– an extra addition is performed to add fib(n-1) and fib(n-2).
• We obtain an = 1 + an-1 + an-2, for n > 2, with initial conditions a1 = 0 a2 = 0.
11/2/2006 Lecture10 gac1 7
Recursive Fibonacci• This is a linear non-homogeneous recurrence relation of
degree 2 with constant coefficients.– Compare with an = c1an-1 + c2an-2 + k.– We have c1 = 1, c2 = 1, k = 1.
• We can use the result of the previous lecture ifc1 + c2 ≠ 1 (O.K.)r2 – c1r – c2 = 0 has two distinct roots (O.K.)
• The two roots of this quadratic are r1 = (1 + √5)/2 and r2 = (1 - √5)/2, and the solution to the recurrence has the form
an = α1r1n + α2r2
n – k/(c1 + c2 – 1)= α1r1
n + α2r2n – 1.
11/2/2006 Lecture10 gac1 8
Recursive Fibonacci• Substituting n = 1 and n = 2 gives
α1r1 + α2r2 – 1 = 0 (*)α1r1
2 + α2r22 – 1 = 0 (**)
• Multiplying (*) by r1 and subtracting from (**) gives:α2r2
2 – 1 – α2r2r1 + r1 = 0, i.e.α2 = (1 – r1)/(r2
2 – r2r1) = -1/√5 and α1 = 1/√5 (after some work!)
• The total number of additions performed by a call to fib(n) is therefore:
• The total execution time is exponential in n.
12
512
515
1−
⎥⎥⎦
⎤
⎢⎢⎣
⎡⎟⎟⎠
⎞⎜⎜⎝
⎛ −−⎟⎟
⎠
⎞⎜⎜⎝
⎛ +=
nn
na
11/2/2006 Lecture10 gac1 9
Iterative Factorial• We can compare the recursive versions of these
algorithms to their iterative equivalents.• An iterative factorial is shown below.
• This algorithm performs n multiplications, and has an execution time of O(n).– the same as the recursive version.
function factorial(n : non-negative integer)beginresult := 1for i = 1 to nresult := result * i
end
11/2/2006 Lecture10 gac1 10
Iterative Factorial• Note that this doesn’t mean the two version have
the same run time.– The constant of proportionality in the O(n) hides the
details.– In practice, function calls (stack manipulation, etc.) often
have a large overhead.– But the result is still valuable: for large enough n, the
execution time of the two ways of calculating factorial are out by at most a constant multiplicative factor.
11/2/2006 Lecture10 gac1 11
Iterative Fibonacci• An iterative version of the Fibonacci generator is
shown below.function fib(n : positive integer)beginx := 0result := 1for i = 1 to (n – 1)begin
z := x + resultx := resultresult := z
endend
11/2/2006 Lecture10 gac1 12
Iterative Fibonacci• Addition is only performed once in the loop, so a total of
(n-1) additions are performed.• Apart from the initialization of x, result, and i (which are
O(1)), the remaining operations take O(n) time.– the execution time is linear in n.
• This compares very favourably with the recursive version: linear vs. exponential run-time!
• We can draw a valuable lesson from this example– recursion often makes the algorithm neater and easier to read,– but it must be used with caution, as exponential run-times can result.
11/2/2006 Lecture10 gac1 13
Test Your Knowledge• The algorithm below operates on a portion a[low to high] of
an array a. When called, low <= high.• Write a recurrence relation, expressing the number of
addition operations in the algorithm below in terms of the number of elements n = high – low + 1.function atest(low: integer, high: integer)begin
if low = highresult := a[low];
elsebegin
temp1 := atest( a, low+1, high );temp2 := atest( a, low, high-1 );result := temp1 + temp2;
endend
11/2/2006 Lecture10 gac1 14
Summary• This lecture has covered
– How to analyse recursive execution time– Recursive and iterative factorial algorithms– Recursive and iterative Fibonacci algorithms
• We can now analyse some recursive algorithms.• There is an important class of recursive algorithms,
known as divide and conquer algorithms, that we can’t yet analyse.
• For this, we need more recurrence relations.
11/2/2006 Lecture11 gac1 1
Divide-and-Conquer Recurrences• This lecture will introduce divide-and-conquer
recurrence relations– used to analyse divide-and-conquer recursive algorithms
• The lecture will cover– divide-and-conquer recurrences– a theorem for a common form of divide-and-conquer
recurrence– the Master Theorem
11/2/2006 Lecture11 gac1 2
Divide-and-Conquer Recurrences• A divide-and-conquer algorithm breaks up a large
problem into several proportionally smaller versions of the same problem.– this approach is often used for sorting, for example –
detailed examples in the next lecture.• Analysing divide-and-conquer algorithms results in
divide-and-conquer recurrences.• Definition
– A divide-and-conquer recurrence relation is an equation of the form f(n) = a f(n/b) + g(n).
11/2/2006 Lecture11 gac1 3
The Growth of D&C Recurrences• For simplicity, let us consider how this recurrence
grows when n = bk for some k ∈ Z+.• In this case, we can write
• We will use this expansion to analyse special cases of the general D&C recurrence relation.
∑−
=
+=
+++=
++=
+=
1
0
2233
22
)/()1(
)()/()/()/()()/()/(
)()/()(
k
j
jjk bngafa
ngbnagbngabnfangbnagbnfa
ngbnafnf
11/2/2006 Lecture11 gac1 4
A Common Special Case• A common special case arises when g(n) is constant.• Theorem
– Let a ≥ 1 be a real number, b > 1 be an integer, and c > 0 be a real number.
– Let f be an increasing function that satisfies the recurrence relation f(n) = a f(n/b) + c whenever n is divisible by b.
– Then if a > 1, f(n) is– Alternatively, if a = 1, f(n) is
• Quick test: – Why do I need to write the base of the log in the first case, but not in
the second?
)( log abnO)(log nO
11/2/2006 Lecture11 gac1 5
Special Case: Proof• To prove this, we will consider two cases
separately– where n = bk for some k ∈ Z+, and otherwise.
• First, let us consider our expansion of the recurrence, when g(n) is constant.
∑
∑−
=
−
=
+=
+=
1
0
1
0
)1(
)/()1()(
k
j
jk
k
j
jjk
acfa
bngafanf
11/2/2006 Lecture11 gac1 6
Special Case: Proof• For the case where a = 1, aj = 1. So in this case,
we have
– this is the sum of an O(1) function and an O(log n) function, and is therefore an O(log n) function.
• Let us now consider what happens if n is not an integer power of b.– In this case, we can sandwich n between two integer
powers of b, i.e. bk < n < bk+1.– f(n) is an increasing function, so f(n) ≤ f(bk+1) =
f(1) + c(k+1) = (f(1) + c) + ck.
ncfckfnf
blog)1()1()(+=+=
11/2/2006 Lecture11 gac1 7
Special Case: Proof– But f(n) ≤ (f(1) + c) + ck ≤ (f(1) + c) + c logb n– The RHS is still the sum of an O(1) and an O(log n)
function, and so the LHS is also O(log n).• Summary so far:
– We have shown that for a = 1, f(n) is O(log n).– We now need to examine the case a > 1.
• Let us now consider the general expansion
– the second term is a geometric progression
∑−
=
+=1
0)1()(
k
j
jk acfanf
11/2/2006 Lecture11 gac1 8
Special Case: Proof– Using the standard formula for the sum of k terms of a
G.P. [can you remember it?], we obtain
– but – so– this is the sum of an O(1) function and an
function, and is therefore an function.• We finally have to consider the case where a > 1
and n is not an integer power of b.
[ ] )1/()1/()1()1/()1()1()(
acacfaaacfanf
k
kk
−+−+=
−−+=
abbnnk baaab nnaaa loglog/1log/loglog ====[ ] )1/()1/()1()( log acacfnnf ab −+−+=
)( log abnO)( log abnO
11/2/2006 Lecture11 gac1 9
Special Case: Proof– Again, we will use the fact that we can sandwich n
between two such powers: bk < n < bk+1.– Since f(n) is increasing, we obtain
– Once again, this is the sum of an O(1) function and an function, and is therefore an function.
[ ][ ]
[ ] )1/()1/()1(
)1/()1/()1()1/()1/()1(
)()(
log
1
1
acaacafnacaacafa
acacfabfnf
a
k
k
k
b −+−+≤
−+−+=
−+−+=
≤+
+
)( log abnO )( log abnO
11/2/2006 Lecture11 gac1 10
Examples• Example 1
– Find a big-O expression for an increasing functionsatisfying f(n) = 5f(n/2) + 3.
– Compare with f(n) = a f(n/b) + g(n). This is a divide-and-conquer recurrence relation with:
• constant g(n) = c = 3.• a = 5 ≥ 1.• b = 2 > 1.
– We can therefore apply the theorem.– f(n) is – Note that log2 5 = 2.32 to 2 decimal places. But for a
big-O expression, we have used O(n2.33).• remember: big-O is an upper bound!
).()()( 33.25loglog 2 nOnOnO ab or =
11/2/2006 Lecture11 gac1 11
Examples• A plot of the function f(n) is shown below (*).
0 2 4 6 8 10 12 14 16 18 200
200
400
600
800
1000
1200
1400
1600
1800
1.5n2.33
2.5n2
f(n)
n(*) Initial conditions: f(1) = 0.24, f(3) = 12.93, f(5) = 42.52, f(7) = 93.13, f(9) = 167.26, f(11) = 266.96, f(13) = 393.99, f(15) = 549.91, f(17) = 736.11, f(19) = 954.88
11/2/2006 Lecture11 gac1 12
Examples• Example 2
– Find a big-O expression for an increasing function satisfying f(n) = f(n/2) + 16.
– Compare with f(n) = a f(n/b) + g(n). This is a divide-and-conquer recurrence relation with:
• constant g(n) = c = 16.• a = 1 ≥ 1.• b = 2 > 1.
– We can therefore apply the theorem.– f(n) is O(log n).
11/2/2006 Lecture11 gac1 13
Examples• A plot of the function f(n) is shown below (*).
0 2 4 6 8 10 12 14 16 18 200
10
20
30
40
50
60
70
80
16.5 log2 nf(n)
n(*) Initial conditions: f(1) = 0, f(3) = 25.44, f(5) = 37.12, f(7) = 44.96 , f(9) = 50.72, f(11) = 55.36, f(13) = 59.2, f(15) = 62.56, f(17) = 65.44, f(19) = 68.
11/2/2006 Lecture11 gac1 14
The Master Theorem• We can now turn our attention to cases when g(n) may be
non-constant.• There is a famous generalization of the previous theorem for
cases where g(n) = cnd, for c > 0 and d ≥ 0.– This general case includes the previous theorem (d = 0).
• This generalization is known as The Master Theorem– It has this title because it is very useful in algorithm analysis.– The proof of the Master theorem is long; we will not prove it. – For those interested, Cormen, et al., have a proof [not examinable].
11/2/2006 Lecture11 gac1 15
The Master Theorem• Theorem
– Let a ≥ 1 be a real number, b > 1 be an integer, c > 0 be a real number, and d ≥ 0 be a real number.
– Let f be an increasing function that satisfies the recurrence relation f(n) = a f(n/b) + cnd, whenever n = bk, where k is a positive integer.
– If a < bd, then f(n) is .– Alternatively, if a = bd, then f(n) is .– Finally, if a > bd, then f(n) is .
)( dnO)log( nnO d
)( log abnO
11/2/2006 Lecture11 gac1 16
Example• Example
– Find a big-O expression for an increasing function f(n) satisfying f(n) = 8 f(n/2) + n2.
– Compare with f(n) = a f(n/b) + g(n). This is a divide-and-conquer recurrence relation with:
• g(n) = n2. This has the form g(n) = cnd, with c = 1 > 0 and d = 2 ≥ 0.
• a = 8 ≥ 1.• b = 2 > 1.
– We can therefore apply the Master Theorem.– We have a > bd (8 > 4), so f(n) is . )()( 38log2 nOnO =
11/2/2006 Lecture11 gac1 17
Example• A plot of the function f(n) is shown below (*).
0 2 4 6 8 10 12 14 16 18 200
100
200
300
400
500
600
700
800
(*) Initial conditions: f(1) = 1, f(3) = 10, f(5) = 25, f(7) = 60 , f(9) = 100, f(11) = 150, f(13) = 220, f(15) = 280, f(17) = 380, f(19) = 460.
n
0.1n3
f(n)
11/2/2006 Lecture11 gac1 18
Test Your Knowledge• Provide a big-O estimate for f(n) satisfying each of
the following recurrence relations, given that f(n) is an increasing function.f(n) = f(n/4) + 1f(n) = 5f(n/4) + 1f(n) = f(n/4) + 3nf(n) = 5f(n/4) + 3nf(n) = 4f(n/4) + 3n
11/2/2006 Lecture11 gac1 19
Summary• This lecture has covered
– divide-and-conquer recurrences– a theorem for a common form of divide-and-conquer
recurrence– the Master Theorem
• In the next lecture, we will put these tools to work analysing divide-and-conquer algorithms.
11/2/2006 Lecture12 gac1 1
Analysing Recursive Algorithms II• In this lecture, we will analyse the worst-case
execution time of divide-and-conquer algorithms.
• The lecture will cover– obtaining divide-and-conquer recurrences– example algorithms
• binary search• merge sort
11/2/2006 Lecture12 gac1 2
Obtaining Recurrences• Divide-and-conquer recurrence relations occur
when analysing the time-complexity of divide-and-conquer algorithms.
• A divide-and-conquer recurrence relation f(n) = a f(n/b) + g(n) results when the algorithm, operating on a problem of size n (multiple of b),1. decides how to split up the input;2. splits the input into smaller segments, each of size n/b,
recursively operating on a of those segments;3. combines the resulting outputs to make to overall
output.– steps 1+3 combined take time g(n).
11/2/2006 Lecture12 gac1 3
Binary Search• Binary Search is a classic example
– we looked at an iterative version in Lecture 8– a slightly different recursive version is shown below: here the aim is
to report success iff the value x is found in the (sorted) array a[ ].procedure binary_search( integer x, integer a[n] )begin
1. if n = 1 thenif a[1] = x then
report successelse
report failureelse
2. if x < a[ ⎣n/2⎦ ] thenbinary_search( x, a[1 to ⎣n/2⎦])
else binary_search( x, a[⎣n/2⎦ + 1 to n])
end11/2/2006 Lecture12 gac1 4
Binary Search• Algorithm explanation
– if the array is of length one, then we only need to look in one place to find x, i.e. a[1].
– otherwise, we find the approximate mid-point, ⎣n/2⎦.
– the array is sorted, so if x is less than the value at the mid-point, look in the first-half; otherwise, look in the second-half.
• When n is a multiple of 2, the algorithm– splits the input into smaller segments, each of size no
more than n/2; recursively operates on one of these• which one is decided by line 2
– sounds like a divide-and-conquer recurrence!
11/2/2006 Lecture12 gac1 5
Binary Search• We can find a big-O estimate for the number of
comparison operations.• The function g(n)?
– g(n) here models the “overhead” in the execution of line 1 and line 2.
– each one performs a comparison, so g(n) = 2.• The overall recurrence relation
– Since the input is split into segments of size n/2, b = 2.– Since only one of these is operated on, a = 1.– We have f(n) = f(n/2) + 2.
11/2/2006 Lecture12 gac1 6
Binary Search• Solution of this recurrence comes from the Master
Theorem (or our special case).– Compare with f(n) = a f(n/b) + cnd.– We have a = 1, b = 2, c = 2, d = 0.– Since a = bd, f(n) is O(nd log n) = O(log n).– Binary search is logarithmic-time.
11/2/2006 Lecture12 gac1 7
Merge Sort• Merge sort is a famous algorithm for sorting arrays.
– it has better asymptotic performance than bubble sort (we will prove this).
• The basic idea– start with an (unsorted) array of n numbers.– divide the array into half: sort each half separately (recursively).– combine the two sorted halves into a sorted whole.
[4,10,9,1] [4,10]
[9,1]
[4]
[10]
[9]
[1]
[4, 10]
[1,9]
[1,4,9,10]split split merge merge
11/2/2006 Lecture12 gac1 8
Merge Sort
• The algorithm is shown in pseudo-code above.• For this algorithm to be efficient, we need an
efficient implementation of merge().
procedure merge_sort( real a[n] )begin
if n = 1 thenresult := a
else L1 = merge_sort( a[ 1 to ⎣n/2⎦ ] )L2 = merge_sort( a[ ⎣n/2⎦+1 to n ] )result := merge( L1, L2 )
end
11/2/2006 Lecture12 gac1 9
Merge Procedure• The merge procedure can take advantage of the
fact that its inputs are sorted arrays.procedure merge( real L1[p], L2[q] )begin
count1 := 1count2 := 1countM := 1do
1. if count2 > q or L1[count1] < L2[count2] then beginresult[countM] = L1[count1]count1 := count1 + 1
end else beginresult[countM] = L2[count2]count2 := count2 + 1
endcountM := countM + 1
2. while countM <= p + qend
11/2/2006 Lecture12 gac1 10
Merge Procedure• Algorithm explanation
– at each iteration of the while loop, it fills a single element of result[ ].– the source of this element depends on whether the current element
of L1 or L2 is smallest – the smallest is taken.– the counters keep track of the current element being processed on
each of the three lists.
[4, 10]
[1,9]
[?,?,?,?]
count1
count2
countM
[4, 10]
[1,9]
[1,?,?,?]
count1
count2
countM
[4, 10]
[1,9]
[1,4,?,?]
count1
count2
countM
[4, 10]
[1,9]
[1,4,9,?]
count1
count2
countM
[4, 10]
[1,9]
[1,4,9,10]
count1
count2
countM
iteration
11/2/2006 Lecture12 gac1 11
Merge Procedure• Algorithm analysis
– before we can estimate the execution time of merge_sort, we need to find the execution time of merge
– we will count the number of comparisons• the loop iterates p+q times• there are at most 2(p+q) comparisons on line 1• the comparison on line 2 is performed p+q times
– the total number of comparisons is therefore at most 3(p+q).
– merge is called with p = ⎣n/2⎦, q = n - ⎣n/2⎦.– So the total number of comparisons is at most 3n.
11/2/2006 Lecture12 gac1 12
Merge Sort• We can now return to the analysis of merge_sort
– when n is even, merge_sort1. performs a comparison (n = 1).2. performs two other merge_sorts on problems of size n/2.3. calls merge, which performs at most 3n comparisons.
– writing a recurrence for the number of comparisons gives f(n) = 2f(n/2) + 3n + 1.• this doesn’t fit the form solved by the Master Theorem.
– But we can combine steps 1 and 3, and say that merge_sort1. performs two other merge_sorts on problems of size n/2.2. performs some additional housekeeping which takes at most 4n
comparisons.
11/2/2006 Lecture12 gac1 13
Merge Sort– Writing this as a recurrence relation, f(n) = 2f(n/2) + 3n
• compare with f(n) = a f(n/b) + cnd, to get a = 2, b = 2, c = 4, d = 1.
• we have a = bd, so f(n) is O(nd log n) = O(n log n), from the Master Theorem.
– Note that the simplification we made (3n + 1 ≤ 4n) has not affected the tightness of the bound in this case
• f(n) = 2f(n/2) + 3n gives exactly the same bound.
• Conclusion: merge_sort performs O(n log n) comparisons, but bubble_sort performs n(n-1)/2.– merge_sort is asymptotically faster.
11/2/2006 Lecture12 gac1 14
Test Your Knowledge• For the code below, give a big-O estimate of
– the number of additions– the number of comparisons (including the for-loop comparisons)– the execution time
procedure atest( real a[n] )begin
if n = 1result := a[1]
else begintemp := 0stride := ⎣n/3⎦for i := 0 to 2
temp := temp + atest( a[ i*stride+1 to (i+1)*stride ] )end
end
11/2/2006 Lecture12 gac1 15
Summary• This lecture has covered
– obtaining divide-and-conquer recurrences– example algorithms
• binary search• merge sort
• This was the last lecture on recurrence relations.
• In the next lecture we will look at the ideas of computability theory.