COSC 3100 Fundamentals of Analysis
of Algorithm EfficiencyInstructor: Tanvir
Outline (Ch 2)• Input size, Orders of growth, Worst-case, Best-case, Average-case (Sec 2.1)• Asymptotic notations and Basic
efficiency classes (Sec 2.2)• Analysis of Nonrecursive Algorithms (Sec
2.3)• Analysis of Recursive Algoithms (Sec
2.4)• Example: Computing n-th Fibonacci
Number (Sec 2.5)• Empirical Analysis, Algorithm
Visualization (Sec 2.6, 2.7)
Analysis of Algoithms• An investigation of an algorithm’s
efficiency with respect to two resources:– Running time (time efficiency or time
complexity)– Memory space (space efficiency or
space complexity)
Analysis Framework• Measuring an input’s size• Measuring Running Time• Orders of Growth (of algorithm’s
efficiency function)• Worst-case, Best-case, and
Average-case efficiency
Measuring Input Size• Efficiency is defined as a function
of input size• Input size depends on the problem
– Example 1: what is the input size for sorting n numbers?
– Example 2: what is the input size for evaluating ⋯
– Example 3: what is the input size for multiplying two n×n matrices?
Units of measuring running time
• Measure running time in milliseconds, seconds, etc.– Depends on which computer
• Count the number of times each operation is executed– Difficult and unnecessary
• Count the number of times an algorithm’s “basic operation” is executed
Measure running time in terms of # of basic
operations• Basic operation: the operation
that contributes the most to the total running time of an algorithm
• Usually the most time consuming operation in the algorithm’s innermost loop
Input size and basic operation examples
Problem Measure of input size
Basic operation
Search for a key in a list of n items
# of items in the list
Key comparison
Add two n×n matrices
Dimensions of the matrices, n
Addition
Polynomial evaluation
Order of the polynomial
Multiplication
Theoretical Analysis of Time Efficiency
• Count the number of times the algorithm’s basic operation is executed on inputs of size n: C(n)
T(n) ≈ cop × C(n)
Execution time for basic operation
# of times basic op. is executedRunning time
Input size Ignore cop,Focus on orders of growth
If C(n) = , how much longer it runs if the input size is doubled?
Orders of Growth• Why do we care about the order of
growth of an algorithm’s efficiency function, i.e., the total number of basic operations? Euclid’
sConsecutive
Integer Checking
gcd(60, 24) 3 13gcd(31415, 14142) 10 14142gcd(218922995834555169026, 135301852344706746049)
97 > 1020
How fast efficiency function grows as n gets larger and larger…
Orders of Growth (contd.)n lgn n n×lg
nn2 n3 2n n!
10 3.3 10 3.3×10 102 103 103 3.6×106
102 6.6 102 6.6×102 104 106 1.3×1030
9.3×10157
103 10 103 10×103 106 109
104 13 104 13×104 108 1012
105 17 105 17×105 1010 1015
106 20 106 20×106 1012 1018
Orders of Growth (contd.)
• Plots of growth…• Consider only the leading term• Ignore the constant coefficients
Worst, Best, Average Cases
• Efficiency depends on input size n• For some algorithms, efficiency
depends on the type of input• Example: Sequential Search
– Given a list of n elements and a search key k, find if k is in the list
– Scan list, compare elements with k until either found a match (success), or list is exhausted (failure)
Sequential Search Algorithm
ALGORITHM SequentialSearch(A[0..n-1], k)//Input: A[0..n-1] and k//Output: Index of first match or -1 if no match is //found
i <- 0while i < n and A[i] ≠ k do
i <- i+1if i < n
return i //A[i] = kelse
return -1
Different cases• Worst case efficiency
– Efficiency (# of times the basic op. will be executed) for the worst case input of size n
– Runs longest among all possible inputs of size n
• Best case efficiency– Runs fastest among all possible inputs of
size n• Average case efficiency
– Efficiency for a typical/random input of size n
– NOT the average of worst and best cases– How do we find average case efficiency?
Average Case of Sequential Search
• Two assumptions:– Probability of successful search is p
(0 ≤ p ≤ 1)– Search key can be at any index with
equal probability (uniform distribution)
Cavg(n) = Expected # of comparisons = Expected # of comparisons for success + Expected # of comparisons if k is not in the list
Summary of Analysis Framework
• Time and space efficiencies are functions of input size
• Time efficiency is # of times basic operation is executed
• Space efficiency is # of extra memory units consumed
• Efficiencies of some algorithms depend on type of input: requiring worst, best, average case analysis
• Focus is on order of growth of running time (or extra memory units) as input size goes to infinity
Asymptotic Growth Rate• 3 notations used to compare
orders of growth of an algorithm’s basic operation count– O(g(n)): Set of functions that grow no
faster than g(n)– Ω(g(n)): Set of functions that grow at
least as fast as g(n)– Θ(g(n)): Set of functions that grow at
the same rate as g(n)
O(big oh)-Notation
Doesn’tmatter
nn0
c × g(n)t(n)
t(n) є O(g(n))
O-Notation (contd.)• Definition: A function t(n) is said
to be in O(g(n)), denoted t(n) є O(g(n)), if t(n) is bounded above by some positive constant multiple of g(n) for sufficiently large n.
• If we can find +ve constants c and n0 such that:
t(n) ≤ c × g(n) for all n ≥ n0
O-Notation (contd.)
• Is 100n+5 є O(n2) ?
• Is 2n+1 є O(2n) ? Is 22n є O(2n) ?
• Is n(n-1) є O(n2) ?
Ω(big omega)-Notation
Doesn’tmatter
n0
t(n)c × g(n)
n
t(n) є Ω(g(n))
Ω-Notation (contd.)• Definition: A function t(n) is said
to be in Ω(g(n)) denoted t(n) є Ω(g(n)), if t(n) is bounded below by some positive constant multiple of g(n) for all sufficiently large n.
• If we can find +ve constants c and n0 such that
t(n) ≥ c × g(n) for all n ≥ n0
Ω-Notation (contd.)
• Is n3 є Ω(n2) ?
• Is 100n+5 є Ω(n2) ?
• Is n(n-1) є Ω(n2) ?
Θ(big theta)-Notation
Doesn’tmatter
n0n
t(n) є Θ(g(n))
c1 × g(n)
c2 × g(n)t(n)
Θ-Notation (contd.)• Definition: A function t(n) is said to
be in Θ(g(n)) denoted t(n) є Θ(g(n)), if t(n) is bounded both above and below by some positive constant multiples of g(n) for all sufficiently large n.
• If we can find +ve constants c1, c2, and n0 such that
c2×g(n) ≤ t(n) ≤ c1×g(n) for all n ≥ n0
Θ-Notation (contd.)• Is n(n-1) є Θ(n2) ?
• Is n2+sin(n) є Θ(n2) ?
• Is an2+bn+c є Θ(n2) for a > 0?
• Is (n+a)b Θ(nb) for b > 0 ?
O, Θ, and Ω
g(n)
(≥) Ω(g(n)): functions that grow at least as fast as g(n)
(=) Θ(g(n)): functions that grow at the same rate as g(n)
(≤) O(g(n)): functions that grow no faster that g(n)
Theorem • If t1(n) є O(g1(n)) and t2(n) є O(g2(n)),
then t1(n) + t2(n) є O(maxg1(n),g2(n))
• Analogous assertions are true for Ω and Θ notations.
• Implication: if sorting makes no more than n2 comparisons and then binary search makes no more than log2n comparisons, then efficiency is O(maxn2, log2n) = O(n2)
Theorem (contd.)• If t1(n) є O(g1(n)) and t2(n) є
O(g2(n)), then t1(n) + t2(n) є O(maxg1(n),g2(n))
t1(n) ≤ c1g1(n) for n ≥ n01 and t2(n) ≤ c2g2(n) for n ≥ n02
t1(n) + t2(n) ≤ c1g1(n) + c2g2(n) for n ≥ max n01, n02 t1(n) + t2(n) ≤ maxc1, c2 g1(n) + maxc1, c2 g2(n) for n ≥ max n01, n02 t1(n) + t2(n) ≤ 2maxc1, c2 maxg1(n), g2(n) , c
for n ≥ maxn01, n02n0
Using Limits for Comparing Orders of
Growth
• =0 implies that t(n) grows slower than g(n)
c implies that t(n) grows at the same order as g(n)
∞ implies that t(n) grows faster than g(n)
1. First two cases (0 and c) means t(n) є O(g(n))
2. Last two cases (c and ∞) means t(n) є Ω(g(n))
3. The second case (c) means t(n) є Θ(g(n))
Taking Limits• L’Hôpital’s rule: if limits of both
t(n) and g(n) are ∞ as n goes to ∞,
• Stirling’s formula: For large n, we have n! ≈
Stirling’s Formula• n! = n (n-1) (n-2) … 1• ln(n!) =
• Trapezoidal rule: • Add to both sides,• • Exponentiate both sides, n! ≈ C nne-nn1/2
= C
n! ≈
…
1 2 3 4 n
Taking Limits (contd.)• Compare orders of growth of 22n
and 3n
• Compare orders of growth of lgn and
• Compare orders of growth of n! and 2n
Summary of How to Establish Order of Growth of Basic Operation Count
• Method 1: Using Limits• Method 2: Using Theorem• Method 3: Using definitions of O-,
Ω-, and Θ-notations.
Basic Efficiency ClassesClass Name Comments
1 constant May be in best caseslgn logarithmic Halving problem size
at each iterationn linear Scan a list of size n
n×lgn linearithmic Divide and conquer algorithms, e.g.,
mergesortn2 quadratic Two embedded loops,
e.g., selection sortn3 cubic Three embedded
loops, e.g., matrix multiplication
2n exponential All subsets of n-elements set
n! factorial All permutations of an n-elements set
Important Summation Formulas
Analysis of Nonrecursive Algorithms
ALGORITHM MaxElement(A[0..n-1])//Determines largest elementmaxval <- A[0]for i <- 1 to n-1 do
if A[i] > maxvalmaxval <- A[i]
return maxval
Input size: nBasic operation: > or <-
C(n) = є Θ(n)
General Plan for Analysis of Nonrecursive
algorithms• Decide on a parameter indicating an
input’s size• Identify the basic operation• Does C(n) depends only on n or does it
also depend on type of input?• Set up sum expressing the # of times
basic operation is executed.• Find closed form for the sum or at least
establish its order of growth
Analysis of Nonrecursive (contd.)
ALGORITHM UniqueElements(A[0..n-1])//Determines whether all elements are //distinctfor i <- 0 to n-2 do
for j <- i+1 to n-1 doif A[i] = A[j]
return falsereturn true
Input size: nBasic operation: A[i] = A[j]Does C(n) depend on type of input?
UniqueElements (contd.)for i <- 0 to n-2 do
for j <- i+1 to n-1 doif A[i] = A[j]
return falsereturn true
Cworst(n) =
Why Cworst(n) is better thansaying Cworst(n)
Analysis of Nonrecursive (contd.)
ALGORITHM MatrixMultiplication(A[0..n-1, 0..n-1], B[0..n-1, 0..n-1])//Output: C = ABfor i <- 0 to n-1 do
for j <- 0 to n-1 doC[i, j] = 0.0for k <- 0 to n-1 do
C[i, j] = C[i, j] + A[i, k]×B[k, j]return C
Input size: nBasic operation: ×M(n) =
T(n) ≈ cmM(n)+caA(n) = cmn3+can3
= (cm+ca)n3
Analysis of Nonrecursive (contd.)
ALGORITHM Binary(n)// Output: # of digits in binary // representation of ncount <- 1while n > 1 do
count <- count+1n <-
return count
Input size: nBasic operation: C(n) = ?
Each iteration halves nLet m be such that 2m ≤ n < 2m+1
Take lg: m ≤ lg(n) < m+1, som = and m+1 = +1 є Θ(lg(n))
Analysis of Nonrecursive (contd.)
ALGORITHM Mystery(n)S <- 0for i <- 1 to n do
S <- S + i×ireturn S What does this algorithm compute?
What is the basic operation?
How many times is the basic operation executed?
What’s its efficiency class?Can you improve it further, orCan you prove that no improvementis possible?
Analysis of Nonrecursive (contd.)
ALGORITHM Secret(A[0..n-1])r <- A[0]s <- A[0]for i <- 1 to n-1 do
if A[i] < rr <- A[i]
if A[i] > ss <- A[i]
return s-r
What does this algorithm compute?
What is the basic operation?
How many times is the basic operation executed?
What’s its efficiency class?
Can you improve it further, Or Could you prove that no improvement is possible?
Analysis of Recursive Algorithms
ALGORITHM F(n)// Output: n!if n = 0
return 1else
return n×F(n-1)
Input size: nBasic operation: ×
M(n) = M(n-1) + 1 for n > 0
to compute F(n-1) to multiply
n and F(n-1)
Analysis of Recursive (contd.)
M(n) = M(n-1) + 1M(0) = 0
M(n) = M(n-1) + 1 = M(n-2)+ 1 + 1 = … …= M(n-n) + 1 + … + 1 = 0 + n = n
We could compute it nonrecursively, saves function call overhead
Recurrence relation
n 1’s
By Backward substitution
General Plan for Recursive
• Decide on input size parameter• Identify the basic operation• Does C(n) depends also on input
type?• Set up a recurrence relation• Solve the recurrence or, at least
establish the order of growth of its solution
Analysis of Recursive (contd.): Tower of Hanoi
Tower of Hanoi (contd.)
M(n-1)
1
M(n-1)
M(n) = M(n-1) + 1 + M(n-1)M(1) = 1
Tower of Hanoi (contd.)ALGORITHM ToH(n, s, m, d)if n = 1
print “move from s to d”else
ToH(n-1, s, d, m)print “move from s to d”ToH(n-1, m, s, d)
M(1) = 1
M(n-1)
M(1)
M(n-1)
Tower of Hanoi (contd.)M(n) = 2M(n-1) + 1 for n > 1M(1) = 1M(n) = 2M(n-1) + 1 [Backward substitution…] = 2[ 2M(n-2)+1] + 1 = 22M(n-2)+2+1 = 22[2M(n-3)+1]+2+1 = 23M(n-3)+ 22+2+1 … … … = 2n-1M(n-(n-1))+2n-2+2n-3+……+22+2+1 = 2n-1+ 2n-2+2n-3+……+22+2+1 = = 2n-1
M(n) є Θ(2n)Be careful! Recursive algorithm may look simpleBut might easily be exponential in complexity.
Tower of Hanoi (contd.)• Recursion tree (# of function calls)
n
n-1 n-1
n-2 n-2 n-2 n-2
… … …2
112
11
211
211
C(n) = = 2n-1
Analysis of Recursive (contd.)
ALGORITHM BinRec(n)//Output: # of binary digits in n’s //binary representationif n = 1
return 1else
return BinRec()+1
Input size: n = 2k
Basic operation: +
A(n) = A(2k) = A(2k-1) + 1 = [A(2k-2) + 1] + 1 = … = A(2k-k)+k = A(1) + k = k = lg(n)
A(n) є Θ(lg(n))
Analysis of Recursive (contd.)
• Maximum how many slices of pizza can a person obtain by making n straight cuts with a pizza knife?
L0 = 1 L1 = 2 L2 = 4
L3 = 7
1, 2, 4, 7, …1 2 3
Ln = Ln-1 + n for n > 0L0 = 1
difference
Bonus problem in HW2
EXAMPLE: nth Fibonacci Number
• 0, 1, 1, 2, 3, 5, 8, 13, 21, 34, …F(n) = F(n-1) + F(n-2) for n > 1F(0) = 0, F(1) = 1Let’s try backward substitution…F(n) = F(n-2)+F(n-3)+F(n-3)+F(n-4)
= F(n-2)+2F(n-3)+F(n-4)= F(n-3)+3F(n-4)+3F(n-5)+F(n-
6) …
nth Fibonacci (contd.)ax(n)+bx(n-1)+cx(n-2) = 0Homogeneous linear second-order recurrence with constant coefficientsGuess: x(n) = rn
arn+brn-1+crn-2 = 0ar2+br+c = 0 THEOREM: If r1, r2 are two real distinct roots of characteristic equation then x(n) = αr1
n+βr2n where α, β are arbitrary constants
Characteristic equation
nth Fibonacci (contd.)F(n)-F(n-1)-F(n-2) = 0r2-r-1 = 0 => r1,2 = F(n) = α+βF(0) = α+β= 0 => α+β=0F(1) = α+β= 1 F(n) = =
where and = -
nth Fibonacci (contd.)• F(n) = where ≈ 1.61803 and ≈ -
0.61803• F(n) є Θ(φn) ALGORITHM F(n)if n ≤ 1
return nelse
return F(n-1)+F(n-2)
Basic op: +
A(n) = A(n-1)+A(n-2)+1A(0) = 0, A(1) = 0
[A(n)+1] –[A(n-1)+1]-[A(n-2)+1]=0Let B(n) = A(n)+1
B(n) –B(n-1)-B(n-2)=0B(0)=1, B(1)=1
Notice, B(n) = F(n+1)A(n) = B(n)-1=F(n+1)-1= -1
A(n) є Θ(φn)
nth Fibonacci (contd.)F(5)
F(4) F(3)
F(2) F(1)
F(0)F(1)
F(2)F(3)
F(0)F(1)F(1)F(2)
F(0)F(1)Only n-1 additions, Θ(n)!
ALGORITHM Fib(n)F[0] <- 0F[1] <- 1for i <- 2 to n do
F[i] <- F[i-1]+F[i-2]return F[n]
ALGORITHM Fib(n)f <- 0fnext <- 1for i <- 2 to n do
tmp <- fnextfnext <- fnext+ff <- tmp
return fnext
Empirical Analysis• Mathematical Analysis
– Advantages• Machine and input independence
– Disadvantages• Average case analysis is hard
• Empirical Analysis– Advantages
• Applicable to any algorithm– Disadvantages
• Machine and input dependency
Empirical Analysis (contd.)
• General Plan– Understand Experiment’s purpose:
• What is the efficiency class?• Compare two algorithms for same problem
– Efficiency metric: Operation count vs. time unit
– Characteristic of input sample (range, size, etc.)
– Write a program– Generate sample inputs– Run on sample inputs and record data– Analyze data
Empirical Analysis (contd.)
• Insert counters into the program to count C(n)
• Or, time the program, (tfinish-tstart), in Java System.currentTimeMillis()– Typically not very accurate, may vary– Remedy: make several
measurements and take mean or median
– May give 0 time, remedy: run in a loop and take mean
Empirical Analysis (contd.)
• Increase sample size systematically, like 1000, 2000, 3000, …, 10000– Impact is easier to analyze
• Better is to generate random sizes within desired range– Because, odd sizes may affect
• Pseudorandom number generators– In Java Math.random() gives uniform
random number in [0, 1)– If you need number in [min, max]:
• min + (int)( Math.random() * ( (max-min) + 1 ) )
Empirical Analysis (contd.)
• Pseudorandom in [min, max]– Math.random() * (max-min) gives [0, max-
min)– min + Math.random() * (max-min) gives
[min, max)– (int)[Math.random() * (max-min+1)] gives
[0, max-min]– min+ (int)[Math.random() * (max-min+1)]
gives [min, max]• To get [5, 10]
– (int)Math.random()*(10-5+1) gives [0, 10-5] or [0, 5]
– 5+ (int)Math.random()*(10-5+1) gives [5, 10]
Empirical Analysis (contd.)
• Profiling: find out most time-consuming portion, Eclipse and Netbeans have it
• Tabulate data
• If M(n)/g(n) converges to a +ve constant, we know M(n) є Θ(g(n))
• Or look at M(2n)/M(n); e.g., if M(n) є Θ(lgn) then M(2n)/M(n) will be low
n M(n) g(n) M(n)/g(n)
Empirical Analysis (contd.)
• Scatterplot
n
count/time
n
count/time
n
count/timelgn
n
nlgn or n2
M(n) ≈ can
lgM(n) ≈ nlga + lgc
Algorithm Visualization (contd.)
• Sequence of still pictures or animation
• Sorting Out Sorting on YouTube– Nice animation of 9 sorting
algorithms
Summary• Time and Space Efficiency• C(n): Count of # of times the basic operation
is executed for input of size n• C(n) may depend on type of input and then
we need worst, average, and best case analysis
• Usually order of growth (O, Ω, Θ) is all that matters: logarithmic, linear, linearithmic, quadratic, cubic, and exponential
• Input size, basic op., worst case?, sum or recurrence, solve it…
• We run on computers for empirical analysis
Homework 2• DUE ON FEBRUARY 07 (TUESDAY)
IN CLASS