+ All Categories
Home > Documents > Fall 2006CSC311: Data Structures1 Chapter 4 Analysis Tools Objectives –Experiment analysis of...

Fall 2006CSC311: Data Structures1 Chapter 4 Analysis Tools Objectives –Experiment analysis of...

Date post: 21-Dec-2015
Category:
View: 219 times
Download: 2 times
Share this document with a friend
30
Fall 2006 Fall 2006 CSC311: Data Structures CSC311: Data Structures 1 Chapter 4 Chapter 4 Analysis Tools Analysis Tools Objectives Objectives Experiment analysis of algorithms and Experiment analysis of algorithms and limitations limitations Theoretical Analysis of algorithms Theoretical Analysis of algorithms Pseudo-code description of algorithms Pseudo-code description of algorithms Big-Oh notations Big-Oh notations Seven functions Seven functions Proof techniques Proof techniques
Transcript

Fall 2006Fall 2006 CSC311: Data StructuresCSC311: Data Structures 11

Chapter 4Chapter 4Analysis ToolsAnalysis Tools

ObjectivesObjectives– Experiment analysis of algorithms and Experiment analysis of algorithms and

limitationslimitations– Theoretical Analysis of algorithmsTheoretical Analysis of algorithms– Pseudo-code description of algorithmsPseudo-code description of algorithms– Big-Oh notationsBig-Oh notations– Seven functionsSeven functions– Proof techniquesProof techniques

22

Analysis of AlgorithmsAnalysis of Algorithms

AlgorithmInput Output

An algorithm is a step-by-step procedure forsolving a problem in a finite amount of time.

Fall 2006Fall 2006 CSC311: Data StructuresCSC311: Data Structures 33

Running Time Running Time Most algorithms Most algorithms transform input objects transform input objects into output objects.into output objects.The running time of an The running time of an algorithm typically grows algorithm typically grows with the input size.with the input size.Average case time is Average case time is often difficult to often difficult to determine.determine.We focus on the worst We focus on the worst case running time.case running time.– Easier to analyzeEasier to analyze– Crucial to applications such Crucial to applications such

as games, finance and as games, finance and roboticsrobotics

0

20

40

60

80

100

120

Runnin

g T

ime

1000 2000 3000 4000

Input Size

best caseaverage caseworst case

Fall 2006Fall 2006 CSC311: Data StructuresCSC311: Data Structures 44

Experimental StudiesExperimental Studies

Write a program Write a program implementing the implementing the algorithmalgorithmRun the program with Run the program with inputs of varying size inputs of varying size and compositionand compositionUse a method like Use a method like System.currentTimeMillis()System.currentTimeMillis() to to get an accurate get an accurate measure of the actual measure of the actual running timerunning timePlot the resultsPlot the results

0

1000

2000

3000

4000

5000

6000

7000

8000

9000

0 50 100

Input Size

Tim

e (

ms)

Fall 2006Fall 2006 CSC311: Data StructuresCSC311: Data Structures 55

Limitations of ExperimentsLimitations of Experiments

It is necessary to implement the It is necessary to implement the algorithm, which may be difficultalgorithm, which may be difficultResults may not be indicative of the Results may not be indicative of the running time on other inputs not running time on other inputs not included in the experiment. included in the experiment. In order to compare two algorithms, In order to compare two algorithms, the same hardware and software the same hardware and software environments must be usedenvironments must be used

Fall 2006Fall 2006 CSC311: Data StructuresCSC311: Data Structures 66

Theoretical AnalysisTheoretical Analysis

Uses a high-level description of the Uses a high-level description of the algorithm instead of an algorithm instead of an implementationimplementation

Characterizes running time as a Characterizes running time as a function of the input size, function of the input size, nn..

Takes into account all possible inputsTakes into account all possible inputs

Allows us to evaluate the speed of an Allows us to evaluate the speed of an algorithm independent of the algorithm independent of the hardware/software environmenthardware/software environment

Fall 2006Fall 2006 CSC311: Data StructuresCSC311: Data Structures 77

PseudocodePseudocode

High-level description High-level description of an algorithmof an algorithmMore structured than More structured than English proseEnglish proseLess detailed than a Less detailed than a programprogramPreferred notation for Preferred notation for describing algorithmsdescribing algorithmsHides program design Hides program design issuesissues

Algorithm arrayMax(A, n)Input array A of n integersOutput maximum element of A

currentMax A[0]for i 1 to n 1 do

if A[i] currentMax thencurrentMax A[i]

return currentMax

Example: find max element of an array

Fall 2006Fall 2006 CSC311: Data StructuresCSC311: Data Structures 88

Pseudocode DetailsPseudocode DetailsControl flowControl flow– ifif …… thenthen …… [ [elseelse …]…]

– whilewhile …… dodo ……

– repeatrepeat …… untiluntil ……

– forfor …… dodo ……

– Indentation replaces braces Indentation replaces braces

Method declarationMethod declarationAlgorithm Algorithm methodmethod ( (argarg [, [, argarg…])…])

InputInput ……

OutputOutput ……

Method callMethod callvar.method var.method ((argarg [, [, argarg…])…])

Return valueReturn valuereturnreturn expressionexpression

ExpressionsExpressions AssignmentAssignment

(like (like in Java) in Java) Equality testingEquality testing

(like (like in Java) in Java)nn22 Superscripts and other Superscripts and other

mathematical mathematical formatting allowedformatting allowed

Fall 2006Fall 2006 CSC311: Data StructuresCSC311: Data Structures 99

The Random Access Machine The Random Access Machine (RAM) Model(RAM) Model

A A CPUCPU

An potentially unbounded An potentially unbounded bank of bank of memorymemory cells, cells, each of which can hold an each of which can hold an arbitrary number or arbitrary number or charactercharacter

01

2

Memory cells are numbered and accessing Memory cells are numbered and accessing any cell in memory takes unit time.any cell in memory takes unit time.

Fall 2006Fall 2006 CSC311: Data StructuresCSC311: Data Structures 1010

Seven Important FunctionsSeven Important FunctionsSeven functions Seven functions that often appear in that often appear in algorithm analysis:algorithm analysis:– Constant Constant 11– Logarithmic Logarithmic log log nn– Linear Linear nn– N-Log-N N-Log-N n n log log nn– Quadratic Quadratic nn22

– Cubic Cubic nn33

– Exponential Exponential 22nn

In a log-log chart, In a log-log chart, the slope of the line the slope of the line corresponds to the corresponds to the growth rate of the growth rate of the functionfunction

1E+01E+21E+41E+61E+8

1E+101E+121E+141E+161E+181E+201E+221E+241E+261E+281E+30

1E+0 1E+2 1E+4 1E+6 1E+8 1E+10n

T(n

)

Cubic

Quadratic

Linear

Fall 2006Fall 2006 CSC311: Data StructuresCSC311: Data Structures 1111

Primitive OperationsPrimitive OperationsBasic computations Basic computations performed by an algorithmperformed by an algorithm

Identifiable in pseudocodeIdentifiable in pseudocode

Largely independent from Largely independent from the programming languagethe programming language

Exact definition not Exact definition not important (we will see why important (we will see why later)later)

Assumed to take a Assumed to take a constant amount of time in constant amount of time in the RAM modelthe RAM model

Examples:Examples:– Evaluating an Evaluating an

expressionexpression– Assigning a value Assigning a value

to a variableto a variable– Indexing into an Indexing into an

arrayarray– Calling a methodCalling a method– Returning from a Returning from a

methodmethod

Fall 2006Fall 2006 CSC311: Data StructuresCSC311: Data Structures 1212

Counting Primitive OperationsCounting Primitive OperationsBy inspecting the pseudocode, we can determine the By inspecting the pseudocode, we can determine the maximum number of primitive operations executed maximum number of primitive operations executed by an algorithm, as a function of the input sizeby an algorithm, as a function of the input size

AlgorithmAlgorithm arrayMaxarrayMax((AA, , nn))

# operations# operations

currentMaxcurrentMax AA[0][0] 22forfor ii 11 toto nn 1 1 dodo 22nn

ifif AA[[ii] ] currentMaxcurrentMax thenthen 2(2(nn 1) 1)currentMaxcurrentMax AA[[ii]] 2(2(nn 1) 1)

{ increment counter { increment counter ii } } 2(2(nn 1) 1)returnreturn currentMaxcurrentMax 11

TotalTotal 8 8nn 2 2

Fall 2006Fall 2006 CSC311: Data StructuresCSC311: Data Structures 1313

Estimating Running TimeEstimating Running TimeAlgorithm Algorithm arrayMaxarrayMax executes executes 88nn 2 2 primitive primitive operations in the worst case. Define:operations in the worst case. Define:aa = Time taken by the fastest primitive operation= Time taken by the fastest primitive operation

bb = Time taken by the slowest primitive = Time taken by the slowest primitive operationoperation

Let Let TT((nn)) be worst-case time of be worst-case time of arrayMax.arrayMax. ThenThen

a a (8(8nn 2) 2) TT((nn)) bb(8(8nn 2) 2)

Hence, the running time Hence, the running time TT((nn)) is bounded by is bounded by two linear functionstwo linear functions

Fall 2006Fall 2006 CSC311: Data StructuresCSC311: Data Structures 1414

Growth Rate of Running TimeGrowth Rate of Running Time

Changing the hardware/ software Changing the hardware/ software environment environment – Affects Affects TT((nn)) by a constant factor, but by a constant factor, but– Does not alter the growth rate of Does not alter the growth rate of TT((nn))

The linear growth rate of the The linear growth rate of the running time running time TT((nn)) is an intrinsic is an intrinsic property of algorithm property of algorithm arrayMaxarrayMax

Fall 2006Fall 2006 CSC311: Data StructuresCSC311: Data Structures 1515

Constant FactorsConstant Factors

The growth rate The growth rate is not affected byis not affected by– constant factors constant factors

or or – lower-order termslower-order terms

ExamplesExamples– 101022nn 101055 is a is a

linear functionlinear function– 101055nn22 10 1088nn is a is a

quadratic functionquadratic function1E+01E+21E+41E+61E+8

1E+101E+121E+141E+161E+181E+201E+221E+241E+26

1E+0 1E+2 1E+4 1E+6 1E+8 1E+10n

T(n

)

Quadratic

Quadratic

Linear

Linear

Fall 2006Fall 2006 CSC311: Data StructuresCSC311: Data Structures 1616

Big-Oh NotationBig-Oh NotationGiven functions Given functions ff((nn) ) and and gg((nn)), we say that , we say that ff((nn) ) is is OO((gg((nn)))) if there if there are positive are positive constantsconstantscc and and nn00 such that such that

ff((nn)) cgcg((nn) ) for for n n nn00

Example: Example: 22nn 1010 is is OO((nn))– 22nn 1010 cncn

– ((cc 2) 2) n n 1010

– n n 1010((cc 2) 2)

– Pick Pick c c 3 3 and and nn0 0 1010

1

10

100

1,000

10,000

1 10 100 1,000n

3n

2n+10

n

Fall 2006Fall 2006 CSC311: Data StructuresCSC311: Data Structures 1717

Big-Oh ExampleBig-Oh ExampleExample: the Example: the function function nn22 is not is not OO((nn))– nn22 cncn

– n n cc– The above inequality The above inequality

cannot be satisfied cannot be satisfied since since cc must be a must be a constant constant

1

10

100

1,000

10,000

100,000

1,000,000

1 10 100 1,000n

n̂ 2

100n

10n

n

Fall 2006Fall 2006 CSC311: Data StructuresCSC311: Data Structures 1818

More Big-Oh ExamplesMore Big-Oh Examples7n-27n-2

7n-2 is O(n)need c > 0 and n0 1 such that 7n-2 c•n for n n0

this is true for c = 7 and n0 = 1

3n3 + 20n2 + 53n3 + 20n2 + 5 is O(n3)need c > 0 and n0 1 such that 3n3 + 20n2 + 5 c•n3 for n

n0

this is true for c = 4 and n0 = 21 3 log n + 53 log n + 5 is O(log n)need c > 0 and n0 1 such that 3 log n + 5 c•log n for n

n0

this is true for c = 8 and n0 = 2

Fall 2006Fall 2006 CSC311: Data StructuresCSC311: Data Structures 1919

Big-Oh and Growth RateBig-Oh and Growth RateThe big-Oh notation gives an upper bound on The big-Oh notation gives an upper bound on the growth rate of a functionthe growth rate of a function

The statement “The statement “ff((nn) ) is is OO((gg((nn))))” means that the ” means that the growth rate of growth rate of ff((nn) ) is no more than the growth is no more than the growth rate of rate of gg((nn))

We can use the big-Oh notation to rank We can use the big-Oh notation to rank functions according to their growth ratefunctions according to their growth rate

ff((nn) ) is is OO((gg((nn)))) gg((nn) ) is is OO((ff((nn))))

gg((nn) ) grows moregrows more YesYes NoNo

ff((nn) ) grows moregrows more NoNo YesYes

Same growthSame growth YesYes YesYes

Fall 2006Fall 2006 CSC311: Data StructuresCSC311: Data Structures 2020

Big-Oh RulesBig-Oh Rules

If is If is ff((nn)) a polynomial of degree a polynomial of degree dd, then , then ff((nn)) is is OO((nndd)), i.e.,, i.e.,

1.1.Drop lower-order termsDrop lower-order terms

2.2.Drop constant factorsDrop constant factors

Use the smallest possible class of functionsUse the smallest possible class of functions– Say “Say “22nn is is OO((nn))”” instead of “instead of “22nn is is OO((nn22))””

Use the simplest expression of the classUse the simplest expression of the class– Say “Say “33nn 55 is is OO((nn))”” instead of “instead of “33nn 55 is is OO(3(3nn))””

Fall 2006Fall 2006 CSC311: Data StructuresCSC311: Data Structures 2121

Asymptotic Algorithm AnalysisAsymptotic Algorithm AnalysisThe asymptotic analysis of an algorithm The asymptotic analysis of an algorithm determines the running time in big-Oh notationdetermines the running time in big-Oh notationTo perform the asymptotic analysisTo perform the asymptotic analysis

– We find the worst-case number of primitive We find the worst-case number of primitive operations executed as a function of the input sizeoperations executed as a function of the input size

– We express this function with big-Oh notationWe express this function with big-Oh notation

Example:Example:– We determine that algorithm We determine that algorithm arrayMaxarrayMax executes at executes at

most most 88nn 2 2 primitive operationsprimitive operations– We say that algorithm We say that algorithm arrayMaxarrayMax “runs in “runs in OO((nn) ) time”time”

Since constant factors and lower-order terms are Since constant factors and lower-order terms are eventually dropped anyhow, we can disregard eventually dropped anyhow, we can disregard them when counting primitive operationsthem when counting primitive operations

Fall 2006Fall 2006 CSC311: Data StructuresCSC311: Data Structures 2222

Computing Prefix AveragesComputing Prefix AveragesWe further illustrate We further illustrate asymptotic analysis with asymptotic analysis with two algorithms for prefix two algorithms for prefix averagesaveragesThe The ii-th prefix average of -th prefix average of an array an array XX is average of the is average of the first first ((ii 1) 1) elements of elements of XX::

AA[[ii]] XX[0] [0] XX[1] [1] … … XX[[ii])/(])/(ii+1)+1)

Computing the array Computing the array AA of of prefix averages of another prefix averages of another array array XX has applications to has applications to financial analysisfinancial analysis

0

5

10

15

20

25

30

35

1 2 3 4 5 6 7

X

A

Fall 2006Fall 2006 CSC311: Data StructuresCSC311: Data Structures 2323

Prefix Averages (Quadratic)Prefix Averages (Quadratic)The following algorithm computes prefix The following algorithm computes prefix averages in quadratic time by applying the averages in quadratic time by applying the definitiondefinition

AlgorithmAlgorithm prefixAverages1prefixAverages1((X, nX, n))InputInput array array XX of of nn integers integersOutputOutput array array AA of prefix averages of of prefix averages of XX #operations#operations

AA new array of new array of nn integers integers nnforfor ii 00 toto nn 1 1 dodo nn

ss XX[0] [0] nnforfor jj 11 toto ii dodo 1 1 2 2 …… ( (nn 1) 1)

ss ss XX[[jj]] 1 1 2 2 …… ( (nn 1) 1)AA[[ii]] ss ( (ii 1) 1) nn

returnreturn A A 11

Fall 2006Fall 2006 CSC311: Data StructuresCSC311: Data Structures 2424

Arithmetic ProgressionArithmetic Progression

The running time of The running time of prefixAverages1 prefixAverages1 isisOO(1 (1 2 2 ……nn))

The sum of the first The sum of the first nn integers is integers is nn((nn 1) 1) 22– There is a simple There is a simple

visual proof of this factvisual proof of this fact

Thus, algorithm Thus, algorithm prefixAverages1 prefixAverages1 runs in runs in OO((nn22) ) time time

0

1

2

3

4

5

6

7

1 2 3 4 5 6

Fall 2006Fall 2006 CSC311: Data StructuresCSC311: Data Structures 2525

Prefix Averages (Linear)Prefix Averages (Linear)The following algorithm computes prefix averages The following algorithm computes prefix averages in linear time by keeping a running sumin linear time by keeping a running sum

AlgorithmAlgorithm prefixAverages2prefixAverages2((X, nX, n))InputInput array array XX of of nn integers integersOutputOutput array array AA of prefix averages of of prefix averages of XX #operations#operationsAA new array of new array of nn integers integers nnss 0 0 11forfor ii 00 toto nn 1 1 dodo nn

ss ss XX[[ii]] nnAA[[ii]] ss ( (ii 1) 1) nn

returnreturn A A 11Algorithm Algorithm prefixAverages2 prefixAverages2 runs in runs in OO((nn) ) time time

Fall 2006Fall 2006 CSC311: Data StructuresCSC311: Data Structures 2626

SummationsSummationsLogarithmsLogarithms

loglogbb(xy) = log(xy) = logbbx + logx + logbbyy

loglogbb (x/y) = log (x/y) = logbbx - logx - logbbyy

loglogbbxa = alogxa = alogbbxx

loglogbba = loga = logxxa/loga/logxxbb

ExponentialsExponentials::aa(b+c)(b+c) = a = abba a cc

aabcbc = (a = (abb))cc

aabb /a /acc = a = a(b-c)(b-c)

b = a b = a loglogaabb

bbcc = a = a c*logc*logaabb

Math you need to ReviewMath you need to Review

Fall 2006Fall 2006 CSC311: Data StructuresCSC311: Data Structures 2727

Proof techniquesProof techniques– By counterexampleBy counterexample– ContrapositiveContrapositive

Ex. Let a and b integers. If ab is even, then a is even or b Ex. Let a and b integers. If ab is even, then a is even or b is evenis evenProof: Consider the contrapositive: if a is odd and b is odd. Proof: Consider the contrapositive: if a is odd and b is odd. Then you can find out that ab is odd.Then you can find out that ab is odd.

– ContradictionContradictionEx. Let a and b be integers. If ab is odd, then a is odd and Ex. Let a and b be integers. If ab is odd, then a is odd and b is oddb is oddProof: Consider the opposite of the “then” part. You’ll Proof: Consider the opposite of the “then” part. You’ll reach to a contradiction where ab is even reach to a contradiction where ab is even

– InductionInductionTwo formatsTwo formatsBase caseBase caseInduction caseInduction case

– Loop invariantsLoop invariants

Proof TechniquesProof Techniques

Fall 2006Fall 2006 CSC311: Data StructuresCSC311: Data Structures 2828

Relatives of Big-OhRelatives of Big-Oh

big-Omegabig-Omega– f(n) is f(n) is (g(n)) if there is a constant c > 0 (g(n)) if there is a constant c > 0

and an integer constant nand an integer constant n00 1 such that 1 such that

f(n) f(n) c c••g(n) for n g(n) for n n n00

big-Thetabig-Theta– f(n) is f(n) is (g(n)) if there are constants c’ > 0 (g(n)) if there are constants c’ > 0

and c’’ > 0 and an integer constant nand c’’ > 0 and an integer constant n00 1 1 such that c’such that c’••g(n) g(n) f(n) f(n) c’’ c’’••g(n) for n g(n) for n n n00

Fall 2006Fall 2006 CSC311: Data StructuresCSC311: Data Structures 2929

Intuition for Asymptotic Intuition for Asymptotic NotationNotation

Big-OhBig-Oh– f(n) is O(g(n)) if f(n) is f(n) is O(g(n)) if f(n) is

asymptotically asymptotically less than or equalless than or equal to g(n)to g(n)

big-Omegabig-Omega– f(n) is f(n) is (g(n)) if f(n) is (g(n)) if f(n) is

asymptotically asymptotically greater than or greater than or equalequal to g(n) to g(n)

big-Thetabig-Theta– f(n) is f(n) is (g(n)) if f(n) is (g(n)) if f(n) is

asymptotically asymptotically equalequal to g(n) to g(n)

Fall 2006Fall 2006 CSC311: Data StructuresCSC311: Data Structures 3030

Example Uses of the Example Uses of the Relatives of Big-OhRelatives of Big-Oh

f(n) is (g(n)) if it is (n2) and O(n2). We have already seen the former, for the latter recall that f(n) is O(g(n)) if there is a constant c > 0 and an integer constant n0 1 such that f(n) < c•g(n) for n n0

Let c = 5 and n0 = 1

5n2 is (n2)

f(n) is (g(n)) if there is a constant c > 0 and an integer constant n0 1 such that f(n) c•g(n) for n n0

let c = 1 and n0 = 1

5n2 is (n)

f(n) is (g(n)) if there is a constant c > 0 and an integer constant n0 1 such that f(n) c•g(n) for n n0

let c = 5 and n0 = 1

5n2 is (n2)


Recommended