3137 Data Structures and Algorithms in C++sh553/teaching/su06-3137/slides/3137-sum… · 1 1 3137...

Post on 12-May-2020

9 views 0 download

transcript

1

1

3137 Data Structures and Algorithms in C++

Lecture 7July 26 2006

Shlomo Hershkop

2

Announcements will do review later today

will take questions at end

please make sure to submit/plan hwsemester is going to end in 2 weeks from today

2

3

OutlineSorting – quick sortDisjoint DSReview for midterm

Reading: Chapter 7.7-7.8, 8-8.3

4

Quick sortfastest currently known sort

Average N log NWorst: N2

3

5

Quicksortif one element returnelse

pick a pivot from the listsplit the list around the pivotreturn quicksort(left) + pivot + quicksort(right)

Lets do an example

6

issuesHow does worst case happen ?

how to pick the pivot ??

4

7

Pivot #1use the first element of the list

pro/cons ?

8

sorted list will always be N2

5

9

Pivot #2choose random element for pivot

pro/cons ?

10

great performance

expensive to generate random number

6

11

Pivot #3Choose median value from the list

pro/cons ?

12

hmmm don’t you need a sorted list to get median?

actually there is a linear algorithm for this ☺ will be doing it on homework

7

13

Pivot #4Median of 3

since #3 isnt cheap, can grab 3 elements and take median

can even use random if you don’t mind

14

codingok so enough theory, how do you code all this ??

arrays are much cheaper than linked lists

lots of tricks to keep things cheap

8

15

16

9

17

18

understanding Ok, any idea of how to maximize cutoff point ?

how to analyze quicksort ??

10

19

Analysisso how to analyze quick sort

think how we did mergesort analysis

20

Quick sorti = size of left partitionC1 = time to choose pivotC2 = partitioning the set

what do you get?

11

21

what would be the worst case runtime?

can you solve this with the methods from Monday?

( ) ( ) ( )121 −−+++= iNTiTNCCNT

22

Telescope

( ) ( ) ( )( ) ( )( ) ( ) ( )

add

ncNTNTcNNTNT

NTTcNNTi

......221

110

0

−+−=−+−=

−++==

12

23

what is the big O runtime ??

this was pretty clear before the analysis ☺

its slow-sort ☺

what is the best case

( ) ( ) ( )⎟⎠⎞

⎜⎝⎛ −

+=2

11 NNcTNT

24

i = N/2T(N) = 2T(N/2)+cN

For average case if you were to analyze every possible input, when left small, right large

( ) ( )∑−

=

=1

0

1 N

jjT

NiT

13

25

Telescope This!

( ) ( )⎥⎦

⎤⎢⎣

⎡+= ∑

=

1

0

12N

jjT

NcNNT

26

telescope

( ) ( )

( ) ( ) ( ) ( )

...)1()1(2)1()1()(*

1211

2*

22

22

0

21

0

−−+−=−−−

−=−−

+=

=

=

NccNNTNTNNTNsubtract

NcjTNTN

telescope

cNjTNTN

Nmultiple

N

j

N

j

14

27

Bottom line

NNNT

NON

NTsimplifies

iCT

NNT N

i

log)(

)(log1)(

122

)1(1)( !

0

=

=+

+=+ ∑

+

=

28

Wrapping up sortingconsider three elements

a,b,c

If we want to sort these three, what possible ordering can there be ?

15

29

comparisonsif we compare a & b

left if a<bright if b<a

what are subtrees ?

30

Decision Treedecision trees portrays the comparisons made by some algorithm

can imagine a tree for quick sort

we can use it represent any comparison based sorting algorithm

16

31

so lets discuss some tree ideas, and apply them back to sorting routines

32

Lemma 1A binary tree of depth d has at most 2d

leaves

can you prove this ??

17

33

Lemma 2A binary tree with L leaves must have a depth of at least log(L)

proof should be obvious

34

Lemma 3Any sorting algorithm that uses only comparisons requires at least log(N!) comparisons in the worst case

since N! ordering possibilities

18

35

Theoremany sorting algorithm that only uses comparisons requires Θ(N log N) comparisons

can you show this ?

36

log(N!) = log (N * N-1 * N-2 * N-3 …log of product = log of sums

log(N!) = log(N) + log(N-1) + log(N-2)..(drop n/2 terms) ≥ log(N) + … + log(N/2)≥ n/2 log(N/2) = ΘN log N

19

37

so how to get past the N log N barrier ??

38

more informationif you have extra information you can break the n log n barrierknowing the range, we can use other sorts

bucket sort

20

39

External Sortssay you can only handle 100 items at a timeneed to sort 1000

general case: many times the number of instances to sort will not fit into memory

Strategy:any ideas ?

40

switch gearslets switch gears for a second

Question:I want to be able to give you a bunch of items and you should say which set the items belong to (given this information)

should be able to add items to setsshould be able to lookup an item’s set

I want to be able to do it quickly

Example: search results: want to be able to tell you what broad categories a search results can be divided by

21

41

any implementation ideas ?

runtime ?

42

equivalenceA relationship R is defined on a set, if for every pair (a,b) we can answer aRb as true of false

Equivalence relationship:1. reflexive

1. aRa for any A

2. symetrics1. if aRb then bRa

3. transitive1. if aRb and bRc then … aRc

22

43

one implementation idea:

use large matrix and mark if relationship exist

44

lets do a quick example of a bunch of join’s using a matrix

23

45

Equivalence classan equivalence class of an element a in S is the subset of S that contains all elements related to a

what are the equivalence classes in the previous example?

46

online vs offlinesome DS operations can be online some offline

Offline:get all information, and then can process

Online:need to deal with the information before continuing

Example:paper exam vs oral exam

24

47

Equivalence Class ADTUnion(a,b)

merges 2 equivalence classes

Find(a)retrieves equivalence class containing a

48

Analyzing Equivalence classeswhen doing the analysis here we will be interested in series of M operations

lets go to implementationany ideas ?

25

49

Arraysuse an array and store name of class at each position

example of union

what is the running timefind?union?

50

running timesso find we can do in O(1)

merge, worst could be O(N)for M merges on N items

O(NM)M can’t exceed Nso O(N2)

26

51

linked listseach member will be in its own list

merge ?find ?

will come back to this next week

52

Review timeLets start a general review

make sure you are familiar with everything covered so far

address some of the questions seen

27

53

runtimesRuntimes are a rough way of judging algorithms

what is Big-O

why classes of functions ?

54

Recursionunderstanding recursion

tail recursion

when to use

when not to use

28

55

what is a DS ?

what are ADT ?

56

LISTwhat is the list ADT ?

operations ?runtimes ?

29

57

Listsarrays

linked lists

58

queueswhat is a the Queue ADT ?

operations ?runtimes ?

Priority queue ADT ?operations ?runtimes ?

30

59

how do the number of links affect the linked list class ?

single vs double linked

what if we added a mid point link

what is we added a bunch of others

60

Treeswhat is the Tree ADT ?

operationsruntimes ?

31

61

Treescomplete treesbinary treesBSTBalanced BST

AVLRed-black

lazy deletiontree traversal algorithmsexpression treesB+ treeshuffman trees

62

heapswhat are heaps ?

what DS being implemented ?

32

63

hashtablewhat is the hash table ADT ?

operations ?runtimes ?

what issues need to be dealt with during operations ?

why not issue with Lists ?Extendible hashing

64

sortingbasic sorts

bubbleinsertselectionrandom

better onesheap sort

Better yetmergesortquicksort

33

65

Sample Exam

lets do some sample questions together

66

Sample 1An algorithm takes 0.5 ms for input of size 100. How long will it take for input of size 500 if the runtime is of the following (assume low order terms are negligible). Show work.

a) Linearb) O (N log N)c) Quadraticd) Cubic

34

67

Sample 2Programs A and B are analyzed and found to have worst-case running times no greater than 150N log N and N2, respectively. Answer the following if possible:

a) Which program has the better guarantee on the running time for large values of N (N > 10,000)?b) Which program has the better guarantee on the running time for small values of N (N <100)

68

Sample 3Suppose we want to add an operation FINDKth to our toolbox of binary tree operations. This operation returns the kth smallest item in the tree. Assume all items are distinct (no repeats) Explain how you would modify the binary tree structure to support FINDKth. This operation also must run in O (log N) average time , without sacrificing the time bound an any other operation currently in the tree.

35

69

Sample 4Show the result of inserting the keys 1011-1101, 0000-0010 , 1001-1011, 1011-1110, 0111-1111, 0101-0001, 1001-0110, 0000-1011 , 1100-1111, 1001-1110, 1101-1011 , 0010-1011, 0110-0001, 1111-0000, 01110-1111 into an initially empty B-tree structure with M=3, L=3. These are 8 character keys , the dash makes them easier to read.

70

any other questions ??

36

71

Nextdo all the readingthe exam is open brain/notes/book closed general internetwill post when test is ready

please email/aim but will only answer up to test time

Good luck