+ All Categories
Home > Documents > Datastructuren - Data Structuresliacs.leidenuniv.nl/~hoogeboomhj/dat/ohp/dat-present-3.pdf · Data...

Datastructuren - Data Structuresliacs.leidenuniv.nl/~hoogeboomhj/dat/ohp/dat-present-3.pdf · Data...

Date post: 17-Jul-2020
Category:
Upload: others
View: 0 times
Download: 0 times
Share this document with a friend
26
Datastructuren Datastructuren Data Structures Fenia Aivaloglou Hendrik Jan Hoogeboom Informatica – LIACS Universiteit Leiden najaar 2019
Transcript
Page 1: Datastructuren - Data Structuresliacs.leidenuniv.nl/~hoogeboomhj/dat/ohp/dat-present-3.pdf · Data Structures Fenia Aivaloglou Hendrik Jan Hoogeboom Informatica { LIACS Universiteit

Datastructuren

DatastructurenData Structures

Fenia AivaloglouHendrik Jan Hoogeboom

Informatica – LIACSUniversiteit Leiden

najaar 2019

Page 2: Datastructuren - Data Structuresliacs.leidenuniv.nl/~hoogeboomhj/dat/ohp/dat-present-3.pdf · Data Structures Fenia Aivaloglou Hendrik Jan Hoogeboom Informatica { LIACS Universiteit

Datastructuren

Table of Contents I

3 Binary Search Trees

Page 3: Datastructuren - Data Structuresliacs.leidenuniv.nl/~hoogeboomhj/dat/ohp/dat-present-3.pdf · Data Structures Fenia Aivaloglou Hendrik Jan Hoogeboom Informatica { LIACS Universiteit

Datastructuren

Binary Search Trees

Contents

3 Binary Search TreesIntroductionBST use casesConstructing BSTsAnalysis of treesADT Set and Dictionary

Page 4: Datastructuren - Data Structuresliacs.leidenuniv.nl/~hoogeboomhj/dat/ohp/dat-present-3.pdf · Data Structures Fenia Aivaloglou Hendrik Jan Hoogeboom Informatica { LIACS Universiteit

Datastructuren

Binary Search Trees

Introduction

binary search tree BST1

K

< K > K

Definition

A binary search tree is a binary tree such that for each node:

all nodes in its left subtree have smaller values, and

all nodes in its right subtree have larger values

1BZB, zie Algoritmiek

Page 5: Datastructuren - Data Structuresliacs.leidenuniv.nl/~hoogeboomhj/dat/ohp/dat-present-3.pdf · Data Structures Fenia Aivaloglou Hendrik Jan Hoogeboom Informatica { LIACS Universiteit

Datastructuren

Binary Search Trees

Introduction

comparables

chico

harpo

groucho

gummo

marx

zeppo 4

5

11

18

25

30 11.6.1509

28.5.1533

30.5.1536

6.1.1540

28.7.1540

12.7.1543

Page 6: Datastructuren - Data Structuresliacs.leidenuniv.nl/~hoogeboomhj/dat/ohp/dat-present-3.pdf · Data Structures Fenia Aivaloglou Hendrik Jan Hoogeboom Informatica { LIACS Universiteit

Datastructuren

Binary Search Trees

Introduction

binary search tree BST

worst case search complexity: unsuccessful search in

linear tree: O(n)

optimal tree: O(log2(n)) (complete tree)

Average case behaviour: see later

Page 7: Datastructuren - Data Structuresliacs.leidenuniv.nl/~hoogeboomhj/dat/ohp/dat-present-3.pdf · Data Structures Fenia Aivaloglou Hendrik Jan Hoogeboom Informatica { LIACS Universiteit

Datastructuren

Binary Search Trees

Introduction

BST with 31 most common English words

top five frequencies indicated the15568

to5739

this with

was you

which

of9767

and7638

that

on

or

a5074

in

I is

it

not

for

as his

are be he

at

but

from

have herby

had

Inserted in BST by decreasing order of frequencySuccessful search of BST requires 4.042 comparisons (on avg.)

Page 8: Datastructuren - Data Structuresliacs.leidenuniv.nl/~hoogeboomhj/dat/ohp/dat-present-3.pdf · Data Structures Fenia Aivaloglou Hendrik Jan Hoogeboom Informatica { LIACS Universiteit

Datastructuren

Binary Search Trees

Introduction

balanced BST

a

5074

and7638

are

as

at

be

but

by

for

from

had

have

he

her

his

I

in

is

it

not

of

9767

on

or

that

the

15568

this

to

5739

was

which

with

you

Perfectly balanced BST

Successful search requires 4.393 comparisons (on avg.)

Page 9: Datastructuren - Data Structuresliacs.leidenuniv.nl/~hoogeboomhj/dat/ohp/dat-present-3.pdf · Data Structures Fenia Aivaloglou Hendrik Jan Hoogeboom Informatica { LIACS Universiteit

Datastructuren

Binary Search Trees

Introduction

optimal BST

are at but from have her I which

as by had his is not or was you

a5074

be he it on this with

and7638

in that to5739

for the15568

of9767

Optimal tree taking frequencies into account

Successful search requires 3.437 comparisons (on avg.)

source: Knuth TAoCP Vol.3 (Sorting and Searching)

Page 10: Datastructuren - Data Structuresliacs.leidenuniv.nl/~hoogeboomhj/dat/ohp/dat-present-3.pdf · Data Structures Fenia Aivaloglou Hendrik Jan Hoogeboom Informatica { LIACS Universiteit

Datastructuren

Binary Search Trees

BST use cases

search value

bool contains( const Comparable & x, Node *t ) const {

if( t == nullptr )

return false;

else if( x < t->element )

return contains( x, t->left );

else if( t->element < x )

return contains( x, t->right );

else

return true; // found

}

call with: contains(v,root);

Page 11: Datastructuren - Data Structuresliacs.leidenuniv.nl/~hoogeboomhj/dat/ohp/dat-present-3.pdf · Data Structures Fenia Aivaloglou Hendrik Jan Hoogeboom Informatica { LIACS Universiteit

Datastructuren

Binary Search Trees

BST use cases

find min/max value

BinaryNode * findMin( BinaryNode *t ) const {

if( t == nullptr )

return nullptr;

if( t->left == nullptr )

return t;

return findMin( t->left );

}

BinaryNode * findMax( BinaryNode *t ) const {

if( t != nullptr )

while( t->right != nullptr )

t = t->right;

return t;

}

call with: findMin(root); and findMax(root);

Page 12: Datastructuren - Data Structuresliacs.leidenuniv.nl/~hoogeboomhj/dat/ohp/dat-present-3.pdf · Data Structures Fenia Aivaloglou Hendrik Jan Hoogeboom Informatica { LIACS Universiteit

Datastructuren

Binary Search Trees

BST use cases

inorder is sorted

81

112

153

204

265

336

347

428

519

5710

6111

inorder : 8 11 15 29 26 33 34 42 51 57 61

Page 13: Datastructuren - Data Structuresliacs.leidenuniv.nl/~hoogeboomhj/dat/ohp/dat-present-3.pdf · Data Structures Fenia Aivaloglou Hendrik Jan Hoogeboom Informatica { LIACS Universiteit

Datastructuren

Binary Search Trees

BST use cases

find k-th element

Augment each node with the size of its subtree

51

103

141

206

261

302

3511

391

454

512

561

Let r be left->size + 1

If k = r: stop! This node has kth item

If k < r: search kth item in left subtree

If k > r: search (k − r)th item in right subtree

Page 14: Datastructuren - Data Structuresliacs.leidenuniv.nl/~hoogeboomhj/dat/ohp/dat-present-3.pdf · Data Structures Fenia Aivaloglou Hendrik Jan Hoogeboom Informatica { LIACS Universiteit

Datastructuren

Binary Search Trees

BST use cases

counting items in [12, 52]

3

6

9

12

X

15

1

18

X

21

24

2

27

X 60

30

33

4

36

39

42

X

45

148

X

51

X

54

57

Page 15: Datastructuren - Data Structuresliacs.leidenuniv.nl/~hoogeboomhj/dat/ohp/dat-present-3.pdf · Data Structures Fenia Aivaloglou Hendrik Jan Hoogeboom Informatica { LIACS Universiteit

Datastructuren

Binary Search Trees

Constructing BSTs

insertion (implementation)

template<class T>

void Node<T>::insert(const T& el, Node<T> * & p) {

if( p == nullptr ) {

p = new Node{el, nullptr, nullptr};

} else if (el < p->data) {

insert(el, p->left);

} else if (el > p->data) {

insert(el, p->right);

} else {

; // Duplicate; do nothing

}

}

call with: insert(el,root);

Page 16: Datastructuren - Data Structuresliacs.leidenuniv.nl/~hoogeboomhj/dat/ohp/dat-present-3.pdf · Data Structures Fenia Aivaloglou Hendrik Jan Hoogeboom Informatica { LIACS Universiteit

Datastructuren

Binary Search Trees

Constructing BSTs

deletion “by copying”

f

×

T1

Λ

=⇒

f

T1

×

T1 T2

=

×

p

Λ

T2

=⇒

p

×

Λ

T2

Page 17: Datastructuren - Data Structuresliacs.leidenuniv.nl/~hoogeboomhj/dat/ohp/dat-present-3.pdf · Data Structures Fenia Aivaloglou Hendrik Jan Hoogeboom Informatica { LIACS Universiteit

Datastructuren

Binary Search Trees

Constructing BSTs

deletion (implementation)

void remove( const Comparable & x, Node * & t ) {

if( t == nullptr ) return;

if( x < t->data ) remove( x, t->left );

else if( x > t->data) remove( x, t->right );

else if( t->left != nullptr && t->right != nullptr ) {

Node *pred = findMax( t->left );

t->element = pred->element;

remove( t->element, t->left );

}

else {

BinaryNode *oldNode = t;

if(t->left != nullptr ) t = t->left

else t = t->right;

delete oldNode;

}

}

aanroepen met: remove(el,root);

Page 18: Datastructuren - Data Structuresliacs.leidenuniv.nl/~hoogeboomhj/dat/ohp/dat-present-3.pdf · Data Structures Fenia Aivaloglou Hendrik Jan Hoogeboom Informatica { LIACS Universiteit

Datastructuren

Binary Search Trees

Analysis of trees

counting trees

i

Bi−1 Bn−i

Unlabeled n-node binary trees

Bn =∑n−1

i=0 (Bi−1 ·Bn−i) with B0 = 1

nth Catalan number: Bn = 1n+1

(2nn

)= (2n)!

(n+1)!n! ∼4n

n3/2√π

this is also the number of BST with given values:unique way to store values in given [unlabeled] tree

Page 19: Datastructuren - Data Structuresliacs.leidenuniv.nl/~hoogeboomhj/dat/ohp/dat-present-3.pdf · Data Structures Fenia Aivaloglou Hendrik Jan Hoogeboom Informatica { LIACS Universiteit

Datastructuren

Binary Search Trees

Analysis of trees

internal path length

0

1

2 2

1

2ipl = 0 + 1 + 1 + 2 + 2 + 2 = 8

Path length of node: # edges from root to node

Definition (Internal path length)

ipl = sum of all path lengths to all nodes

Avg # comparisons in successful search: ipln + 1

Page 20: Datastructuren - Data Structuresliacs.leidenuniv.nl/~hoogeboomhj/dat/ohp/dat-present-3.pdf · Data Structures Fenia Aivaloglou Hendrik Jan Hoogeboom Informatica { LIACS Universiteit

Datastructuren

Binary Search Trees

Analysis of trees

external path length

0

1

2 2

1

2

E = 3 + 3 + 3 + 3 + 2 + 3 + 3 = 20

Definition (External path length)

E = sum of all path lengths to the ‘extended’ leaves

Avg # comparisons in unsuccessful search: En+1 (n + 1 leaves)

Relation to ipl: E = ipl + 2n proof: induction

Page 21: Datastructuren - Data Structuresliacs.leidenuniv.nl/~hoogeboomhj/dat/ohp/dat-present-3.pdf · Data Structures Fenia Aivaloglou Hendrik Jan Hoogeboom Informatica { LIACS Universiteit

Datastructuren

Binary Search Trees

Analysis of trees

path length extremal trees

optimal (balanced) worst case (linear)h levels: n = 2h − 1 nodes

h = lg(n+1)

0

1 1

2 2 2 2

0

1

2

6

ipl =∑h−1

i=0 i · 2i, E = 2h · h ipl =∑n−1

i=0 i = n(n−1)2

⇒ ipl = (n+1) lg(n+1)− 2n E = ipl + 2n = n(n+3)2

avg = n+1n lg(n+1)− 1 avg = n+1

2

Page 22: Datastructuren - Data Structuresliacs.leidenuniv.nl/~hoogeboomhj/dat/ohp/dat-present-3.pdf · Data Structures Fenia Aivaloglou Hendrik Jan Hoogeboom Informatica { LIACS Universiteit

Datastructuren

Binary Search Trees

Analysis of trees

average tree

intuition: more balance ⇒ more permutations yield that treeexample: 4-node BSTs

1

2

3

4

1234ipl=6

1

2

4

3

1243ipl=6

1

3

2 4

13241342ipl=5

1

4

2

3

1423ipl=6

1

4

3

2

1432ipl=6

2

1 3

4

213423142341ipl=4

2

1 4

3

214324132431ipl=4

14 BSTs (7 symmetric to above)4! = 24 permutationsaverage ipl: 1

24(12× 4 + 4× 5 + 8× 6) = 11624 = 29

6

Page 23: Datastructuren - Data Structuresliacs.leidenuniv.nl/~hoogeboomhj/dat/ohp/dat-present-3.pdf · Data Structures Fenia Aivaloglou Hendrik Jan Hoogeboom Informatica { LIACS Universiteit

Datastructuren

Binary Search Trees

Analysis of trees

average ipl BST

In average internal path length BST n nodes

insert permutation 1, . . . , n into BST ⇒ tree structurewe average over permutations

5

2

1 4

3

6

7

permutationdetermines left & right subtrees

2 4 1 35

6 7

any k can be root = first elementIn = (n− 1) + 2

n

∑nk=1(Ik−1 + In−k)

Page 24: Datastructuren - Data Structuresliacs.leidenuniv.nl/~hoogeboomhj/dat/ohp/dat-present-3.pdf · Data Structures Fenia Aivaloglou Hendrik Jan Hoogeboom Informatica { LIACS Universiteit

Datastructuren

Binary Search Trees

Analysis of trees

telescope!

In average internal path length n nodes

so In = (n− 1) + 2(I0 + I1 + · · ·+ In−1)/n

also In−1 = (n− 2) + 2(I0 + I1 + · · ·+ In−2)/(n− 1)

subtract n In − (n− 1)In−1 = 2n− 2 + 2In−1

thus n In = (n + 1)In−1 + 2n− 2

In

n+ 1=In−1

n+

2

n+ 1−

2

n(n+ 1)

In−1

n=In−2

n− 1+

2

n−

2

(n− 1)n

. . .

I1

2=I0

1+

2

2−

2

1 · 2In

n+ 1=I0

1+O(lnn)−

2n

n+ 1

Page 25: Datastructuren - Data Structuresliacs.leidenuniv.nl/~hoogeboomhj/dat/ohp/dat-present-3.pdf · Data Structures Fenia Aivaloglou Hendrik Jan Hoogeboom Informatica { LIACS Universiteit

Datastructuren

Binary Search Trees

ADT Set and Dictionary

ADT Set

Initialize: construct an empty set.

IsEmpty: check whether there the set is empty (∅, containsno elements).

Size: return the number of elements, the cardinality of theset.

IsElement(a): returns whether a given object from thedomain belongs to the set, a ∈ A.

Insert(a): add an element to the set (if it is not present,A ∪ {a})Delete(a): removes an element from the set (if it is present,A \ {a}).

Efficient implementation of ADT Set possible with BST

Page 26: Datastructuren - Data Structuresliacs.leidenuniv.nl/~hoogeboomhj/dat/ohp/dat-present-3.pdf · Data Structures Fenia Aivaloglou Hendrik Jan Hoogeboom Informatica { LIACS Universiteit

Datastructuren

Binary Search Trees

ADT Set and Dictionary

end.


Recommended