Chapter 10 Search Structures Instructors: C. Y. Tang and J. S. Roger Jang All the material are...

Post on 21-Dec-2015

222 views 0 download

transcript

Chapter 10Search Structures

Instructors: C. Y. Tang and J. S. Roger Jang

All the material are integrated from the textbook "Fundamentals of Data Structures in C" and some supplement from the slides of Prof. Hsin-Hsi Chen (NTU).

Balance Matters Binary search trees can be degenerate.

If you insert in a sorted order using the insertion algorithm introduced before, you’ll obtain a degenerate BST.

O(n) time search in such cases.

Balanced Binary Search Trees

There are binary search trees that guarantees balance.

Balance factor of a node: (height of left subtree) – (height of right subtree)

An AVL tree is a tree where each node has a balance factor of -1, 0, or 1.

AVL Trees Balance is maintained by the insertion and deleti

on algorithms. Both take O(log n) time. For example, if an insertion causes un-balance, t

hen some rotation is performed. For details please refer to your textbook.

Comparing Some Structures

2-3 Trees

Each node can have a degree of 2 or 3.

< 40 > 40

External nodes are at the same level.

2-3 Trees

The number of elements is between 2h – 1 and 3h – 1, where h is the height of the tree.

So if there are n elements, the height is between log3 (n+1) and log2 (n+1).

Hence to search in a 2-3 tree, you need O(log n) time.

Search in A 2-3 Tree The search algorithm is similar to that for a

BST. At each node (suppose we are searching for

x):k1 k2

Go this way if x < k1

Go this way if x > k1 and x < k2

Go this way if x > k2

Insertion Into A 2-3 Tree

To insert 70: First we search for 70. It is not in the tree, and the leaf node encountered during the search is node C. Since there is only one key there, we directly insert 70 into node C.

Now we want to insert 30. The leaf we encounter is B.

B is full. So we must create a new node D. The keys that will be concerned in this operation

are 10, 20 (elem. in B) and 30 (elem. to be inserted).

Largest (30): put into D. Smallest (10): remain in B. Median (20): insert into A, the parent of B.

Add a link from A to D.

median

smallest largest

Now we want to insert 60. We encounter leaf node C when we search for 60.

Node C is full, so we: Create node E to hold max{70, 80, 60} = 80. min{70,80,60} = 60 will remain in C. The median, 70, will be inserted into A.

But A is also full, so… New node F will be created. F has children C (where 60 is in) and E (where 80 is

in).

20 | 40

10 | 30 | 60 | 80 |

70 will be inserted to A

A

B D C E

20 | 40

10 | 30 | 60 |

70 will be inserted to A

A

B D C

But A is also full, so… Create node F to hold max{20,40,70} = 70. F has children C and E. min{20,40,70} = 20 will remain in A. med{20,40,70} = 40 should be inserted into parent

of A. But A has no parent, so create G to hold 40. G has children A and F.

80 |

E

Split A 3-Node Inserting y into a 3-node B causes a split.

x | z min | max |

C D E

B

A

F

(F is a node that does not have a parent yet. )

B G(new)

A

min, max, and med are the minimum, maximum, and median of {x, y, z}, respectively.

med will be inserted into A.

Split

Observe that this pattern repeats.

x | z

ch1(p) ch2(p) ch3(p)

p

parent(p)

q

q is initialized to be null. At that time p is a leaf.

min | max |

q

p (next q)

parent(p) (next p)

The position to insert the link to q depends on the situation.

med(next y)

ch1(p) ch2(p) ch3(p)

Split Split is simpler when p is the root.

x | z

ch1(p) ch2(p) ch3(p)

p

q

min | max |

q

p

med |

ch1(p) ch2(p) ch3(p)

New root

The position to insert the link to q depends on the situation.

Insertion Algorithm We are to insert key y in tree t. First, search for y in t. When you visit

each node, push it into a stack to facilitate finding parents later.

Assume that y is not in t (otherwise we need not insert). Let p be the leaf node we encountered in the search. So, if we pop a node from the above stack,

we’ll obtain the parent of p (assume that p itself is not pushed into the stack).

Insertion Algorithm

Initialize q to be null. If p is a 2-node, then simply insert y into p.

Put q immediately to the right of y. That is, if w is originally in p, then we have two cases:

w | y

q=nil

p

nil nil

y | w

nil

p

nil q=nil

And we’re done!

Insertion Algorithm

If p is a 3-node, then split p.

x | z

nil nil nil

p

parent(p)

q=nil

min | max |

q=nil

p (next q)

parent(p) (next p)med(next y)

nil nil nil

Then, let p = parent(p), q be the new node holding max, and y = med. We’ll now consider the insertion of the new y into the new p.

Insertion Algorithm In the remaining process, if p is a 2-

node, then simply insert y into p, and update the links as:

w | y

qa b

y | w

b

p

a q

p

And we’re done!

Insertion Algorithm If p is a 3-node, then split. Then we’ll

continue to insert the new y into the new p.

x | z

ch2(p) ch3(p)

p

parent(p)

q

min | max |

q

p

parent(p) (next p)med(next y)

ch1(p) ch2(p) ch3(p)

(next q)

ch1(p)

The position to insert the link to q depends on the situation.

Insertion Algorithm If p (3-node) is the root, then the split is done in

the manner as stated before. We’re done after this.

x | z

ch1(p) ch2(p) ch3(p)

p

q

min | max |

q

p

med |

ch1(p) ch2(p) ch3(p)

New root

The position to insert the link to q depends on the situation.

Correctness of Insertion Note that, all keys in part B, including y and keys in

q, lie between u and v. Because we followed the middle link of parent(p) when w

e did the search in the example below, the input key (to be inserted) falls between u and v.

Besides the (input) key to insert, all keys in B were originally there and fall between u and v.

? | ?

ch2(p) ch3(p)

p

parent(p)

qch1(p)

u | v

y to be inserted in p

A B C

Correctness of Insertion

So the global relationship is ok. As to the local relationship among the keys, the insertion actions clearly maintain such properly.

w | y

ch2(p)

p

parent(p)

qch1(p)

u | v

Correctness of Insertion

You should use induction as well as these observations to give a more rigorous proof. p and q are always 2-3 trees after

each iteration.

Time Complexity of Insertion

At each level, the algorithm takes O(1) time.

There are O(log n) levels. So insertion takes O(log n) time.

Deletion From A 2-3 Tree

Deletion of any element can be transformed into deletion of a leaf element.

To delete 50, we replace 50 by 60 or 20. Then delete correspondingly the leaf element 60 or 20. 60 is the leftmost leaf element in the right subtree of 50. 20 is the rightmost leaf element in the left subtree of 50.

20 | 80

10 | 60 | 70 90 | 95

Use the algorithm presented later to delete 20 in the leaf.

Deletion From A 2-3 Tree

Delete 70 (in C). This case is straightforward, as the resulting C is non-empty.

Deletion From A 2-3 Tree

Delete 90 (in D). This is also simple; a shift of 95 in D suffices.

Deletion From A 2-3 Tree Delete 60 (in C). C becomes empty. Left sibling of C is a 3-node. Hence

(rotation): Move 50 from A to C. Move 20 from B to A.

Deletion From A 2-3 Tree

Delete 95 (from D): D becomes empty. Its left sibling C is a 2-node, hence (combine): Move 80 from A to C. Delete D.

Deletion From A 2-3 Tree

Delete 50 (in C). Simply shift.

Deletion From A 2-3 Tree Delete 10 (in B): B becomes empty. The

right sibling of B is a 2-node, hence (combine): Move 20 from A to B. Move 80 from C to B. The parent A, which is also the root, is

empty. Hence simply let B be the new root.

Rotation and Combine

When a deletion in node p leaves p empty, then: Let r be the parent of p. If p is the left child of r, then let q be the

right sibling of p. Otherwise, let q be the left sibling of p. If q is a 3-node, then combine. If q is a 2-node, then rotation.

Rotation If p is the left child of r:

(“?” means don’t care) Observe the correctness.

Rotation

If p is the middle child of r.

Rotation

If p is the right child of r.

Combine If p is the left child of r:

Case 1: If r is a 2-node. r becomes empty, so we set p to be r, and

continue to consider to rotate/combine the new p. If r is a root, then let p become the new root.

Combine

If p is the left child of r. Case 2: If r is a 3-node.

Combine If p is the middle child of r:

Case 1: If r is a 2-node. Continue to handle the empty r as

before. w |

| y |

|

y | w

a b c

r

q p

a b c

p

r

Combine If p is the middle child of r:

Case 2: If r is a 3-node.

w | x

| y |

x |

y | w

a b c

d

r

q p d

a b c

p

r

Combine If p is the right child of r:

w | x

| y | a

b c

r

qp

d

w |

y | xa

b c

r

p

d

Correctness of Deletion

Observe that, if a combine results in a new empty node, then that node must have the following appearance (r with one tail): | r

r will become p in the next iteration. In the (left-hand side) pictures we’ve seen, p

has the above appearance. So applicable.

Correctness of Deletion

We begin with p being a leaf. At that time, the children of p are all null. So rotation/combine as illustrated in the previous figures are also applicable.

Correctness of other parts should be clear.

Time Complexity of Deletion

At each level: O(1) time. Rotation/combine need O(1) time.

#levels: O(log n). Total: O(log n) time.