CSE 326: Data Structures Splay TreesCSE 326: Data Structures Splay Trees James Fogarty Autumn 2007...

Post on 18-Mar-2020

4 views 0 download

transcript

CSE 326: Data StructuresSplay Trees

James FogartyAutumn 2007

Lecture 10

2

AVL Trees Revisited

• Balance condition:Left and right subtrees of every nodehave heights differing by at most 1

– Strong enough : Worst case depth is O(log n)– Easy to maintain : one single or double rotation

• Guaranteed O(log n) running time for– Find ?– Insert ?– Delete ?– buildTree ? Θ(n log n)

3

Single and Double Rotationsa

ZY

b

Xh hh

a

Z

b

W

c

X Yh-1

h

h h -1

4

AVL Trees Revisited

• What extra info did we maintain in each node?

• Where were rotations performed?

• How did we locate this node?

5

Other Possibilities?• Could use different balance conditions, different ways to

maintain balance, different guarantees on running time, …

• Why aren’t AVL trees perfect?

• Many other balanced BST data structures– Red-Black trees– AA trees– Splay Trees– 2-3 Trees– B-Trees– …

Extra info, complex logic todetect imbalance, recursivebottom-up implementation

6

Splay Trees

• Blind adjusting version of AVL trees– Why worry about balances? Just rotate anyway!

• Amortized time per operations is O(log n)• Worst case time per operation is O(n)

– But guaranteed to happen rarely

Insert/Find always rotate node to the root!

SAT/GRE Analogy question:AVL is to Splay trees as ___________ is to __________

Leftish heap : Skew heap

7

Recall: Amortized Complexity

If a sequence of M operations takes O(M f(n)) time,we say the amortized runtime is O(f(n)).

Amortized complexity is worst-case guarantee oversequences of operations.

• Worst case time per operation can still be large, say O(n)

• Worst case time for any sequence of M operations is O(M f(n))

Average time per operation for any sequence is O(f(n))

8

Recall: Amortized Complexity

• Is amortized guarantee any weaker than worstcase?

• Is amortized guarantee any stronger than averagecase?

• Is average case guarantee good enough in practice?

• Is amortized guarantee good enough in practice?

Yes, it is only for sequences

Yes, guarantees no bad sequences

No, adversarial input, bad day, …

Yes, again, no bad sequences

9

The Splay Tree Idea

17

10

92

5

If you’re forced to make a really deep access:

Since you’re down there anyway,fix up a lot of deep nodes!

3

10

1. Find or insert a node k2. Splay k to the root using:

zig-zag, zig-zig, or plain old zig rotation

Why could this be good??

1. Helps the new root, ko Great if k is accessed again

2. And helps many others!o Great if many others on the path are accessed

Find/Insert in Splay Trees

11

Splaying node k to the root:Need to be careful!

One option (that we won’t use) is to repeatedly use AVL single rotation until k becomes the root: (see Section 4.5.1 for details)

s

A k

B C

r

D

q

E

p

F

r

D

q

E

p

F

C

s

A B

k

12

Splaying node k to the root:Need to be careful!

What’s bad about this process?

s

A k

B C

r

D

q

E

p

F

r

D

q

E

p

F

C

s

A B

k

r is pushed almost as low as k wasBad seq: find(k), find(r), find(…), …

13

Splay: Zig-Zag*

g

Xp

Y

k

Z

W

*Just like an…

k

Y

g

W

p

ZX

AVL double rotation

Helps those in blueHurts those in red

Which nodes improve depth?

k and its original children

14

Splay: Zig-Zig*

k

Z

Y

p

X

g

W

g

W

X

p

Y

k

Z

*Is this just two AVL single rotations in a row?

Not quite – we rotate g and p, then p and k

Why does this help?Same number of nodes helped as hurt. But later rotations help the whole subtree.

15

Special Case for Root: Zigp

X

k

Y

Z

root k

Z

p

Y

X

root

Down 1 level

Relative depth of p, Y, Z? Relative depth of everyone else?

Much betterWhy not drop zig-zig and just zig all the way?

Zig only helps one child!

16

Splaying Example: Find(6)

2

1

3

4

5

6

Find(6)

2

1

3

6

5

4

?

Zig-zig

Think of as if created by inserting 6,5,4,3,2,1 – each took constant time – a LOT of savings so far to amortize those bad accesses over

17

Still Splaying 6

2

1

3

6

5

4

1

6

3

2 5

4

?

Zig-zig

18

Finally…

1

6

3

2 5

4

6

1

3

2 5

4

?

Zig

19

Another Splay: Find(4)

Find(4)

6

1

3

2 5

4

6

1

4

3 5

2

?

Zig-zag

20

Example Splayed Out

6

1

4

3 5

2

61

4

3 5

2

?

Zig-zag

21

But Wait…

What happened here?

Didn’t two find operations take linear timeinstead of logarithmic?

What about the amortized O(log n) guarantee?

That still holds, though we must takeinto account the previous steps used to createthis tree. In fact, a splay tree, by construction,

will never look like the example we started with!

22

Why Splaying Helps

• If a node n on the access path is at depth d before the splay, it’s at about depth d/2 after the splay

• Overall, nodes which are low on the access path tend to move closer to the root

• Splaying gets amortized O(log n) performance. (Maybe not now, but soon, and for the rest of the operations.)

23

Practical Benefit of Splaying

• No heights to maintain, no imbalance to check for– Less storage per node, easier to code

• Data accessed once, is often soon accessed again– Splaying does implicit caching by bringing it to the root

24

Splay Operations: Find

• Find the node in normal BST manner• Splay the node to the root

– if node not found, splay what would have been its parent

What if we didn’t splay?

Amortized guarantee fails!Bad sequence: find(leaf k), find(k), find(k), …

25

Splay Operations: Insert

• Insert the node in normal BST manner• Splay the node to the root

What if we didn’t splay?

Amortized guarantee fails!Bad sequence: insert(k), find(k), find(k), …

26

Splay Operations: Remove

find(k)

L R

k

L R

> k< k

delete k

Now what?

Everything else splayed, so we’d better do that for remove

27

JoinJoin(L, R):

given two trees such that (stuff in L) < (stuff in R), merge them:

Splay on the maximum element in L, then attach R

L R R

L

Similar to BST delete – find max = find element with no right child

Does this work to join any two trees?No, need L < R

splay

max

28

Delete Example

91

6

4 7

2

Delete(4)

find(4)

9

6

7

1

4

2

1

2

9

6

7

Find max

2

1

9

6

7

2

1

9

6

7

29

Splay Tree Summary

• All operations are in amortized O(log n) time

• Splaying can be done top-down; this may be better because:– only one pass– no recursion or parent pointers necessary– we didn’t cover top-down in class

• Splay trees are very effective search trees– Relatively simple– No extra fields required– Excellent locality properties:

frequently accessed keys are cheap to find

What happens to node that never get accessed?(tend to drop to the bottom)

Like what? Skew heaps! (don’t need to wait)

30

Splay E

ED

CF

I

GB

AH

A

B

E

I

H

C

D

G

F

31

Splay E

B

I

H

C

D

G

F

E

A

I

E

A

D

C

B H

F

G