+ All Categories
Home > Documents > Index tuning-- B+tree. overview © Dennis Shasha, Philippe Bonnet 2001 B+-Tree Locking Tree...

Index tuning-- B+tree. overview © Dennis Shasha, Philippe Bonnet 2001 B+-Tree Locking Tree...

Date post: 29-Dec-2015
Category:
Upload: shon-morrison
View: 218 times
Download: 2 times
Share this document with a friend
20
Index tuning-- B+tree
Transcript
Page 1: Index tuning-- B+tree. overview © Dennis Shasha, Philippe Bonnet 2001 B+-Tree Locking Tree Traversal –Update, Read –Insert, Delete phantom problem: need.

Index tuning--

B+tree

Page 2: Index tuning-- B+tree. overview © Dennis Shasha, Philippe Bonnet 2001 B+-Tree Locking Tree Traversal –Update, Read –Insert, Delete phantom problem: need.

overview

Page 3: Index tuning-- B+tree. overview © Dennis Shasha, Philippe Bonnet 2001 B+-Tree Locking Tree Traversal –Update, Read –Insert, Delete phantom problem: need.

© Dennis Shasha, Philippe Bonnet 2001

B+-Tree Locking

• Tree Traversal– Update, Read– Insert, Delete

• phantom problem: need for range locking

• ARIES KVL (implemented in DB2)• Tree Traversal (next page)• Lock on tuples• Lock on key values• Range locking:

– Next key lock 42 4

Page 4: Index tuning-- B+tree. overview © Dennis Shasha, Philippe Bonnet 2001 B+-Tree Locking Tree Traversal –Update, Read –Insert, Delete phantom problem: need.

© Dennis Shasha, Philippe Bonnet 2001

A

B C

D

E F

T1 lock

T1 lockT1 lock

B+-Tree Locking

Page 5: Index tuning-- B+tree. overview © Dennis Shasha, Philippe Bonnet 2001 B+-Tree Locking Tree Traversal –Update, Read –Insert, Delete phantom problem: need.

Bulk Loading of a B+ Tree

• If we have a large collection of records, and we want to create a B+ tree on some field, doing so by repeatedly inserting records is very slow.

• Bulk Loading can be done much more efficiently.• Initialization: Sort all data entries, insert pointer to first

(leaf) page in a new (root) page.

Page 6: Index tuning-- B+tree. overview © Dennis Shasha, Philippe Bonnet 2001 B+-Tree Locking Tree Traversal –Update, Read –Insert, Delete phantom problem: need.

Bulk Loading (Contd.)

• Add <low key value on page, pointer to page> to the root page

Page 7: Index tuning-- B+tree. overview © Dennis Shasha, Philippe Bonnet 2001 B+-Tree Locking Tree Traversal –Update, Read –Insert, Delete phantom problem: need.

Bulk Loading (Contd.)

• Split the root and create a new root page.

Page 8: Index tuning-- B+tree. overview © Dennis Shasha, Philippe Bonnet 2001 B+-Tree Locking Tree Traversal –Update, Read –Insert, Delete phantom problem: need.

Bulk Loading (Contd.)

• Index entries for leaf pages always entered into rightmost index page just above leaf level. When this fills up, it splits. (Split may go up right-most path to the root.)

Page 9: Index tuning-- B+tree. overview © Dennis Shasha, Philippe Bonnet 2001 B+-Tree Locking Tree Traversal –Update, Read –Insert, Delete phantom problem: need.

• Much faster than repeated inserts, especially when one considers locking!

Page 10: Index tuning-- B+tree. overview © Dennis Shasha, Philippe Bonnet 2001 B+-Tree Locking Tree Traversal –Update, Read –Insert, Delete phantom problem: need.

Comparison: B-trees vs. static indexed sequential file

Ref #1: Held & Stonebraker, “B-Trees Re-examined”, CACM, Feb. 1978

Ref # 1 claims:

- Concurrency control harder in B-Trees

- B-tree consumes more space

For their comparison:

block = 512 byteskey = pointer = 4 bytes4 data records per block

Page 11: Index tuning-- B+tree. overview © Dennis Shasha, Philippe Bonnet 2001 B+-Tree Locking Tree Traversal –Update, Read –Insert, Delete phantom problem: need.

Example: 1 block static index

127 keys

(127+1)4 = 512 Bytes

-> pointers in index implicit! up to 127blocks

k1

k2

k3

k1

k2

k3

1 datablock

Page 12: Index tuning-- B+tree. overview © Dennis Shasha, Philippe Bonnet 2001 B+-Tree Locking Tree Traversal –Update, Read –Insert, Delete phantom problem: need.

Example: 1 block B-tree

63 keys

63x(4+4)+8 = 512 Bytes

-> pointers needed in B-tree up to 63blocks because index is blocksnot contiguous

k1

k2

...

k63

k1

k2

k3

1 datablock

next

-

Page 13: Index tuning-- B+tree. overview © Dennis Shasha, Philippe Bonnet 2001 B+-Tree Locking Tree Traversal –Update, Read –Insert, Delete phantom problem: need.

Size comparison Ref. #1Size comparison Ref. #1

Static Index B-tree

# data # datablocks height blocks height

2 -> 127 2 2 -> 63 2

128 -> 16,129 3 64 -> 3968 3

16,130 -> 2,048,383 4 3969 -> 250,047 4

250,048 -> 15,752,961 5

Page 14: Index tuning-- B+tree. overview © Dennis Shasha, Philippe Bonnet 2001 B+-Tree Locking Tree Traversal –Update, Read –Insert, Delete phantom problem: need.

Ref. #1 analysis claims

• For an 8,000 block file,after 32,000 inserts

after 16,000 lookups

Static index saves enough accessesto allow for reorganization

Ref. #1 conclusion Static index better!!

Page 15: Index tuning-- B+tree. overview © Dennis Shasha, Philippe Bonnet 2001 B+-Tree Locking Tree Traversal –Update, Read –Insert, Delete phantom problem: need.

Ref #2: M. Stonebraker, “Retrospective on a database system,” TODS, June 1980

Ref. #2 conclusion B-trees better!!

• DBA does not know when to reorganize• DBA does not know how full to load

pages of new index

• Buffering– B-tree: has fixed buffer requirements– Static index: must read several overflow

blocks to be efficient(large & variable

size buffers needed for this)

Page 16: Index tuning-- B+tree. overview © Dennis Shasha, Philippe Bonnet 2001 B+-Tree Locking Tree Traversal –Update, Read –Insert, Delete phantom problem: need.

• Speaking of buffering… Is LRU a good policy for B+tree

buffers? Of course not!

Should try to keep root in memory at all times

(and perhaps some nodes from second level)

Page 17: Index tuning-- B+tree. overview © Dennis Shasha, Philippe Bonnet 2001 B+-Tree Locking Tree Traversal –Update, Read –Insert, Delete phantom problem: need.

Interesting problem:

For B+tree, how large should n be?

n is number of keys / node

Page 18: Index tuning-- B+tree. overview © Dennis Shasha, Philippe Bonnet 2001 B+-Tree Locking Tree Traversal –Update, Read –Insert, Delete phantom problem: need.

Sample assumptions:

(1) Time to read node from disk is(S+Tn) msec.

(2) Once block in memory, use binarysearch to locate key:

(a + b LOG2 n) msec.

For some constants a,b; Assume a << S(3) Assume B+tree is full, i.e., # nodes to examine is LOGn N

where N = # records

Page 19: Index tuning-- B+tree. overview © Dennis Shasha, Philippe Bonnet 2001 B+-Tree Locking Tree Traversal –Update, Read –Insert, Delete phantom problem: need.

Can get: f(n) = time to find a record

f(n)

nopt n

Page 20: Index tuning-- B+tree. overview © Dennis Shasha, Philippe Bonnet 2001 B+-Tree Locking Tree Traversal –Update, Read –Insert, Delete phantom problem: need.

FIND nopt by f’(n) = 0

Answer is nopt = “few hundred”

(see homework for details)

What happens to nopt as

• Disk gets faster?

• CPU get faster?


Recommended