of 15
8/8/2019 Database design and Implementation 05.btree
1/151
Database Systems Implementation, Bongki Moon 1
Introduction to IndexingB+-Tree Indexes
Ramakrishnan & Gehrke: Chap. 8.2, 10.3-10.8
Database Systems Implementation, Bongki Moon 2
Overview
Index classification Primay/Secondary, Clustered/Non-clustered
Sparse/Dense, Single-attribute/Composite
B+-Tree
Structure, Search etc.
Insert and Delete algorithms
Duplicate keys, variable-length keys, bulk loading
8/8/2019 Database design and Implementation 05.btree
2/15
2
Database Systems Implementation, Bongki Moon 3
Basics
To speed up selections on the search key fields.
Any subset of the fields of a relation can be thesearch key for an index on the relation.
Typically stored as a separate file (or relation).
Typically much smaller than base relations.
Database Systems Implementation, Bongki Moon 4
Index Classification
Primary vs. secondary: If search key containsprimary key, then called primary index.
Primary index value is unique (e.g., SSN). Secondary index allows duplicate key values (e.g.
Names, GPA).
Unique index: Search key is or contains a candidatekey.
8/8/2019 Database design and Implementation 05.btree
3/15
3
Database Systems Implementation, Bongki Moon 5
Index Classification
Clustered vs. Non-clustered: If order of data records is the same as, or`close to, order of keys, then called clustered index.
How many clustered and non-clustered indexes can a relation have?
Index entries
Data entries
direct search for
(Index File)
(Data file)
Data Records
data entries
Data entries
Data Records
CLUSTERED UNCLUSTERED
Database Systems Implementation, Bongki Moon 6
Index Classification
Dense vs. Sparse: Ifthere is 1-to-1 mappingbetween key values (in
index) and data records,then the index is dense. Can a clustered index be
dense or sparse?
Can a non-clusteredindex be dense or sparse?
Ashby, 25, 3000
Smith, 44, 3000
Ashby
Cass
Smith
22
25
30
40
44
44
50
Sparse Indexon
Name Data File
Dense Indexon
Age
33
Bristow, 30, 2007
Basu, 33, 4003
Cass, 50, 5004
Tracy, 44, 5004
Daniels, 22, 6003
Jones, 40, 6003
8/8/2019 Database design and Implementation 05.btree
4/15
4
Database Systems Implementation, Bongki Moon 7
Benefits and Costs
Cost of retrieving data records through index variesgreatly based ontypes of indexes. (E.g.) For an exact-match query with a unique/non-unique, clustered/non-
clustered, dense/sparse index, how many pages need to be retrieved from theindex and the base relation?
(E.g.) How about a range query?
Some may be more costly than others To build or maintain a clustered index, the records in the base relation must
be sorted and be in sorted order.
For dynamic data, it will be costly to maintain the sorted order.
To reduce the cost of dynamic insertions, Keep some free space on each page of the base relation for future insertions.
Use overflow pages (with links) for more future insertions. (Thus, order of datarecords is `close to, but not identical to, the sort order.)
Ex) Cost of indexed scan: clustered vs. non-cluster. Which is better?
Database Systems Implementation, Bongki Moon 8
Composite Search Keys
Search on a combination ofcolumns. Equality query:
age=20 and sal =75
Range query: age =20; or age=20 and sal > 10
Keys in index should be in sortedorder by search key to supportrange queries. But, how for multiple columns? Certain queries may benefit from a
particular order. (E.g.) age-then-sal, sal-then-age
Asymmetric vs. Symmetric
sue 13 75
bob
cal
joe 12
10
20
8011
12
name age sal
12,20
12,10
11,80
13,75
20,12
10,12
75,13
80,11
11
12
12
13
10
20
75
80
Data recordssorted by name
Data entries in indexsorted by
Data entriessorted by
Examples of composite keyindexes using lexicographic order.
8/8/2019 Database design and Implementation 05.btree
5/15
5
Database Systems Implementation, Bongki Moon 9
B+-Tree: Introduction
Most widely used.
Support both range and equality searchesefficiently.
Dynamic index Adjusts gracefully under insertions and deletions.
Keeps tree height-balanced. The cost of exact-match/insertion/deletion is log FN.
F is the fanout, and N is the # leaf nodes.
Database Systems Implementation, Bongki Moon 10
B+-Tree: Structural Characteristics
Root node has at least two children.
Minimum 50% occupancy (except for root). For a B+- treeof order d,
An internal node hasd/2
m
d children. A leaf node has d/2 m d-1 pointers to data pages.
Index Entries
Data Entries
("Sequence set")
(Direct search)<
8/8/2019 Database design and Implementation 05.btree
6/15
6
Database Systems Implementation, Bongki Moon 11
B+-Tree: Node Structures
Internal nodes
N : the number of valid entries.
Ki : key values (K1 < K2 < < Kd-1).
Pi : tree pointers to internal or leaf nodes.
Leaf nodes
Ppred, Pnext : pointers to neighboring leaf nodes.
Pi : data pointers to a record or a block.
N P1 K1 P2 K2 Pd-1 Kd-1 Pd...
P2K2P1K1PpredN ... PnextPd-1Kd-1
Database Systems Implementation, Bongki Moon 12
Example B+-Tree (of order 5)
Search begins at root, and key comparisonsdirect it to a leaf.
(E.g.) Search for 5, 15, or all data entries > 24 ...Root
17 24 30
2* 3* 5* 7* 14* 16* 19* 20* 22* 24* 27* 29* 33* 34* 38* 39*
13
8/8/2019 Database design and Implementation 05.btree
7/15
7
Database Systems Implementation, Bongki Moon 13
Inserting a Key into a B+-Tree
Find a correct leaf node L.
Put a key entry into L. If L has enough space, done!
Else, must split L (into L and a new node L2) Redistribute entries evenly, copy up the middle key.
Insert a new index entry pointing to L2 into the parent of L.
Node splitting can happen recursively To split a non-leaf node, redistribute entries evenly, but
push up the middle key. (Contrast with leaf splits.)
Splits grow tree wider or taller Splitting the root node makes the tree one level taller.
Database Systems Implementation, Bongki Moon 14
Inserting 8* into Example B+-Tree
2* 3* 5* 7* 8*
5
Note that 5 is copied up and
continues to appear in the leaf.
5 24 30
17
13
Note that 17 is pushed up andappears once in the index.
Root
17 24 30
2* 3* 5* 7* 14* 16* 19* 20* 22* 24* 27* 29* 33* 34* 38* 39*
13
8/8/2019 Database design and Implementation 05.btree
8/15
8
Database Systems Implementation, Bongki Moon 15
Example B+-Tree After Inserting 8*
Notice that root was split, leading to increase in height.
In this example, we can avoid split by re-distributingentries; however, this is usually not done in practice.
2* 3*
Root
17
24 30
14* 16* 19* 20* 22* 24* 27* 29* 33* 34* 38* 39*
135
7*5* 8*
Database Systems Implementation, Bongki Moon 16
Deleting a Key from a B+-Tree
Start at the root, find a leaf L where the entry belongs.
Remove the entry. If L is at least half-full, done!
If L has less thand/2
entries (pointers),
Try to re-distribute, borrowing from sibling (adjacent node withsame parent as L).
If re-distribution fails, merge L and a sibling.
If merge occurred, must delete an entry (pointing to L orsibling) from the parent of L.
Merge could propagate to the root, decreasing the height.
8/8/2019 Database design and Implementation 05.btree
9/15
9
Database Systems Implementation, Bongki Moon 17
Example: Deleting 19* and 20*
Deleting 20* is done with re-distribution (by moving key 24). The key 24 is removed from the parent. The new middle key 27 is copied up.
2* 3*
Root
17
30
14*16* 33*34*38*39*
135
7*5* 8* 22*24*
27
27*29*
2* 3*
Root17
24 30
14*16* 19*20*22* 24* 27*29* 33*34*38*39*
135
7*5* 8*
Database Systems Implementation, Bongki Moon 18
... And Then Deleting 24*
Two leaf nodes merge : Key 27 is removed from the parent.
2* 3*
Root
17
30
14*16* 33*34*38*39*
135
7*5* 8* 22*24*
27
27*29*
2* 3*
Root
17
14*16*
135
7*5* 8* 22* 27*
30
33* 34*29* 38* 39*
8/8/2019 Database design and Implementation 05.btree
10/15
10
Database Systems Implementation, Bongki Moon 19
Example B+-Tree After Deleting 24*
2* 3* 7* 14* 16* 22* 27* 29* 33* 34* 38* 39*5* 8*
Root30135 17
2* 3*
Root
17
14*16*
135
7*5* 8* 22* 27*
30
33* 34*29* 38* 39*
Two internal nodes merge : Key 17 ispulled down from the root.
Database Systems Implementation, Bongki Moon 20
Example: Deleting 24*
Root
135 17 20
22
30
14*16* 17*18* 20* 33*34*38*39*22*27*29*21*7*5* 8*3*2*
A different scenario: Deleting 24* causes two leaf nodes merged. Then, this causes an underflow in the internal node.
Root
135 17 20
22
27
14*16* 17*18* 20* 27*29*22*24*21*7*5* 8*3*2* 34*38*39*
30
33*
8/8/2019 Database design and Implementation 05.btree
11/15
11
Database Systems Implementation, Bongki Moon 21
Example of Non-leaf Re-distributionRoot
135 17 20
22
30
14*16* 17*18* 20* 33*34*38*39*22*27*29*21*7*5* 8*3*2*
14*16* 33*34*38*39*22*27*29*17*18* 20*21*7*5* 8*2* 3*
Root
135
17
3020 22
Entries are re-distributed by pushing through the entries in the parent node. Pull down 22 from the root, and push up 20 to the root. We can stop here. Its ok to do once more. Pull down 20 from the root, and push up 17 to the root.
Database Systems Implementation, Bongki Moon 22
Summary of B+-tree Operations
Insert Split a leaf node; copy up the middle key.
Split an internal node;push up the middle key.
Delete Merge leaf nodes; remove the middle key from the parent. Merge internal nodes;pull down the middle key from the parent.
Redistribute leaf nodes; remove the middle key from the parent, andcopy up a new middle key.
Redistribute internal nodes;pull down the middle key from theparent to one child node, andpush up a new middle key from theother child node.
8/8/2019 Database design and Implementation 05.btree
12/15
12
Database Systems Implementation, Bongki Moon 23
Other Issues of B+-tree
1. Duplicate Key Values
2. Prefix Key Compression
Variable-length Keys such as strings
3. Bulk-Loading
4. Choosing an Optimal Node Size
Database Systems Implementation, Bongki Moon 24
Duplicate Key Values
Three alternatives for Leaf Page Layout:1. Multiple pairs
2. One-key and multiple-pointers
variable-length records, variable fanout.3. Another level of indirection
additional overhead for dereferencing and disk accesses.
Page Overflows: several leaf pages may contain entrieswith the same key value, if the 1st or 2nd layout option is selected.
The 3rd layout option may be a better choice for this.
How about AM Layer of MiniRel project?
8/8/2019 Database design and Implementation 05.btree
13/15
13
Database Systems Implementation, Bongki Moon 25
Variable-Length Keys
Longer keys may reduce the fan-out and grow the indextaller. (The taller the tree, the more disk accesses.)
Prefix-key compression is done to increase fan-out.
Key values in index entries only `direct traffic; can oftencompress them. (E.g.) If we have adjacent index entries with search key values
Dannon Yogurt, David Smith and Devarakonda Murthy, we canabbreviate them to Dan, Dav and Dev. What if there is a data entry Davey Jones? (Then, we can only
compress David Smith to Davi)
Insert/delete must be suitably modified.
What if the keys (or strings) are longer than a page?
Database Systems Implementation, Bongki Moon 26
Bulk Loading of a B+-Tree
For a large collection of records, building a B+-tree byrepeatedly inserting records is very slow. Does not give sequential storage of leaves.
Bulk-Loading:
Start with an empty root node and sorted entries Insert a leaf node at a time into the right-most slot of the root or
parent node.
3* 4* 6* 9* 10* 11* 12* 13* 20*22* 23* 31* 35* 36* 38* 41* 44*
Sorted pages of data entries; not yet in B+ treeRoot
8/8/2019 Database design and Implementation 05.btree
14/15
14
Database Systems Implementation, Bongki Moon 27
B+-Trees in Practice
Typical order: 200. Typical fill-factor: 67%. Block: 8KB, key: 36 Bytes, Pointer: 4 Bytes.
average fanout = 133
Typical capacities: Height 3: 1333 = 2,352,637 records
Height 4: 1334 = 312,900,700 records
Can often hold top levels in buffer pool: Level 1 = 1 page = 8 Kbytes
Level 2 = 133 pages = 1 Mbytes
Level 3 = 17,689 pages = 133 MBytes
Database Systems Implementation, Bongki Moon 28
Optimal Size of B-tree Nodes
Problem Small pages make B-tree taller and inefficient.
Large pages increase transfer time.
Idea: Find a Page Size that maximizes the benefit-to-cost
ratio. Benefit = log2F
Because F(fanout) Page size; Height 1/log2F
AccessCost = disk_latency + PageSize/TransferRate (ignoringcache effects).
From Table 6 in Gray&Graefe 5 Minute Rule paper, 8/16/32 Kbytes are near optimal, when one entry is 20 bytes
long.
8/8/2019 Database design and Implementation 05.btree
15/15
Database Systems Implementation, Bongki Moon 29
Summary
B+-tree is a dynamic structure. Inserts/deletes leave tree height-balanced; log F N cost.
Adjusts to growth gracefully.
High fanout (F) means depth rarely more than 3 or 4.
Typically, 67% occupancy on average.
We will discuss B-tree locking soon.
Most widely used index in database management systemsbecause of its versatility. One of the most optimized
components of a DBMS. Discussions
B-Tree vs. B+-Tree
BST vs. B+-Tree