Data Organization Btrees
11.2Database System Concepts
Data organization and retrievalFile organization can improve data retrieval time
SELECT *FROM depositorsWHERE bname=“Downtown”
Mianus A215Perry A218Downtown A101....
Brighton A217Downtown A101Downtown A110......
Heap Ordered File
Searching a heap: must search all blocks (100 blocks)
OR
Searching an ordered file: 1. Binary search for the 1st tuple in answer : log2 100 = 7 block accesses2. scan blocks with answer: no more than 2 Total <= 9 block accesses
100 blocks200 recs/blockQuery returns 150 records
11.3Database System Concepts
Data organization and retrievalBut... file can only be ordered on one search key:
Brighton A217Downtown A101Downtown A110......
Ordered File (bname)Ex. Select * From depositors Where acct_no = “A110”
Requires linear scan (100 BA’s)
Solution: Indexes! Auxiliary data structures over relations that can improve
the search time
11.4Database System Concepts
A simple indexBrighton A217 700Downtown A101 500Downtown A110 600Mianus A215 700Perry A102 400......
A101A102A110A215A217...... Index of depositors on acct_no
Index records: <search key value, pointer (block, offset or slot#)>
To answer a query for “acct_no=A110” we:
1. Do a binary search on index file, searching for A1102. “Chase” pointer of index record
Index file
11.5Database System Concepts
Index Choices
1. Primary: index search key =
physical (sort) order search key vs Secondary: all other indexes
Q: how many primary indexes per relation?
2. Dense: index entry for every search key value
vs Sparse: some search key values not in the index
3. Singlelevel vs Multilevel (index on the indexes)
11.6Database System Concepts
Measuring ‘goodness’
On what basis do we compare different indices?1. Access type: what type of queries can be answered:
selection queries (ssn = 123)? range queries ( 100 <= ssn <= 200)?
2. Access time: what is the cost of evaluating queries measured in # of block accesses
3. Maintenance overhead: cost of insertion / deletion? (also in # block accesses)
4. Space overhead : in # of blocks needed to store the index relative to the real data.
11.7Database System Concepts
Indexing
Primary (or clustering) index on SSN
STUDENTSsn Name Address
123 smith main str234 jones forbes ave345 smith forbes ave
… … …
123234345456567
11.8Database System Concepts
Indexing
Primary/sparse index on ssn (primary key)
>=123
>=456
123456
…
11.9Database System Concepts
IndexingSecondary (or nonclustering) index: duplicates may exist
Addressindex
• Can have many secondary indices• but only one primary index
STUDENTSsn Name Address
123 smith main str234 jones forbes ave345 tomson main str456 stevens forbes ave567 smith forbes ave
11.10Database System Concepts
Indexing
secondary index: typically, with ‘postings lists’
If not on a candidate key value.
Postings lists
STUDENTSsn Name Address
123 smith main str234 jones forbes ave345 tomson main str456 stevens forbes ave567 smith forbes ave
forbes avemain str
11.11Database System Concepts
Indexing
Secondary / dense index
Secondary on a candidate key:No duplicates, no need for posting lists
Ssn Name Address345 tomson main str234 jones forbes ave123 smith main str567 smith forbes ave456 stevens forbes ave
123234345456567
11.12Database System Concepts
Primary vs Secondary
1. Access type: Primary: SELECTION, RANGE Secondary: SELECTION, RANGE but index must point to posting
lists (if not on candidate key).
2. Access time: Primary faster than secondary for range queries
(no list access, all results clustered together)
3. Maintenance Overhead: Primary has greater overhead (must alter index + file)
4. Space Overhead: secondary has more.. (posting lists)
11.13Database System Concepts
Dense vs Sparse
1. Access type: both: Selection, range (if primary)
2. Access time: Dense: requires lookup for 1st result Sparse: requires lookup + scan for first result
3. Maintenance Overhead: Dense: Must change index entries Sparse: may not have to change index entries
4. Space Overhead: Dense: 1 entry per search key value Sparse: < 1 entry per block
11.14Database System Concepts
Summary
Dense Sparse
Primary rare usual
secondary usual• All combinations are possible
• at most one sparse/clustering index• as many dense indices as desired• usually: one primary index (probably sparse) and a
few secondary indices (nonclustering)• secondary / sparse: Which keys to use? Hot
items?
11.15Database System Concepts
ISAM
>=123
>=456
block
2nd level sparse index on the values of the 1st level
What if index is too large to search in memory?
STUDENTSsn Name Address
123 smith main str234 jones forbes ave345 tomson main str456 stevens forbes ave567 smith forbes ave
123456
…
1233,423
…
11.16Database System Concepts
ISAM observations
What about insertions/deletions?
>=123
>=456
124; peterson; fifth ave.
STUDENTSsn Name Address
123 smith main str234 jones forbes ave345 tomson main str456 stevens forbes ave567 smith forbes ave
123456
…
1233,423
…
11.17Database System Concepts
ISAM observations
What about insertions/deletions?
124; peterson; fifth ave.
overflows
Problems?
STUDENTSsn Name Address
123 smith main str234 jones forbes ave345 tomson main str456 stevens forbes ave567 smith forbes ave
123456
…
1233,423
…
11.18Database System Concepts
ISAM observations
What about insertions/deletions?
124; peterson; fifth ave.
overflows
• overflow chains may become very long - what to do?
STUDENTSsn Name Address
123 smith main str234 jones forbes ave345 tomson main str456 stevens forbes ave567 smith forbes ave
123456
…
1233,423
…
11.19Database System Concepts
ISAM observations
What about insertions/deletions?
124; peterson; fifth ave.
overflows
• overflow chains may become very long - thus:
• shut-down & reorganize
• start with ~80% utilization
STUDENTSsn Name Address
123 smith main str234 jones forbes ave345 tomson main str456 stevens forbes ave567 smith forbes ave
123456
…
1233,423
…
11.21Database System Concepts
So far
… indices (like ISAM) suffer in the presence of frequent updates
alternative indexing structure: B trees
11.22Database System Concepts
Btrees
Most successful family of index schemes(Btrees, B+trees, B*trees)
Can be used for primary/secondary, clustering/nonclustering index.
Balanced “nway” search trees
11.23Database System Concepts
Btrees
e.g., Btree of order 3:
1 3
6
7
9
13
< 6
>6 < 9
>9
records
• Key values appear once.• Record pointers accompany keys.• For simplicity, we will not show records and record
pointers.
11.24Database System Concepts
Btree Nodes
v1 v2 … vn-1
p1 pn
v<v1 v1 ≤ v < v2 Vn1 < v
Key values are ordered
MAXIMUM: n pointer valuesMINIMUM: n/2 pointer values
(Exception: root’s minimum = 2)
11.25Database System Concepts
Properties
“block aware” nodes: each node > disk page
O(logB (N)) for everything! (ins/del/search)
N is number of records
B is the branching factor ( = number of pointers)
typically, if B = (50 to 100), then 2 3 levels
utilization >= 50%, guaranteed; on average 69%
11.26Database System Concepts
Queries
Algorithm for exact match query? (e.g., ssn=8?)
1 3
6
7
9
13
< 6
> 6 < 9 >9
11.27Database System Concepts
Queries
Algorithm for exact match query? (e.g., ssn=7?)
1 3
6
7
9
13
< 6
>6 < 9 >9
11.28Database System Concepts
Queries
Algorithm for exact match query? (e.g., ssn=7?)
1 3
6
7
9
13
< 6
>6 < 9>9
11.29Database System Concepts
Queries
Algorithm for exact match query? (e.g., ssn=7?)
1 3
6
7
9
13
< 6>6 < 9 >9
11.30Database System Concepts
Queries
Algorithm for exact match query? (e.g., ssn=7?)
1 3
6
7
9
13
< 6
>6 < 9 >9Height of tree = H
(= # disk accesses)
11.31Database System Concepts
Queries
What about range queries? (e.g., 5<salary<8)
Proximity/ nearest neighbor searches? (e.g., salary ~ 8 )
11.32Database System Concepts
Queries What about range queries? (eg., 5<salary<8) Proximity/ nearest neighbor searches? (e.g., salary ~ 8 )
1 3
6
7
9
13
< 6
>6 < 9 >9
11.33Database System Concepts
How Do You Maintain Btrees?
Must insert/delete keys in tree such that the Btree rules are obeyed.
Do this on every insert/delete
Incur a little bit of overhead on each update, but avoid the problem of catastrophic reorganization (a la ISAM).
11.34Database System Concepts
Btrees: Insertion
Insert in leaf, if room exists
On overflow (no more room), Split: create a new internal node Redistribute keys
s.t., preserves B tree properties Push middle key up (recursively)
11.35Database System Concepts
Btrees
Easy case: Tree T0; insert ‘8’
1 3
6
7
9
13
< 6
>6 < 9 >9
11.36Database System Concepts
Btrees
Tree T0; insert ‘8’
1 3
6
7
9
13
< 6
>6 < 9 >9
8
11.37Database System Concepts
Btrees
Hard case: Tree T0; insert ‘2’
1 3
6
7
9
13
< 6
>6 < 9 >9
2
11.38Database System Concepts
Btrees
Hardest case: Tree T0; insert ‘2’
1 2
6
7
9
133
push middle up
11.39Database System Concepts
Btrees
Hard case: Tree T0; insert ‘2’
6
7
9
131 3
22
Overflow
push middle key up
Split
11.40Database System Concepts
Btrees
Hard case: Tree T0; insert ‘2’
7
9
131 3
2
6
Final state
11.41Database System Concepts
Btrees insertion
Q: What if there are two middles? (e.g., order 4) A: either one is fine
11.42Database System Concepts
Btrees: Insertion
Insert in leaf; on overflow, push middle up recursively – ‘propagate split’)
Split: preserves all B tree properties (!!)
Notice how it grows: height increases when root overflows & splits
Automatic, incremental reorganization (contrast with ISAM!)
11.43Database System Concepts
Overview
Primary / Secondary indices Multilevel (ISAM)
B – trees
Definition, Search, Insertion, deletion
B+ trees
Hashing
11.44Database System Concepts
Deletion
Rough outline of algorithm: Delete key; on underflow, may need to merge
In practice, some implementers just allow underflows to happen…
11.45Database System Concepts
Btrees – Deletion
Easiest case: Tree T0; delete ‘3’
1 3
6
7
9
13
< 6>6 < 9
>9
11.46Database System Concepts
Btrees – Deletion
Easiest case: Tree T0; delete ‘3’
1
6
7
9
13
< 6
>6 < 9 >9
11.47Database System Concepts
Btrees – Deletion
Case1: delete a key at a leaf – no underflow Case2: delete nonleaf key – no underflow Case3: delete leafkey; underflow, and ‘rich
sibling’ Case4: delete leafkey; underflow, and ‘poor
sibling’
11.48Database System Concepts
Btrees – Deletion
Case1:
delete a key at a leaf – no underflow
(delete 3 from T0)
1 3
6
7
9
13
< 6
>6 < 9 < 9
11.49Database System Concepts
Btrees – Deletion
Case 2:
delete a key at a nonleaf – no underflow
delete 6 from T0
1 3
6
7
9
13
< 6>6 < 9 >9
Delete & promote
11.50Database System Concepts
Btrees – Deletion
1 3 7
9
13
< 6
>6 < 9 >9
Case 2:
delete a key at a nonleaf – no underflow
delete 6 from T0
Delete & promote
11.51Database System Concepts
Btrees – Deletion
1 7
9
13
< 6
>6 < 9 >9
3
Case 2:
delete a key at a nonleaf – no underflow
delete 6 from T0
Delete & promote
11.52Database System Concepts
Btrees – Deletion
1 7
9
13
< 3> 3 < 9 > 9
3FINAL TREE
Case 2:
delete a key at a nonleaf – no underflow
delete 6 from T0
11.53Database System Concepts
Btrees – Deletion
Case2: delete a key at a nonleafno underflow (e.g., delete 6 from T0)
Q: How to promote?
A: pick the largest key from the left subtree (or the smallest from the right subtree)
11.54Database System Concepts
Btrees – Deletion
Case1: delete a key at a leaf – no underflow Case2: delete nonleaf key – no underflow Case3: delete leafkey; underflow, and ‘rich sibling’ Case4: delete leafkey; underflow, and ‘poor sibling’
11.55Database System Concepts
Btrees – Deletion
Case3:underflow & ‘rich sibling’
delete 7 from T0
1 3
6
7
9
13
< 6
>6 < 9 >9
Delete & borrow
11.56Database System Concepts
Btrees – Deletion
1 3
6 9
13
< 6>6 < 9 > 9Rich sibling
Case3:underflow & ‘rich sibling’
delete 7 from T0
Delete & borrow
11.57Database System Concepts
Btrees – Deletion
Case3: underflow & ‘rich sibling’
‘rich’ = can give a key, without underflowing ‘borrowing’ a key: THROUGH the PARENT!
11.58Database System Concepts
Btrees – Deletion
1 3
6 9
13
< 6
> 6 < 9 > 9Rich sibling
NO!!
Case3:underflow & ‘rich sibling’
delete 7 from T0
Delete & borrow
11.59Database System Concepts
Btrees – Deletion
1 3
6 9
13
< 6
>6 < 9 >9
Delete & borrow
Case3:underflow & ‘rich sibling’
delete 7 from T0
11.60Database System Concepts
Btrees – Deletion
1
3 9
13
< 6
> 6 < 9 > 9
6
Case3:underflow & ‘rich sibling’
delete 7 from T0
Delete & borrow
11.61Database System Concepts
Btrees – Deletion
1
3 9
13
< 3>3 < 9 > 9
Delete & borrow, through the parent
6
FINAL TREE
Case3:underflow & ‘rich sibling’
delete 7 from T0
11.62Database System Concepts
Btrees – Deletion
Case1: delete a key at a leaf – no underflow Case2: delete nonleaf key – no underflow Case3: delete leafkey; underflow, and ‘rich sibling’ Case4: delete leafkey; underflow, and ‘poor sibling’
11.63Database System Concepts
Btrees – DeletionCase 4
Underflow & ‘poor sibling’
Delete 13 from T0
• Merge, by pulling a key from the parent • Exact reversal from insertion:
‘split and push up’, vs. ‘merge and pull down’
1 3
6
7
9
13
< 6
>6 < 9 >9
11.64Database System Concepts
Btrees – Deletion
1 3
6
7
< 6
> 6
A: merge w/ ‘poor’ sibling
9
Case 4
Underflow & ‘poor sibling’
Delete 13 from T0
11.65Database System Concepts
Btrees – Deletion
1 3
6
7
< 6
> 69
FINAL TREE
Case 4
Underflow & ‘poor sibling’
Delete 13 from T0
11.66Database System Concepts
Btrees – Deletion
Case4: underflow & ‘poor sibling’ ‘pull key from parent, and merge’
Q: What if the parent underflows? A: repeat recursively
11.67Database System Concepts
Btrees in practice
In practice:
1 3
6
7
9
13
< 6
> 6 < 9 > 9
Ssn … …
3
7
6
9
1
FILE
11.68Database System Concepts
Btrees in practice
In practice, the formats are: leaf nodes: (v1, rp1, v2, rp2, … vn, rpn) Nonleaf nodes: (p1, v1, rp1, p2, v2, rp2, …)
1 3
6
7
9
13
< 6
> 6 < 9 > 9
11.69Database System Concepts
Overview
primary / secondary indices multilevel (ISAM)
B – trees
B+ trees
hashing
11.70Database System Concepts
B+ trees Motivation
Btree – print keys in sorted order:
1 3
6
7
9
13
< 6
> 6 < 9 > 9
11.71Database System Concepts
B+ trees Motivation
Btree needs backtracking – how to avoid it?
1 3
6
7
9
13
< 6
> 6 < 9 > 9
11.72Database System Concepts
Solution: B+ trees
Facilitate sequential ops
String all leaf nodes together
AND
replicate keys from nonleaf nodes, to make sure every key appears at the leaf level
11.73Database System Concepts
B+trees
B+tree of order 3:
3 4
6 9
9
< 6
≥ 6 < 9 ≥ 9
6 7 13
(3, Joe, 23) (3, Bob, 23)
(4, John, 23)
………… ………… …………
root: internal node
leaf node
Data File
11.74Database System Concepts
B+ tree insertion
INSERTION OF KEY ’K’ insert searchkey value to ’L’ such that the keys are in order; if ( ’L’ overflows) { split ’L’ ; insert (ie., COPY) smallest searchkey value of new node to parent node ’P’; if (’P’ overflows) { repeat the Btree split procedure recursively; /* Notice: the BTREE split; NOT the B+ tree */ } }
11.75Database System Concepts
B+tree insertion – cont’d
ATTENTION:
A split at the LEAF level is handled by
COPYING the middle key up;
A split at a higher level is handled by
PUSHING the middle key up
Remember: Leaf nodes must be complete – all keysInterior nodes need not be complete
11.76Database System Concepts
B+ trees insertion
1 3
6
6
9
9
> 6
≥ 6 < 9 ≥ 9
7 13
Insert ‘8’
11.77Database System Concepts
B+ trees insertion
1 3
6
6
9
9
< 6≥ 6 < 9 ≥ 9
7 13
Insert ‘8’
8
11.78Database System Concepts
B+ trees insertion
1 3
6
6
9
9
<6
≥ 6 <9 ≥ 9
7 13
Eg., insert ‘8’
8
COPY middle (=7) upstairs; Keep 8 in leaf as well
11.79Database System Concepts
B+ trees insertion
1 3
6
6
9< 6
≥ 6 < 9≥ 9
9 13
Eg., insert ‘8’
COPY middle upstairs and split
7 and 8 remain in leaves since all keys are present there.
7 8
7
11.80Database System Concepts
B+ trees insertion
1 3
6
6
9<6
≥ 6 < 9≥ 9
9 13
Insert ‘8’
COPY middle upstairs again
7 8
7
Nonleaf overflow – just PUSH the middle
11.81Database System Concepts
B+ trees – insertion
1 3
6
6
<6
≥ 6 ≥ 9
9 13
Insert ‘8’
7 8
7
9
< 7 ≥ 7
<9
FINAL TREE