+ All Categories
Home > Documents > Data Organization ­ B­trees - Brown University · PDF fileDatabase System Concepts...

Data Organization ­ B­trees - Brown University · PDF fileDatabase System Concepts...

Date post: 06-Mar-2018
Category:
Upload: phunganh
View: 219 times
Download: 3 times
Share this document with a friend
80
Data Organization - B-trees
Transcript
Page 1: Data Organization ­ B­trees - Brown University · PDF fileDatabase System Concepts 11.2 Data organization and retrieval File organization can improve data retrieval time SELECT *

Data Organization ­ B­trees

Page 2: Data Organization ­ B­trees - Brown University · PDF fileDatabase System Concepts 11.2 Data organization and retrieval File organization can improve data retrieval time SELECT *

11.2Database System Concepts

Data organization and retrievalFile organization can improve data retrieval time

SELECT  *FROM depositorsWHERE bname=“Downtown”

Mianus      A­215Perry          A­218Downtown  A­101....

Brighton        A­217Downtown     A­101Downtown     A­110......

Heap Ordered File

Searching a heap: must search all blocks (100 blocks)

OR

Searching an ordered file: 1. Binary search for the 1st tuple in answer : log2 100 = 7 block accesses2. scan blocks with answer: no more than 2      Total <= 9 block accesses 

100 blocks200 recs/blockQuery returns 150 records

Page 3: Data Organization ­ B­trees - Brown University · PDF fileDatabase System Concepts 11.2 Data organization and retrieval File organization can improve data retrieval time SELECT *

11.3Database System Concepts

Data organization and retrievalBut... file can only be ordered on one search key:

Brighton        A­217Downtown     A­101Downtown     A­110......

Ordered File (bname)Ex. Select *       From depositors       Where acct_no = “A­110”

Requires linear scan (100 BA’s)

Solution: Indexes!   Auxiliary data structures over relations that can improve 

the search time

Page 4: Data Organization ­ B­trees - Brown University · PDF fileDatabase System Concepts 11.2 Data organization and retrieval File organization can improve data retrieval time SELECT *

11.4Database System Concepts

A simple indexBrighton        A­217     700Downtown     A­101     500Downtown     A­110     600Mianus          A­215     700Perry             A­102     400......

A­101A­102A­110A­215A­217...... Index of depositors on acct_no

Index records:  <search key value, pointer (block, offset or slot#)>

To answer a query for “acct_no=A­110”  we:

1. Do a binary search on index file, searching for A­1102. “Chase” pointer of index record

Index file

Page 5: Data Organization ­ B­trees - Brown University · PDF fileDatabase System Concepts 11.2 Data organization and retrieval File organization can improve data retrieval time SELECT *

11.5Database System Concepts

Index Choices

1. Primary: index search key =

physical (sort) order search key     vs Secondary: all other indexes

       Q: how many primary indexes per relation?

2. Dense: index entry for every search key value

     vs Sparse: some search key values not in the index

3. Single­level vs Multi­level (index on the indexes)

Page 6: Data Organization ­ B­trees - Brown University · PDF fileDatabase System Concepts 11.2 Data organization and retrieval File organization can improve data retrieval time SELECT *

11.6Database System Concepts

Measuring ‘goodness’

On what basis do we compare different indices?1.  Access type: what type of queries can be answered:

selection queries (ssn = 123)? range queries ( 100 <= ssn <= 200)?

2. Access time: what is the cost of evaluating queries measured in # of block accesses

3. Maintenance overhead: cost of insertion / deletion? (also in # block accesses)

4. Space overhead :  in # of blocks needed to store the index relative to the real data.

Page 7: Data Organization ­ B­trees - Brown University · PDF fileDatabase System Concepts 11.2 Data organization and retrieval File organization can improve data retrieval time SELECT *

11.7Database System Concepts

Indexing

Primary (or clustering)  index on SSN

STUDENTSsn Name Address

123 smith main str234 jones forbes ave345 smith forbes ave

… … …

123234345456567

Page 8: Data Organization ­ B­trees - Brown University · PDF fileDatabase System Concepts 11.2 Data organization and retrieval File organization can improve data retrieval time SELECT *

11.8Database System Concepts

Indexing

Primary/sparse index on ssn (primary key)

>=123

>=456

123456

Page 9: Data Organization ­ B­trees - Brown University · PDF fileDatabase System Concepts 11.2 Data organization and retrieval File organization can improve data retrieval time SELECT *

11.9Database System Concepts

IndexingSecondary (or non­clustering) index: duplicates may exist

Address­index

• Can have many secondary indices•  but only one primary index

STUDENTSsn Name Address

123 smith main str234 jones forbes ave345 tomson main str456 stevens forbes ave567 smith forbes ave

Page 10: Data Organization ­ B­trees - Brown University · PDF fileDatabase System Concepts 11.2 Data organization and retrieval File organization can improve data retrieval time SELECT *

11.10Database System Concepts

Indexing

secondary index: typically, with ‘postings lists’

If not on a candidate key value.

Postings lists

STUDENTSsn Name Address

123 smith main str234 jones forbes ave345 tomson main str456 stevens forbes ave567 smith forbes ave

forbes avemain str

Page 11: Data Organization ­ B­trees - Brown University · PDF fileDatabase System Concepts 11.2 Data organization and retrieval File organization can improve data retrieval time SELECT *

11.11Database System Concepts

Indexing

Secondary / dense index

Secondary on a candidate key:No duplicates, no need for posting lists

Ssn Name Address345 tomson main str234 jones forbes ave123 smith main str567 smith forbes ave456 stevens forbes ave

123234345456567

Page 12: Data Organization ­ B­trees - Brown University · PDF fileDatabase System Concepts 11.2 Data organization and retrieval File organization can improve data retrieval time SELECT *

11.12Database System Concepts

Primary vs Secondary

1. Access type:   Primary: SELECTION, RANGE Secondary: SELECTION, RANGE but index must point to posting 

lists (if not on candidate key).

2. Access time: Primary faster than secondary for range queries

(no list access, all results clustered together)

3. Maintenance Overhead: Primary has greater overhead (must alter index + file)

4. Space Overhead:    secondary has more.. (posting lists)

Page 13: Data Organization ­ B­trees - Brown University · PDF fileDatabase System Concepts 11.2 Data organization and retrieval File organization can improve data retrieval time SELECT *

11.13Database System Concepts

Dense vs Sparse

1. Access type:    both: Selection, range (if primary)

2. Access time:   Dense: requires lookup for 1st result Sparse: requires lookup + scan for first result

3. Maintenance Overhead: Dense:  Must change index entries Sparse: may not have to change index entries

4. Space Overhead: Dense:  1 entry per search key value Sparse: < 1 entry per block

Page 14: Data Organization ­ B­trees - Brown University · PDF fileDatabase System Concepts 11.2 Data organization and retrieval File organization can improve data retrieval time SELECT *

11.14Database System Concepts

Summary

Dense Sparse

Primary rare usual

secondary usual• All combinations are possible

• at most one sparse/clustering index•  as many dense indices as desired•  usually: one primary index (probably sparse) and a 

few secondary indices (non­clustering)•  secondary / sparse: Which keys to use? Hot 

items?

Page 15: Data Organization ­ B­trees - Brown University · PDF fileDatabase System Concepts 11.2 Data organization and retrieval File organization can improve data retrieval time SELECT *

11.15Database System Concepts

ISAM

>=123

>=456

block

2nd level sparse index on the values of the 1st level

What if index is too large to search in memory?

STUDENTSsn Name Address

123 smith main str234 jones forbes ave345 tomson main str456 stevens forbes ave567 smith forbes ave

123456

1233,423

Page 16: Data Organization ­ B­trees - Brown University · PDF fileDatabase System Concepts 11.2 Data organization and retrieval File organization can improve data retrieval time SELECT *

11.16Database System Concepts

ISAM ­ observations

What about insertions/deletions?

>=123

>=456

124; peterson; fifth ave.

STUDENTSsn Name Address

123 smith main str234 jones forbes ave345 tomson main str456 stevens forbes ave567 smith forbes ave

123456

1233,423

Page 17: Data Organization ­ B­trees - Brown University · PDF fileDatabase System Concepts 11.2 Data organization and retrieval File organization can improve data retrieval time SELECT *

11.17Database System Concepts

ISAM ­ observations

What about insertions/deletions?

124; peterson; fifth ave.

overflows

Problems?

STUDENTSsn Name Address

123 smith main str234 jones forbes ave345 tomson main str456 stevens forbes ave567 smith forbes ave

123456

1233,423

Page 18: Data Organization ­ B­trees - Brown University · PDF fileDatabase System Concepts 11.2 Data organization and retrieval File organization can improve data retrieval time SELECT *

11.18Database System Concepts

ISAM ­ observations

What about insertions/deletions?

124; peterson; fifth ave.

overflows

• overflow chains may become very long - what to do?

STUDENTSsn Name Address

123 smith main str234 jones forbes ave345 tomson main str456 stevens forbes ave567 smith forbes ave

123456

1233,423

Page 19: Data Organization ­ B­trees - Brown University · PDF fileDatabase System Concepts 11.2 Data organization and retrieval File organization can improve data retrieval time SELECT *

11.19Database System Concepts

ISAM ­ observations

What about insertions/deletions?

124; peterson; fifth ave.

overflows

• overflow chains may become very long - thus:

• shut-down & reorganize

• start with ~80% utilization

STUDENTSsn Name Address

123 smith main str234 jones forbes ave345 tomson main str456 stevens forbes ave567 smith forbes ave

123456

1233,423

Page 20: Data Organization ­ B­trees - Brown University · PDF fileDatabase System Concepts 11.2 Data organization and retrieval File organization can improve data retrieval time SELECT *

11.21Database System Concepts

So far

…  indices (like ISAM) suffer in the presence of frequent updates

alternative indexing structure: B ­ trees

Page 21: Data Organization ­ B­trees - Brown University · PDF fileDatabase System Concepts 11.2 Data organization and retrieval File organization can improve data retrieval time SELECT *

11.22Database System Concepts

B­trees

Most successful family of index schemes(B­trees, B+­trees, B*­trees)

Can be used for primary/secondary, clustering/non­clustering index.

Balanced “n­way” search trees

Page 22: Data Organization ­ B­trees - Brown University · PDF fileDatabase System Concepts 11.2 Data organization and retrieval File organization can improve data retrieval time SELECT *

11.23Database System Concepts

B­trees

e.g., B­tree of order 3:

1 3

6

7

9

13

< 6

>6 < 9

>9

records

•   Key values appear once.•   Record pointers accompany keys.•   For simplicity, we will not show records and record 

pointers.

Page 23: Data Organization ­ B­trees - Brown University · PDF fileDatabase System Concepts 11.2 Data organization and retrieval File organization can improve data retrieval time SELECT *

11.24Database System Concepts

B­tree Nodes

v1 v2 … vn-1

p1 pn

v<v1  v1 ≤  v < v2 Vn­1 < v

Key values are ordered

MAXIMUM: n pointer valuesMINIMUM:  n/2   pointer values    

(Exception: root’s minimum = 2) 

Page 24: Data Organization ­ B­trees - Brown University · PDF fileDatabase System Concepts 11.2 Data organization and retrieval File organization can improve data retrieval time SELECT *

11.25Database System Concepts

Properties

“block aware” nodes: each node ­> disk page

O(logB (N)) for everything! (ins/del/search)

N is number of records

B is the branching factor ( = number of pointers)

typically, if B = (50 to 100), then 2 ­ 3 levels

utilization >= 50%, guaranteed; on average 69%

Page 25: Data Organization ­ B­trees - Brown University · PDF fileDatabase System Concepts 11.2 Data organization and retrieval File organization can improve data retrieval time SELECT *

11.26Database System Concepts

Queries

Algorithm for exact match query?  (e.g., ssn=8?)

1 3

6

7

9

13

< 6

> 6 < 9 >9

Page 26: Data Organization ­ B­trees - Brown University · PDF fileDatabase System Concepts 11.2 Data organization and retrieval File organization can improve data retrieval time SELECT *

11.27Database System Concepts

Queries

Algorithm for exact match query?  (e.g., ssn=7?)

1 3

6

7

9

13

< 6

>6 < 9 >9

Page 27: Data Organization ­ B­trees - Brown University · PDF fileDatabase System Concepts 11.2 Data organization and retrieval File organization can improve data retrieval time SELECT *

11.28Database System Concepts

Queries

Algorithm for exact match query?  (e.g., ssn=7?)

1 3

6

7

9

13

< 6

>6 < 9>9

Page 28: Data Organization ­ B­trees - Brown University · PDF fileDatabase System Concepts 11.2 Data organization and retrieval File organization can improve data retrieval time SELECT *

11.29Database System Concepts

Queries

Algorithm for exact match query?  (e.g., ssn=7?)

1 3

6

7

9

13

< 6>6 < 9 >9

Page 29: Data Organization ­ B­trees - Brown University · PDF fileDatabase System Concepts 11.2 Data organization and retrieval File organization can improve data retrieval time SELECT *

11.30Database System Concepts

Queries

Algorithm for exact match query?  (e.g., ssn=7?)

1 3

6

7

9

13

< 6

>6 < 9 >9Height of tree = H

(= # disk accesses)

Page 30: Data Organization ­ B­trees - Brown University · PDF fileDatabase System Concepts 11.2 Data organization and retrieval File organization can improve data retrieval time SELECT *

11.31Database System Concepts

Queries

What about range queries? (e.g., 5<salary<8)

Proximity/ nearest neighbor searches? (e.g., salary ~ 8 )

Page 31: Data Organization ­ B­trees - Brown University · PDF fileDatabase System Concepts 11.2 Data organization and retrieval File organization can improve data retrieval time SELECT *

11.32Database System Concepts

Queries What about range queries? (eg., 5<salary<8) Proximity/ nearest neighbor searches? (e.g., salary ~ 8 )

1 3

6

7

9

13

< 6

>6 < 9 >9

Page 32: Data Organization ­ B­trees - Brown University · PDF fileDatabase System Concepts 11.2 Data organization and retrieval File organization can improve data retrieval time SELECT *

11.33Database System Concepts

How Do You Maintain B­trees?

Must insert/delete keys in tree such that the B­tree rules are obeyed.

Do this on every insert/delete

Incur a little bit of overhead on each update, but avoid the problem of catastrophic re­organization (a la ISAM).

Page 33: Data Organization ­ B­trees - Brown University · PDF fileDatabase System Concepts 11.2 Data organization and retrieval File organization can improve data retrieval time SELECT *

11.34Database System Concepts

B­trees: Insertion

Insert in leaf, if room exists

On overflow (no more room),  Split: create a new internal node Redistribute keys

s.t., preserves B ­ tree properties Push middle key up (recursively)

Page 34: Data Organization ­ B­trees - Brown University · PDF fileDatabase System Concepts 11.2 Data organization and retrieval File organization can improve data retrieval time SELECT *

11.35Database System Concepts

B­trees

Easy case: Tree T0; insert ‘8’

1 3

6

7

9

13

< 6

>6 < 9 >9

Page 35: Data Organization ­ B­trees - Brown University · PDF fileDatabase System Concepts 11.2 Data organization and retrieval File organization can improve data retrieval time SELECT *

11.36Database System Concepts

B­trees

Tree T0; insert ‘8’

1 3

6

7

9

13

< 6

>6 < 9 >9

8

Page 36: Data Organization ­ B­trees - Brown University · PDF fileDatabase System Concepts 11.2 Data organization and retrieval File organization can improve data retrieval time SELECT *

11.37Database System Concepts

B­trees

Hard case: Tree T0; insert ‘2’

1 3

6

7

9

13

< 6

>6 < 9 >9

2

Page 37: Data Organization ­ B­trees - Brown University · PDF fileDatabase System Concepts 11.2 Data organization and retrieval File organization can improve data retrieval time SELECT *

11.38Database System Concepts

B­trees

Hardest case: Tree T0; insert ‘2’

1 2

6

7

9

133

push middle up

Page 38: Data Organization ­ B­trees - Brown University · PDF fileDatabase System Concepts 11.2 Data organization and retrieval File organization can improve data retrieval time SELECT *

11.39Database System Concepts

B­trees

Hard case: Tree T0; insert ‘2’

6

7

9

131 3

22

Overflow

push middle key up

Split

Page 39: Data Organization ­ B­trees - Brown University · PDF fileDatabase System Concepts 11.2 Data organization and retrieval File organization can improve data retrieval time SELECT *

11.40Database System Concepts

B­trees

Hard case: Tree T0; insert ‘2’

7

9

131 3

2

6

Final state

Page 40: Data Organization ­ B­trees - Brown University · PDF fileDatabase System Concepts 11.2 Data organization and retrieval File organization can improve data retrieval time SELECT *

11.41Database System Concepts

B­trees ­ insertion

Q: What if there are two middles? (e.g., order 4) A: either one is fine

Page 41: Data Organization ­ B­trees - Brown University · PDF fileDatabase System Concepts 11.2 Data organization and retrieval File organization can improve data retrieval time SELECT *

11.42Database System Concepts

B­trees: Insertion

Insert in leaf; on overflow, push middle up recursively – ‘propagate split’)

Split: preserves all B ­ tree properties (!!)

Notice how it grows: height increases when root overflows & splits

Automatic, incremental re­organization (contrast with ISAM!)

Page 42: Data Organization ­ B­trees - Brown University · PDF fileDatabase System Concepts 11.2 Data organization and retrieval File organization can improve data retrieval time SELECT *

11.43Database System Concepts

Overview

Primary / Secondary indices Multilevel (ISAM)

B – trees

Definition, Search, Insertion, deletion

 B+ ­ trees

Hashing

Page 43: Data Organization ­ B­trees - Brown University · PDF fileDatabase System Concepts 11.2 Data organization and retrieval File organization can improve data retrieval time SELECT *

11.44Database System Concepts

Deletion

Rough outline of algorithm: Delete key; on underflow, may need to merge

In practice, some implementers just allow underflows to happen…

Page 44: Data Organization ­ B­trees - Brown University · PDF fileDatabase System Concepts 11.2 Data organization and retrieval File organization can improve data retrieval time SELECT *

11.45Database System Concepts

B­trees – Deletion

Easiest case: Tree T0; delete ‘3’

1 3

6

7

9

13

< 6>6 < 9

>9

Page 45: Data Organization ­ B­trees - Brown University · PDF fileDatabase System Concepts 11.2 Data organization and retrieval File organization can improve data retrieval time SELECT *

11.46Database System Concepts

B­trees – Deletion

Easiest case: Tree T0; delete ‘3’

1

6

7

9

13

< 6

>6 < 9 >9

Page 46: Data Organization ­ B­trees - Brown University · PDF fileDatabase System Concepts 11.2 Data organization and retrieval File organization can improve data retrieval time SELECT *

11.47Database System Concepts

B­trees – Deletion

Case1: delete a key at a leaf – no underflow Case2: delete non­leaf key – no underflow Case3: delete leaf­key; underflow, and ‘rich 

sibling’ Case4: delete leaf­key; underflow, and ‘poor 

sibling’

Page 47: Data Organization ­ B­trees - Brown University · PDF fileDatabase System Concepts 11.2 Data organization and retrieval File organization can improve data retrieval time SELECT *

11.48Database System Concepts

B­trees – Deletion

Case1:

delete a key at a leaf – no underflow

(delete 3 from T0)

1 3

6

7

9

13

< 6

>6 < 9 < 9

Page 48: Data Organization ­ B­trees - Brown University · PDF fileDatabase System Concepts 11.2 Data organization and retrieval File organization can improve data retrieval time SELECT *

11.49Database System Concepts

B­trees – Deletion

Case 2:

delete a key at a non­leaf – no underflow

  delete 6 from T0

1 3

6

7

9

13

< 6>6 < 9 >9

Delete & promote

Page 49: Data Organization ­ B­trees - Brown University · PDF fileDatabase System Concepts 11.2 Data organization and retrieval File organization can improve data retrieval time SELECT *

11.50Database System Concepts

B­trees – Deletion

1 3 7

9

13

< 6

>6 < 9 >9

Case 2:

delete a key at a non­leaf – no underflow

  delete 6 from T0

Delete & promote

Page 50: Data Organization ­ B­trees - Brown University · PDF fileDatabase System Concepts 11.2 Data organization and retrieval File organization can improve data retrieval time SELECT *

11.51Database System Concepts

B­trees – Deletion

1 7

9

13

< 6

>6 < 9 >9

3

Case 2:

delete a key at a non­leaf – no underflow

  delete 6 from T0

Delete & promote

Page 51: Data Organization ­ B­trees - Brown University · PDF fileDatabase System Concepts 11.2 Data organization and retrieval File organization can improve data retrieval time SELECT *

11.52Database System Concepts

B­trees – Deletion

1 7

9

13

< 3> 3 < 9 > 9

3FINAL TREE

Case 2:

delete a key at a non­leaf – no underflow

  delete 6 from T0

Page 52: Data Organization ­ B­trees - Brown University · PDF fileDatabase System Concepts 11.2 Data organization and retrieval File organization can improve data retrieval time SELECT *

11.53Database System Concepts

B­trees – Deletion

Case2: delete a key at a non­leafno underflow (e.g., delete 6 from T0)

Q: How to promote? 

A: pick the largest key from the left sub­tree (or the smallest from the right sub­tree)

Page 53: Data Organization ­ B­trees - Brown University · PDF fileDatabase System Concepts 11.2 Data organization and retrieval File organization can improve data retrieval time SELECT *

11.54Database System Concepts

B­trees – Deletion

Case1: delete a key at a leaf – no underflow Case2: delete non­leaf key – no underflow Case3: delete leaf­key; underflow, and ‘rich sibling’ Case4: delete leaf­key; underflow, and ‘poor sibling’

Page 54: Data Organization ­ B­trees - Brown University · PDF fileDatabase System Concepts 11.2 Data organization and retrieval File organization can improve data retrieval time SELECT *

11.55Database System Concepts

B­trees – Deletion

Case3:underflow & ‘rich sibling’

delete 7 from T0

1 3

6

7

9

13

< 6

>6 < 9 >9

Delete & borrow

Page 55: Data Organization ­ B­trees - Brown University · PDF fileDatabase System Concepts 11.2 Data organization and retrieval File organization can improve data retrieval time SELECT *

11.56Database System Concepts

B­trees – Deletion

1 3

6 9

13

< 6>6 < 9 > 9Rich sibling

Case3:underflow & ‘rich sibling’

delete 7 from T0

Delete & borrow

Page 56: Data Organization ­ B­trees - Brown University · PDF fileDatabase System Concepts 11.2 Data organization and retrieval File organization can improve data retrieval time SELECT *

11.57Database System Concepts

B­trees – Deletion

Case3: underflow & ‘rich sibling’

‘rich’ = can give a key, without underflowing ‘borrowing’ a key: THROUGH the PARENT!

Page 57: Data Organization ­ B­trees - Brown University · PDF fileDatabase System Concepts 11.2 Data organization and retrieval File organization can improve data retrieval time SELECT *

11.58Database System Concepts

B­trees – Deletion

1 3

6 9

13

< 6

> 6 < 9 > 9Rich sibling

NO!!

Case3:underflow & ‘rich sibling’

delete 7 from T0

Delete & borrow

Page 58: Data Organization ­ B­trees - Brown University · PDF fileDatabase System Concepts 11.2 Data organization and retrieval File organization can improve data retrieval time SELECT *

11.59Database System Concepts

B­trees – Deletion

1 3

6 9

13

< 6

>6 < 9 >9

Delete & borrow

Case3:underflow & ‘rich sibling’

delete 7 from T0

Page 59: Data Organization ­ B­trees - Brown University · PDF fileDatabase System Concepts 11.2 Data organization and retrieval File organization can improve data retrieval time SELECT *

11.60Database System Concepts

B­trees – Deletion

1

3 9

13

< 6

> 6 < 9 > 9

6

Case3:underflow & ‘rich sibling’

delete 7 from T0

Delete & borrow

Page 60: Data Organization ­ B­trees - Brown University · PDF fileDatabase System Concepts 11.2 Data organization and retrieval File organization can improve data retrieval time SELECT *

11.61Database System Concepts

B­trees – Deletion

1

3 9

13

< 3>3 < 9 > 9

Delete & borrow, through the parent

6

FINAL TREE

Case3:underflow & ‘rich sibling’

delete 7 from T0

Page 61: Data Organization ­ B­trees - Brown University · PDF fileDatabase System Concepts 11.2 Data organization and retrieval File organization can improve data retrieval time SELECT *

11.62Database System Concepts

B­trees – Deletion

Case1: delete a key at a leaf – no underflow Case2: delete non­leaf key – no underflow Case3: delete leaf­key; underflow, and ‘rich sibling’ Case4: delete leaf­key; underflow, and ‘poor sibling’

Page 62: Data Organization ­ B­trees - Brown University · PDF fileDatabase System Concepts 11.2 Data organization and retrieval File organization can improve data retrieval time SELECT *

11.63Database System Concepts

B­trees – DeletionCase 4

Underflow & ‘poor sibling’

Delete 13 from T0

•   Merge, by pulling a key from the parent •   Exact reversal from insertion:

‘split and push up’, vs. ‘merge and pull down’

1 3

6

7

9

13

< 6

>6 < 9 >9

Page 63: Data Organization ­ B­trees - Brown University · PDF fileDatabase System Concepts 11.2 Data organization and retrieval File organization can improve data retrieval time SELECT *

11.64Database System Concepts

B­trees – Deletion

1 3

6

7

< 6

> 6

A: merge w/ ‘poor’ sibling

9

Case 4

Underflow & ‘poor sibling’

Delete 13 from T0

Page 64: Data Organization ­ B­trees - Brown University · PDF fileDatabase System Concepts 11.2 Data organization and retrieval File organization can improve data retrieval time SELECT *

11.65Database System Concepts

B­trees – Deletion

1 3

6

7

< 6

> 69

FINAL TREE

Case 4

Underflow & ‘poor sibling’

Delete 13 from T0

Page 65: Data Organization ­ B­trees - Brown University · PDF fileDatabase System Concepts 11.2 Data organization and retrieval File organization can improve data retrieval time SELECT *

11.66Database System Concepts

B­trees – Deletion

Case4: underflow & ‘poor sibling’  ‘pull key from parent, and merge’

Q: What if the parent underflows? A: repeat recursively

Page 66: Data Organization ­ B­trees - Brown University · PDF fileDatabase System Concepts 11.2 Data organization and retrieval File organization can improve data retrieval time SELECT *

11.67Database System Concepts

B­trees in practice

In practice: 

1 3

6

7

9

13

< 6

> 6 < 9 > 9

Ssn … …

3

7

6

9

1

FILE

Page 67: Data Organization ­ B­trees - Brown University · PDF fileDatabase System Concepts 11.2 Data organization and retrieval File organization can improve data retrieval time SELECT *

11.68Database System Concepts

B­trees in practice

In practice, the formats are: leaf nodes: (v1, rp1, v2, rp2, … vn, rpn) Non­leaf nodes: (p1, v1, rp1, p2, v2, rp2, …) 

1 3

6

7

9

13

< 6

> 6 < 9 > 9

Page 68: Data Organization ­ B­trees - Brown University · PDF fileDatabase System Concepts 11.2 Data organization and retrieval File organization can improve data retrieval time SELECT *

11.69Database System Concepts

Overview

primary / secondary indices multilevel (ISAM)

B – trees

B+ ­ trees

hashing

Page 69: Data Organization ­ B­trees - Brown University · PDF fileDatabase System Concepts 11.2 Data organization and retrieval File organization can improve data retrieval time SELECT *

11.70Database System Concepts

B+ trees ­ Motivation

B­tree – print keys in sorted order:

1 3

6

7

9

13

< 6

> 6 < 9 > 9

Page 70: Data Organization ­ B­trees - Brown University · PDF fileDatabase System Concepts 11.2 Data organization and retrieval File organization can improve data retrieval time SELECT *

11.71Database System Concepts

B+ trees ­ Motivation

B­tree needs back­tracking – how to avoid it?

1 3

6

7

9

13

< 6

> 6 < 9 > 9

Page 71: Data Organization ­ B­trees - Brown University · PDF fileDatabase System Concepts 11.2 Data organization and retrieval File organization can improve data retrieval time SELECT *

11.72Database System Concepts

Solution: B+ ­ trees 

Facilitate sequential ops

String all leaf nodes together 

AND

replicate keys from non­leaf nodes, to make sure every key appears at the leaf level

Page 72: Data Organization ­ B­trees - Brown University · PDF fileDatabase System Concepts 11.2 Data organization and retrieval File organization can improve data retrieval time SELECT *

11.73Database System Concepts

B+­trees

B+­tree of order 3:

3 4

6 9

9

< 6

≥ 6 < 9 ≥ 9

6 7 13

(3, Joe, 23) (3, Bob, 23)

(4, John, 23)

………… ………… …………

root: internal node

leaf node

Data File

Page 73: Data Organization ­ B­trees - Brown University · PDF fileDatabase System Concepts 11.2 Data organization and retrieval File organization can improve data retrieval time SELECT *

11.74Database System Concepts

B+ tree insertion

INSERTION OF KEY ’K’  insert search­key value to ’L’ such that the keys are in order;  if ( ’L’ overflows) {      split ’L’ ;      insert (ie., COPY) smallest search­key value          of new node to parent node ’P’;      if (’P’ overflows) {         repeat the B­tree split procedure recursively;         /* Notice: the B­TREE split; NOT the B+ ­tree */      }  }

Page 74: Data Organization ­ B­trees - Brown University · PDF fileDatabase System Concepts 11.2 Data organization and retrieval File organization can improve data retrieval time SELECT *

11.75Database System Concepts

B+­tree insertion – cont’d

ATTENTION:

A split at the LEAF level is handled by

COPYING the middle key up;

A split at a higher level is handled by

PUSHING  the middle key up

Remember: Leaf nodes must be complete – all keysInterior nodes need not be complete

Page 75: Data Organization ­ B­trees - Brown University · PDF fileDatabase System Concepts 11.2 Data organization and retrieval File organization can improve data retrieval time SELECT *

11.76Database System Concepts

B+ trees ­ insertion

1 3

6

6

9

9

> 6

≥ 6 < 9 ≥ 9

7 13

Insert ‘8’

Page 76: Data Organization ­ B­trees - Brown University · PDF fileDatabase System Concepts 11.2 Data organization and retrieval File organization can improve data retrieval time SELECT *

11.77Database System Concepts

B+ trees ­ insertion

1 3

6

6

9

9

< 6≥ 6 < 9 ≥ 9

7 13

Insert ‘8’

8

Page 77: Data Organization ­ B­trees - Brown University · PDF fileDatabase System Concepts 11.2 Data organization and retrieval File organization can improve data retrieval time SELECT *

11.78Database System Concepts

B+ trees ­ insertion

1 3

6

6

9

9

<6

≥ 6 <9 ≥ 9

7 13

Eg., insert ‘8’

8

COPY middle (=7) upstairs; Keep 8 in leaf as well

Page 78: Data Organization ­ B­trees - Brown University · PDF fileDatabase System Concepts 11.2 Data organization and retrieval File organization can improve data retrieval time SELECT *

11.79Database System Concepts

B+ trees ­ insertion

1 3

6

6

9< 6

≥ 6 < 9≥ 9

9 13

Eg., insert ‘8’

COPY middle upstairs and split

7 and 8 remain in leaves since all keys are present there.

7 8

7

Page 79: Data Organization ­ B­trees - Brown University · PDF fileDatabase System Concepts 11.2 Data organization and retrieval File organization can improve data retrieval time SELECT *

11.80Database System Concepts

B+ trees ­ insertion

1 3

6

6

9<6

≥ 6 < 9≥ 9

9 13

Insert ‘8’

COPY middle upstairs again

7 8

7

Non­leaf overflow – just PUSH the middle

Page 80: Data Organization ­ B­trees - Brown University · PDF fileDatabase System Concepts 11.2 Data organization and retrieval File organization can improve data retrieval time SELECT *

11.81Database System Concepts

B+ trees – insertion

1 3

6

6

<6

≥ 6 ≥ 9

9 13

Insert ‘8’

7 8

7

9

< 7 ≥ 7

<9

FINAL TREE


Recommended