+ All Categories
Home > Documents > CS 4432lecture #61 CS4432: Database Systems II Lecture #6 Professor Elke A. Rundensteiner.

CS 4432lecture #61 CS4432: Database Systems II Lecture #6 Professor Elke A. Rundensteiner.

Date post: 20-Dec-2015
Category:
View: 214 times
Download: 1 times
Share this document with a friend
Popular Tags:
38
CS 4432 lecture #6 1 CS4432: Database Systems II Lecture #6 Professor Elke A. Rundensteiner
Transcript
Page 1: CS 4432lecture #61 CS4432: Database Systems II Lecture #6 Professor Elke A. Rundensteiner.

CS 4432 lecture #6 1

CS4432: Database Systems IILecture #6

Professor Elke A. Rundensteiner

Page 2: CS 4432lecture #61 CS4432: Database Systems II Lecture #6 Professor Elke A. Rundensteiner.

CS 4432 lecture #6 2

PROJECT 1

• Project-1 teams assigned (my.wpi)

• If any issues, let us know sooner rather than later …

• Must state each member’s contributions …

Page 3: CS 4432lecture #61 CS4432: Database Systems II Lecture #6 Professor Elke A. Rundensteiner.

CS 4432 lecture #6 3

Storage Layout :How to lay out data on disk. ( chapter 3)

Page 4: CS 4432lecture #61 CS4432: Database Systems II Lecture #6 Professor Elke A. Rundensteiner.

CS 4432 lecture #6 4

Placing records into blocks

blocks ...

a file

assume fixedlength blocks

assume a single file (for now)

Page 5: CS 4432lecture #61 CS4432: Database Systems II Lecture #6 Professor Elke A. Rundensteiner.

CS 4432 lecture #6 5

(1) separating records(2) spanned vs. unspanned(3) mixed record types – clustering(4) split records(5) sequencing(6) indirection

Options for storing records in blocks:

Page 6: CS 4432lecture #61 CS4432: Database Systems II Lecture #6 Professor Elke A. Rundensteiner.

CS 4432 lecture #6 6

Fixed part in one block

Typically forhybrid format

Variable part in another block

(4) Split records

Page 7: CS 4432lecture #61 CS4432: Database Systems II Lecture #6 Professor Elke A. Rundensteiner.

CS 4432 lecture #6 7

Block with fixed recs.

R1 (a)R1 (b)

Block with variable recs.

R2 (a)

R2 (b)

R2 (c)

Page 8: CS 4432lecture #61 CS4432: Database Systems II Lecture #6 Professor Elke A. Rundensteiner.

CS 4432 lecture #6 8

• Ordering records in file (and block) by some key value– Sequential file ( - sequenced file)

• Why sequencing ?– Typically to make it possible to efficiently read

records in order

(5) Sequencing

Page 9: CS 4432lecture #61 CS4432: Database Systems II Lecture #6 Professor Elke A. Rundensteiner.

CS 4432 lecture #6 9

Sequencing Options

(a) Next record physically contiguous

...

(b) Linked

What about INSERT/ DELETE ?

Next (R1)R1

R1 Next (R1)

Page 10: CS 4432lecture #61 CS4432: Database Systems II Lecture #6 Professor Elke A. Rundensteiner.

CS 4432 lecture #6 10

(c) Overflow area

Recordsin sequence

R1

R2

R3

R4

R5

Sequencing Options

header

R2.1

R1.3

R4.7

Page 11: CS 4432lecture #61 CS4432: Database Systems II Lecture #6 Professor Elke A. Rundensteiner.

CS 4432 lecture #6 11

• How does one refer to records?

• Problem: Records can be on disk or in (virtual) memory. Need common address, but have different physical locations.

(6) Indirection Addressing

Rx

Many options: Physical Indirect

Page 12: CS 4432lecture #61 CS4432: Database Systems II Lecture #6 Professor Elke A. Rundensteiner.

CS 4432 lecture #6 12

Purely Physical Addressing

Device ID E.g., Record Cylinder #

Address = Track #( ID ) Block #

Offset in block

Block ID

Page 13: CS 4432lecture #61 CS4432: Database Systems II Lecture #6 Professor Elke A. Rundensteiner.

CS 4432 lecture #6 13

Fully Indirect Addressing

Solution: Record ID (Oracle: ROWID) as global address, maintain a map table.

Map Tablerec ID r address

a

Physicaladdr.Rec ID

Page 14: CS 4432lecture #61 CS4432: Database Systems II Lecture #6 Professor Elke A. Rundensteiner.

CS 4432 lecture #6 14

Tradeoff

Flexibility Costto move records of indirection(for deletions, insertions) (lookup)

What to do : Options inbetween ?

Physical Indirect

Page 15: CS 4432lecture #61 CS4432: Database Systems II Lecture #6 Professor Elke A. Rundensteiner.

CS 4432 lecture #6 15

Ex #1 : Indirection in block

Block Header

A block: Free

space

R3

R4

R1 R2

Page 16: CS 4432lecture #61 CS4432: Database Systems II Lecture #6 Professor Elke A. Rundensteiner.

CS 4432 lecture #6 17

Ex. #2 Use logical block #’s understood by file system

instead of direct disk accessREC ID File ID

Block # Record # or Offset

File ID, PhysicalBlock # Block ID

File System Map

Page 17: CS 4432lecture #61 CS4432: Database Systems II Lecture #6 Professor Elke A. Rundensteiner.

CS 4432 lecture #6 18

(1) Separating records(2) Spanned vs. Unspanned(3) Mixed record types - Clustering(4) Split records(5) Sequencing(6) Indirection

Recap: Storing records in blocks

Page 18: CS 4432lecture #61 CS4432: Database Systems II Lecture #6 Professor Elke A. Rundensteiner.

CS 4432 lecture #6 19

(1) Insertion/Deletion(2) Buffer Management(3) Comparison of Schemes

Other Topics in Chapter 3

Page 19: CS 4432lecture #61 CS4432: Database Systems II Lecture #6 Professor Elke A. Rundensteiner.

CS 4432 lecture #6 20

Block

Deletion

Rx

Page 20: CS 4432lecture #61 CS4432: Database Systems II Lecture #6 Professor Elke A. Rundensteiner.

CS 4432 lecture #6 21

Options:

(a) Deleted and immediately reclaim space(b) Mark deleted

– May need chain of deleted records

(for re-use)– Need a way to mark:

• special characters• delete field• in map

Page 21: CS 4432lecture #61 CS4432: Database Systems II Lecture #6 Professor Elke A. Rundensteiner.

CS 4432 lecture #6 22

As usual, many tradeoffs...

• How expensive is to move valid record to free space for immediate reclaim?

• How much space is wasted?– e.g., deleted records, delete fields,

free space chains,...

Page 22: CS 4432lecture #61 CS4432: Database Systems II Lecture #6 Professor Elke A. Rundensteiner.

CS 4432 lecture #6 23

Dangling pointers

Note: If pointers point to physical locations (rather than ROWIDs), storing new data in deleted block corrupts data.

Concern with deletions

R1 ?

Page 23: CS 4432lecture #61 CS4432: Database Systems II Lecture #6 Professor Elke A. Rundensteiner.

CS 4432 lecture #6 24

Solution #1: Do not worry

Page 24: CS 4432lecture #61 CS4432: Database Systems II Lecture #6 Professor Elke A. Rundensteiner.

CS 4432 lecture #6 25

E.g., Leave “MARK” in map or old location

Solution #2: Tombstones

• Physical IDs

A block

This space This space cannever re-used be re-used

Page 25: CS 4432lecture #61 CS4432: Database Systems II Lecture #6 Professor Elke A. Rundensteiner.

CS 4432 lecture #6 26

• Logical IDs

ID LOC

7788

map

Never reuseID 7788 nor

space in map...

E.g., Leave “MARK” in map or old location

Solution #2: Tombstones

Page 26: CS 4432lecture #61 CS4432: Database Systems II Lecture #6 Professor Elke A. Rundensteiner.

CS 4432 lecture #6 27

• Place record ID within every record• When you follow a pointer, check if

it leads to correct record

Solution #3 (?):

Does this work??? If space reused, won’t new record have same ID?

to3-77

rec-id:3-77

Page 27: CS 4432lecture #61 CS4432: Database Systems II Lecture #6 Professor Elke A. Rundensteiner.

CS 4432 lecture #6 28

Easy case: Records fixed length/not in sequence Insert new record at end of file

or, in deleted slotA little harder:

If records are variable size, not as easy may not be able to reuse space – fragmentationHard case: records in sequence

If free space “close by”, not too bad... Or, use overflow idea...

Or worst case, reorganize file ...

Insert

Page 28: CS 4432lecture #61 CS4432: Database Systems II Lecture #6 Professor Elke A. Rundensteiner.

CS 4432 lecture #6 29

Interesting problems:

• How much free space to leave in each block, track, cylinder?

• How often do I reorganize file + overflow?

Freespace

Page 29: CS 4432lecture #61 CS4432: Database Systems II Lecture #6 Professor Elke A. Rundensteiner.

CS 4432 lecture #6 30

• DB features needed• Why LRU may be bad Read• Pinned blocks Textbook!• Forced output• Double buffering• Swizzling

Buffer Management

in Notes03

Page 30: CS 4432lecture #61 CS4432: Database Systems II Lecture #6 Professor Elke A. Rundensteiner.

CS 4432 lecture #6 31

Pointer Swizzling

Memory Disk

Rec A

block 1

Rec Ablock 2 block 2

block 1

Issue : If records (objects) contain pointers to otherobjects, translate locations when load objects into memory.

Page 31: CS 4432lecture #61 CS4432: Database Systems II Lecture #6 Professor Elke A. Rundensteiner.

CS 4432 lecture #6 32

Translation DB Addr Mem Addr Table Rec-A Rec-A-inMem

One Option:

Solution: Insert fields that represent pointers into map table. Translate pointers as needed.

Page 32: CS 4432lecture #61 CS4432: Database Systems II Lecture #6 Professor Elke A. Rundensteiner.

CS 4432 lecture #6 33

In memory pointers - need “type” bit

to disk

to memoryM

Another Option:

Page 33: CS 4432lecture #61 CS4432: Database Systems II Lecture #6 Professor Elke A. Rundensteiner.

CS 4432 lecture #6 34

Swizzling Issues

• Automatic• On-demand• No swizzling / program control

Swizzling Options

• Must ‘unswizzle’

• Updating/writing of records

Page 34: CS 4432lecture #61 CS4432: Database Systems II Lecture #6 Professor Elke A. Rundensteiner.

CS 4432 lecture #6 35

• There are 1,000,001 ways to organize my data on disk…

Which is right for me?

Comparison

Page 35: CS 4432lecture #61 CS4432: Database Systems II Lecture #6 Professor Elke A. Rundensteiner.

CS 4432 lecture #6 36

Issues:

Flexibility Space Utilization

Complexity Performance

Page 36: CS 4432lecture #61 CS4432: Database Systems II Lecture #6 Professor Elke A. Rundensteiner.

CS 4432 lecture #6 37

To evaluate a given strategy, compute following parameters:

-> space used for expected data - on average

-> expected time to :- fetch record given key- fetch record with next key- insert/delete/update record- read complete file- reorganize file (maybe sort)

-> usage patterns / workload: - how many/which user queries/updates

Page 37: CS 4432lecture #61 CS4432: Database Systems II Lecture #6 Professor Elke A. Rundensteiner.

CS 4432 lecture #6 38

Example : What about Project 1?

Design decisions for CAPE queue storage system? – What data types?– Variable/Fixed length records?– Fixed/Variable format?– Record IDs ?– Sequencing? What’s special here ?– Deletions?– Insertions? – Headers/Meta-data for Queues?– Buffering/Disk Pushing?– Reorganize files and Overflow? – Defragmentation?– …

Page 38: CS 4432lecture #61 CS4432: Database Systems II Lecture #6 Professor Elke A. Rundensteiner.

CS 4432 lecture #6 39

Chapter 4 (Chapter 13 in ‘compete book’ )

How to find a record quickly, given a key

TOMORROW


Recommended