Storage and File Organization
Wherein we start looking inside the database
Application
Logical Data Model
Physical Implementation
SQL
Query Plan
JSON
Materialized Results
Correctness
Performance
PerformanceMemory Hierarchy
Access Patterns
https://en.wikipedia.org/wiki/Memory_hierarchy#/media/File:ComputerMemoryHierarchy.svg
http://csappbook.blogspot.com/2017/05/a-gallery-of-memory-mountains.html
Storing and Organizing Fixed Length Tuples
Delete record 3
Wasted Space
Storing and Organizing Variable Length Tuples
Fixed Size
4 fields, 0 = not null, 1 = null
Access one block at a time - typically
4096 bytes
Database File … …
Order of tuples in a file: Sequential Order
Can easily find key
Order of tuples in a file: Multitable Clustering
Order of tuples in a file: Heap File
Database File … …
Array of 3-bit values (maximum value = 8)
6/8 = 75%
Has space? Insert!
Storing Metadata? Use what you have! Relations!
https://en.wikipedia.org/wiki/Memory_hierarchy#/media/File:ComputerMemoryHierarchy.svg
Computation Happens Here
Buffer Management
Buffer
Database File … …
RAMDisk
read(Database File, Block n)read(Database File, Block m)
Operation 1 accesses block
pin!
1
Operation 2 accesses block
pin!
2
read(Database File, Block x)
…Full!
A Modern Alternative
10101 Srinavasan Comp. Sci. 65000
1212 Wu Finance 90000
15151 Mozart Music 40000
10101 Srinavasan Comp. Sci. 65000
1212 Wu Finance 90000
15151 Mozart Music 40000
Row Storage Column Storage
Inserts and deletes for each tuple are fast Scans on a single attribute are slow
Scan on a few attributes are fast Compression CPU cache performance
Need to reconstruct the tuple Decompression cost Inserting a tuple is random access
On disc:
• ORC or Parquet Format • Popular for BIG DATA and
Distributed Settings
Partition of the relation
In main memory:
• No buffer manager • Possible due to: • heavy compression • larger (TBs) main memory
Partition of the relation
Questions?