Storage and File Organization - cs.brown.edu · Storage and File Organization Wherein we start...

Post on 03-May-2020

6 views 2 download

transcript

Storage and File Organization

Wherein we start looking inside the database

Application

Logical Data Model

Physical Implementation

SQL

Query Plan

JSON

Materialized Results

Correctness

Performance

PerformanceMemory Hierarchy

Access Patterns

https://en.wikipedia.org/wiki/Memory_hierarchy#/media/File:ComputerMemoryHierarchy.svg

http://csappbook.blogspot.com/2017/05/a-gallery-of-memory-mountains.html

Storing and Organizing Fixed Length Tuples

Delete record 3

Wasted Space

Storing and Organizing Variable Length Tuples

Fixed Size

4 fields, 0 = not null, 1 = null

Access one block at a time - typically

4096 bytes

Database File … …

Order of tuples in a file: Sequential Order

Can easily find key

Order of tuples in a file: Multitable Clustering

Order of tuples in a file: Heap File

Database File … …

Array of 3-bit values (maximum value = 8)

6/8 = 75%

Has space? Insert!

Storing Metadata? Use what you have! Relations!

https://en.wikipedia.org/wiki/Memory_hierarchy#/media/File:ComputerMemoryHierarchy.svg

Computation Happens Here

Buffer Management

Buffer

Database File … …

RAMDisk

read(Database File, Block n)read(Database File, Block m)

Operation 1 accesses block

pin!

1

Operation 2 accesses block

pin!

2

read(Database File, Block x)

…Full!

A Modern Alternative

10101 Srinavasan Comp. Sci. 65000

1212 Wu Finance 90000

15151 Mozart Music 40000

10101 Srinavasan Comp. Sci. 65000

1212 Wu Finance 90000

15151 Mozart Music 40000

Row Storage Column Storage

Inserts and deletes for each tuple are fast Scans on a single attribute are slow

Scan on a few attributes are fast Compression CPU cache performance

Need to reconstruct the tuple Decompression cost Inserting a tuple is random access

On disc:

• ORC or Parquet Format • Popular for BIG DATA and

Distributed Settings

Partition of the relation

In main memory:

• No buffer manager • Possible due to: • heavy compression • larger (TBs) main memory

Partition of the relation

Questions?