File System Performance

File System Performance

CSE451Andrew Whitaker

Ways to Improve Performance

Access the disk less Caching!

Be smarter about accessing the disk Turn small operations into large operations Turn scattered operations into sequential operations

Technique #1: Caching

Memory is MUCH faster than diskSo, cache whatever we can in memory

File buffers i-nodes Directory entries (name => i-node)

Caching reads is a no-brainerCaching writes is more interesting…

Caching Writes

Two options Synchronous: data is immediately written out to disk

AKA: write-through Asynchronous: disk writes are delayed

AKA: write-back

Programmer’s perspective: what does it mean when the “write” system call returns? With asynchronous writes, the data has not necessarily hit

the disk

Why Use Asynchronous Writes?

Allows us to batch-up multiple writes to the same block

Allows for better overlap of CPU and I/O CPU does not stall waiting for the disk

Allows the disk scheduler to make better decisions Application: write(a); write (b); write(c); Disk: write(b); write(a); write(c);

Most data updates in UNIX systems use asynchronous writes by default Programmer can override: fsync(fd);

Problems with Asynchronous Writes

File system state can be lost during a crash Missing blocks, missing files,

missing directories, storage leaks, etc.

For this reason, meta-data updates tend to be done synchronously File/directory creation or

deletion

Consistency Problems

Problems still arise, even with synchronous meta-data updates For example, file creation must modify an i-node and a directory entry

Initialize the i-node Record the <fileName, i-node> mapping in the directory

Disks do not support atomic operations

Dealing with Consistency Problems

Always keep the disk in a “safe” stateRun a recovery program (like fsck) on

startup

i-check: File Consistency

Is each block on exactly one list? Create a bit vector with as many entries as there are

blocks Follow the free list and each i-node block list When a block is encountered, examine its bit

If the bit was 0, set it to 1 If the bit was already 1

• if the block is both in a file and on the free list, remove it from the free list and cross your fingers

• if the block is in two files, call support! If there are any 0’s left at the end, put those blocks on

the free list

d-check: Directory Consistency

Do the directories form a tree? Cycles are bad!

Does the link count of each file (i-node) equal the number of directory links to it?

Technique #2: Better Data Layout

Recall basic file system structure: Meta-data: i-nodes, free block list Data: file data, directory data

Metadata Data

Note: i-nodes are far from the data blocks they describe

Cylinder groups Basic idea: group commonly accessed data and

meta-data together This reduces seeks

Details: Disk is partitioned into groups of cylinders Data blocks from a file are all placed in the same

cylinder group Files in same directory are placed in the same cylinder

group i-node for file placed in same cylinder group as file’s

data

Cylinder Group Analysis

+ Reduces or eliminates seeks for some common access patterns

- Does not address rotational delay- Performance is workload dependent- Performance degrades if cylinders become full

- Partial solution: pro-actively reserve space

Log Structured File System

Let’s assume all reads are cached An iffy assumption, but let’s suspend disbelief

Q: How can we turn all writes into large, sequential writes?

Insight: this is possible if the location of data on disk can change

A Convention File System

Files live at fixed locationSo, file system writes

must use seeksFor example:

Write to Christine.txt Write to Andrew.txt Write to Colin.txt

Veneta.txtJoel.txt

Colin.txt

Matt.txt

Andrew.txtNolan.txt

Bishop.txtChristine.txt

Log-structured File System

Use the disk as an append-only log All writes go at the end of the

logThe location of a file

changes over timeOld data is not over-written

Until the file system becomes full

Christine.txtAndrew.txtColin.txt

Loggrowth

Christine.txt

LFS Details

Everything gets written to the log File data, i-nodes, directories

LFS tries to buffer many small writes into large segments Typically 512k, 1MB

How Can This Possibly Work?

Q: If nothing lives at a fixed location, how do we find “the data”?

A: Add a layer of indirection: An i-node map Maps from i-node number to current location The map resides at a fixed location on disk

NOT in the log! The map is cached in memory for performance

What Happens When the Disk Gets Full?

Partial solution: disk is managed in segments, which are threaded on disk Basically, a linked-list

But, this re-introduces seeks!

Segment Cleaner

Goal: make scattered segments contiguous again

Approach: Read a segment Write live data to the end of the log Presto: The segment is now clean

This is very expensive Each live byte is read and written

LFS Analysis

For reads, LFS and a traditional FS are largely equivalent

LFS has better performance for small writes and meta-data operations

The LFS cleaner has a large impact on performance How important is this?

LFS in Practice

LFS is implemented, but not widely usedReasons?

Assumptions about read behavior were not valid Reads have not gone away

Performance improvements were not sufficient to offset increase complexity, higher variability

LFS comeback? See Jim Gray’s article

Date post:	23-Mar-2016
Category:	Documents
Upload:	mirra
View:	62 times
Download:	3 times

File System Performance

Documents