Extensible File Systems Yong Yao CS614 May 1 st, 2001.

Post on 20-Dec-2015

222 views 5 download

Tags:

transcript

Extensible File Systems

Yong YaoCS614

May 1st, 2001

Problems of file systems Evolution of filing services is slow

Many innovations have been proposed, but few have become widely available

Why File systems are large and difficult to

implement Key part of an operating system Interact with other core services

No well-defined interface to introduce new service easily Interfaces vary from system to system

Benefit with an extensible interface

Compared to the way that additional services obtained at the user level

More available services at a far faster rate

Motivation Modularity

Sun Network File System

A system for accessing remote files across LANs

Goal: Allow some degree of sharing among a set of independent file systems.

NFS Architecture Three major layers

Unix file-system interface

VFS layer NFS protocol

layer

But If interface is evolving:

Compatibility problems arise – A change to the interface requires changes to each existing filing services

If keep static: Difficult to provide new services cleanly

Desirable characteristics Extensibility: Filing must be robust to both internal

and external change – change management

Stacking: add new functionality to existing services– Modularity

Coherence: Data is consistent across multiple layers– Successful design

Stacking and Extensibility

A conflict between two characteristics

Stacking: layer is bounded(above and below) by the same interface

Extensibility: layer is robust to change

Stackable Design Decomposed a

complex filing service into several layers

Each layers can be developed independently

Decomposition Make individual

components most reusable

Each encompasses a single abstraction

Example: Disk partition Files service Directory service

Possible division

Physical storage management Directory services Compression and decompression Encryption and decryption Cache management Remote-access services Replication

Layer substitution Support the evolution and replacement of

layers

Symmetric interface Construct complex filing services from a

number of simple layers

Interfaces are identical above and below a layer Compared to Shell programming:

The pipe mechanism provides a simple byte-stream of data

ls -l | wc

Bypass routine Interface is evolving, and any layer

can add new operations NFS: A routine for each operation

Default routine-pass unknown operations to a lower layer Handle variety of argument Metadata

Nonlinear Stacking Not necessary to be a strict stack

Fan In Fan Out

Address-space independence

Layers can execute in different address spaces or even on different machines

Distributed file system

Transport Layer A stackable layer

that transfers operations from one address space to another

NFS: only a fixed set of operations, not extensible

Extend NFS to bypass new operations

Example- replication layer of Ficus Ficus: a distributed

file system developed at UCLA

Provide a large scale replication service

Logical layer: single-copy, highly available file

Physical layer: implement the concept of a file replica

Interposition Layers are

interposed between existing users of the stack and the old stack-top

Useful when operations must happen at run-time

Implementation

Existing File-System Interfaces

Vnode:individual files Vfs: subtree Mounting Delayed binding

Interface Extensibility Vnode interface: fix the formal definition of all

operations before kernel compilation

UCLA interface: maintain all interface definition until execution, then dynamically constructing the interface

Each one provide a list of all operations Take the union of these operations Customized to each file system

Stack Creation Mounting for layer construction

Instantiate a new UFS from /layer/ufs/crypt.raw

Create an encryption layer(/usr/data) on top of the lower layer (/ayer/ufs/crypt.raw)

Stack Data Caching

Individual stack layers cache data to improve performance

A cache manager coordinate page caching to keep consistent. Page-naming policy

[stack identifier, file offset]

Performance

Layer overhead compared to monolithic file system

Facilitate file-system development

Compatibility problems

Layer Performance Interface Performance

Compare a kernel supporting UCLA interface with a standard kernel

Multiple-Layer Performance Null layers: forward all operations to the next

layer of the stack

Object-Orientation and Stacking

Strong parallels exist between “object-oriented” design and Stacking Data encapsulation Late binding vs. run time stack

configuration Inheritance vs. bypass routine

Conclusion

Focus: Improve the file-system development process

Stacking: build new services on old system

Extensible interface: add new services without invalidating existing work

The Echo Distributed File System

Andrew D.Birrell, Andy Hisgen, Chuck Jerian, Timothy Mann, and Garret Swart

Motivation

Virtues of centralized systems Easy sharing of data Centralized administration Simple failure model

Virtues of distributed systems Fault tolerance Scalability

Properties

Global naming Global access Global security Fault Tolerance Global Scale

Global naming Each file has a name which has

the same meaning to every users. NFS: unix2: /usr/shared/dir1

unix1: /usr/local/dir1

unix1

local

usr

shared

usr

unix2

dir1

unix1

local

usr

dir1

Echo name space

Single global root Arc is independent of the node Volumes: a sub-tree of name space Junction: a leaf in one volume,

pointer to a further volume

Volume classes

Global root volume: DNS provides world-wide service

Name service volume: Echo name server

Echo Filestore volume Store most of the files and directories

Example: Resolve /-/com/dec/src/x/y/p/q

Why two kinds of volumes Trade off between consistency and

availability High level: change quite rarely,

availability is more important Low level: change rapidly

Use different algorithms

Global Access

Goal: clients see the same data Echo file servers use a replication

scheme with tight consistency Key point: reduce the load on

servers Write behind cache policy with

token management

Fault Tolerance Replicate data storage to improve

reliability – two copies are enough

Replicate storage access paths(Name service volumes) to improve availability

Elect a primary by majority voting scheme

Overview

Performance was better than most NFS

Global naming was satisfactory

Reasonable availability