+ All Categories
Home > Documents > Module 4 Distributed File & Name Systemsdpnm.postech.ac.kr/cs600/lecture/E600-Module4.pdf ·...

Module 4 Distributed File & Name Systemsdpnm.postech.ac.kr/cs600/lecture/E600-Module4.pdf ·...

Date post: 15-Apr-2020
Category:
Upload: others
View: 2 times
Download: 0 times
Share this document with a friend
70
3: Distributed File & Name Systems 1 Module 4 Distributed File & Name Systems
Transcript
Page 1: Module 4 Distributed File & Name Systemsdpnm.postech.ac.kr/cs600/lecture/E600-Module4.pdf · 2004-05-24 · 3: Distributed File & Name Systems 7 Requirements (Cont) Fault tolerance:

3: Distributed File & Name Systems 1

Module 4Distributed File & Name Systems

Page 2: Module 4 Distributed File & Name Systemsdpnm.postech.ac.kr/cs600/lecture/E600-Module4.pdf · 2004-05-24 · 3: Distributed File & Name Systems 7 Requirements (Cont) Fault tolerance:

3: Distributed File & Name Systems 2

Part 1Distributed File

Systems

Page 3: Module 4 Distributed File & Name Systemsdpnm.postech.ac.kr/cs600/lecture/E600-Module4.pdf · 2004-05-24 · 3: Distributed File & Name Systems 7 Requirements (Cont) Fault tolerance:

3: Distributed File & Name Systems 3

File SystemsFile system

Operating System interface to disk storage

File system attributes (Metadata)

File lengthCreation timestamp

Read timestampWrite timestamp

Attribute timestampReference count

OwnerFile type

Access control list

Page 4: Module 4 Distributed File & Name Systemsdpnm.postech.ac.kr/cs600/lecture/E600-Module4.pdf · 2004-05-24 · 3: Distributed File & Name Systems 7 Requirements (Cont) Fault tolerance:

3: Distributed File & Name Systems 4

Operations on Unix File System

filedes = open(name, mode)filedes = creat(name, mode)

Opens an existing file with the given name.Creates a new file with the given name.Both operations deliver a file descriptor referencing the openfile. The mode is read, write or both.

status = close(filedes) Closes the open file filedes.count = read(filedes, buffer, n)count = write(filedes, buffer, n)

Transfers n bytes from the file referenced by filedes to buffer.Transfers n bytes to the file referenced by filedes from buffer.Both operations deliver the number of bytes actually transferredand advance the read-write pointer.

pos = lseek(filedes, offset,whence)

Moves the read-write pointer to offset (relative or absolute,depending on whence).

status = unlink(name) Removes the file name from the directory structure. If the filehas no other names, it is deleted.

status = link(name1, name2) Adds a new name (name2) for a file (name1).status = stat(name, buffer) Gets the file attributes for file name into buffer.

Page 5: Module 4 Distributed File & Name Systemsdpnm.postech.ac.kr/cs600/lecture/E600-Module4.pdf · 2004-05-24 · 3: Distributed File & Name Systems 7 Requirements (Cont) Fault tolerance:

3: Distributed File & Name Systems 5

Distributed File SystemFile system emulating non- distributed file system behaviour on a physically distributed set of files, usually within an intranetRequirements

TransparencyAccess transparency: hide distributed nature of file system by providing a single service interface for local and distributed files

programs working with a non- distributed file system should work without major adjustments on a distributed file system

Location transparency: uniform, location independent name space

Page 6: Module 4 Distributed File & Name Systemsdpnm.postech.ac.kr/cs600/lecture/E600-Module4.pdf · 2004-05-24 · 3: Distributed File & Name Systems 7 Requirements (Cont) Fault tolerance:

3: Distributed File & Name Systems 6

Requirements (Cont)Transparency

Mobility transparency: file specifications will remain invariant if a file is physically moved to a different location within the dfsPerformance transparency: load increase within normal bounds should allow a client to continue to receive satisfactory performance Scaling transparency: expansion by incremental growth

Allow concurrent accessAllow file replicationTolerate hardware and operating system heterogeneitySecurity

Access controlUser authentication

Page 7: Module 4 Distributed File & Name Systemsdpnm.postech.ac.kr/cs600/lecture/E600-Module4.pdf · 2004-05-24 · 3: Distributed File & Name Systems 7 Requirements (Cont) Fault tolerance:

3: Distributed File & Name Systems 7

Requirements (Cont)Fault tolerance: continue to provide correct service in the presence of communication or server faults

at- most- once semantics for file operations at- least- once semantics for idempotent file operations replication (stateless, so that servers can be restarted after failure)

Consistencyone- copy update semantics

all clients see contents of file identically as if only one copy of file existedif caching is used: after an update operation, no program can observe a discrepancy between data in cache and stored data

EfficiencyLatency of file accessesScalability (e.g., with increase of number of concurrent users)

Page 8: Module 4 Distributed File & Name Systemsdpnm.postech.ac.kr/cs600/lecture/E600-Module4.pdf · 2004-05-24 · 3: Distributed File & Name Systems 7 Requirements (Cont) Fault tolerance:

3: Distributed File & Name Systems 8

Architecture

Flat File Service performs file operationsuses “unique file identifiers” (UFIDs) to refer to files flat file service interface

RPC- based interface for performing file operationsnot normally used by application level programs

Client computer Server computer

Applicationprogram

Applicationprogram

Client moduleFlat file service

Directory service

Page 9: Module 4 Distributed File & Name Systemsdpnm.postech.ac.kr/cs600/lecture/E600-Module4.pdf · 2004-05-24 · 3: Distributed File & Name Systems 7 Requirements (Cont) Fault tolerance:

3: Distributed File & Name Systems 9

Architecture (Cont)

Directory Service mapping of UFIDs to “text” file names, and vice versa

Client Module provides API for file operations available to application program

Client computer Server computer

Applicationprogram

Applicationprogram

Client moduleFlat file service

Directory service

Page 10: Module 4 Distributed File & Name Systemsdpnm.postech.ac.kr/cs600/lecture/E600-Module4.pdf · 2004-05-24 · 3: Distributed File & Name Systems 7 Requirements (Cont) Fault tolerance:

3: Distributed File & Name Systems 10

Architecture (Cont)Flat File Service Interface

Comparison with Unixevery operation can be performed immediatelyUnix maintains file pointer, reads and writes start at the file pointer locationadvantages: fault tolerance

with the exception of create, all operations are idempotent can be implemented as a stateless, replicated server

Read(FileId, i, n) -> Data— throws BadPosition

If 1 ≤ i ≤ Length(File): Reads a sequence of up to n itemsfrom a file starting at item i and returns it in Data.

Write(FileId, i, Data)— throws BadPosition

If 1 ≤ i ≤ Length(File)+1: Writes a sequence of Data to afile, starting at item i, extending the file if necessary.

Create() -> FileId Creates a new file of length 0 and delivers a UFID for it.Delete(FileId) Removes the file from the file store.GetAttributes(FileId) -> Attr Returns the file attributes for the file.SetAttributes(FileId, Attr) Sets the file attributes (only those attributes that are not

shaded in slide 3).

Page 11: Module 4 Distributed File & Name Systemsdpnm.postech.ac.kr/cs600/lecture/E600-Module4.pdf · 2004-05-24 · 3: Distributed File & Name Systems 7 Requirements (Cont) Fault tolerance:

3: Distributed File & Name Systems 11

Architecture (Cont)Access Control

at the server in dfs, since requests are usually transmitted via unprotected RPC calls mechanisms

access check when mapping file name to UFID, returning cryptographic “capability” to requester who uses this for subsequent requestsaccess check at server with every file system operation

Hierachical File Systemfiles organized in trees reference by pathname + filename

File Groupsgroups of files that can be moved between servers file cannot change group membership in Unix: filesystemidentification: must be unique in network

IP address of creating hostdate of creation

Page 12: Module 4 Distributed File & Name Systemsdpnm.postech.ac.kr/cs600/lecture/E600-Module4.pdf · 2004-05-24 · 3: Distributed File & Name Systems 7 Requirements (Cont) Fault tolerance:

3: Distributed File & Name Systems 12

Sun Network File System

UNIX kernel

protocol

Client computer Server computer

system calls

Local Remote

UNIXfile

systemNFSclient

NFSserver

UNIXfile

system

Applicationprogram

Applicationprogram

NFS

UNIX

UNIX kernelVirtual file systemVirtual file system

Oth

erfil

e sy

stem

Page 13: Module 4 Distributed File & Name Systemsdpnm.postech.ac.kr/cs600/lecture/E600-Module4.pdf · 2004-05-24 · 3: Distributed File & Name Systems 7 Requirements (Cont) Fault tolerance:

3: Distributed File & Name Systems 13

Architecture of NFS version 3 Access transparency

No distinction between local and remote filesVirtual file system keeps track of locally and remotely available file systemsfile identifiers: file handles

filesystem identifier (unique number allocated at creation time)i- node number i- node generation number (because i- node- numbers are reused)

Selected NFS operations - Ilookup(dirfh, name) -> fh, attr Returns file handle and attributes for the file name in the directory

dirfh.create(dirfh, name, attr) ->

newfh, attrCreates a new file name in directory dirfh with attributes attr andreturns the new file handle and attributes.

remove(dirfh, name) status Removes file name from directory dirfh.getattr(fh) -> attr Returns file attributes of file fh. (Similar to the UNIX stat system

call.)setattr(fh, attr) -> attr Sets the attributes (mode, user id, group id, size, access time and

modify time of a file). Setting the size to 0 truncates the file.

Page 14: Module 4 Distributed File & Name Systemsdpnm.postech.ac.kr/cs600/lecture/E600-Module4.pdf · 2004-05-24 · 3: Distributed File & Name Systems 7 Requirements (Cont) Fault tolerance:

3: Distributed File & Name Systems 14

Selected NFS operations - II read(fh, offset, count) -> attr, data Returns up to count bytes of data from a file starting at offset.

Also returns the latest attributes of the file.write(fh, offset, count, data) -> attr Writes count bytes of data to a file starting at offset.

Returns the attributes of the file after the write has taken place.rename(dirfh, name, todirfh, toname)

-> statusChanges the name of file name in directory dirfh to toname indirectory to todirfh.

link(newdirfh, newname, dirfh, name)-> status

Creates an entry newname in the directory newdirfh whichRefers to file name in the directory dirfh.

symlink(newdirfh, newname, string)-> status

Creates an entry newname in the directory newdirfh of typesymbolic link with the value string. The server does notinterpret the string but makes a symbolic link file to hold it.

readlink(fh) -> string Returns the string that is associated with the symbolic link fileidentified by fh.

mkdir(dirfh, name, attr) -> newfh, attr

Creates a new directory name with attributes attr and returnsthe new file handle and attributes.

rmdir(dirfh, name) -> status Removes the empty directory name from the parent directorydirfh. Fails if the directory is not empty.

Page 15: Module 4 Distributed File & Name Systemsdpnm.postech.ac.kr/cs600/lecture/E600-Module4.pdf · 2004-05-24 · 3: Distributed File & Name Systems 7 Requirements (Cont) Fault tolerance:

3: Distributed File & Name Systems 15

Selected NFS operations - III

readdir(dirfh, cookie, count) -> entries

Returns up to count bytes of directory entries from the directorydirfh. Each entry contains a file name, a file handle, and an opaquepointer to the next directory entry, called a cookie. The cookie isused in subsequent readdir calls to start reading from the followingentry. If the value of cookie is 0, reads from the first entry in thedirectory.

statfs(fh) -> fsstats Returns file system information (such as block size, number offree blocks and so on) for the file system containing a file fh.

Page 16: Module 4 Distributed File & Name Systemsdpnm.postech.ac.kr/cs600/lecture/E600-Module4.pdf · 2004-05-24 · 3: Distributed File & Name Systems 7 Requirements (Cont) Fault tolerance:

3: Distributed File & Name Systems 16

Access Control/Authentication NFS requests transmitted via Remote Procedure Calls (RPCs)

clients send authentication information (user / group IDs)checked against access permissions in file attributes

Potential security loopholeany client may address RPC requests to server providing another client’s identification informationintroduction of security mechanisms in NFS

DES encryption of user identification information Kerberos authentication

Page 17: Module 4 Distributed File & Name Systemsdpnm.postech.ac.kr/cs600/lecture/E600-Module4.pdf · 2004-05-24 · 3: Distributed File & Name Systems 7 Requirements (Cont) Fault tolerance:

3: Distributed File & Name Systems 17

Mounting of File Systems Making remote file systems available to a local client, specifying remote host name and pathname Mount protocol (RPC- based)

returns file handle for directory name given in requestlocation (IP address and port number) and file handle are passed to Virtual File system and NFS client

Hard- mounted (mostly used in practice)user-level process suspended until operation completedapplication may not terminate gracefully in failure situations

Soft- mountederror message returned by NFS client module to user-level process after small number of retries

Page 18: Module 4 Distributed File & Name Systemsdpnm.postech.ac.kr/cs600/lecture/E600-Module4.pdf · 2004-05-24 · 3: Distributed File & Name Systems 7 Requirements (Cont) Fault tolerance:

3: Distributed File & Name Systems 18

Mounting Example

jim jane joeann

usersstudents

usrvmunix

Client Server 2

. . . nfs

Remotemountstaff

big bobjon

people

Server 1

export

(root)

Remotemount

. . .

x

(root) (root)

Note: The file system mounted at /usr/students in the client is actually the sub-tree located at /export/people in Server 1; the file system mounted at /usr/staff in the client is actually the sub-tree located at /nfs/users in Server 2.

Page 19: Module 4 Distributed File & Name Systemsdpnm.postech.ac.kr/cs600/lecture/E600-Module4.pdf · 2004-05-24 · 3: Distributed File & Name Systems 7 Requirements (Cont) Fault tolerance:

3: Distributed File & Name Systems 19

Caching in Sun NFS

Server cachingdisk caching as in non- networked file systemsread operations: unproblematicwrite operations: consistency problems

write- through cachingstore updated data in cache and on disk before sending reply to clientrelatively inefficient if frequent write operations occur

commit operationcaching only in cache memorywrite back to disk only when commit operation for file received

Caching in server and client indispensable to achieve necessary performance

Page 20: Module 4 Distributed File & Name Systemsdpnm.postech.ac.kr/cs600/lecture/E600-Module4.pdf · 2004-05-24 · 3: Distributed File & Name Systems 7 Requirements (Cont) Fault tolerance:

3: Distributed File & Name Systems 20

Caching in Sun NFS (Cont)Client caching

Caching of read , write , getattr , lookup and readdiroperationsPotential inconsistency: the data cached in client may not be identical to the same data stored on the serverTime- stamp based scheme used in polling server about feshness of a data object (presumption of synchronized global time, e. g., through NTP)

Tc: time cache was last validatedTm client/ server : time when block was last modified at the server as recorded by client/ servert: freshness interval

Page 21: Module 4 Distributed File & Name Systemsdpnm.postech.ac.kr/cs600/lecture/E600-Module4.pdf · 2004-05-24 · 3: Distributed File & Name Systems 7 Requirements (Cont) Fault tolerance:

3: Distributed File & Name Systems 21

Caching in Sun NFS (Cont)Client caching

Freshness condition (T- Tc < t) (Tm client = TM server )

if (T- Tc < t) (can be determined without server access), then entry presumed to be validif not (T- Tc < t), then TM server needs to be obtained by a getattr callif Tm client = TM server , then data presumed valid, else obtain data from server and update file

Note: scheme does not guarantee consistency, since recent updates may be invisible, one copy update semantics only approximated

Page 22: Module 4 Distributed File & Name Systemsdpnm.postech.ac.kr/cs600/lecture/E600-Module4.pdf · 2004-05-24 · 3: Distributed File & Name Systems 7 Requirements (Cont) Fault tolerance:

3: Distributed File & Name Systems 22

Caching in Sun NFS (Cont)Client caching

perfomance factors: how to reduce server traffic, in particular for getattr

receipt of TM server , then update all Tm client values related to data object derived from the same file piggyback current attribute values on results of every file operation adaptive algorithm for t

t too short: many server requestst too large: increased chance of inconsistenciestypical values: 3 to 30 secs for files, 30 to 60 secs for directoriesin Solaris, t is adjusted according to frequency of file updates

Page 23: Module 4 Distributed File & Name Systemsdpnm.postech.ac.kr/cs600/lecture/E600-Module4.pdf · 2004-05-24 · 3: Distributed File & Name Systems 7 Requirements (Cont) Fault tolerance:

3: Distributed File & Name Systems 23

Caching in Sun NFS (Cont)Client caching - write operations

mark modified cache page as “dirty” and schedule page to be flushed to server (asynchronously)

flush happens with closing of file, when sync is issued,or when asynchronous block input- output (bio) daemon is used and active

when read , then read- ahead: when read occurs, bio daemon sends next file blockwhen write , then bio daemon will send block asynchronously to server

bio daemons: performance measure reducing probability that client blocks waiting for

read operations to return, orwrite operations to be committed at the server

Page 24: Module 4 Distributed File & Name Systemsdpnm.postech.ac.kr/cs600/lecture/E600-Module4.pdf · 2004-05-24 · 3: Distributed File & Name Systems 7 Requirements (Cont) Fault tolerance:

3: Distributed File & Name Systems 24

Andrew File System (AFS) Started as a joint effort of Carnegie Mellon University and IBM Today basis for DCE/ DFS: the distributed file system included in the Open Software Foundations’s Distributed Computing Environment Some UNIX file system usage observations, as pertaining to caching

infrequently updated shared files and local user files will remain valid for long periods of time (the latter because they are being updated on owners workstations)allocate large local disk cache, e. g., 100 MByte, that can provide a large enough working set for all files of one user such that the file is still in this cache when used next time

Page 25: Module 4 Distributed File & Name Systemsdpnm.postech.ac.kr/cs600/lecture/E600-Module4.pdf · 2004-05-24 · 3: Distributed File & Name Systems 7 Requirements (Cont) Fault tolerance:

3: Distributed File & Name Systems 25

Andrew File System (AFS) Some UNIX file system usage observations, as pertaining to caching (continued)

Assumptions about typical file accesses (based on empirical evidence)

usually small files, less than 10 Kbytes reads much more common than writes (appr. 6: 1)usually sequential access, random access not frequently found user- locality: most files are used by only one userburstiness of file references: once file has been used, it will be used in the nearer future with high probability

Design decisions for AFSWhole- file serving: entire contents of directories and files transfered from server to client (AFS- 3: in chunks of 64 Kbytes)Whole file caching: when file transfered to client it will be stored on that client’s local disk

Page 26: Module 4 Distributed File & Name Systemsdpnm.postech.ac.kr/cs600/lecture/E600-Module4.pdf · 2004-05-24 · 3: Distributed File & Name Systems 7 Requirements (Cont) Fault tolerance:

3: Distributed File & Name Systems 26

AFS Architecture Venus, Network and Vice

Venus

Workstations Servers

Venus

VenusUserprogram

Network

UNIX kernel

UNIX kernel

Vice

Userprogram

Userprogram

ViceUNIX kernel

UNIX kernel

UNIX kernel

Page 27: Module 4 Distributed File & Name Systemsdpnm.postech.ac.kr/cs600/lecture/E600-Module4.pdf · 2004-05-24 · 3: Distributed File & Name Systems 7 Requirements (Cont) Fault tolerance:

3: Distributed File & Name Systems 27

File Name Space

/ (root)

tmp bin cmuvmunix. . .

bin

SharedLocal

Symboliclinks

Seen by clients of AFS

Page 28: Module 4 Distributed File & Name Systemsdpnm.postech.ac.kr/cs600/lecture/E600-Module4.pdf · 2004-05-24 · 3: Distributed File & Name Systems 7 Requirements (Cont) Fault tolerance:

3: Distributed File & Name Systems 28

AFS system call intercept Handling by Venus

UNIX filesystem calls

Non-local fileoperations

Workstation

Localdisk

Userprogram

UNIX kernel

Venus

UNIX file system

Venus

Page 29: Module 4 Distributed File & Name Systemsdpnm.postech.ac.kr/cs600/lecture/E600-Module4.pdf · 2004-05-24 · 3: Distributed File & Name Systems 7 Requirements (Cont) Fault tolerance:

3: Distributed File & Name Systems 29

Implementation of system calls Callbacks and Callback promises

User process UNIX kernel Venus Net Viceopen(FileName, mode) If FileNamerefers to

a file in shared file,space pass therequest to Venus.

Open the local file

and return the filedescriptor to theapplication.

Check list of files inlocal cache. If notpresent or there is no

valid callback promisesend a request for thefile to the Vice serverthat is custodian of thevolume containing thefile.

Place the copy of thefile in the local filesystem, enter its localname in the local cachelist and return the localname to UNIX.

Transfer a copy of thefile and a callbackpromise to theworkstation. Log thecallback promise.

Page 30: Module 4 Distributed File & Name Systemsdpnm.postech.ac.kr/cs600/lecture/E600-Module4.pdf · 2004-05-24 · 3: Distributed File & Name Systems 7 Requirements (Cont) Fault tolerance:

3: Distributed File & Name Systems 30

Implementation of system calls Callbacks and Callback promises

User process UNIX kernel Venus Net Vice

read(FileDescriptor,Buffer, length)

Perform a normalUNIX read operationon the local copy.

write(FileDescriptor,Buffer, length)

Perform a normalUNIX write operationon the local copy.

close(FileDescriptor) Close the local copyand notify Venus thatthe file has been closed.

If the local copy hasbeen changed, send acopy to the ViceServer that is thecustodian of the file.

Replace the filecontents and send acallback to all otherclients holding callbackpromises on the file.

Page 31: Module 4 Distributed File & Name Systemsdpnm.postech.ac.kr/cs600/lecture/E600-Module4.pdf · 2004-05-24 · 3: Distributed File & Name Systems 7 Requirements (Cont) Fault tolerance:

3: Distributed File & Name Systems 31

Callback Mechanism Ensures that cached copies of files are updated when another client performs a close operation on that file Callback promise

Token stored with cached fileStatus: valid or cancelled

When server performs request to update file (e. g., following a close ), then it sends callback to all Venus processes to which it has sent callback promise

RPC from server to Venus processVenus process sets callback promise for local copy to cancelled

Venus handling an opencheck whether local copy of file has valid callback promiseif canceled, fresh copy must be fetched from Vice server

Page 32: Module 4 Distributed File & Name Systemsdpnm.postech.ac.kr/cs600/lecture/E600-Module4.pdf · 2004-05-24 · 3: Distributed File & Name Systems 7 Requirements (Cont) Fault tolerance:

3: Distributed File & Name Systems 32

Callback MechanismRestart of workstation after failure

retain as many locally cached files as possible, but callbacks may have been missedVenus sends cache validation request to the Vice server

contains file modification timestamp if timestamp is current, server sends valid and callback promiseis reinstantiated with valid if timestamp not current, server sends cancelled

Problem: communication link failurescallback must be renewed with above protocol before new open if a time T has lapsed since file was cached or callback promisewas last validated

Page 33: Module 4 Distributed File & Name Systemsdpnm.postech.ac.kr/cs600/lecture/E600-Module4.pdf · 2004-05-24 · 3: Distributed File & Name Systems 7 Requirements (Cont) Fault tolerance:

3: Distributed File & Name Systems 33

Callback MechanismScalability

AFS callback mechanism scales well with increasing number of users

communication only when file has been updatedin NFS timestamp approach: for each open

Since majority of files not accessed concurrently, and reads moreFrequent than writes, callback mechanism performs better

Page 34: Module 4 Distributed File & Name Systemsdpnm.postech.ac.kr/cs600/lecture/E600-Module4.pdf · 2004-05-24 · 3: Distributed File & Name Systems 7 Requirements (Cont) Fault tolerance:

3: Distributed File & Name Systems 34

File Update Semantics To ensure strict one- copy update semantics: modification of cached file must be propagated to any other client caching this file before any client can access this file

Rather inefficient Callback mechanism is an approximation of one- copy semantics Guarantees of currency for files in AFS (version 1)

after successful open: latest( F, S)current value of file F at client C is the same as the value at server S

after a failed open/ close: failure( S)open close not performed at server

after successful close: updated( F, S)client’s value of F has been successfully propagated to S

Page 35: Module 4 Distributed File & Name Systemsdpnm.postech.ac.kr/cs600/lecture/E600-Module4.pdf · 2004-05-24 · 3: Distributed File & Name Systems 7 Requirements (Cont) Fault tolerance:

3: Distributed File & Name Systems 35

File Update Semantics in AFSv3 Vice keeps callback state information about Venus clients: whichclients have received callback promises for which files lists retained over server failures when callback message is lost due to communication link failure, an old version of a file may be opened after it has been updated byanother client limited by time T after which client validates callback promise (typically, T= 10 minutes)currency guarantees

After successful open: Latest (F, S, 0)

copy of F as seen by client is no more than 0 seconds out of date

or (lostCallback( S, T) and inCache( F) and latest( F, S, T))callback message has been lost in the last T time units,the file F was in the cache before open was attempted,and copy is no more than T time units out of date

Page 36: Module 4 Distributed File & Name Systemsdpnm.postech.ac.kr/cs600/lecture/E600-Module4.pdf · 2004-05-24 · 3: Distributed File & Name Systems 7 Requirements (Cont) Fault tolerance:

3: Distributed File & Name Systems 36

Cache consistency & concurrency ctl.AFS does not control concurrent updates of files, this is left up to the application

deliberate decision, not to support distributed database system techniques, due to overhead this causes

Cache consistency only on open and close operationsonce file is opened, modifications of file are possible without knownledge of other processes’ operations on the fileany close replaces current version on server

All but the update resulting from last close operation processed at server will be lost, without warning

application programs on same server share same cached copy of file, hence using standard UNIX block- by- block update semantics

Although update semantics not identical to local UNIX file system, sufficiently close so that it works well in practice

Page 37: Module 4 Distributed File & Name Systemsdpnm.postech.ac.kr/cs600/lecture/E600-Module4.pdf · 2004-05-24 · 3: Distributed File & Name Systems 7 Requirements (Cont) Fault tolerance:

3: Distributed File & Name Systems 37

Enhancements Spritely NFS

Goal: achieve precise one- copy update semanticsAbolishes stateless nature of NFS -> vulnerability in case of server crashes Introduces open and close operations

open must be invoked when application wishes to access file on server,parameters:

modes: read, write, read/ write number of local processes that currently have the file open for read and write

closeupdated counts of processes

Server records counts in open files table, together with IP address and port number of client

Page 38: Module 4 Distributed File & Name Systemsdpnm.postech.ac.kr/cs600/lecture/E600-Module4.pdf · 2004-05-24 · 3: Distributed File & Name Systems 7 Requirements (Cont) Fault tolerance:

3: Distributed File & Name Systems 38

Enhancements Spritely NFS (Cont)

When server receives open: checks file table for other clients that have the same file open

if open specifies write,request fails if any other client has file open for writing,otherwise other read clients are instructed to invalidate local cache copy

if open specifies read,sends callback to other write clients forcing them to modify their caching strategy to write- through causes all other read clients to read from server and stop caching

Page 39: Module 4 Distributed File & Name Systemsdpnm.postech.ac.kr/cs600/lecture/E600-Module4.pdf · 2004-05-24 · 3: Distributed File & Name Systems 7 Requirements (Cont) Fault tolerance:

3: Distributed File & Name Systems 39

Enhancements WebNFS

Access to files in WANs by direct interaction with remote NFS servers Permits partial file accesses

http or ftp would require entire files to be transmitted, or special software at the server end to provide only the data needed

Access to “published” files through public file handle For access via path name on server, usage of lookup requests Reading a (portion of) a file requires

TCP connection to serverlookup RPCread RPC

Page 40: Module 4 Distributed File & Name Systemsdpnm.postech.ac.kr/cs600/lecture/E600-Module4.pdf · 2004-05-24 · 3: Distributed File & Name Systems 7 Requirements (Cont) Fault tolerance:

3: Distributed File & Name Systems 40

Enhancements NFS version 4

similarly for WANs usage of callbacks and leases recovery from server faults through transparent moving of file systems fromone server to another usage of proxy servers to increase scalability

Page 41: Module 4 Distributed File & Name Systemsdpnm.postech.ac.kr/cs600/lecture/E600-Module4.pdf · 2004-05-24 · 3: Distributed File & Name Systems 7 Requirements (Cont) Fault tolerance:

3: Distributed File & Name Systems 41

Part 2Distributed Name

Systems

Page 42: Module 4 Distributed File & Name Systemsdpnm.postech.ac.kr/cs600/lecture/E600-Module4.pdf · 2004-05-24 · 3: Distributed File & Name Systems 7 Requirements (Cont) Fault tolerance:

3: Distributed File & Name Systems 42

OutlineNames

Name and Directory Services

Name Spaces

Name Resolution

Navigation

Caching & Replication

Case Studies:Domain Name System

X.500 Directory Service

Page 43: Module 4 Distributed File & Name Systemsdpnm.postech.ac.kr/cs600/lecture/E600-Module4.pdf · 2004-05-24 · 3: Distributed File & Name Systems 7 Requirements (Cont) Fault tolerance:

3: Distributed File & Name Systems 43

Name & Directory ServicesNames

identification of objectsresource sharing: internet domain namescommunication: domain name part of email address

how much information about an object is in a name?pure names: uninterpreted bit patternsnon- pure names: contain information about the object, e. g., its location

Name Servicesentries of the form <name, attributes>, where attributes are typically network addresses type of lookup queries

name -> attribute valuesalso called name resolution

Directory Services<name, attributes> entries types of lookup queries

name -> attribute valuesattribute values -> names

Page 44: Module 4 Distributed File & Name Systemsdpnm.postech.ac.kr/cs600/lecture/E600-Module4.pdf · 2004-05-24 · 3: Distributed File & Name Systems 7 Requirements (Cont) Fault tolerance:

3: Distributed File & Name Systems 44

Requirements on Name ServicesUsage of unique naming conventions

enables sharingit is often not easily predictable which services will eventually share resources

Scalabilitynaming directories tent to grow very fast, in particular in Internet

Consistencyshort and mid term inconsistencies tolerablelong term: system should converge towards a consistent state

Performance and availabilityspeed and availability of lookup operationsname services are at the heart of many distributed applications

Adaptability to changeorganizations frequently change structure during lifetime

Fault isolationsystem should tolerate failure of some of its servers

Page 45: Module 4 Distributed File & Name Systemsdpnm.postech.ac.kr/cs600/lecture/E600-Module4.pdf · 2004-05-24 · 3: Distributed File & Name Systems 7 Requirements (Cont) Fault tolerance:

3: Distributed File & Name Systems 45

Name Spacesset of all valid names to be used in a certain context, e. g., all valid URLs in WWW can be described using a generative grammar (e. g., BNF for URLs)internal structure

flat set of numeric or symbolic identifiershierarchy representing position (e. g., UNIX file system)hierarchy representing organizational structure (e. g., Internetdomains)

potentially infiniteholds only for hierarchic name spacesflat name spaces finite size induced by max. name length

aliasesIn general, allows a convenient name to be substituted for a more complicated one

naming domainname space for which there exist a single administrative authority for assigning names within it

Page 46: Module 4 Distributed File & Name Systemsdpnm.postech.ac.kr/cs600/lecture/E600-Module4.pdf · 2004-05-24 · 3: Distributed File & Name Systems 7 Requirements (Cont) Fault tolerance:

3: Distributed File & Name Systems 46

Name & Directory ServicesPartitioning

no one name server can hold all name and attribute entries for entire network, in particular in Internet name server data partitioned according to domain

Replicationa domain has usually more than one name server availability and performance are enhanced

Cachingservers may cache name resolutions performed on other servers

avoids repeatedly contacting the same name server to look up identical names

client lookup software may equally cache results of previous requests

Page 47: Module 4 Distributed File & Name Systemsdpnm.postech.ac.kr/cs600/lecture/E600-Module4.pdf · 2004-05-24 · 3: Distributed File & Name Systems 7 Requirements (Cont) Fault tolerance:

3: Distributed File & Name Systems 47

Name Resolutiontranslation of a name into the related primitive attributeoften, an iterative process

name service returns attributes if the resolution can be performed in it’s naming contextname service refers query to another context if name can’t be resolved in own context

deal with cyclic alias references, if presentabort resolution after a pre- defined number of attempts, if no result obtained

Page 48: Module 4 Distributed File & Name Systemsdpnm.postech.ac.kr/cs600/lecture/E600-Module4.pdf · 2004-05-24 · 3: Distributed File & Name Systems 7 Requirements (Cont) Fault tolerance:

3: Distributed File & Name Systems 48

Navigation

Iterative Navigationused in DNS client contacts one NS NS either resolves name, or suggests other name server to contact resolution continues until name resolved or name found to be unbound

Accessing naming data from more than one name server in order to resolve a name

Client 12

3

NS2

NS1

NS3

Nameservers

Page 49: Module 4 Distributed File & Name Systemsdpnm.postech.ac.kr/cs600/lecture/E600-Module4.pdf · 2004-05-24 · 3: Distributed File & Name Systems 7 Requirements (Cont) Fault tolerance:

3: Distributed File & Name Systems 49

Navigation (Cont)Non-recursive, server-controlled

server contacts peers if it cannot resolve name itself by multicast or iteratively by direct contact

12

34client

NS2

NS1

NS3

Page 50: Module 4 Distributed File & Name Systemsdpnm.postech.ac.kr/cs600/lecture/E600-Module4.pdf · 2004-05-24 · 3: Distributed File & Name Systems 7 Requirements (Cont) Fault tolerance:

3: Distributed File & Name Systems 50

Navigation (Cont)Recursive, server-controlled

if name cannot be resolved, server contacts superior server responsible for a larger prefix of the name space

recursively applied until name resolvedcan be used when clients and low- level servers are not entitled to directly contact high- level servers

1

23

5

4

client

NS2

NS1

NS3

Page 51: Module 4 Distributed File & Name Systemsdpnm.postech.ac.kr/cs600/lecture/E600-Module4.pdf · 2004-05-24 · 3: Distributed File & Name Systems 7 Requirements (Cont) Fault tolerance:

3: Distributed File & Name Systems 51

Name & Directory ServicesDomain Name System (DNS)

name service used across the Internet Global Name Service (GNS)

developed at DEC X. 500

ITU - standardized directory service Jini

discovery service used in spontaneous networkingcontains directory service component

LDAPdirectory servicelightweight implementation of X. 500often used in intranetsMicrosoft Active Directory Service provides X. 500 interface

Page 52: Module 4 Distributed File & Name Systemsdpnm.postech.ac.kr/cs600/lecture/E600-Module4.pdf · 2004-05-24 · 3: Distributed File & Name Systems 7 Requirements (Cont) Fault tolerance:

3: Distributed File & Name Systems 52

DNS: Domain Name System

People: many identifiers:SSN, name, Passport #

Internet hosts, routers:IP address (32 bit) -used for addressing datagrams“name”, e.g., bbcr.uwaterloo.ca -used by humans

Q: map between IP addresses and name ?

Domain Name System:distributed databaseimplemented in hierarchy of many name serversapplication-layer protocolhost, routers, name servers to communicate to resolve names (address/name translation)

note: core Internet function implemented as application-layer protocolcomplexity at network’s “edge”

Page 53: Module 4 Distributed File & Name Systemsdpnm.postech.ac.kr/cs600/lecture/E600-Module4.pdf · 2004-05-24 · 3: Distributed File & Name Systems 7 Requirements (Cont) Fault tolerance:

3: Distributed File & Name Systems 53

DNS name servers

no server has all name-to-IP address mappings

local name servers:each ISP, company has local (default) name serverhost DNS query first goes to local name server

authoritative name server:for a host: stores that host’s IP address, namecan perform name/address translation for that host’s name

Why not centralize DNS?single point of failuretraffic volumedistant centralized databasemaintenance

doesn’t scale!

Page 54: Module 4 Distributed File & Name Systemsdpnm.postech.ac.kr/cs600/lecture/E600-Module4.pdf · 2004-05-24 · 3: Distributed File & Name Systems 7 Requirements (Cont) Fault tolerance:

3: Distributed File & Name Systems 54

DNS: Root name servers

contacted by local name server that can not resolve nameroot name server:

contacts authoritative name server if name mapping not knowngets mappingreturns mapping to local name server

~ dozen root name servers worldwide

Page 55: Module 4 Distributed File & Name Systemsdpnm.postech.ac.kr/cs600/lecture/E600-Module4.pdf · 2004-05-24 · 3: Distributed File & Name Systems 7 Requirements (Cont) Fault tolerance:

3: Distributed File & Name Systems 55

Simple DNS example

host comm.utoronto.cawants IP address of bbcr.cs.uwaterloo.ca

1. Contacts its local DNS server, dns.utoronto.ca

2. dns.utoronto.cacontacts root name server, if necessary

3. root name server contacts authoritative name server, dns.uwaterloo.ca, if necessary requesting host

comm.utoronto.cabbcr.cs.uwaterloo.ca

root name server

authorititive name serverdns.uwaterloo.ca

local name serverdns.utoronto.ca

1

23

45

6

Page 56: Module 4 Distributed File & Name Systemsdpnm.postech.ac.kr/cs600/lecture/E600-Module4.pdf · 2004-05-24 · 3: Distributed File & Name Systems 7 Requirements (Cont) Fault tolerance:

3: Distributed File & Name Systems 56

DNS example

Root name server:may not know authoratiative name servermay know intermediate name server: who to contact to find authoritative name server

requesting hostcomm.utoronto.ca

bbcr.cs.uwaterloo.ca

root name server

local name serverdns.utoronto.ca

1

23

4 5

6

authoritative name serverdns.cs.uwaterloo.ca

intermediate name serverdns.uwaterloo.ca

7

8

Page 57: Module 4 Distributed File & Name Systemsdpnm.postech.ac.kr/cs600/lecture/E600-Module4.pdf · 2004-05-24 · 3: Distributed File & Name Systems 7 Requirements (Cont) Fault tolerance:

3: Distributed File & Name Systems 57

DNS: iterated queries

recursive query:puts burden of name resolution on contacted name serverheavy load?

iterated query:contacted server replies with name of server to contact“I don’t know this name, but ask this server”

requesting hostcomm.utoronto.ca

bbcr.cs.uwaterloo.ca

root name server

local name serverdns.utoronto.ca

1

23

4

5 6

authoritative name serverdns.cs.uwaterloo.ca

intermediate name serverdns.uwaterloo.ca

7

8

iterated query

Page 58: Module 4 Distributed File & Name Systemsdpnm.postech.ac.kr/cs600/lecture/E600-Module4.pdf · 2004-05-24 · 3: Distributed File & Name Systems 7 Requirements (Cont) Fault tolerance:

3: Distributed File & Name Systems 58

DNS recordsDNS: distributed db storing resource records (RR)

Type=NSname is domain (e.g. foo.com)value is IP address of authoritative name server for this domain

RR format: (name, value, type,ttl)

Type=Aname is hostnamevalue is IP address

Type=CNAMEname is an alias name for some “cannonical”(the real) namevalue is cannonicalname

Type=MXvalue is hostname of mailserver associated with name

Page 59: Module 4 Distributed File & Name Systemsdpnm.postech.ac.kr/cs600/lecture/E600-Module4.pdf · 2004-05-24 · 3: Distributed File & Name Systems 7 Requirements (Cont) Fault tolerance:

3: Distributed File & Name Systems 59

DNS protocol, messagesDNS protocol : query and repy messages, both with

same message format

msg headeridentification: 16 bit # for query, repy to query uses same #flags:

query or replyrecursion desired recursion availablereply is authoritative

Page 60: Module 4 Distributed File & Name Systemsdpnm.postech.ac.kr/cs600/lecture/E600-Module4.pdf · 2004-05-24 · 3: Distributed File & Name Systems 7 Requirements (Cont) Fault tolerance:

3: Distributed File & Name Systems 60

DNS protocol, messages

Name, type fieldsfor a query

RRs in reponseto query

records forauthoritative servers

additional “helpful”info that may be used

Page 61: Module 4 Distributed File & Name Systemsdpnm.postech.ac.kr/cs600/lecture/E600-Module4.pdf · 2004-05-24 · 3: Distributed File & Name Systems 7 Requirements (Cont) Fault tolerance:

3: Distributed File & Name Systems 61

DNS Database StructureDNS database partitioned over set of interconnected servers (partitioning)DNS servers: responsible for a domain

locality of requests: most DNS queries are served by servers in local domainadditionally, DNS server stores other domain names and the responsible DNS servers

DNS naming data zones:attributes for names in a domain, minus that of sub- domains

uwaterloo.ca minus cs.uwaterloo.ca

names and addresses of two name servers that provide authoritativenaming information (replication)

authoritative: reasonably up to datenames and addresses of servers that hold authorative information for sub- domains

Page 62: Module 4 Distributed File & Name Systemsdpnm.postech.ac.kr/cs600/lecture/E600-Module4.pdf · 2004-05-24 · 3: Distributed File & Name Systems 7 Requirements (Cont) Fault tolerance:

3: Distributed File & Name Systems 62

DNS Database Structure (Cont)DNS clients

handles requests to server, usually by UDP- based request/ reply protocolmay be configured to contact alternate servers if primary

choice unavailable specifies type of navigation to be used by server

iterative and recursive strategies are all allowed

note: root server data replicated to about a dozen secondary servers

still, each one of them has to process up to 1000 queries per second

domains = naming domainstop- level/ organizational domains

com, edu, gov, mil, int, net, org, ca, us, de, ...

Page 63: Module 4 Distributed File & Name Systemsdpnm.postech.ac.kr/cs600/lecture/E600-Module4.pdf · 2004-05-24 · 3: Distributed File & Name Systems 7 Requirements (Cont) Fault tolerance:

3: Distributed File & Name Systems 63

DNS CachingOnce (any) name server learns mapping, it cachesmapping

data will be marked as non- authoritativeDNS database may contain inconsistent and stale data

inconsistency is of no consequence until inconsistent address data is usedDNS does not specify how to detect and remedy staleness of address dataupdate/notify mechanisms under design by IETF

RFC 2136http://www.ietf.org/html.charters/dnsind-charter.html

Page 64: Module 4 Distributed File & Name Systemsdpnm.postech.ac.kr/cs600/lecture/E600-Module4.pdf · 2004-05-24 · 3: Distributed File & Name Systems 7 Requirements (Cont) Fault tolerance:

3: Distributed File & Name Systems 64

Discovery Servicesdirectory services that allow clients to query available services in a spontaneous networking environment

e. g., which are the available color printersservices enter their data using registration interfaces

structure usually rather flat, since scope limited to (wireless)LAN

JINI (http:// www. sun. com/ jini/)JAVA- based discovery service

clients and servers run JVMscommunication via Java RMI dynamic loading of code

Page 65: Module 4 Distributed File & Name Systemsdpnm.postech.ac.kr/cs600/lecture/E600-Module4.pdf · 2004-05-24 · 3: Distributed File & Name Systems 7 Requirements (Cont) Fault tolerance:

3: Distributed File & Name Systems 65

JINI Discovery related serviceslookup service

holds information regarding available servicesquery of lookup service by Jini client

match requestdownload object providing service from lookup service

registration of a Jini client or Jini service with lookup service

send message to well- known IP multicast address, identical to all Jini instanceslimit multicast to LAN using time- to- live attributeuse of leases that need to be renewed periodically for registering Jini services

Page 66: Module 4 Distributed File & Name Systemsdpnm.postech.ac.kr/cs600/lecture/E600-Module4.pdf · 2004-05-24 · 3: Distributed File & Name Systems 7 Requirements (Cont) Fault tolerance:

3: Distributed File & Name Systems 66

JINI Discovery Scenarionew client wishes to print on a printer belonging to the finance group

Printing service

serviceLookup

serviceLookup

Printing service

admin

admin

admin, finance

finance

Client

Client

Corporate infoservice

1. ‘finance’ lookup service?

2. Here I am: .....

3. Requestprinting

4. Use printingservice

Network

Page 67: Module 4 Distributed File & Name Systemsdpnm.postech.ac.kr/cs600/lecture/E600-Module4.pdf · 2004-05-24 · 3: Distributed File & Name Systems 7 Requirements (Cont) Fault tolerance:

3: Distributed File & Name Systems 67

X.500 Directory Servicegeared towards satisfying descriptive queries

provide attributes of other users and system resources architecture

directory user agents (DUA)directory service agents (DSA)

DSA

DSA

DSA

DSA

DSADSADUA

DUA

DUA

Page 68: Module 4 Distributed File & Name Systemsdpnm.postech.ac.kr/cs600/lecture/E600-Module4.pdf · 2004-05-24 · 3: Distributed File & Name Systems 7 Requirements (Cont) Fault tolerance:

3: Distributed File & Name Systems 68

X.500 Name TreeDirectory Information Tree (DIT)

every node of the tree stores extensive information Directory Information Base (DIB)

DIT plus node information

... France (country) Great Britain (country) Greece (country)...

BT Plc (organization) University of Gormenghast (organization)... ...

Department of Computer Science (organizationalUnit)Computing Service (organizationalUnit)

Engineering Department (organizationalUnit)

...

...

X.500 Service (root)

Departmental Staff (organizationalUnit)

Research Students (organizationalUnit)ely (applicationProcess)

...

...

Alice Flintstone (person) Pat King (person) James Healey (person) ...... Janet Papworth (person)...

Page 69: Module 4 Distributed File & Name Systemsdpnm.postech.ac.kr/cs600/lecture/E600-Module4.pdf · 2004-05-24 · 3: Distributed File & Name Systems 7 Requirements (Cont) Fault tolerance:

3: Distributed File & Name Systems 69

Example X.500 DIB Entryinfo

Alice Flintstone, Departmental Staff, Department of Computer Science, University of Gormenghast, GB

commonNameAlice.L.FlintstoneAlice.FlintstoneAlice FlintstoneA. Flintstone

surnameFlintstone

telephoneNumber+44 986 33 4604

uid

alf

[email protected]

[email protected]

Z42userClass

Research Fellow

Page 70: Module 4 Distributed File & Name Systemsdpnm.postech.ac.kr/cs600/lecture/E600-Module4.pdf · 2004-05-24 · 3: Distributed File & Name Systems 7 Requirements (Cont) Fault tolerance:

3: Distributed File & Name Systems 70

X.500 Directory Accessesread

absolute or relative name provided DSA navigates tree and returns requested attributes

search provided

base name: starting point for search in treefilter expression: boolean condition on directory attributes

returnedlist of DIT node names for which filter evaluates to true

Lightweight Directory Access Protocol (LDAP)lightweight version of X. 500; access of DSA through TCP/ IP; simpler API; textual encoding in place of ASN. 1 encoding

Practical Usage of X. 500 / LDAP LDAP currently widely used for intranets adoption in Internet


Recommended