Post on 18-Jan-2016
transcript
File Management
Chapter 12
Files and File systems
File system provides the resource abstractions typically associated with secondary storage.It permit user to create data collections, called files with the following properties:
Long-term existence Files are stored on disk/other secondary storage do not disappear when a user logs off.
Sharable between processes Files have names and can have associated access permissions that permit controlled sharing
Files and File systems
Structure file can have internal structure that convenient for a particular applications. It can be organized in hierararchical structure.
A collection of functions that can be performed on files:
CreateNew file is define and positioned within the structure of files
DeleteA file is removed from the file structure and destroyed
OpenAn existing file is declared to be “opened” by a process, allowing the process to perform function on the file
CloseThe file is closed with to respect to a process, process no longer may perform functions on file
Reada process reads all /portion of data in a file
WriteProcess updates a file , add new data /changing values.
Files and File systems
File structure
Terms are commonly used when discussing about files:
FieldBasic element of data
An individual field contains single value, e.g. employee‘s name
It’s characterized by its length and data type
Can be fixed or variable length depending on file design
Can contains subfields
File structureRecord
Collection of related fieldsCan be treated as a unit by some application program.Exp: employee record have fields such as name, social sec number, date hired etc…Can be fixed/variable length
FileA collection of similar recordsTreated as s single entity by users and applications and may be referenced by nameMay be created and deletedApplying access control
File structure
DatabaseCollection of related data.Essential aspects of database are that the relationships that exist among elements of data are explicit and the database is designed for use by number of different applications.May contain all of the info related to an organization.Consists one/more types of files
File structure
Operations that must be supported when to use files:
Retrieve_AllRetrieve all the record of a file.Required for an application that must process all of the info in the file at one time.E.g.: application that produces a summary of the info in the file This operation is often equated with the term sequential processing
– since all records are access in sequence.O
File structure
Retrieve_One: Just retrieve one recordE.g.: interactive transaction-oriented applications need this operation.
Retrieve_NextRetrieve the record that is “next” in some logical sequence to the most recently retrieved record.E.g.: interactive application like filling in forms, performing a search operation.
Retrieve_PreviousRecord that is “previous” to the currently accessed record is retrieved.
Insert_OneInsert new record into the file.
Delete_OneDelete an existing record.
Update_oneRetrieve a record, update one/more of its field and rewrite the updated record back into the file.
File structure
Retrieve_FewRetrieve a number of record.
The nature of the operations that are most commonly performed on a file will influence the way the file is organized.
File structure
File Management Systems (FMS)
set of system software that provides services to users an applications in the use of files.Users/application may access files in through the FMS.Objectives:
To meet the data mgmt needs and requirements of the user, which include storage of data and the ability to perform the operation required.To guarantee, to the extend possible, that the data in the file are valid.
To optimize performance, both from the system point of view in terms of overall throughput and from user’s point of view in term of response time.To provide I/O support for a variety of storage device types.To minimize/eliminate the potential for lost /destroyed dataTo provide a standardize set of I/O interface routines to use processesTo provide I/O support for multiple users.
File Management Systems (FMS)
For objective 1: meeting user requirement Requirements depends on the variety of applications and the environment in which the computer system will be used.For an interactive general-purpose system, the following constitute a minimal set of requirements:
each user should be able to create, delete, read,write,modify files.Each user may have controlled access to other users’s files
File Management Systems (FMS)
Each user may control what types of accesses are allowed to the user’s filesEach user should be able to restructure the user’s files in a form appropriate to the problem.Each user should be able to move data between filesEach user should be able to back up and recover the user’s files in case of damageEach user should be able to access the user’s files by using symbolic names
File Management Systems (FMS)
Need to look at software organization in order to understand file mgmt.Figure 12.1 show the File system software architecture.Lowest level:
device drivers communicate directly with peripheral devicesDevice driver responsible for starting I/O operations on a device and processing the completion of an I/O request.Exp: disk and tape.Part of OS.
File System Architecture
Basic file system/physical I/O:Primary interface with the environment outside of the computer system.It deals with blocks of data that are exchanged with disk/tapeConcerns with the placement of those blocks on the 2nd storageAnd on the buffering in main memoryPart of OS
File System Architecture
Basic I/O supervisorResponsible for all file I/O initiation and terminationControl structures are maintained that deals with device I/O, scheduling and file statusPart of OS
File System Architecture
Logical I/OEnables users and applications to access recordsDeals with file records.Provides a general-purpose record I/O capability and maintained basic data about files.
Access method Level that closest to the userProvide standard interface between application and the file system and devices that hold the dataDifferent access methods reflect different file structures and way of accessing and processing the data
File System Architecture
File Management Functions
Another way of viewing the functions of a file system is shown in Figure 12.2User and application program interact with the file system by means of commands for creating and deleting files and performing operations on files. Before performing any operation the file system identify and locate a selected fileUse a directory to describe the location of all files plus their attributes
File Management Functions
On a shared system enforce user access control
Only authorized users are allowed to access files.Basic operations may perform on a file are performed at record levelFiles are viewed as some structure that organizes the record
Sequential structure – employee name stored alphabetically by last name
Thus, to translate user commands into specific file manipulation commands, the access method appropriate to this file structure must be employed.
File Management Functions
I/O is done on block basis.The records of a file must be blocked for output and unblocked after input.To support block I/O files:
Secondary storage must be managedAllocating files to free blocksManaging free storage for available blocks.
File Organization and access
File organization refer to the logical structuring of the records as determined by the way in which they are accessed.Criteria need to look when choosing a file organization:
Short access time
Ease of update
Economy of storage
Simple maintenance
Reliability
Continue..
Focus on 5 organizations :
The pile
The sequential file
The indexed sequential file
The indexed file
The direct/hashed file
The pile
Least complicatedData are collected in the order in which they arriveEach record consists of one burst of dataPurpose: simply to accumulate the mass of data and save it.Records may have different fields/similar fields in different orderEach field should be self-describing, filed name as well as valueThe length of the field must be implicitly indicated by delimitersNo structure to the pile record, record access is by exhaustive search.
The pile… cont
i.e: need to find record that contains a particular field with a particular value, necessary to examine each record in the pile until found/not found.Pile files are encountered when data are collected and stored prior to processing/when data not easy to organizeUses space well when the stored data vary in size and structurePerfectly adequate for exhaustive searches, easy to updateNot suit for most applications.
Pile
The sequential file
Most commonA fixed format is used for recordsAll records are of the same length, consisting of the same number of fixed-length fields in a particular orderFirst field in each record is referred as key field.The key field uniquely identifies the recordUsually used for batch applicationEasily stored on tape/diskFor interactive application that involve queries-poor performance
The sequential file… cont
The Sequential FileNew records are placed in a log file or transaction fileBatch update is performed to merge the log file with the master file
Sequential File
The Indexed Sequential File
Maintains the key characteristic of the sequential file
Records are organized in sequence based on key field.Add two features:
An index to the file to support random accessAn overflow file
Index provides lookup capability to reach quickly
Overflow similar to log file used with sequential file but is integrated so that record in the overflow file is located by following a pointer from its predecessor record.
The Indexed Sequential File… cont
Comparison of sequential and indexed sequential
Example: a file contains 1 million records
On average 500,00 accesses are required to find a record in a sequential file
If an index contains 1000 entries, it will take on average 500 accesses to find the key, followed by 500 accesses in the main file. Now on average it is 1000 accesses
Indexed Sequential File
Uses multiple indexes for different key fields
May contain an exhaustive index that contains one entry for every record in the main file
May contain a partial index – contains entries to records where the field of interest exists.
When new record is added to main file, all of the index files must be updated.
Used in applications where timeliness of info is critical i.e airline reservation system, inventory control system.
Indexed File
Indexed File
Directly access a block at a known address
Key field required for each record
Make use of hashing function on the key value.
Often used when very rapid access is required, where fixed length length record sre used and where records are always accessed one at a time.
i.e directories, pricing
The Direct or Hashed File
File Directories
Contains information about filesAttributesLocationOwnership
Directory itself is a file owned by the operating systemProvides mapping between file names and the files themselves
Simple Structure for a Directory
List of entries, one for each fileSequential file with the name of the file serving as the keyProvides no help in organizing the filesForces user to be careful not to use the same name for two different files
Two-level Scheme for a Directory
One directory for each user and a master directoryMaster directory contains entry for each user
Provides address and access control information
Each user directory is a simple list of files for that userStill provides no help in structuring collections of files
Hierarchical, or Tree-Structured Directory
Master directory with user directories underneath itEach user directory may have subdirectories and files as entries
Hierarchical, or Tree-Structured Directory
Files can be located by following a path from the root, or master, directory down various branches
This is the pathname for the fileCan have several files with the same file name as long as they have unique path names
Hierarchical, or Tree-Structured Directory
Current directory is the working directoryFiles are referenced relative to the working directory
File Sharing
In multiuser system, allow files to be shared among usersTwo issues
Access rights
Management of simultaneous access
Access Rights-exp on access right
NoneUser may not know of the existence of the file
User is not allowed to read the user directory that includes the file
KnowledgeUser can only determine that the file exists and who its owner is
Access Rights
ExecutionThe user can load and execute a program but cannot copy it
ReadingThe user can read the file for any purpose, including copying and execution
AppendingThe user can add data to the file but cannot modify or delete any of the file’s contents
Access Rights
UpdatingThe user can modify, deleted, and add to the file’s data. This includes creating the file, rewriting it, and removing all or part of the data
Changing protectionUser can change access rights granted to other users
DeletionUser can delete the file
Access Rights
OwnersHas all rights previously listed
May grant rights to others using the following classes of users
Specific user
User groups
All for public files
Simultaneous Access
User may lock entire file when it is to be updatedUser may lock the individual records during the updateMutual exclusion and deadlock are issues for shared access
Record Blocking
Records are the logical unit of access of a structured fileBlocks are the unit of I/O with secondary storage.For I/O to be performed, records must be organized as blocks.Several issues to be consider :
Should blocks be fixed or variable length?Most system, blocks are of fixed length
What should the relative size of a block to be compared to the average record size?
Larger the block, more record can be passed to I/O, with sequentially processed it’s an advantage.
With random access-result in unnecessary transfer of unused records.
Record Blocking… cont
Given the size of the block, there are 3 methods of blocking that can be used
Fixed blockingVariable-length spanned blockingVariable –length unspanned blocking
Record Blocking… cont
Fixed blockingFixed-length records are usedAn integral number of records is stored in a blockThere may be unused space at the end of each block –internal fragmentation
Fixed Blocking
Record Blocking… cont
Variable-length spanned blockingVariable –length records are usedPacks into blocks with no unused spaceSome records must span two blocks, with the continuation indicated by a pointer to the successor block
Variable Blocking: Spanned
Record Blocking… cont
Variable –length unspanned blockingVariable-length records are used, not employed spanningThere is waste space in most block because of the inability to use the remainder of a block if the next record is larger than the remaining unused space.
Variable Blocking Unspanned
Secondary Storage Management
Space must be allocated to filesMust keep track of the space available for allocation
Preallocation vs Dynamic AllocationPreallocation
Need the maximum size for the file at the time of creationDifficult to reliably estimate the maximum potential size of the fileTend to overestimated file size so as not to run out of space
Dynamic allocationAllocates space to a file in portions as needed
Portion sizeSize of the portion allocated to a fileContiguity of space increases performanceLarge number of small portion increases the size of tables needed to manage the allocation infoFixed sized portion simplifies the reallocation of spaceVariable size/small fixed size portion minimizes waste of unused storage due to over allocation
Portion size… contAlternatives:
variable,large contiguous portion=better performanceVariable size avoid waste, file allocation tables are smallSpace is hard to reuse
Blocks= small fixed portion provide greater flexibility
Methods of File Allocation
Contiguous allocationSingle set of blocks is allocated to a file at the time of creation
Only a single entry in the file allocation tableStarting block and length of the file
External fragmentation will occurNeed to perform compaction
Methods of File Allocation
Chained allocationAllocation on basis of individual block
Each block contains a pointer to the next block in the chain
Only single entry in the file allocation tableStarting block and length of file
No external fragmentation
Best for sequential files
No accommodation of the principle of locality
Methods of File Allocation
Indexed allocationFile allocation table contains a separate one-level index for each file
The index has one entry for each portion allocated to the file
The file allocation table contains block number for the index