Date post: | 21-Jan-2016 |
Category: |
Documents |
Upload: | esmond-mckinney |
View: | 222 times |
Download: | 4 times |
Lecture 9 File Systems andLinux Virtual File System
(Homework#4 Task 1 included at slide no.16)
Nov. 13, 2015
Kyu Ho Park
File: a logical unit of information created by process. Files are managed by the OS. How they are structured, named, accessed,
used,protected, implemented, etc.
File System: The part of the OS dealing with files is known as the file system.
Files
Disk : It is considered as a linear sequence of fixed-size blocks and supporting two operations:
Issues to be solved: How do you find information? How do you keep one user from reading
another user’s data? How do you know which blocks are free?
Disk
File Systems How the file system can be looked to
the user; Define a file and its attributes , the
operations allowed on a file, and the directory structure for organizing a file.
Algorithms and data structures to map the logical file system to the physical disk.
4
5
File System Overview[ On Disk]
Boot block: It contains information needed by the system to boot an operating system.
Volume control block:It contains volume( or partition) details, such as the number of blocks in the partition, size of blocks, free-block count and free-block pointers, free FCB count and FCB pointers.In UNIX, it is called a superblock.
5
6
File System Overview A directory structure per file system: In UNIX, it includes file names and
associated inode numbers. A per-file FCB: In UNIX, it is called i-node
6
7
In-memory An in-memory mount table:
It contains information about each mounted volume.
An in-memory directory-structure cache:It holds the directory information of recently accessed directories.
System-wide open-file table:It contains a copy of the FCB of each open file.
Per-process open-file table:It contains a pointer to the appropriate entry in the system-wide open-file table.
7
8
In-Memory File System Structures
read (index)
per-processopen-file table
data blocks
file-control block
user space kernel memory secondary storage
system-wideopen-file table
index
9
Page Cache A page cache caches pages rather
than disk blocks using virtual memory techniques
Memory-mapped I/O uses a page cache
Routine I/O through the file system uses the buffer (disk) cache
This leads to the following figure
10
I/O Without a Unified Buffer Cache
memory-mapped I/OI/O using
read( ) and write( )
page cache
buffer cache
file system
11
Unified Buffer Cache A unified buffer cache uses the
same page cache to cache both memory-mapped pages and ordinary file system I/O
12
I/O Using a Unified Buffer Cache
memory-mapped I/OI/O using
read( ) and write( )
buffer cache
file system
13
File Structure
Three kinds of files byte sequence record sequence tree
1 Byte 1 Record
(a) (b)
Ant PigFox Pig
Cat DogCow Goat OwlLion Pony WormRat
Hen LambIbis
(c)
14
File Access Sequential access
read all bytes/records from the beginning cannot jump around, could rewind or back up convenient when medium was mag tape
Random access bytes/records read in any order essential for data base systems read can be …
move file marker (seek), then read or … read and then move file marker
15
File Operations1.open, create, close, read, write, lseek, unlink, remove
2.umask, access, chmod, fchmod, chown,fchown, link,rename, symlink, readlink,stat, fstat
3.mkdir, rmdir,opendir,closedir
readdir,chdir,getcwd
Homework#4 Task1 Explain the red colored functions of file operationsshown in the previous slide and make your own
program using each red colored functions.
Submit the execution results of each functions capturing the screen of your computer.
Due: Nov. 17, 23:59. Submit your report to Joo Kyung Ro <[email protected]>.
17
Directories:Single-Level Directory Systems
A single level directory system contains 4 files owned by 3 different people, A, B, and C
A B C C
Root directory
18
Two-level Directory Systems
Letters indicate owners of the directories and files
C C C
Root directory
B
A A B
CA
Userdirectory
Files
19
Hierarchical Directory Systems
A hierarchical directory system
Root directory
B
A
CA
Userdirectory
User file
B B B
B
C C
C C
C C C C
User subdirectories
20
A UNIX directory tree
Path Names
Root directory
/user/jim
bin
etc
lib
user
tmp
bin etc usrast
jim
lib
tmp
lib
libdict.
jim
ast
21
File System Implementation
A possible file system layout
Disk Partition
MBR
Boot block Super block Free space mgmt I-nodes Root dirFiles and
directories
Partition table
Entire disk
File System Layout1. Sector 0 : MBR to boot the computer2. Partition table :Starting and ending
address of each partition.3. When the computer is booted, the
BIOS reads in and executes the MBR.4. The first work of the MBR is locating
the active partition: reads in its first block(boot block) and execute it.
5. The program in the boot block loads OS contained in that partition.
22
Bootloader
pPower ON
ROM(Flash)
• CPU initialization• Registers, Memory, Clock seting• Copy bootloader code to RAM
Area
Bootloader Code
RAM Area
Bootloader copy
Kernel Start
I/O address area
24
Implementing Files:Contiguous Allocation
(a) Contiguous allocation of disk space for 7 files(b) State of the disk after files D and E have been removed
25
Linked List Allocation
Storing a file as a linked list of disk blocks
Physical block
File A
Fileblock
0
Fileblock
1
Fileblock
2
Fileblock
3
0
Fileblock
4
Fileblock
0
Fileblock
1
Fileblock
2
0
Fileblock
3Physical
block
File B
4 7 2 10 12
6 3 11 14
26
File Allocation Table(FAT)
Linked list allocation using a file allocation table in RAM
0
1
2 10
3 11
4 7
5
6 3
7 2
8
9
10 12
11 14
12 -1
13
14 -1
15
Physical block
File A starts here
File B starts here
Unused block
27
i-node
An example i-node
File Attributes
Address of disk block 0
Address of disk block 1
Address of disk block 2
Address of disk block 3
Address of disk block 4
Address of disk block 5
Address of disk block 6
Address of disk block 7
Address of block of pointersDisk blockcontainingadditional
disk addresses
Implementing Directories1. When a file is opened , the OS uses the
path name supplied by the user to locate the directory entry.
2. The directory entry provides the information needed to find the disk blocks.
3. Depending on the systems, the information may be the disk address of the entire files(contiguous allocation), the number of the first block, or the number of the i-node.
28
Directory The main function of the directory
system is to map the ASCII name of the file onto the information needed to locate the data.
Attribute of a file: file’s owner, creation time, modified
time----
29
30
Implementing Directories (1)
(a) A simple directoryfixed size entriesdisk addresses and attributes in directory entry
(b) Directory in which each entry just refers to an i-node
games attributes
mail attributes
news attributes
work attributes
games attributes
mail attributes
news attributes
work attributes
Data structurecontaining the attributes
(a) (b)
31
Implementing Directories (2)
Two ways of handling long file names in directory (a) In-line (b) In a heap
32
Shared Files
File system containing a shared file
Root directory
B
A
CA
B B B
B
C C
C C
? C C C
Shared file
Hard Link and Soft Link Hard link
A file name included in a directory is called a file hard link(or simply link).
The same file may have several links, so it may have several file names.
ln file1 file2 Limitations:
Not possible to create hard links for directories. Links can be created only among files included in
the same file system.
33
Hard Link and Soft Link soft link ( also called symbolic link)
To overcome the limitations of the hard link.
Symbolic links are short files that contain an abitrary pathname of another file.
ln –s file1 file2
34
File Types Regular file Directory Symbolic link Block device file Character device file Pipe and names pipe(FIFO) Socket
35
36
Disk Block Size and Disk I/O Speed
Dark line (left hand scale) gives data rate of a disk Dotted line (right hand scale) gives disk space efficiency All files 2KB
Block size
Linux File System
boot block
Block group 0
Block group 1
…….
Block group n
Ext2
system/dev/hda
/dev/hda1
/dev/hda2/dev/hda3
Super block
Group descriptor
Block bitmap
Inode bitmap
Inode table
Data blocks
Filesystem Layout
i_blocksi_nodei_links_counti_uidi_gidi_atime, time,mtime
…..Single
Double
Triple
12 direct block
3 indirect block
ext2_inode U G S r w x r w x r w xType(4bit)
S_IFSOCKS_IFLNKS_IFREGS_IFBLKS_IFDIRS_IFCHRS_IFIFO
0
1023
0
1023
0
1023
0
1023
0
1023
0
1023
ext2 inode
Size of a file size of a block : 4KB Direct Blocks : 12 x 4K=48K Indirect Blocks : 1 x 1K x 4K=4M Double Indirect Blocks:
1 x 1K x 1K x 4K =4G Triple Indirect Blocks :
1 x 1K x 1K x 1K x4K= 4T Maximum size of a file = 4,004,004,048,000Bytes
disk block10
.. parent
. 5
my_file .c 15
my_dir 25
i_modetime…10…
i_modetime…111213…
i_modetime…21…
inode5
inode15
inode25my_file.c my_dir
/
inode , file and directory
42
Virtual File Systems
Virtual File Systems (VFS) provide an object-oriented way of implementing file systems.
VFS allows the same system call interface (the API) to be used for different types of file systems.
The API is to the VFS interface, rather than any specific type of file system.
VFS
43
common file modelIt consists of the 4 object types:
superblock object information about a mounted filesystem.
inode object information about a specific file
file objects information about the interaction between an
open file and a process dentry object
information about the linking of a directory entry with the corresponding file.
44
Object: a software construct that defines both a data structure and the methods that operate on it
VFS objects and processes
45
….fat_file_write(test.c,…);ext2_create(my_file.c,…);….
/dev/hda2
User App
User level
Disk
….generic_file_read,generic_file_write,ext2_truncate,ext2_readdir,ext2_create,…
….generic_file_read,fat_file_write,fat_truncat,generic_read_dir,…
msdos
my_file.c test.c
Ext 2
/dev/hda3
Kernel level
Without Virtual Layer
….write(test.c,…);create(my_file.c,…);….
/dev/hda2
User App
User level
Disk
….generic_file_read,generic_file_write,ext2_truncate,ext2_readdir,ext2_create,…
….generic_file_read,fat_file_write,fat_truncat,generic_read_dir,…
msdosExt 2
/dev/hda3
Kernel level
my_file.c test.c
Virtual File System
sys_open()
file_open()
/*fs/open.c*/
/*fs/namei.c*/
- get_unused_fd_flags()-do_file_open()-fd_install(fd, f)
- struct file initialize- call file ->f_op->open()
System call layer
VFS layer
Specific File layer
fifo_open()blkdev_open()
chrdev_open() sock_no_open()
operations in VFS
fd[0]
…filesfs…
task_struct file_struct/*include/linux/fdtable.h*/
fd[1]
fd[2]
fd[3]
Disk
file/*include/linux/fs.h*/
dentry/*include/linux/scache.h*/
inode/*include/linux/fs.h*/
super_block/*include/linux/fs.h*/
…d_inoded_op…
…f_dentryf_posf_op…
…i_sbi_op…
…s_bdevs_op…
ext 2 /dev/hda2
task_struct and VFS Objects
New Trend:Storage Class Memory
System Evolution
CPU-RAM-Disk
CPU-RAM-SSD-Disk
CPU-RAM-SCM-Disk
51
Hierarchy of Latency Freitas &Wilcke, IBM J.Res&Dev,2008
Disk 10E7-10E8 CPU cyclesSCM 10E3DRAM 10E2L2,L3 cache 10-100L1 1
52
Emerging NVRAM Technology [1]
Phase Change Memory (PCRAM or PRAM) STT-RAM (MRAM) Memristor (RRAM)
[1] “Non-Volatile Memory: Emerging Technologies And Their Impacts on Memory Systems", Taciano Perez, Cesar A. F. De Rose, Technical Report (Pontificia Universiadae)
53
New Memory ArchitecturesNVRAMOS 2009
An Empirical Study using NVRAM
Performance/Energy tradeoffs on NVRAM
Operating System Support for NVRAM Green data center with NVRAM
CPU
SCM
CPU
SCM
NOR, NAND Flash
CPU
RAM
SCM
CPU
RAM
NOR, NAND Flash
RAM-Flash RAM-SCM SCM - Flash
SCM - Only
< Best power efficiency >
< Best Performance >
New Memory ArchitecturesNVRAMOS 2009
I/O bound job SCM is best
Memory bound job DRAM is best
CPU bound job Little impacts
SCM only SystemNVRAMOS 2009
SCM Great potential to reduce energy
consumption SCM as memory cause performance
degradationCPU
SCM
SCM - Only
Operating System support for SCM
NVRAMOS 2009
Operating System support for SCM Unified file object and memory objects Eliminate redundant I/O accesses mmap like operations for all
memory/disk data
Buffer cache
File Disk- Twice accesses for same data to update
SCMFilememor
y
Adaptive Context SwitchNVRAMOS 2009
Context Switch on Block devices Fast block devices with SCM
No need to context switch, which takes 100us
Keep use whole schedule quantum 5~10% performance improvements Shared object accesses in multiple
threads cause performance degradation/malfunctions
CPU
RAM
SCM
I/O access less than 1ms