File System Implementation
CSCI 444/544 Operating Systems
Fall 2008
Agenda
• File system layout
• Free-space management
• Directory implementation
• File caching
File System Layout
Overall question: how to organize files on disk?
• This is really just a data structure issue– What data structure is the right one to use to store a file on
disk?– Usage patterns matter
• Many issues in OS boil down to data structure and algorithms
– Disk scheduling is similar to traveling salesman problem
File System Usage Patterns• 80% of file accesses are reads
• Most programs that use a file sequentially access the whole file– Spatial locality
– Pre-fetching
• Most files are small, although most bytes are taken up by large files
File Allocation
How do we lay out the blocks of a file on disk?
Many different approaches• Contiguous• Linked list• Indexed
Implications• Large files should be allocated sequentially• Files in same directory should be allocated near each other• Data should be allocated near its meta-data
Design Metrics
• Amount of fragmentation (internal and external)?
• Ability to grow file over time?• Seek cost for sequential accesses?• Speed to find data blocks for random
accesses?•Wasted space for pointers to data blocks?
Contiguous AllocationAllocate each file to contiguous blocks on disk
• Meta-data: Starting block and size of file• OS allocates by finding sufficient free space• Example: IBM OS/360
Advantages• Little overhead for meta-data• Excellent performance for sequential accesses• Simple to calculate random addresses
Drawbacks• Horrible external fragmentation (Requires periodic compaction)• May not be able to grow file without moving it
A A A B B B B C C C
Contiguous Allocation of Disk Space
Extent-Based AllocationAllocate multiple contiguous regions (extents) per file
• Meta-data: Small array (2-6) designating each extent – Each entry: starting block and size
Improves contiguous allocation• File can grow over time (until run out of extents)• Helps with external fragmentation
Advantages• Limited overhead for meta-data• Very good performance for sequential accesses• Simple to calculate random addresses
Disadvantages (Small number of extents):• External fragmentation can still be a problem• Not able to grow file when run out of extents
D A A A B B B B C C C B BD D
Linked AllocationAllocate linked-list of fixed-sized blocks
• Meta-data: Location of first block of file– Each block also contains pointer (first word) to next block
Advantages• No external fragmentation• Files can be easily grown, with no limit
Disadvantages• Random access is extremely slow• Unreliable: what if you lose one block in chain?
D A A A B B B B C C C B BD D D DB
Linked Allocation of Disk Space
File-Allocation Table (FAT)
Variation of Linked allocation• Keep linked-list information for all files in on-disk FAT table • Meta-data: Location of first block of file
– And, FAT table itself
Comparison to Linked Allocation• Same basic advantages and disadvantages• Full block size available• Optimization: FAT must be in main memory
– Greatly improves random accesses
D A A A B B B B C C C B BD D D DB
File-Allocation Table
Indexed Allocation
Brings all pointers together into the index block.Logical view.
index table
• each file has an index table (i-node in UNIX)– a collection of pointers to file’s blocks
• only need to load index tables (i-nodes) into memory when you open files
Example of Indexed Allocation
Indexed Allocation
Advantages• No external fragmentation• Files can be easily grown, with no limit• Supports random access
Disadvantages• Large overhead for meta-data:
– Wastes space for unneeded pointers – most files are small!
Multi-Level Indexed FilesVariation of Indexed Allocation
• Dynamically allocate hierarchy of pointers to blocks as needed• Meta-data: Small number of pointers allocated statically
– Additional pointers to blocks of pointers
• Examples: UNIX file systems
indirect
doubleindirect
indirecttripleindirect
Comparison to Indexed Allocation
• Advantage: does not waste space for unneeded pointers
– Still fast access for small files– Can also grow to a very large size
• Disadvantage: need to read indirect blocks of pointers to calculate addresses (extra disk read)
– Keep indirect blocks cached in main memory
The UNIX I-node
04/18/23 20
i-nodesAttributes:
• File type, size • Owner, group,
permissions (r/w/x)• Times: creation, last
access, last modified• Reference count
Block Addresses• Direct• Inderect
File Attributes
Address of block 0
Address of block 1
…
Address of block N
Single Indirect
Double Indirect
Triple Indirect
i-nodes
Assume: N=10, 1KB blocks, 4 byte entries
• Direct: 10 KB• Single indirect: 256 KB• Double indirect: 64 MB• Triple indirect: 16 GB!
File Attributes
Address of block 0
Address of block 1
…
Address of block N
Single Indirect
Double Indirect
Triple Indirect
File System Layout
A possible file system layout
File System Layout
Partitions: independent file systems
MBR (Master Boot Record): boots computer, then active partition
Boot block: first block executed
Superblock: Info about the file system• Contains all the key parameters about the file system: # of files, #
of blocks, # of free blocks, etc.
Free-space Management
Must keep track of blocks that are free• Bitmap (bit vector)
• Linked list
• Grouping– the first free block stores the address of n free blocks– The first n-1 of these blocks are actually free– The last block contains the addresses of another n free blocks,
and so on.
BitmapBit vector (n blocks)
…
0 1 2 n-1
bit[i] =0 block[i] free
1 block[i] occupied
Block number calculation
(number of bits per word) *(number of 0-value words) +offset of first 1 bit
Linked Free Space List on Disk
Implementing Directories (1)
(a) A simple directoryfixed size entriesdisk addresses and attributes in directory entry
(b) Directory in which each entry just refers to an i-node
Implementing Directories (2)
Two ways of handling long file names in directory• (a) In-line• (b) In a heap
File Caching
File system has lots of data structures on disk• Meta-data: bitmap of free blocks, directories, I-nodes,
indirect blocks• Data blocks
File caches speed access to all these types of data• Changing disk I/O to memory access• Response time can improve by 1000,000x• Write-through or write-back?