file system implementation csci 444/544 operating systems fall 2008

Post on 20-Dec-2015

214 Views

Category:

Documents

0 Downloads

Preview:

Click to see full reader

TRANSCRIPT

File System Implementation

CSCI 444/544 Operating Systems

Fall 2008

Agenda

• File system layout

• Free-space management

• Directory implementation

• File caching

File System Layout

Overall question: how to organize files on disk?

• This is really just a data structure issue– What data structure is the right one to use to store a file on

disk?– Usage patterns matter

• Many issues in OS boil down to data structure and algorithms

– Disk scheduling is similar to traveling salesman problem

File System Usage Patterns• 80% of file accesses are reads

• Most programs that use a file sequentially access the whole file– Spatial locality

– Pre-fetching

• Most files are small, although most bytes are taken up by large files

File Allocation

How do we lay out the blocks of a file on disk?

Many different approaches• Contiguous• Linked list• Indexed

Implications• Large files should be allocated sequentially• Files in same directory should be allocated near each other• Data should be allocated near its meta-data

Design Metrics

• Amount of fragmentation (internal and external)?

• Ability to grow file over time?• Seek cost for sequential accesses?• Speed to find data blocks for random

accesses?•Wasted space for pointers to data blocks?

Contiguous AllocationAllocate each file to contiguous blocks on disk

• Meta-data: Starting block and size of file• OS allocates by finding sufficient free space• Example: IBM OS/360

Advantages• Little overhead for meta-data• Excellent performance for sequential accesses• Simple to calculate random addresses

Drawbacks• Horrible external fragmentation (Requires periodic compaction)• May not be able to grow file without moving it

A A A B B B B C C C

Contiguous Allocation of Disk Space

Extent-Based AllocationAllocate multiple contiguous regions (extents) per file

• Meta-data: Small array (2-6) designating each extent – Each entry: starting block and size

Improves contiguous allocation• File can grow over time (until run out of extents)• Helps with external fragmentation

Advantages• Limited overhead for meta-data• Very good performance for sequential accesses• Simple to calculate random addresses

Disadvantages (Small number of extents):• External fragmentation can still be a problem• Not able to grow file when run out of extents

D A A A B B B B C C C B BD D

Linked AllocationAllocate linked-list of fixed-sized blocks

• Meta-data: Location of first block of file– Each block also contains pointer (first word) to next block

Advantages• No external fragmentation• Files can be easily grown, with no limit

Disadvantages• Random access is extremely slow• Unreliable: what if you lose one block in chain?

D A A A B B B B C C C B BD D D DB

Linked Allocation of Disk Space

File-Allocation Table (FAT)

Variation of Linked allocation• Keep linked-list information for all files in on-disk FAT table • Meta-data: Location of first block of file

– And, FAT table itself

Comparison to Linked Allocation• Same basic advantages and disadvantages• Full block size available• Optimization: FAT must be in main memory

– Greatly improves random accesses

D A A A B B B B C C C B BD D D DB

File-Allocation Table

Indexed Allocation

Brings all pointers together into the index block.Logical view.

index table

• each file has an index table (i-node in UNIX)– a collection of pointers to file’s blocks

• only need to load index tables (i-nodes) into memory when you open files

Example of Indexed Allocation

Indexed Allocation

Advantages• No external fragmentation• Files can be easily grown, with no limit• Supports random access

Disadvantages• Large overhead for meta-data:

– Wastes space for unneeded pointers – most files are small!

Multi-Level Indexed FilesVariation of Indexed Allocation

• Dynamically allocate hierarchy of pointers to blocks as needed• Meta-data: Small number of pointers allocated statically

– Additional pointers to blocks of pointers

• Examples: UNIX file systems

indirect

doubleindirect

indirecttripleindirect

Comparison to Indexed Allocation

• Advantage: does not waste space for unneeded pointers

– Still fast access for small files– Can also grow to a very large size

• Disadvantage: need to read indirect blocks of pointers to calculate addresses (extra disk read)

– Keep indirect blocks cached in main memory

The UNIX I-node

04/18/23 20

i-nodesAttributes:

• File type, size • Owner, group,

permissions (r/w/x)• Times: creation, last

access, last modified• Reference count

Block Addresses• Direct• Inderect

File Attributes

Address of block 0

Address of block 1

Address of block N

Single Indirect

Double Indirect

Triple Indirect

i-nodes

Assume: N=10, 1KB blocks, 4 byte entries

• Direct: 10 KB• Single indirect: 256 KB• Double indirect: 64 MB• Triple indirect: 16 GB!

File Attributes

Address of block 0

Address of block 1

Address of block N

Single Indirect

Double Indirect

Triple Indirect

File System Layout

A possible file system layout

File System Layout

Partitions: independent file systems

MBR (Master Boot Record): boots computer, then active partition

Boot block: first block executed

Superblock: Info about the file system• Contains all the key parameters about the file system: # of files, #

of blocks, # of free blocks, etc.

Free-space Management

Must keep track of blocks that are free• Bitmap (bit vector)

• Linked list

• Grouping– the first free block stores the address of n free blocks– The first n-1 of these blocks are actually free– The last block contains the addresses of another n free blocks,

and so on.

BitmapBit vector (n blocks)

0 1 2 n-1

bit[i] =0 block[i] free

1 block[i] occupied

Block number calculation

(number of bits per word) *(number of 0-value words) +offset of first 1 bit

Linked Free Space List on Disk

Implementing Directories (1)

(a) A simple directoryfixed size entriesdisk addresses and attributes in directory entry

(b) Directory in which each entry just refers to an i-node

Implementing Directories (2)

Two ways of handling long file names in directory• (a) In-line• (b) In a heap

File Caching

File system has lots of data structures on disk• Meta-data: bitmap of free blocks, directories, I-nodes,

indirect blocks• Data blocks

File caches speed access to all these types of data• Changing disk I/O to memory access• Response time can improve by 1000,000x• Write-through or write-back?

top related