cs 135: file systems - hmc computer science structure vfs interface vfs interface functions the list...
Post on 19-Apr-2018
223 Views
Preview:
TRANSCRIPT
CS 135: File SystemsGeneral Filesystem Design
1 / 22
Promises
Promises Made by Disks (etc.)
1. I am a linear array of blocks2. You can access any block fairly quickly3. You can read or write any block independently of any other4. If you give me bits, I will keep them and give them back later
2 / 22
Promises
Promises Made by Filesystems
1. I am a structured collection of data2. My indexing is more complex than just numbers3. You can read and write on the block or byte level4. You can find the data you gave me5. I will give you back the bits you wrote
3 / 22
Kernel Structure VFS Interface
Virtual File System Layer
User
User Program
VFS Switch
Kernel
ReiserFS Ext3 NFS
NFS Server
VFS Switch
4 / 22
Kernel Structure VFS Interface
VFS Stacking
User
User Program
VFS Switch
Kernel
Encrypt
Compress
Ext3
5 / 22
Kernel Structure VFS Interface
VFS Interface Functions
The list is long and the interface is complex. Here are a few samplefunctions:
lookup Find directory entrygetattr Return file’s attributes; roughly Unix statmkdir Create a directorycreate Create a file (empty)rename Works on files and directoriesopen Open a file (possibly creating it)read Read bytes
Important: Don’t have to implement all operations!
6 / 22
Kernel Structure VFS Interface
VFS Interface Functions
The list is long and the interface is complex. Here are a few samplefunctions:
lookup Find directory entrygetattr Return file’s attributes; roughly Unix statmkdir Create a directorycreate Create a file (empty)rename Works on files and directoriesopen Open a file (possibly creating it)read Read bytes
Important: Don’t have to implement all operations!
6 / 22
Kernel Structure FUSE
FUSE Structure
FUSE (Filesystem in USEr space) works sort of like NFS:
User
User Program
VFS Switch
Kernel
FUSEkernel support
FUSE client(same machine)
(via special FUSE protocol)
7 / 22
Kernel Structure FUSE
FUSE Clients
I Must implement a minimum protocol subsetI What happens internally is hugely flexible
I Serve requests from internal memoryI Serve them programmatically (e.g, reads return
√writes)
I Feed them on to some other filesystem, local or remoteI Implement own filesystem on local disk or inside a local file
I Samples widely available:hellofs Programmatic “hello, world”sshfs Remote access via ssh
Yacufs Makes ogg look like mp3, etc.Bloggerfs Your blog is a filesystem!
rsbep ECC for your filesunpackfs Look inside tar/zip/gzip archives
8 / 22
Kernel Structure FUSE
The FUSE Interface (1)
FUSE is somewhat like VFS, but can be stateless. Full list ofoperations (2 slides):
*getattr Get file attributes/properties*readdir Read directory entries
*open Open file*read Read byteswrite Write bytesmkdir Make directoryrmdir Remove directorymknod Make device node or FIFO
readlink Read symlink destinationsymlink Create symbolic link
link Create hard linkunlink Remove link or filerename Rename directory or file
9 / 22
Kernel Structure FUSE
The FUSE Interface (2)
FUSE operation list, continued:truncate Delete tail of fileaccess Check access permissionschmod Change permissionschown Change ownership
utimens Update access/modify timesstatfs Get filesystem statistics
release Done with file (kind of like close)fsync Flush file data to stable storage
getxattr Get extended attributessetxattr Set extended attributeslistxattr List extended attributesremovexattr Remove extended attributes
10 / 22
Kernel Structure FUSE
A Minimal FUSE Filesystem
The “hello, world” example filesystem:
getattr If path is “/” or “/hello”, return canned result; else failreaddir Return three canned results: “.”, “..”, “hello”
open Fail unless path is “/hello” and open is for readread If path is “/hello” and read is within string, return bytes
requested. Otherwise fail.
97 lines of well-formatted (but uncommented) code!
Oh, and you can do it in Python or Perl. . .
11 / 22
Kernel Structure FUSE
A Minimal FUSE Filesystem
The “hello, world” example filesystem:
getattr If path is “/” or “/hello”, return canned result; else failreaddir Return three canned results: “.”, “..”, “hello”
open Fail unless path is “/hello” and open is for readread If path is “/hello” and read is within string, return bytes
requested. Otherwise fail.
97 lines of well-formatted (but uncommented) code!Oh, and you can do it in Python or Perl. . .
11 / 22
Kernel Structure FUSE
What Is FUSE Good For?
I Quick filesystem developmentI Filesystems that need user-level servicesI Extending existing filesystemsI Trying out radical ideas (e.g., SQL interface to filesystem)
12 / 22
Kernel Structure FUSE
What is FUSE Bad At?
I Performance is necessarily worse than in-kernel systemsI Don’t use when top performance neededI Don’t use if you need performance measurements on your cool
new idea
13 / 22
Filesystem Design Basics
What a Filesystem Must Provide
I Unix has had big effect on filesystem designI To succeed today, must support the POSIX interface:
I Named files (buckets of bits)I Hierarchical directory treesI Long file namesI Ownership and permissions
I Many ways to accomplish this goalI Today we’ll look at single-disk filesystems
14 / 22
Filesystem Design Basics
Disk Partitioning
I For various bad reasons, disks are logically divided into partitionsI Table inside cylinder 0 tells OS where boundaries areI OS makes it look like multiple disks to higher levelsI Early computers had no BIOS, so booting just read block 0 (“boot
block”) of disk 0I Block 0 had enough code to find rest of kernel & read it inI Even today, block 0 is reserved for boot block (Master Boot
Record)I Original scheme had partition table inside MBRI Partition contents are up to filesystem
15 / 22
Filesystem Design Structure
Basic Filesystem Structure
I Any (single-disk) filesystem can be divided into five parts:1. “Superblock” at well-known location2. Free list(s) of unallocated space & data structures3. Directories that tell where to find other files and directories4. “Root directory” findable from superblock5. Metadata telling how to find directory & file contents, and possibly
other information about them
16 / 22
Filesystem Design Structure
The Superblock
I Must be findable when FS is first accessed (“mounted”)I Only practical approach: have well-known location (e.g., block 2)I Everything is up to designer, but usually has:
I “Magic number” for identificationI Checksum for validityI Size of FS (redundant but necessary)I Location of root directoryI Location of metadata (or first metadata)I Parameters of disk and of FS structure (e.g., blocks per cylinder,
how things are spread across disk)I Location of free listI Bookkeeping data (e.g., date last mounted or checked)
17 / 22
Filesystem Design Structure
The Free List
I Usually one of simplest data structures in filesystemI Popular approaches:
I Special file holding all free blocksI Linked list of blocksI “Chunky” list of blocksI BitmapI List of extents (contiguous groups of blocks identified by start &
length)I B-tree or fancier structure
18 / 22
Filesystem Design Structure
Directories
I Requirement: associate name with something usable for locatingfile’s attributes & data
I Simplest approach: array of structures, each of which has name,attributes, pointer to where data is found
I Problems:I Makes directories big; skipping unwanted entries is expensiveI Puts “how to find” information far from file & makes it hard to findI Can’t support hard links & certain other nice features
I Better: associative map of pairs (name, id-number) whereid-number tells where to find rest of metadata about file
I From Unix, traditionally referred to as i-node numberI Inodes (from “index nodes”) can be array or complex structureI Every directory must also have “.” and “..” (or equivalent)
19 / 22
Filesystem Design Structure
The Root Directory
I This part is easy: on any sensible FS it’s identical to any otherexcept for being easily findable
I “..” must be special, since you can’t go up from rootI Exception: if filesystem is mounted under a subdirectory, going up
makes senseI OS hacks that case specially
20 / 22
Filesystem Design Structure
Metadata About Files and Directories
I Most is just a struct of useful informationI Under Unix, almost precisely what stat(2) returnsI Type, permissions, owner, group, size in bytes, three timestamps
I Fun part is “how to find the data itself”I Desirable properties of a good scheme:
I Cheap for small files, which are commonI Supports very large filesI Efficient random access to large filesI Lets OS know when blocks are contiguous (i.e., cheap to read
sequentially)I Easy to return blocks to free list
I Can’t be list of blocks, since inode usually fixed sizeI Various schemes; for example, could give root of B-tree, or first of
linked list of block numbersI Can be useful to use extents & try to have sequences of blocksI Can use hybrid scheme where first few blocks listed in inode,
remainder found elsewhere
21 / 22
Filesystem Design Final Thoughts
Final Thoughts
I Optimal design depends on workloadI Read vs. write frequencyI Sequential vs. random accessI File-size distributionI Long- vs. short-lived filesI Proportion of files to directoriesI Directory sizeI . . .
I There is no perfect filesystem!
22 / 22
Filesystem Design Final Thoughts
Final Thoughts
I Optimal design depends on workloadI Read vs. write frequencyI Sequential vs. random accessI File-size distributionI Long- vs. short-lived filesI Proportion of files to directoriesI Directory sizeI . . .
I There is no perfect filesystem!
22 / 22
top related