Download - Auxiliary Storage Management: Chapter 10
Auxiliary Storage Management: Chapter 10
File attributes: Name Type (often defined by extension such
as .doc, .c, .java, .txt, .gif, .jpg, etc) Location – or at least where it begins Size Protection (access rights) Time Date (expiration, last access/update/modification –
changing a file’s data or characterisitics)
Note ls –l and chmod command. Program iostat.c shows how to display many of a
files attributes. Often an attribute specifies the program that created
the file. This allows double clicking on a file and having it opened
by the correct program. Linux touch command:
Ex. touch –t 0412250606 filename changes last access date to Christmas of 2004 at 6 minutes after 6:00 am
Can see results via ls –l or ls -lc
File operations: Creating, writing, reading, repositioning (like a
seek), deleting, truncating (remove contents, leave attributes intact), append.
Modes for opening a file in C r for read; r+ for reading and writing (positioned at
beginning of file which must exist or error occurs) w creating a new file w+ reading and writing (file created if it does not exist) a for append (positioned at end of file) a+ reading and writing (positioned at end of file). See
man fopen.
Information associated w/ an open file (stuff in FILE *) disk location # procs which opened file current record position, etc.
File Types
Figure 10.1 on page 410 Extensions do not require the file to be of a
particular type – they just help with organization. NOTE : A directory is also a file type.
File locking
shared locks for reading; exclusive locks for writing Try to open the same file on two different machines.
Access methods - Section 10.2 Sequential access
Information processed in order. Reads in order; writes by appending.
Direct Access information stored by record/block. Access by specifying record/block #. Demo program fseek.c uses fseek function. Not strictly direct access but is something other than
sequential.
B+ tree indexes Complex collection of indexes stored as a B+ tree
structure. Bottom of the tree has the records. Allows a search based on key value
Hashed storage A hash function transforms a key value into a location.
Record is stored there. To retrieve a record based on key value, apply the hash
function to it and look in the resulting location. There are issues dealing with good hash functions and
what happens if two keys “hash” to the same location Both of these are more appropriate for a data
structures course
Directory structures – section 10.3
Single level (linear structure) some old computers did not support directories
Tree structured (most, if not all, systems today). System-wide file name
consists of path to current directory (command pwd) followed by file name inside the subdirectory.
NOTE: May be implemented using subdirectories or graphic folder icons. They really are the same.
Partial Linux Directory Structure
Acyclic-Graph subdirectories
A file may exists in two separate subdirectories. shortcuts (windows) – create a shortcut and store it in a
different folder. alias?? (Macintosh) May be implemented by links (Linux)
subdir1 subdir2
shared file
Two ways to create links in Linux: Hard link
Ex. ln /home/shayw/452/storage/test.txt hlink and Soft link (or symbolic link)
Ex. ln –s /home/shayw/452/storage/test.txt slink We'll describe the difference later.
Mounting Section 10.4 A file system must be mounted before it can be used. Example: one file system (say on a CD ROM) is
merged into another. Allows file system to be spread over multiple
devices (see Linux man mount) Mount point – location where file system will be
attached. Mount point
File system
New File system
File sharing Section 10.5 - skip
Section 10.6 Protection
Access listsLinux ls -l command owner group world- or d or l rwx rwx rwx links owner size date filename
eg octal 777 means rwx for all754 means rwx (7) for owner, rx (5) for group, r(4) for
world (some use universe which I guess is a little broader in scope).
Windows has something similar. Right click on a file (or a share folder) and select properties.
Chapter 11: File Systems Applogical file systemfile-org. modulebasic file
systemI/O controldevices App: Like a read or write in any language Logical file system:
How is directory organized? Does file exist? Does user have access? Read access? Write access? Etc. Where is the logical location of the file (logical block 0
through N?
File-org module: How is file allocated to disk? Determine the physical address on disk (disk, track, and
sector) from the logical block?
See also [http://www.ntfs.com/hard-disk-basics.htm] Seek time: time to move head from one track to
another Rotational delay (latency): time for proper sector to
rotate past the head Transfer time: time to transfer bits in a sector
Basic file system generate simple commands to driver: ex. read surface i, track j, sector k.
I/O control: Consists of device drivers and interrupt handlers. Issues low level hardware specific commands to a
controller. Can test status of controller or an operation. Usually done by writing/testing certain bit patterns in a
controller register. Devices: as previously described.
File System: Collection of FCBs (File control blocks) each of
which describes a file. Contains: permissions, dates, ownership, size, location of
blocks (or inodes using Linux terminology). Boot control block:
contains code necessary to boot an OS from the volume or partition
Volume control block: contains #blocks, size of blocks, #free blocks (also called
a superblock or master file table-MFT)
Open a file (What does open do?) Search system wide file table to see if file already
open. If so, a process open-file table entry is created, pointing to the system-wide entry (Fig 11.3).
Otherwise, search directory for file name. Does file exist? Do permissions allow access? Where is it?
Copy FCB into system-wide open-file table. Table also knows how many process have file open.
Create an entry in the process open-file table and have it point to the system-wide open file table entry.
Open returns a pointer to the process open-file table entry (file descriptor or file handle)
Raw disk: no file system (swap space, some databases, RAID
systems) Disk may have multiple partitions each with its own
file system. Old Windows 98. When a disk increased beyond a
particular size, had to have different drive letters to use entire disk space.
Partition may be spread over multiple disks.
Section 11.4 Allocation methods: Contiguous allocation:
Stored in consecutive disk blocks: problem w/ external fragmentation (no contiguous space
big enough) See Figure 11.5.
Linked allocation: Stored as a linked list of disk blocks: Initial sectors optimized to reduce seek times; many
additions/deletions lead to internal fragmentation (unused space inside block). Need to defragment drive.
Mainly useful for sequential access files.
FAT (File Allocation Table)-System
originated with MS/DOS through win 95, 98
Directory (list of entries on each disk). Each directory entry includes: File name (DOS originally had 8 bytes for this) File extension Attribute vector (bits indicating read only, system file,
hidden file, directory) Time and date of creation & last update. File size
Location of 1st FAT entry (also 1st cluster number or disk block)
Directory entry
2334
-1
31
25
28
2324252627282930313233343536
This file is allocated to clusters 23, 34, 28, 31, and 25.
FAT entry number
See also figure 11.7 NOTE: lots of disk head movements unless the FAT
is cached.
Indexed Allocation
First block is an index block - an array of indexes, each pointing to another block (of data)
Essentially a two-level hierarchy Can expand to multiple levels Figure 11.8.
The Unix inode (Directory entry associates a filename with an inode)
mode
owner
timestamps
Size block
count
Direct blocks
single indirect
double indirect
triple indirect
Data block
Data block
Data block
Pointers to data blocks
Pointers to indirect blocks
Pointers to second indirect blocks
Up to 12
Data block
Data block
See also figure 11.9
A disk directory is a list of FCBs, each locating one inode.
fid = fopen("filename", ...)
File descriptor table: one entry for each open file
stdin
stdout
stderr
-----
0
1
2
System File table
Inode
Contains file attributes
One for each open file system-wide. Contains: location of inode, current position in the file, mode (rwx), # of fds (file descriptors) pointing to it.
When inodes correspond to a directory
Directories are files. data blocks contain a collection of (entry, inode#)
pairs. In Linux, can open directories and read through them
much as you would a file See commands opendir and readdir Each entry is the name of a file in that directory. Also contains entries for “.” and “..”. Program directory.c demonstrates how this works.
Linux hard links:
Two directory entries that refer to the same inode. The inode keeps track of how many directory entries
reference it. Removing the original file just removes the directory
entry but leaves the inode (and the data) intact. A hard link continues to point to data after the
original file is deleted Takes up only a directory slot.
Soft links:
A file with a separate inode. Data is just the pathname of the actual file. An original can be removed and re-created and the
slink behaves accordingly. takes up directory slots, inode, and data block. Takes up more space than a hard link despite what
the command ls –l shows.
Example:
Create a test file /home/shayw/452/io/test.c. Enter directory /home/shayw/452/memory and type
ln /home/shayw/452/storage/test.c hlink and ln –s /home/shayw/452/storage/test.c slink
The first creates a hard link – the second a soft or symbolic link.
Do ls –l and note the results. change the test.c file and display the hlink and slink
files.
Do ls –i in each subdirectory. Note that hlink has the same inode numbers as the original file.
Now remove test.c file. Then display the contents of hlink and slink from the respective directories. The soft link is removed.
Recreate the test.c file. The hard link is unchanged; the soft link reflects the new contents.
Do ls –i again. The inode numbers are all different.
NOTE: Can set up links to subdirectories also. Makes it appear that one subdirectory can exists within two separate parent directories.
NOTE: must be a soft link.
Other allocation methods. Keyed file (VSAM)
defined by a hierarchy similar to 2-3 trees or B+ trees Hash strategy
combination of open/closed hashing Packing
storing logical records inside of physical records (block)
Tapes physical vs. logical records. Inter-record gaps.
Windows
create a small textfile using, say notepad. Right click on it and select properties. NOTE: Size and Size on Disk differ by potentially a
LARGE amount. Why? Min file size is 4K bytes (size of cluster). Can also do ls –s in Linux to show the number of
blocks for a file.
Section 11.5 Free space management
Bit map or bit vector: sequence of bits, each associated with a block. 1 means free; 0 means allocated. Simple approach-finding the first free block is easy. Find
the first 1. Could be a LOT of bits for anything but small disks – and
most disks are large anyway. Needs to be in memory for quick handling of output
requests that need new space. A 40GB disk (small) w/ 1K blocks would require 40
million bits or 5 MB of storage.
Linked list of free blocks. Use first free block when more space is needed (Figure 11.10)
Grouping: First free block has addresses of n more blocks. That last of those contains n more addresses, and so on. Can find multiple free blocks more quickly.
Performance
Caching used to speed up performance. First read causes a physical transfer from disk to OS
cache. Subsequent "reads" do not cause a physical I/O if data can be retrieved from cache.
Controller cache
Controller cacheOS cacheOS cache
User memoryUser memory
disk
Recovery & consistency checking
See help chkdsk. Run chkdsk C: from the command prompt. Linux fsck command. See man fsck.
Both can be used to fix some errors and check for consistency. Inconsistencies can occur as a result of a crash or problem during a file write.
Example, fsck will Create two tables each containing a counter for each
physical disk block. Read through each inode; access each block from the
inode; increment counter for that block in the first table. Tracks which blocks are part of some file. read through free list (or bit-map vector); increment
counter for each block that is found. Tracks which blocks are not part of some file Consistent means: each block has a 0 in one table and a 1
in the other.
Possible problems/responses a block has a 0-count in both tables. Missing
block. Usually add it to the free list.
A block has a count > 1 in the second table, meaning it appears twice in a free list can't happen with bit map vector. Adjust list to remove redundant entry
A block has a count > 1 in the first table meaning it appears twice (or more) in a file or in two or more files;
or the block has a non-0 count in both tables. Copy data into a free block and adjust file system
accordingly. Probably notify user or administrator.
File directory consistency
Table of counters, one for each file. Start at root directory and do a recursive search of
the hierarchy (file system). Inspect each directory. For each i-node in a
directory, increment a counter for that file. Recall a file may appear in > 1 directory due to hard links.
When done the checker knows how many directories have each file.
Inodes contain a link count set at 1 when file is created and incremented
whenever a hard link is created. These counts must agree.
If the link count is higher, all files could be deleted and the inode would not be removed. Set link count in inode to proper value.
If the link count is lower, an inode is removed when its count goes to 0 but there could still be a reference to it from another directory entry. This is bad. Again, adjust the link count.
o Bizarre protection levels. What if a file has a protection of 007?
o User and group have no access but world has rwx access?
o Linux allows this.
Backups: possible schedule
Full backup (all files) Backup files modified since previous day
(incremental) Repeat above N times. Repeat above starting with full backup.
Chapter 12: Disk Structure
Recall the disk structure (Figure 12.1) Head crash: head makes contact with disk surface
and scrapes off the magnetic material Mapping logical sector number to physical location.
Easy in theory; hard in practice. Bad sectors can be removed, leaving fewer sectors in
some tracks.
Disk rotation speed: constant angular velocity (CAV) or constant linear
velocity (CLV). With CLV, bit density is constant. More sectors
on outer tracks (maybe 40% more). Angular velocity varies. Used in CD/DVD tech.
With CAV, angular velocity is constant. Bit density changes. Number of sectors the same on all tracks. Can underutilize the outer tracks.
Magnetic tapes.
Section 12.4: Disk Scheduling
Given a number of reads that are pending, which are processed first?
FCFS (First come first server) Process in order received Head moves to tracks for each read potentially large average seek time (Figure 12.4)
SSTF (shortest seek time first) Process request for a track closest to the track over which
the head is currently positioned. Maximize efficiency bias against files on extreme tracks. (Figure 12.5) Starvation possible.
SCAN (elevator algorithm) Scans from one end of the disk to the other. Process requests as head passes by a track (Figure 12.6). Scans one way; when done, scans the other way. Just like an elevator going up, then down.
C-SCAN (circular scan) Scans one way only then returns to the beginning (Figure
12.7). Like elevators but after the top floor, free falls to the
bottom More consistent treatment of requests. Waits vary less.
C-LOOK: Like C-SCAN except only goes as far as the last request
and returns to the other end of a disk where a request is pending.
Disk Management
Format: divide disk into sectors and create an empty directory.
Each sector has: header, data, trailer.
Extra space used for error detecting data to check for transfer errors (Error Correcting Codes-ECC)
Boot block: On power up computer needs small program to run. ROM contains a boot loader which loads a boot program
stored on a boot disk (system disk). Not all disks have a boot program.
Bad blocks: defective blocks FAT contains special entry designating a block as bad. Utilities (such as chkdsk) can look for and mark other bad
disks.
Swap-space Management Disk space used to implement virtual memory. May be substantial if RAM is small; less if RAM is
larger Multiple swap spaces placed on multiple disks (if
avail) to even out I/O load. Can be part of the normal file system, easier but
takes longer to sort through file lists. Can add to fragmentation problems. NT uses a paging file stored on disk.
Often uses a separate disk partition
Raid Systems Redundant Array of Independent (Inexpensive)
Disks. Striping:
Split bits from a byte across multiple disks (bit-level striping).
Block level striping spreads consecutive blocks from a file across multiple disks.
Can load balance responses to requests and reduce response time by concurrent accesses.
[http://www.redhat.com/docs/manuals/linux/RHL-6.2-Manual/ref-guide/ch-raid.html]
Raid Level 0: Block level striping with no redundancy. Higher
performance, data loss or recovery time is less critical Raid Level 1:
Disk mirroring: complete redundancy (for fault tolerance). High reliability & fast recovery
Raid Level 2: Error detection or correction: extra bits added to each
byte. Bit striping with extra disks used for the extra bits. Raid Level 3:
Like level 2 but uses just one parity bit. Can be used to correct errors if a controller detects that a disk sector is bad. More common than level 2.
Raid Level 4: Block level striping with one disk used for parity blocks.
If one disk fails, can reconstruct using the parity block disk. Disks operate independent of each other unlike levels 2 and 3 in which drives are synchronized. Extra disk used for parity, but can create bottlenecks
Raid Level 5: Like level 5 but parity blocks distributed over all disks.
Good for large volumes of data. Some refs say this is the most common.
Raid Level 6: Like level 5 but uses error correction to account for
multiple disk failures.
Revisiting FAT
16-bit cluster numbers 216= 64K clusters. Originally, one cluster = one 512-byte sector, limit
of 64K * 0.5K = 32MB.
Larger disks => need to cluster sectors.
cluster more sectors, up to 128 sectors per cluster which yielded 64K*128*0.5K = 4GB limit. Even with this limit, FAT required two partitions. That's why large disks under Windows 95 had to be partitioned into distinct logical disks. FAT32 uses a 4Kb cluster size and 32-bit cluster numbers; released with Windows 98.
Disk Size Cluster Size
0 - 32MB 1/2KB
33MB - 64MB 1KB
5MB-128MB 2KB
129MB - 256MB
4KB
257MB - 512MB
8KB
513MB - 1GB 16KB
1.1GB - 2GB 32KB
2.1GB - 4GB 64KB
FAT32 (from www.microsoft.com) FAT32 provides the following enhancements over
previous implementations of the FAT file system:
Supports drives up to 2 terabytes in size. Uses space more efficiently. FAT32 uses smaller clusters (that is, 4KB clusters for
drives up to 8 GB in size), resulting in 10 to 15 percent more efficient use of disk space relative to large FAT16 drives. NOTE: When looking at file using view->details, file size may show up as 1KB but right clicking and choosing properties reveals that 4KB bytes minimum are used.
More robust. FAT32 has the ability to relocate the root directory
and use the backup copy of the FAT instead of the default copy. In addition, the boot record on FAT32 drives has been expanded to include a backup of critical data structures. This means that FAT32 drives are less susceptible to a single point of failure than existing FAT16 volumes.
More flexible. The root directory on a FAT32 drive is now an
ordinary cluster chain, so it can be located anywhere on the drive. For this reason, the previous limitations on the number of root directory entries no longer exist. In addition, FAT mirroring can be disabled, allowing a copy of the FAT other than the first one to be active. These features allow for dynamic resizing of FAT32 partitions.
NTFS (New Technology File System) Ref: Modern Operating Systems by A.S. Tanenbaum
[http://support.microsoft.com/default.aspx?scid=kb;en-us;Q100108]
[http://www.pcguide.com/ref/hdd/file/ntfs/archMFT-c.html]
[http://www.ntfs.com] [http://en.wikipedia.org/wiki/NTFS]
File names: up to 255 characters and in Unicode Full path names: up to 32,767 characters Allocates by clusters, each of which is 2p sectors.
Cluster size is 4K for disk > 2GB Uses logical cluster numbers as disk addresses.
File updates are treated as transactions (all or nothing). Allows NTFS to reconstruct disk volumes in the event of
failures. Similar to database processing. All updates performed inside transactions:
Write a log record that contains redo and undo information to the file system data structure make change write a commit record to the log entry.
If a crash, system can use log file to restore data integrity. Write a checkpoint record to the file every 5 seconds. Ensures only metadata is correct, not necessarily user data.
file is NOT just a byte stream, but a structured object consisting of attributes. Each attribute is a byte stream. Stream can include: file name, file data (one or more streams), object ID.
Similar to an extension of the Macintosh files which have a data and resource fork (streams) and can be used to store icons, images, menu definitions, even some application code.
Perhaps most obvious is thumbnail images of jpeg files that can appear in Windows explorer.
File system structure: Master File Table (MFT): each file described by 1
or more records in an array called MFT (Master File Table). Each MFT entry contains: Name (Unicode), Timestamps, & other attributes. If
an attribute is short, it's place in the MFT; If not, then it goes elsewhere with a pointer (in the MFT record) to it.
List of block addresses. May use more than 1 MFT entry if there are a lot of blocks.
Location of additional MFT entries, if needed
MFT is a file and can be anywhere on the disk Entry 0 specifies the location of MFT blocks.
Boot block locates the first MFT entry. Entry 1 is a duplicate of 0.
Next 14 entries refer to meta data files. Examples: log file to track changes such as adding a new
directory or removing one, changes to file attributes, and virtually everything except changes to user data.
A volume file to track information about the volume (disk). Size, label, version.
Attribute definitions: Attributes used in MFT entries for regular files (later).
Root directory Free space bitmap to keep track of free space. Bitmap
itself is a file. List of all bad blocks Bootstrap loader file Case mapping definitions (A-Z; a-z may or may not
be case sensitive) Also can define rules for other languages.
More on an MFT entry. MFT entry contains a sequence of up to attribute header:attribute value pairs. The value may reside in the MFT entry or reside on a separate block. Attributes include: File name, timestamps, attribute bits, access control
list (who has access) ID (similar to a Linux inode) Information on directory structures. May be a simple
list if a small directory or a B+ tree if a large directory.
Storage allocation: Try to use consecutive blocks, if possible. Blocks are described by a sequence of records, each
describing a sequence of logically contiguous blocks. Suppose a file is stored in blocks 20-23; 64-65; and
80-82. MFT entry contains (header specifies how many
blocks)………………0 9 20 4 64 2 80 3
Header disk blocks
Reparse points: Used for symbolic links or for mounting file systems. NTFS tags a file as a reparse point & associates a block of
data with it. If file encountered during a file search, operation fails,
returning the block of data. Block refers to an alternate path name which can refer to
a symbolic link or a mounted file system.
Fragmented files require more header entries. The above contains three runs of block.
If there are too many runs to fit in one MFT entry, the MFT entry will also contain the numbers of MFT entries containing the other runs. They are called extension records.
A "hole" exists when there are two separate runs. A small file (with many holes) could require more MFT space that a large file with few or no holes).
NTFS can use various compression schemes (transparent to the user) to reduce the space needed for this.
File lookup procedure. Look for /foo/bar/dir1/testfile.txt. Entry 5 in the MFT corresponds to the root directory.
Look for foo there. Previous step returns (if successful) MFT index for foo. Locate that entry and look for bar. Returns MFT entry for bar. Locate that entry and look for dir1. Return MFT entry for dir1. Locate that entry and look for testfile.txt. If any of the previous steps is not successful -> file not
found
Journaling: NTFS keeps a list of all the change records for directories and files in a special file.
Can mark directories to be encrypted which results in all included files being encrypted. This not actually managed by NTFS but by a driver called Encryption File System or an encryption facility in Vista called BitLocker.