network file system
DESCRIPTION
Network File System. Phil Segel Muhammad Kamran Arain Ala F. Alnawaiseh. Group 3. Outline. Introduction Network File System (NFS) Windows Distributed File System (DFS). Introduction. - PowerPoint PPT PresentationTRANSCRIPT
Network File System
Phil SegelMuhammad Kamran ArainAla F. Alnawaiseh
Group 3
Outline
Introduction Network File System (NFS) Windows Distributed File System (DFS)
Introduction
A Distributed File System (DFS): is a File System, that supports sharing of files and resources in the form of persistent storage over a network.
The first file servers were developed in the 1970s.
Sun’s Network File System (NFS) became the first widely used distributed file system after its introduction in 1985.
Clients and servers
A file server provides file services to clients. A client interface for a file service is formed
by a set of primitive file operations:– Creating a file.– Deleting a file.– Reading from a file.– and Writing to a file.
Distribution
A DFS is a file system whose clients, servers, and storage devices are dispersed among the machines of a Distributed System or intranet.
Accordingly, service activity has to be carried out across the network, and instead of a single centralized data repository, the system has multiple and independent storage devices.
The distinctive features of a DFS are the multiplicity and autonomy of clients and servers in the system.
Transparency
Ideally, a DFS should appear to its clients to be a conventional, centralized file system.
The multiplicity and dispersion of its servers and storage devices should be made invisible.
Performance
The most important performance measurement of a DFS is the amount of time needed to satisfy service requests. – In conventional systems, this time consists of a
disk-access time and a small amount of CPU-processing time.
– In a DFS, however, a remote access has the additional overhead attributed to the distributed structure.
Concurrent File Updates
A DFS should provide for multiple client processes on multiple machines not just accessing but also updating the same files.
Concurrency control or locking may be either built into the file system or be provided by an add-on protocol
Distributed Data Store
A Distributed Data Store is a network in which a user stores his or her information on a number of peer network nodes.
Most of the peer to peer networks do not have distributed data stores in that the user's data is only available when their node is on the network.
Distributed Data-Store Networks
FreeNet. MNet. Andrew File System (AFS). NNTP. BitTorrent The Mnesia Database. GNUnet. Secure File system (SFS) Global File System (GFS) The Chord Project. SVK – Distributed Version Control. Groove shared workspace, used for DoHyki.
Windows Distributed File Systems
What is the purpose of Windows DFS?
To unite files on different computers into a single namespace
Make it easy to build a single, hierarchal view of multiple file servers and file server shares on your network
To display files in a single directory structure regardless of what server the files are on
Comparison
Windows Distributed File Systems do for servers what a file system does for a hard disk.
DFS File Protocols
Not limited to a single protocol Regardless of client used, can support
mapping of:– Servers– Shares– Files
Supports these provided that the client supports the native server and share
History
The UNC (Universal Naming Convention) was required to specify the physical server and share to access file information– i.e. \\Server\share\path\filename
Could be used directly by drive mapping– i.e. X:\path\fileame
As network continues to grow mapping shares individually scales poorly
Solution to historical problems
Windows DFS solves these problems by linking physical storage into logical representation.
Permits shares to be hierarchally connected to other Windows shares
Make physical location of data transparent to users and applications
DFS Features and Benefits
Feature: Custom hierarchical view of shared network resources
– Description: By linking shares together, administrators can create a single hierarchical volume that behaves as though it were one giant hard drive. Individual users can create their own Dfs volumes, which in turn can be incorporated by other Dfs volumes. These are called inter-Dfs links.
– Benefit: Provides a simplified view of network shares that can be customized by the administrator.
DFS Features and Benefits (Continued)
Feature: Flexible volume administration Description: Individual shares participating in the Dfs
volume can be taken offline without affecting the remaining portion of the volume name space.
Benefit: Allows administrators to manage physical network shares, independent of their logical representation to users.
DFS Features and Benefits (Continued)
Feature: Graphical administration tool – Description: Each Dfs root is administered with an easy-to-
use graphical administration tool that permits browsing, configuration of volumes, alternates, and inter-Dfs links, as well as administration of remote Dfs roots.
– Benefit: Requires little training, reducing the need for trained, full-time server administrators.
DFS Features and Benefits (Continued)
Feature: Higher data availability – Description: Multiple copies of read-only shares can be
mounted under the same logical Dfs name to provide alternate locations for accessing data. If one of the copies becomes unavailable, an alternate is automatically selected.
– Benefit: Important business data is always available, even if a server, disk drive, or file occasionally fails.
DFS Features and Benefits (Continued)
Feature: Load balancing – Description: Multiple copies of read-only shares on separate
disk drives or servers can be mounted under the same logical Dfs name, thus permitting limited load balancing between drives or servers. As users request files from the Dfs volume, they are transparently referred to one of the network shares comprising the Dfs volume.
– Benefit: Automatically distributes file access across multiple disk drives or servers to balance loads and improve response time during peak usage periods.
DFS Features and Benefits (Continued)
Feature: Name transparency – Description: End users navigate the logical name space
without consideration to the physical locations of their data. Physical data can be relocated to any server and the logical Dfs name space can be reconfigured so that the end user‘s perspective of the Dfs name space is unaffected (that is, it is transparent to users that their data has changed location).
– Benefit: Increased administrative flexibility. Administrators can move network shares between servers or disk drives without affecting users’ ability to access the data.
DFS Features and Benefits (Continued)
Feature: Integration with Windows NT security model – Description: No additional administrative or security issues.
Any user who connects to a Dfs volume is only permitted to access files for which he or she has appropriate rights on that share.
– Benefit: Uses the existing Windows NT security model for easy administration and secure access.
DFS Features and Benefits (Continued)
Feature: Dfs client integrated into Windows NT Workstation 4.0, available for Windows 95 and Windows 98
– Description: The Dfs Windows NT Workstation client has been incorporated into Windows NT Workstation 4.0. This integration with the SMB redirector allows the extra Dfs features to be fully pageable and does not affect memory needs or standard client access performance.
– Benefit: Dfs functionality requires no additional resources on client systems.
DFS Features and Benefits (Continued)
Feature: Intelligent client caching – Description: A Dfs volume can potentially connect hundreds
or thousands of published shares. The client software makes no assumptions about what portion of Dfs published information a user might access. As a result, the first access of a published directory caches certain information locally. The next time a client accesses that portion of the Dfs name space, the cached referral is accessed, rather than obtaining a new referral.
– Benefit: Allows high-performance access to complex hierarchies of network volumes.
DFS Features and Benefits (Continued)
Feature: Windows 95 and Windows 98 Client – Description: Dfs includes a service to permit Windows 95
and Windows 98 users to navigate the Dfs name space. With the current release of Dfs, Windows clients can only access non-SMB volumes through a server-based gateway (for example, Microsoft Gateway Services for NetWare, which is included with Windows NT Server).
– Benefit: Extends Dfs benefits to Windows 95 and Windows
98 users.
DFS Features and Benefits (Continued)
Feature: Interoperates with other network file systems
– Description: Any volume that is accessible through a redirector on Windows NT Workstation can participate in the Dfs name space. This can be through either client redirectors or server-based gateway technology.
– Benefit: Administrators can create a single hierarchy incorporating heterogeneous network file systems.
Administration
DFS Provides tools to add and remove shares as necessary
Administration (Continued)
Easy to replace servers since each node in the Dfs is assigned a logical name that points to a file share
Can point a particular share to a new node while the current node is being replaced
User View
Maps just like a regular Windows drive
Load Balancing
If volumes are unavailable, Dfs will hand off request to an alternate volume if available
Example:– If 300 users require access to one volume, Dfs
can split users among copies of 2 or more servers to balance the load
Name Transparency
Eliminates the need for end users to know where the information is physically stored
Eases updating to accommodate additional storage
Example:– Users do not need to know the location of
physical storage, so it can be swapped out behind the scenes to accommodate additional storage
Technical Overview of DFS
DFS Root– Serves as a starting point and host to other
shares
Post-Junction Junctions
This is a junction that has child junctions Inter-Dfs Links
– Can join separate Dfs volumes together– Example: Organizations having their own Dfs, and
then one large Dfs to encompass the smaller Dfs’s
Post-Junction Junctions (Continued)
Midlevel Junctions– Planned for future versions of Dfs– Unlimited hierarchical junctioning
Reduces points of failures Does not require inter-dfs links Minimizes the number of referrals to deeply
nested paths Maintained by the Root
Example
UNC Name Maps to Description
\\Server\Public \\Server\Public Root of the organization’s Dfs
\\Server\Public\Intranet \\IIS\Root Junction to the intranet launching point
\\Server\Public\Intranet\CorpInfo
\\Marketing\Info\Corporate_HTML
Junction to departmental intranet content
\\Server\Public\Users \\Server\Public\Users Collection of home directories
\\Server\Public\Users\Bob
\\Server\Public\Users Junction from Users to Bob’s directory on the corporate development server
\\Server\Public\Users\Bob\Java_Apps
\\Bob1\Data\Java_Apps
Junction point from Bob’s development directory to one of Bob’s personal workstations
\\Server\Public\Users\Bob\Java_Apps
\\Bob2\Backups\Java_Apps
ALTERNATE Volume: Manually maintained backup of Bob’s work
\\Server\Public\Users\Ray
\\Server\Public\Users Down-level Volume : junction to a non-SMB volume (such as NetWare or NFS)
Example (Continued)
Alternate Volumes
Keeping exact replicas of the same volume for redundancy
Can be mounted to the same point Limit of 32 alternates for any given junction
point
Down-Level Volumes
Legacy support for all older Windows operating systems
Can participate in Dfs but cannot host the Dfs tree
Partition Knowledge Table (PKT)
Maintains knowledge of all of the junction points
Approximately 300 bytes per entry containing:– Dfs Path– [Server + Share] (a list)– Time to Live
Illustration
Resolving Junctions
Logical names into physical names is done by searching the PKT.
Maintained in a tree Top-down search
Example
Fail over to between volumes
When alternates are available they are provided to the client during name resolution
Choosing which volume among alternates is arbitrary and selected by the client
Fail-over Scenario 1
A client is browsing an alternate volume. The computer hosting the alternate loses power or drops completely from the network for any reason. To fail-over, the client must first detect that the hosting computer is no longer present. How long this takes depends on which protocol the client is using. Many protocols account for slow and loosely connected WAN links, and therefore may have retry counts of up to two minutes before the protocol itself times out. Once that occurs, Dfs immediately selects a new alternate. If none are available from the local cache, the Dfs client consults with the Dfs root to see if the administrator has modified any PKT entries. If no alternates are available at the root, a failure occurs; otherwise, Dfs initiates a fresh alternate selection and session setup.
Fail-over Scenario 2
A client is browsing an alternate volume. The computer hosting the alternate loses a hard disk hosting the alternate, or the share is deactivated. In this situation, the server hosting the alternate is still responding to the client request; the fail-over to a fresh alternate is nearly instantaneous.
Fail-over Scenario 3
A client has open files. The computer hosting the alternate loses power or drops completely from the network for any reason. In this scenario, the same protocol fail-over process described in Scenario #1 occurs, but the application that previously had file locks from the previous alternate must detect the change and establish new locks.
New attempts to open files trigger the same fail-over process described in Scenario #1. Operations on already open files fail with appropriate errors.
Fail-over Scenario 4
A client has open files. The computer hosting the alternate loses a hard disk hosting the alternate, or the share is deactivated. In this scenario, the same very quick fail-over process described in Scenario #2 occurs, but the application that previously had file handles from the previous alternate must detect the change and establish new handles.
Security
Allows for special handling of security issues at session startup using ACL’s.
The ACL’s are not consistent system wide ACL’s maintained on each server share
Administered at each physical share
There is no mechanism to administer system wide security from the Dfs root
There is no attempt made to keep the ACL’s consistent between alternate volumes
Network File System
NFS Architecture (1)
a) The remote access model.b) The upload/download model
NFS Architecture (2)
The basic NFS architecture for UNIX systems.
Important Advantage Of NFS
Largely independent of local file systems
In principle it does not matter which OS client or server uses (Unix or Windows)
Only important issue is that file systems must be compliant with file system model offered by NFS
Example: short MS-DOS names cannot be used to implement an NFS server in a fully transparent way
File System Model
An incomplete list of file system operations supported by NFS.
Operation v3 v4 Description
Create Yes No Create a regular file V3 only
Create No Yes Create a nonregular file - V4 symbolic links, directories and special files
Link Yes Yes Create a hard link to a file
Symlink Yes No Create a symbolic link to a file
Mkdir Yes No Create a subdirectory in a given directory
Mknod Yes No Create a special file
Rename Yes Yes Change the name of a file
Rmdir Yes No Remove an empty subdirectory from a directory
Open No Yes Open a file – V4 - will create a file if it does not exist
Close No Yes Close a file
Lookup Yes Yes Look up a file by means of a file name
Readdir Yes Yes Read the entries in a directory
Readlink Yes Yes Read the path name stored in a symbolic link
Getattr Yes Yes Read the attribute values for a file
Setattr Yes Yes Set one or more attribute values for a file
Read Yes Yes Read the data contained in a file
Write Yes Yes Write data to a file
File Handles
A reference to a file within a file system It is independent of the name of the file it refers to Created by the server that is hosting the file system Unique with respect to all file systems exported by the server Created when the file is created Client is kept ignorant of the actual content of the file handle – it is completely
opaque
Processes
NFS Traditional Client/Server system Version 2 and Version 3 Server stateless Stateless model not always fully implemented Very little client info held Version 4 Stateless model abandoned
Stateful Approach
Besides file locking and authentication there is another reason for making the server stateful
NFS 4 is expected to work over WANs This requires that client can make efficient use of caches This, in turn, requires an efficient cache consistency protocol Server needs to maintain information on files used by clients For example, the server may associate a lease with each client, promising to
give client exclusive read/write
Communication
a) Reading data from a file in NFS version 3 - Iterative
b) Reading data using a compound procedure in version 4 - Recursive
Naming (1)
Mounting (part of) a remote file system in NFS.
Naming (2)
Mounting nested directories from multiple servers in NFS.
Automounting (1)
A simple automounter for NFS.
Automounting (2)
Using symbolic links with automounting.
File Attributes
Version 3 used fixed set of attributes Fully implementing version 3 was difficult on some platforms Version 4 split attributes into 3 sets: Mandatory attributes Recommended attributes Named attributes• Named attributes not actually part of NFS protocol
File Attributes
Attribute Description
ACL an access control list associated with the file
FILEHANDLE The server-provided file handle of this file
FILEID A file-system unique identifier for this file
FS_LOCATIONS Locations in the network where this file system may be found
OWNER The character-string name of the file's owner
TIME_ACCESS Time when the file data were last accessed
TIME_MODIFY Time when the file data were last modified
TIME_CREATE Time when the file was created
Attribute Description
TYPE The type of the file (regular, directory, symbolic link)
SIZE The length of the file in bytes
CHANGE Indicator to see if and/or when the file has changed
FSID Server-unique identifier of the file's file system
Semantics of File Sharing (1)
a) On a single processor, when a read follows a write, the value returned by the read is the value just written.
b) In a distributed system with caching, obsolete values may be returned.
Semantics of File Sharing
• Immutable Files
No updates are possible Simplifies sharing and replication Only operations are create and read
• Transaction All changes occur atomically
File Locking in NFS (1)
NFS version 4 operations related to file locking.
Operation Description
Lock Creates a lock for a range of bytes
Lockt Test whether a conflicting lock has been granted
Locku Remove a lock from a range of bytes
Renew Renew the lease on a specified lock
Client Caching (1)
Client-side caching in NFS.
Client Caching (2)
Using the NFS version 4 callback mechanism to recall file delegation.
RPC Failures
Three situations for handling retransmissions.a) The request is still in progressb) The reply has just been returnedc) The reply has been some time ago, but was lost.
Security
The NFS security architecture.
Secure RPCs
Secure RPC in NFS version 4.
Access Control
The classification of operations recognized by NFS with respect to access control.
Operation Description
Read_data Permission to read the data contained in a file
Write_data Permission to modify a file's data
Append_data Permission to append data to a file
Execute Permission to execute a file
List_directory Permission to list the contents of a directory
Add_file Permission to add a new file t5o a directory
Add_subdirectory Permission to create a subdirectory to a directory
Delete Permission to delete a file
Delete_child Permission to delete a file or directory within a directory
Read_acl Permission to read the ACL
Write_acl Permission to write the ACL
Read_attributes The ability to read the other basic attributes of a file
Write_attributes Permission to change the other basic attributes of a file
Read_named_attrs Permission to read the named attributes of a file
Write_named_attrs Permission to write the named attributes of a file
Write_owner Permission to change the owner
Synchronize Permission to access a file locally at the server with synchronous reads and writes
Benchmarking study
– Network File System (NFS)
Outline
File system architectures Performance study design Experimental results
NFS Architecture
Client/server system Single server for files
Performance Study Design
Experimental cluster
–Seven dual-processor Pentium III 1GHz, 1GB memory computers
–Dual EIDE disk RAID 0 subsystem in all nodes, measured throughput about 50MBps
–Myrinetswitches, 250MBps theoretical bandwidth
NFS Parameters
Mount on Node 0 is a local mount
–Optimization for NFS NFS server can participate or not as a client
in the workload
System Software
RedHatLinux version 7.1 Linux kernel version 2.4.17-rc2 NFS protocol version 3 PVFS version 1.5.3 PVFS kernel version 1.5.3 Myrinetnetwork drivers gm-1.5-pre3b MPICH version 1.2.1
Clearcache
Clear NFS client and server-side caches
–UnmountNFS directory, shutdown NFS
–Restart NFS, remount NFS directories
Experimental Parameters
I/O servers NFS may or may not also participate as clients
Experimental Results
NFS, LWF and GWF with and without server
reading PVFS UNIX/POSIX API compared to NFS PVFS and NFS, GWF, 1 and 2 clients
with/without server participating
NFS, LWF and GWF with and without server reading
PVFS UNIX/POSIX API compared to NFS
PVFS and NFS, GWF, 1 and 2 clients with/without server participating
Conclusions
NFS can take advantage of a local mount NFS performance is limited by contention at
the single server
–Limited to the disk throughput or the network throughput from the server, whichever has the most contention