network file systems victoria krafft cs 614 10/4/05

Network File Systems

Victoria Krafft

CS 614

10/4/05

General Idea

People move around

Machines may want to share data

Want a system with: No new interface for applications No need to copy all the data No space consuming version control

Network File Systems

Diagram from http://www.cs.binghamton.edu/~kang/cs552/note11.ppt

A Brief History

Network File System (NFS) developed in 1984 Simple client-server model Some problems

Andrew File System (AFS) developed in 1985 Better performance More caching client-side

SFS 1999 NFS can be run on untrusted networks

Lingering Issues

Central server is a major bottleneck

All choices still require lots of bandwidth

LANs getting faster & lower latency Remote memory faster than local disk ATM faster with more nodes sending data

Cooperative Caching

Michael D. Dahlin, Randolph Y. Wang, Thomas E. Anderson, and David A. Patterson in 1994

ATM, Myrinet provide faster, low-latency network

This makes remote memory 10-20x faster than disk

Want to get data from memory of other clients rather than server disk

Cooperative Caching

Data can be found in:1. Local memory

2. Server memory

3. Other client memory

4. Server disk

How should we distribute cache data?

Design DecisionsPrivate/Global Coop. Cache?

Coordinated Cache Entries?

Static/Dynamic Partition?

Block Location? Weighted LRU

Hash

N-Chance

Direct Client Cooperation

Greedy Forwarding

Cent. Coord

Private

Any Client

DynamicStatic

Coordination

Global

No Coordination

Fixed

Direct Client Cooperation

Active clients use idle client memory as a backing store

Simple

Don’t get info from other active clients

Greedy Forwarding

Each client manages its local cache greedily

Server stores contents of client caches

Still potentially large amounts of data duplication

No major cost for performance improvements

Centrally Coordinated Caching

Client cache split into two parts – local and global

N-Chance Forwarding

Clients prefer to cache singlets, blocks stored in only one client cache.Instead of discarding a singlet, set recirculation count to n and pass on to a random other client.

Sensitivity

Variation in Response Time with Client Cache Size

Variation in Response Time with Network Latency

Simulation results

Average read response time Sever load

Simulation results

Slowdown

Slowdown

Results

N-Chance forwarding close to best possible performance

Requires clients to trust each other

Requires fast network

Serverless NFS

Thomas E. Anderson, Michael D. Dahlin, Jeanna M. Neefe, David A. Patterson, Drew S. Roselli, and Randolph Y. Wang in 1995

Eliminates central server

Takes advantage of ATM and Myrinet

Starting points

RAID: Redundancy if nodes leave or fail

LFS: Recovery when system fails

Zebra: Combines LFS and RAID for distributed systems

Multiprocessor Cache Consistency: Invalidating stale cache info

To Eliminate Central Servers

Scaleable distributed metadata, which can be reconfigured after a failure

Scalable division into groups for efficient storage

Scalable log cleaning

How it works

Each machine has one or more roles:1. Client

2. Storage Server

3. Manager

4. Cleaner

Management split among metadata managers

Disks clustered into stripe groups for scalability

Cooperative caching among clients

xFS

xFS is a prototype of the serverless network file system

Lacks a couple features: Recovery not completed Doesn’t calculate or distribute new manager or

stripe group maps No distributed cleaner

File Read

File Write

Buffered into segments in local memory

Client commits to storage

Client notifies managers of modified blocks

Managers update index nodes & imaps

Periodically, managers log changes to stable storage

Distributing File Management

First Writer – management goes to whoever created the file

*does not include all local hits

Cleaning

Segment utilization maintained by segment writer

Segment utilization stored in s-files

Cleaning controlled by stripe group leader

Optimistic Concurrency control resolves cleaning / writing conflicts

Recovery

Several steps are O(N2), but can be run in parallel

Steps For Recovery

xFS Performance

Aggregate Bandwidth Writing 10MB files

Aggregate Bandwidth Reading 10MB files

NFS max with 2 clients

AFS max with 32 clients AFS max with

12 clients

NFS max with 2 clients

xFS Performance

Average time to complete the Andrew benchmark, varying the number of simultaneous clients

System Variables

Aggregate Large-Write Bandwidth with Different Storage Server Configurations

Variation in Average Small File Creation Speed with more Managers

Possible Problems

System relies on secure network between machines, and trusted kernels on distributed nodes

Testing done on Myrinet

Low-Bandwidth NFS

Want efficient remote access over slow or wide area networks

File systems better than CVS, copying all data over

Want close-to-open consistency

LBFS

Large client cache containing user’s working set of files

Don’t send all the data – reconstitute files from previous data, and only send changes

File indexing

Non-overlapping chunks between 2K and 64K

Broken up using 48 byte Rabin fingerprints

Identified by SHA-1 hash, indexing on first 64 bits

Stored in database, recomputed before use to avoid synchronization issues

Protocol

Based on NFS, added GETHASH, MKTMPFILE, TMPWRITE, CONDWRITE, COMMITTMPSecurity infrastructure from SFSWhole file cachingRetrieve from server on read unless valid copy in cacheWrite back to server when file closed

File Reads

File Writes

Implementation

LBFS server accesses file system as an NFS clientServer creates trash directory for temporary filesServer inefficient when files overwritten or truncated, which could be fixed by lower-level access.Client uses xfs driver

Evaluation

Bandwidth consumptionMuch higher bandwidth

for first build

Application Performance

Bandwidth and Round Trip Time

Conclusions

New technologies open up new possibilities for network file systems

Cost of increased traffic over Ethernet may cause problems for xFS, cooperative caching.

network file systems victoria krafft cs 614 10/4/05

Documents

data slide

global slide

myrinet slide

server disk slide

network latency slide

fast network slide

cache data

coordination fixed slide