distributed fs, continued andy wang cop 5611 advanced operating systems
TRANSCRIPT
![Page 1: Distributed FS, Continued Andy Wang COP 5611 Advanced Operating Systems](https://reader035.vdocument.in/reader035/viewer/2022062322/5697bfe01a28abf838cb34e1/html5/thumbnails/1.jpg)
Distributed FS, Continued
Andy WangCOP 5611
Advanced Operating Systems
![Page 2: Distributed FS, Continued Andy Wang COP 5611 Advanced Operating Systems](https://reader035.vdocument.in/reader035/viewer/2022062322/5697bfe01a28abf838cb34e1/html5/thumbnails/2.jpg)
Outline
Replicated file systems Ficus Coda
Serverless file systems
![Page 3: Distributed FS, Continued Andy Wang COP 5611 Advanced Operating Systems](https://reader035.vdocument.in/reader035/viewer/2022062322/5697bfe01a28abf838cb34e1/html5/thumbnails/3.jpg)
Replicated File Systems
NFS provides remote access AFS provides high quality caching Why isn’t this enough?
More precisely, when isn’t this enough?
![Page 4: Distributed FS, Continued Andy Wang COP 5611 Advanced Operating Systems](https://reader035.vdocument.in/reader035/viewer/2022062322/5697bfe01a28abf838cb34e1/html5/thumbnails/4.jpg)
When Do You Need Replication?
For write performance For reliability For availability For mobile computing For load sharing Optimistic replication increases
these advantages
![Page 5: Distributed FS, Continued Andy Wang COP 5611 Advanced Operating Systems](https://reader035.vdocument.in/reader035/viewer/2022062322/5697bfe01a28abf838cb34e1/html5/thumbnails/5.jpg)
Some Replicated File Systems
Locus Ficus Coda Rumor All optimistic: few conservative file
replication systems have been built
![Page 6: Distributed FS, Continued Andy Wang COP 5611 Advanced Operating Systems](https://reader035.vdocument.in/reader035/viewer/2022062322/5697bfe01a28abf838cb34e1/html5/thumbnails/6.jpg)
Ficus
Optimistic file replication based on peer-to-peer model
Built in Unix context Meant to service large network of
workstations Built using stackable layers
![Page 7: Distributed FS, Continued Andy Wang COP 5611 Advanced Operating Systems](https://reader035.vdocument.in/reader035/viewer/2022062322/5697bfe01a28abf838cb34e1/html5/thumbnails/7.jpg)
Peer-To-Peer Replication
All replicas are equal No replicas are masters, or servers All replicas can provide any service All replicas can propagate updates
to all other replicas Client/server is the other popular
model
![Page 8: Distributed FS, Continued Andy Wang COP 5611 Advanced Operating Systems](https://reader035.vdocument.in/reader035/viewer/2022062322/5697bfe01a28abf838cb34e1/html5/thumbnails/8.jpg)
Basic Ficus Architecture Ficus replicates at volume
granularity Given volume can be replicated
many times Performance limitations on scale
Updates propagated as they occur On single best-efforts basis
Consistency achieved by periodic reconciliation
![Page 9: Distributed FS, Continued Andy Wang COP 5611 Advanced Operating Systems](https://reader035.vdocument.in/reader035/viewer/2022062322/5697bfe01a28abf838cb34e1/html5/thumbnails/9.jpg)
Stackable Layers in Ficus
Ficus is built out of several stackable layers
Exact composition depends on what generation of system you look at
![Page 10: Distributed FS, Continued Andy Wang COP 5611 Advanced Operating Systems](https://reader035.vdocument.in/reader035/viewer/2022062322/5697bfe01a28abf838cb34e1/html5/thumbnails/10.jpg)
Ficus Stackable Layers Diagram
Select
FLFS
Storage
FPFS
Transport
Storage
FPFS
![Page 11: Distributed FS, Continued Andy Wang COP 5611 Advanced Operating Systems](https://reader035.vdocument.in/reader035/viewer/2022062322/5697bfe01a28abf838cb34e1/html5/thumbnails/11.jpg)
Ficus Diagram
Site A
Site B
Site C
1
2 3
![Page 12: Distributed FS, Continued Andy Wang COP 5611 Advanced Operating Systems](https://reader035.vdocument.in/reader035/viewer/2022062322/5697bfe01a28abf838cb34e1/html5/thumbnails/12.jpg)
An Update Occurs
Site A
Site B
Site C
1
2 3
![Page 13: Distributed FS, Continued Andy Wang COP 5611 Advanced Operating Systems](https://reader035.vdocument.in/reader035/viewer/2022062322/5697bfe01a28abf838cb34e1/html5/thumbnails/13.jpg)
Reconciliation in Ficus
Reconciliation process runs periodically on each Ficus site For each local volume replica
Reconciliation strategy implies eventual consistency guarantee Frequency of reconciliation affects
how long “eventually” takes
![Page 14: Distributed FS, Continued Andy Wang COP 5611 Advanced Operating Systems](https://reader035.vdocument.in/reader035/viewer/2022062322/5697bfe01a28abf838cb34e1/html5/thumbnails/14.jpg)
Steps in Reconciliation
1. Get information about the state of a remote replica
2. Get information about the state of the local replica
3. Compare the two sets of information
4. Change local replica to reflect remote changes
![Page 15: Distributed FS, Continued Andy Wang COP 5611 Advanced Operating Systems](https://reader035.vdocument.in/reader035/viewer/2022062322/5697bfe01a28abf838cb34e1/html5/thumbnails/15.jpg)
Ficus Reconciliation DiagramC ReconcilesWith ASite
A
Site B
Site C
1
2 3
![Page 16: Distributed FS, Continued Andy Wang COP 5611 Advanced Operating Systems](https://reader035.vdocument.in/reader035/viewer/2022062322/5697bfe01a28abf838cb34e1/html5/thumbnails/16.jpg)
Ficus Reconciliation Diagram Con’t
B ReconcilesWith C
Site A
Site B
Site C
1
2 3
![Page 17: Distributed FS, Continued Andy Wang COP 5611 Advanced Operating Systems](https://reader035.vdocument.in/reader035/viewer/2022062322/5697bfe01a28abf838cb34e1/html5/thumbnails/17.jpg)
Gossiping and Reconciliation
Reconciliation benefits from the use of gossip
In example just shown, an update originating at A got to B through communications between B and C
So B can get the update without talking to A directly
![Page 18: Distributed FS, Continued Andy Wang COP 5611 Advanced Operating Systems](https://reader035.vdocument.in/reader035/viewer/2022062322/5697bfe01a28abf838cb34e1/html5/thumbnails/18.jpg)
Benefits of Gossiping
Potentially less communications Shares load of sending updates Easier recovery behavior Handles disconnections nicely Handles mobile computing nicely Peer model systems get more
benefit than client/server model systems
![Page 19: Distributed FS, Continued Andy Wang COP 5611 Advanced Operating Systems](https://reader035.vdocument.in/reader035/viewer/2022062322/5697bfe01a28abf838cb34e1/html5/thumbnails/19.jpg)
Reconciliation Topology
Reconciliation in Ficus is pair-wise In the general case, which pairs of
replicas should reconcile? Reconciling all pairs is unnecessary
Due to gossip Want to minimize number of recons
But propagate data quickly
![Page 20: Distributed FS, Continued Andy Wang COP 5611 Advanced Operating Systems](https://reader035.vdocument.in/reader035/viewer/2022062322/5697bfe01a28abf838cb34e1/html5/thumbnails/20.jpg)
Ficus Ring Reconciliation Topology
![Page 21: Distributed FS, Continued Andy Wang COP 5611 Advanced Operating Systems](https://reader035.vdocument.in/reader035/viewer/2022062322/5697bfe01a28abf838cb34e1/html5/thumbnails/21.jpg)
Adaptive Ring Reconciliation Topology
![Page 22: Distributed FS, Continued Andy Wang COP 5611 Advanced Operating Systems](https://reader035.vdocument.in/reader035/viewer/2022062322/5697bfe01a28abf838cb34e1/html5/thumbnails/22.jpg)
Problems in File Reconciliation
Recognizing updates Recognizing update conflicts Handling conflicts Recognizing name conflicts Update/remove conflicts Garbage collection Fiscus has solutions for all these
problems
![Page 23: Distributed FS, Continued Andy Wang COP 5611 Advanced Operating Systems](https://reader035.vdocument.in/reader035/viewer/2022062322/5697bfe01a28abf838cb34e1/html5/thumbnails/23.jpg)
Recognizing Updates in Ficus
Ficus keeps per-file version vectors Updates detected by version
vector comparisons The data for the later version can
then be propagated Ficus propagates full files
![Page 24: Distributed FS, Continued Andy Wang COP 5611 Advanced Operating Systems](https://reader035.vdocument.in/reader035/viewer/2022062322/5697bfe01a28abf838cb34e1/html5/thumbnails/24.jpg)
Recognizing Update Conflicts in Ficus
Concurrent update can lead to update conflicts
Version vectors permit detection of update conflicts
Works for n-way conflicts, too
![Page 25: Distributed FS, Continued Andy Wang COP 5611 Advanced Operating Systems](https://reader035.vdocument.in/reader035/viewer/2022062322/5697bfe01a28abf838cb34e1/html5/thumbnails/25.jpg)
Handling Update Conflicts in Ficus
Ficus uses resolver programs to handle conflicts
Resolvers work on one pair of replicas of one file
System attempts to deduce file type and call proper resolver
If all resolvers fail, notify user Ficus also blocks access to file
![Page 26: Distributed FS, Continued Andy Wang COP 5611 Advanced Operating Systems](https://reader035.vdocument.in/reader035/viewer/2022062322/5697bfe01a28abf838cb34e1/html5/thumbnails/26.jpg)
Handling Directory Conflicts in Ficus
Directory updates have very limited semantics So directory conflicts are easier to
deal with Ficus uses special in-kernel
mechanisms to automatically fix most directory conflicts
![Page 27: Distributed FS, Continued Andy Wang COP 5611 Advanced Operating Systems](https://reader035.vdocument.in/reader035/viewer/2022062322/5697bfe01a28abf838cb34e1/html5/thumbnails/27.jpg)
Directory Conflict Diagram
Earth
Mars
Saturn
Earth
Mars
Sedna
Replica 2Replica 1
![Page 28: Distributed FS, Continued Andy Wang COP 5611 Advanced Operating Systems](https://reader035.vdocument.in/reader035/viewer/2022062322/5697bfe01a28abf838cb34e1/html5/thumbnails/28.jpg)
How Did This Directory Get Into This State?
If we could figure out what operations were performed on each side that cased each replica to enter this state,
We could produce a merged version
But there are two possibilities
![Page 29: Distributed FS, Continued Andy Wang COP 5611 Advanced Operating Systems](https://reader035.vdocument.in/reader035/viewer/2022062322/5697bfe01a28abf838cb34e1/html5/thumbnails/29.jpg)
Possibility 1
1. Earth and Mars exist2. Create Saturn at replica 13. Create Sedna at replica 2Correct result is directory containing
Earth, Mars, Saturn, and Sedna
![Page 30: Distributed FS, Continued Andy Wang COP 5611 Advanced Operating Systems](https://reader035.vdocument.in/reader035/viewer/2022062322/5697bfe01a28abf838cb34e1/html5/thumbnails/30.jpg)
The Create/Delete Ambiguity This is an example of a general
problem with replicated data Cannot be solved with per-file
version vectors Requires per-entry information Ficus keeps such information Must save removed files’ entries
for a while
![Page 31: Distributed FS, Continued Andy Wang COP 5611 Advanced Operating Systems](https://reader035.vdocument.in/reader035/viewer/2022062322/5697bfe01a28abf838cb34e1/html5/thumbnails/31.jpg)
Possibility 2
1. Earth, Mars, and Saturn exist2. Delete Saturn at replica 23. Create Sedna at replica 2 Correct result is directory
containing Earth, Mars, and Sedna
And there are other possibilities
![Page 32: Distributed FS, Continued Andy Wang COP 5611 Advanced Operating Systems](https://reader035.vdocument.in/reader035/viewer/2022062322/5697bfe01a28abf838cb34e1/html5/thumbnails/32.jpg)
Recognizing Name Conflicts in Ficus
Name conflicts occur when two different files are concurrently given same name
Ficus recognizes them with its per-entry directory info
Then what? Handle similarly to update conflicts
Add disambiguating suffixes to names
![Page 33: Distributed FS, Continued Andy Wang COP 5611 Advanced Operating Systems](https://reader035.vdocument.in/reader035/viewer/2022062322/5697bfe01a28abf838cb34e1/html5/thumbnails/33.jpg)
Internal Representation of Problem Directory
Earth
Mars
Saturn
Earth
Mars
Saturn
Sedna
Replica 1 Replica 2
![Page 34: Distributed FS, Continued Andy Wang COP 5611 Advanced Operating Systems](https://reader035.vdocument.in/reader035/viewer/2022062322/5697bfe01a28abf838cb34e1/html5/thumbnails/34.jpg)
Update/Remove Conflicts
Consider case where file “ Saturn” has two replicas
1. Replica 1 receives an update2. Replica 2 is removed What should happen? A matter of systems semantics,
basically
![Page 35: Distributed FS, Continued Andy Wang COP 5611 Advanced Operating Systems](https://reader035.vdocument.in/reader035/viewer/2022062322/5697bfe01a28abf838cb34e1/html5/thumbnails/35.jpg)
Ficus’ No-Lost-Updates Semantics
Ficus handles this problem by defining its semantics to be no-lost-updates
In other words, the update must not disappear
But the remove must happen Put “Saturn” in the orphanage
Requires temporarily saving removed files
![Page 36: Distributed FS, Continued Andy Wang COP 5611 Advanced Operating Systems](https://reader035.vdocument.in/reader035/viewer/2022062322/5697bfe01a28abf838cb34e1/html5/thumbnails/36.jpg)
Removals and Hard Links
Unix and Ficus support hard links Effectively, multiple names for a file
Cannot remove a file’s bits until the last hard link to the file is removed
Tricky in a distributed system
![Page 37: Distributed FS, Continued Andy Wang COP 5611 Advanced Operating Systems](https://reader035.vdocument.in/reader035/viewer/2022062322/5697bfe01a28abf838cb34e1/html5/thumbnails/37.jpg)
Link Example
Replica 1
foodir
red blue
Replica 2
foodir
red blue
![Page 38: Distributed FS, Continued Andy Wang COP 5611 Advanced Operating Systems](https://reader035.vdocument.in/reader035/viewer/2022062322/5697bfe01a28abf838cb34e1/html5/thumbnails/38.jpg)
Link Example, Part II
Replica 1
foodir
red blue
Replica 2
foodir
red blue
update blue
![Page 39: Distributed FS, Continued Andy Wang COP 5611 Advanced Operating Systems](https://reader035.vdocument.in/reader035/viewer/2022062322/5697bfe01a28abf838cb34e1/html5/thumbnails/39.jpg)
Link Example, Part III
Replica 1
foodir
red blue
Replica 2
foodir
red blue
delete blue
bardir
create hard link in bardir to blue
![Page 40: Distributed FS, Continued Andy Wang COP 5611 Advanced Operating Systems](https://reader035.vdocument.in/reader035/viewer/2022062322/5697bfe01a28abf838cb34e1/html5/thumbnails/40.jpg)
What Should Happen Here?
Clearly, the link named foodir/blue should disappear
And the link in bardir link point to? But what version of the data should
the bardir link point to? No-lost-update semantics say it
must be the update at replica 1
![Page 41: Distributed FS, Continued Andy Wang COP 5611 Advanced Operating Systems](https://reader035.vdocument.in/reader035/viewer/2022062322/5697bfe01a28abf838cb34e1/html5/thumbnails/41.jpg)
Garbage Collection in Ficus
Ficus cannot throw away removed things at once Directory entries Updated files for no-lost-updates Non-updated files due to hard links
When can Ficus reclaim the space these use?
![Page 42: Distributed FS, Continued Andy Wang COP 5611 Advanced Operating Systems](https://reader035.vdocument.in/reader035/viewer/2022062322/5697bfe01a28abf838cb34e1/html5/thumbnails/42.jpg)
When Can I Throw Away My Data
Not until all links to the file disappear Global information, not local
Moreover, just because I know all links have disappeared doesn’t mean I can throw everything away Must wait till everyone knows
Requires two trips around the ring
![Page 43: Distributed FS, Continued Andy Wang COP 5611 Advanced Operating Systems](https://reader035.vdocument.in/reader035/viewer/2022062322/5697bfe01a28abf838cb34e1/html5/thumbnails/43.jpg)
Why Can’t I Forget When I Know There Are No Links
I can throw the data away I don’t need it, nobody else does either
But I can’t forget that I knew this Because not everyone knows it
For them to throw their data away, they must learn
So I must remember for their benefit
![Page 44: Distributed FS, Continued Andy Wang COP 5611 Advanced Operating Systems](https://reader035.vdocument.in/reader035/viewer/2022062322/5697bfe01a28abf838cb34e1/html5/thumbnails/44.jpg)
Coda
A different approach to optimistic replication
Inherits a lot form Andrew Basically, a client/server solution Developed at CMU
![Page 45: Distributed FS, Continued Andy Wang COP 5611 Advanced Operating Systems](https://reader035.vdocument.in/reader035/viewer/2022062322/5697bfe01a28abf838cb34e1/html5/thumbnails/45.jpg)
Coda Replication Model
Files stored permanently at server machines
Client workstations download temporary replicas, not cached copies
Can perform updates without getting token from the server
So concurrent updates possible
![Page 46: Distributed FS, Continued Andy Wang COP 5611 Advanced Operating Systems](https://reader035.vdocument.in/reader035/viewer/2022062322/5697bfe01a28abf838cb34e1/html5/thumbnails/46.jpg)
Detecting Concurrent Updates
Workstation replicas only reconcile with their server
At recon time, they compare their state of files with server’s state Detecting any problems
Since workstations don’t gossip, detection is easier than in Ficus
![Page 47: Distributed FS, Continued Andy Wang COP 5611 Advanced Operating Systems](https://reader035.vdocument.in/reader035/viewer/2022062322/5697bfe01a28abf838cb34e1/html5/thumbnails/47.jpg)
Handling Concurrent Updates
Basic strategy is similar to Ficus’ Resolver programs are called to
deal with conflicts Coda allows resolvers to deal with
multiple related conflicts at once Also has some other refinements
to conflict resolution
![Page 48: Distributed FS, Continued Andy Wang COP 5611 Advanced Operating Systems](https://reader035.vdocument.in/reader035/viewer/2022062322/5697bfe01a28abf838cb34e1/html5/thumbnails/48.jpg)
Server Replication in Coda
Unlike Andrew, writable copies of a file can be stored at multiple servers
Servers have peer-to-peer replication Servers have strong connectivity,
crash infrequently Thus, Coda uses simpler peer-to-peer
algorithms than Ficus must
![Page 49: Distributed FS, Continued Andy Wang COP 5611 Advanced Operating Systems](https://reader035.vdocument.in/reader035/viewer/2022062322/5697bfe01a28abf838cb34e1/html5/thumbnails/49.jpg)
Why Is Coda Better Than AFS?
Writes don’t lock the file Writes happen quicker More local autonomy
Less write traffic on the network Workstations can be disconnected Better load sharing among servers
![Page 50: Distributed FS, Continued Andy Wang COP 5611 Advanced Operating Systems](https://reader035.vdocument.in/reader035/viewer/2022062322/5697bfe01a28abf838cb34e1/html5/thumbnails/50.jpg)
Comparing Coda to Ficus
Coda uses simpler algorithms Less likely to be bugs Less likely to be performance
problems Coda doesn’t allow client gossiping Coda has built-in security Coda garbage collection simpler
![Page 51: Distributed FS, Continued Andy Wang COP 5611 Advanced Operating Systems](https://reader035.vdocument.in/reader035/viewer/2022062322/5697bfe01a28abf838cb34e1/html5/thumbnails/51.jpg)
Serverless Network File Systems
New network technologies are much faster, with much higher bandwidth
In some cases, going over the net is quicker than going to local disk
How can we improve file systems by taking advantage of this change?
![Page 52: Distributed FS, Continued Andy Wang COP 5611 Advanced Operating Systems](https://reader035.vdocument.in/reader035/viewer/2022062322/5697bfe01a28abf838cb34e1/html5/thumbnails/52.jpg)
Fundamental Ideas of Serverless File Systems
Peer workstations providing file service for each other
High degree of location independence
Make use of all machine’s caches Provide reliability in case of
failures
![Page 53: Distributed FS, Continued Andy Wang COP 5611 Advanced Operating Systems](https://reader035.vdocument.in/reader035/viewer/2022062322/5697bfe01a28abf838cb34e1/html5/thumbnails/53.jpg)
xFS Serverless file system project at
Berkeley Inherits ideas from several sources
LFS Zebra (RAID-like ideas) Multiprocessor cache consistency
Built for Network of Workstations (NOW) environment
![Page 54: Distributed FS, Continued Andy Wang COP 5611 Advanced Operating Systems](https://reader035.vdocument.in/reader035/viewer/2022062322/5697bfe01a28abf838cb34e1/html5/thumbnails/54.jpg)
What Does a File Server Do?
Stores file data blocks on its disks Maintains file location information Maintains cache of data blocks Manages cache consistency for its
clients
![Page 55: Distributed FS, Continued Andy Wang COP 5611 Advanced Operating Systems](https://reader035.vdocument.in/reader035/viewer/2022062322/5697bfe01a28abf838cb34e1/html5/thumbnails/55.jpg)
xFS Must Provide These Services
In essence, every machine takes on some of the server’s responsibilities
Any data or metadata might be located at any machine
Key challenge is providing same services centralized server provided in a distributed system
![Page 56: Distributed FS, Continued Andy Wang COP 5611 Advanced Operating Systems](https://reader035.vdocument.in/reader035/viewer/2022062322/5697bfe01a28abf838cb34e1/html5/thumbnails/56.jpg)
Key xFS Concepts
Metadata manager Stripe groups for data storage Cooperative caching Distributed cleaning processes
![Page 57: Distributed FS, Continued Andy Wang COP 5611 Advanced Operating Systems](https://reader035.vdocument.in/reader035/viewer/2022062322/5697bfe01a28abf838cb34e1/html5/thumbnails/57.jpg)
How Do I Locate a File in xFS?
I’ve got a file name, but where is it? Assuming it’s not locally cached
File’s director converts name to a unique index number
Consult the metadata manager to find out where file with that index number is stored-the manager map
![Page 58: Distributed FS, Continued Andy Wang COP 5611 Advanced Operating Systems](https://reader035.vdocument.in/reader035/viewer/2022062322/5697bfe01a28abf838cb34e1/html5/thumbnails/58.jpg)
The Manger Map
Data structure that allows translation of index numbers to file managers Not necessarily file locations
Kept by each metadata manager Globally replicated data structure Simply says what machine manages
the file
![Page 59: Distributed FS, Continued Andy Wang COP 5611 Advanced Operating Systems](https://reader035.vdocument.in/reader035/viewer/2022062322/5697bfe01a28abf838cb34e1/html5/thumbnails/59.jpg)
Using the Manager Map
Look up index number in local map Index numbers are clustered, so
many fewer entries than files Send request to responsible
manager
![Page 60: Distributed FS, Continued Andy Wang COP 5611 Advanced Operating Systems](https://reader035.vdocument.in/reader035/viewer/2022062322/5697bfe01a28abf838cb34e1/html5/thumbnails/60.jpg)
What Does the Manager Do? Manager keeps two types of
information1. imap information2. caching information If some other sites has the file in its
cache, tell requester to go to that site
Always use cache before disk Even if cache is remote
![Page 61: Distributed FS, Continued Andy Wang COP 5611 Advanced Operating Systems](https://reader035.vdocument.in/reader035/viewer/2022062322/5697bfe01a28abf838cb34e1/html5/thumbnails/61.jpg)
What if No One Caches the Block?
Metadata manager for this file then must consul its imap
Imap tells which disks store the data block
Files are striped across disks stored on multiple machines Typically single block is on one disk
![Page 62: Distributed FS, Continued Andy Wang COP 5611 Advanced Operating Systems](https://reader035.vdocument.in/reader035/viewer/2022062322/5697bfe01a28abf838cb34e1/html5/thumbnails/62.jpg)
Writing Data
xFS uses RAID-like methods to store data
RAID sucks for small writes So xFS avoids small writes By using LFS-style operations
Batch writes until you have a full stripe’s worth
![Page 63: Distributed FS, Continued Andy Wang COP 5611 Advanced Operating Systems](https://reader035.vdocument.in/reader035/viewer/2022062322/5697bfe01a28abf838cb34e1/html5/thumbnails/63.jpg)
Stripe Groups
Set of disks that cooperatively store data in RAID fashion
xFS uses single parity disk Alternative to striping all data
across all disks
![Page 64: Distributed FS, Continued Andy Wang COP 5611 Advanced Operating Systems](https://reader035.vdocument.in/reader035/viewer/2022062322/5697bfe01a28abf838cb34e1/html5/thumbnails/64.jpg)
Cooperative Caching Each site’s cache can service
requests from all other sites Working from assumption that
network access is quicker than disk access
Metadata managers used to keep track of where data is cached So remote cache access takes 3
network hops
![Page 65: Distributed FS, Continued Andy Wang COP 5611 Advanced Operating Systems](https://reader035.vdocument.in/reader035/viewer/2022062322/5697bfe01a28abf838cb34e1/html5/thumbnails/65.jpg)
Getting a Block from a Remote Cache
ManagerMap
Client
CacheConsistency
Sate
MetaDataServer
UnixCache
CachingSite
RequestBlock
1 2
3
![Page 66: Distributed FS, Continued Andy Wang COP 5611 Advanced Operating Systems](https://reader035.vdocument.in/reader035/viewer/2022062322/5697bfe01a28abf838cb34e1/html5/thumbnails/66.jpg)
Providing Cache Consistency
Per-block token consistency To write a block, client requests
token from metadata server Metadata server retrievers token
from whoever has it And invalidates other caches
Writing site keeps token
![Page 67: Distributed FS, Continued Andy Wang COP 5611 Advanced Operating Systems](https://reader035.vdocument.in/reader035/viewer/2022062322/5697bfe01a28abf838cb34e1/html5/thumbnails/67.jpg)
Which Sites Should Manage Which Files?
Could randomly assign equal number of file index groups to each site
Better if the site using a file also manages it In particular, if most frequent writer
manages it Can reduce network traffic by ~ 50%
![Page 68: Distributed FS, Continued Andy Wang COP 5611 Advanced Operating Systems](https://reader035.vdocument.in/reader035/viewer/2022062322/5697bfe01a28abf838cb34e1/html5/thumbnails/68.jpg)
Cleaning Up
File data (and metadata) is stored in log structures spread across machines
A distributed cleaning method is required
Each machine stores info on its usage of stripe groups
Each clans up its own mess
![Page 69: Distributed FS, Continued Andy Wang COP 5611 Advanced Operating Systems](https://reader035.vdocument.in/reader035/viewer/2022062322/5697bfe01a28abf838cb34e1/html5/thumbnails/69.jpg)
Basic Performance Results
Early results from incomplete system
Can provide up to 10 times the bandwidth of file data as single NFS server
Even better on creating small files Doesn’t compare xFS to
multimachine servers