the ethernet approach to grid computing
DESCRIPTION
The Ethernet Approach to Grid Computing. Douglas Thain and Miron Livny Condor Project, University of Wisconsin http://www.cs.wisc.edu/condor/ftsh. The UW US-CMS Physics Grid. Wrapper. globus-url-copy (C). Gatekeeper (C). MCRunJob (python). Impala (bash). Jobmanager (C). MOP - PowerPoint PPT PresentationTRANSCRIPT
![Page 1: The Ethernet Approach to Grid Computing](https://reader036.vdocument.in/reader036/viewer/2022062520/56815d8a550346895dcb965b/html5/thumbnails/1.jpg)
The Ethernet Approachto Grid Computing
Douglas Thain and Miron LivnyCondor Project, University of Wisconsin
http://www.cs.wisc.edu/condor/ftsh
![Page 2: The Ethernet Approach to Grid Computing](https://reader036.vdocument.in/reader036/viewer/2022062520/56815d8a550346895dcb965b/html5/thumbnails/2.jpg)
The UWUS-CMS
Physics Grid
Gridmanager(C++)
Condor-G(C++)
Gatekeeper(C)
DAGMan(C++)
GAHP Server(C++)
Batch Interface(bash)
Impala wrapper(bash)
Actual Job(Fortran)
Jobmanager(C)
Batch System(???)
MOP wrapper(bash)
Submit DAG(perl)
MOP(python)
Impala(bash)
MCRunJob(python)
Wrapperglobus-url-copy
(C)
![Page 3: The Ethernet Approach to Grid Computing](https://reader036.vdocument.in/reader036/viewer/2022062520/56815d8a550346895dcb965b/html5/thumbnails/3.jpg)
try for 30 minutes...
end
Outline• Two problems in real systems:
– Timing is uncontrollable.– Failures lack detail.
• A solution:– The Ethernet Approach.
• A language and a tool:– The Fault Tolerant Shell.– Time and failures are explicit.
• Example Applications:– Shared Job Queue.– Shared Disk Buffer.– Shared Data Servers.
Ethernet Carrier Sense
Collision DetectExponential BackoffLimited Allocation
Client Client Client Client
WWWServer
Client
WWWServer Black
Holedataset dataset
![Page 4: The Ethernet Approach to Grid Computing](https://reader036.vdocument.in/reader036/viewer/2022062520/56815d8a550346895dcb965b/html5/thumbnails/4.jpg)
1 - Timing is Uncontrollable
• Consider a distributed file system.• Suppose that the network is down.
– “soft mounted” - failure after one minute– “hard mounted” – failure never exposed
• Time is an unknown in nearly every operating system activity:– Process invocation.– Memory access.– Network communications.
![Page 5: The Ethernet Approach to Grid Computing](https://reader036.vdocument.in/reader036/viewer/2022062520/56815d8a550346895dcb965b/html5/thumbnails/5.jpg)
2 - Failures Lack Detail
• Consider this trivial program:
• We would like to distinguish:– “success.”– “file not found.”– “nfs server down, still trying.”– “couldn’t find library libc.so.25.”
% cp a b
![Page 6: The Ethernet Approach to Grid Computing](https://reader036.vdocument.in/reader036/viewer/2022062520/56815d8a550346895dcb965b/html5/thumbnails/6.jpg)
2 - Failures Lack Detail
• Consider this trivial program:
• Actual results:– “success.” (exit code 0)– “file not found.” (exit code 1)– “nfs server down, still trying.” (code 1)– “couldn’t find library libc.so.25.” (code 1)
% cp a b
![Page 7: The Ethernet Approach to Grid Computing](https://reader036.vdocument.in/reader036/viewer/2022062520/56815d8a550346895dcb965b/html5/thumbnails/7.jpg)
Examples Abound!
• TCP connect -> ECONNREFUSED– Wrong port number.– A loaded service is rejecting connections.– The machine has just rebooted, has initialized
TCP/IP, but not yet started the service.• FTP RETR -> code 550
– “550 File or directory not found.”– “550 Erlaubnis hat verweigert.”– “550 Archiveer systeem offline.”– “550 Fuori di memoria.”– “550 File staging in from tape.” (NCSA Unitree)
![Page 8: The Ethernet Approach to Grid Computing](https://reader036.vdocument.in/reader036/viewer/2022062520/56815d8a550346895dcb965b/html5/thumbnails/8.jpg)
Not enoughinformation or control.
Real systems have these
problems. How can we learn to live with them?
“Ethernet Approach”HPDC 2003
How do we design new
systems thatavoid these problems?
“Error Scope”HPDC 2002
![Page 9: The Ethernet Approach to Grid Computing](https://reader036.vdocument.in/reader036/viewer/2022062520/56815d8a550346895dcb965b/html5/thumbnails/9.jpg)
The Ethernet Approach
Networkor Memory
or Disk Spaceor OS Resources
Ethernet RulesCarrier Sense
Collision DetectExponential BackoffLimited Allocation
No Carrier Sense== Aloha Protocol
![Page 10: The Ethernet Approach to Grid Computing](https://reader036.vdocument.in/reader036/viewer/2022062520/56815d8a550346895dcb965b/html5/thumbnails/10.jpg)
The Fault Tolerant Shell
• A tool that encourages the Ethernet approach in system integration.– Similar to the Bourne or C-Shells.– Process invocation and repetition are simple.– Other elements are possible but ugly.
• Not meant to be general purpose, high performance, or abstractly beautiful.– Not OOP, AOP, SOP, GP, etc...– Ethernet ideas could be used in such languages.
• Elements:– Brittle property, try/catch, timed try, forany/forall.
![Page 11: The Ethernet Approach to Grid Computing](https://reader036.vdocument.in/reader036/viewer/2022062520/56815d8a550346895dcb965b/html5/thumbnails/11.jpg)
The Brittle Property
wget http://host/file.tar.gzgunzip file.tar.gztar xvf file.tar
Failure of any step causes an immediate halt of the entire group.
![Page 12: The Ethernet Approach to Grid Computing](https://reader036.vdocument.in/reader036/viewer/2022062520/56815d8a550346895dcb965b/html5/thumbnails/12.jpg)
Untyped Exceptions
trywget http://host/file.tar.gzgunzip file.tar.gztar xvf file.tar
catchecho “Zoiks!”
end
Exceptions have no type!
Failure of this group raises an exception.
![Page 13: The Ethernet Approach to Grid Computing](https://reader036.vdocument.in/reader036/viewer/2022062520/56815d8a550346895dcb965b/html5/thumbnails/13.jpg)
Timed Try Statements
try for 30 minuteswget http://host/file.tar.gzgunzip file.tar.gztar xvf file.tar
end An exception in the enclosed statement will retry up to 30 mins.(Exp. backoff.)
The enclosed statement will be cancelled after 30 mins.
Success after n is as good as success after one. (Otherwise, failure.)
![Page 14: The Ethernet Approach to Grid Computing](https://reader036.vdocument.in/reader036/viewer/2022062520/56815d8a550346895dcb965b/html5/thumbnails/14.jpg)
Timed Try Statements
• If group completes within time limit.– Try block succeeds.
• If group fails within time limit.– Automatically retried.– Exponentially increasing delay.– Random factor to avoid collisions.
• If group runs over time limit.– Resources reclaimed, exception thrown.
![Page 15: The Ethernet Approach to Grid Computing](https://reader036.vdocument.in/reader036/viewer/2022062520/56815d8a550346895dcb965b/html5/thumbnails/15.jpg)
forany and forall
forany host in xxx yyy zzzwget http://${host}/file
end
Attempt to make this statement succeed for any random branch.
forall host in xxx yyy zzzwget http://${host}/file
end
Attempt to make this statement
succeed for all branches
simultaneously.
![Page 16: The Ethernet Approach to Grid Computing](https://reader036.vdocument.in/reader036/viewer/2022062520/56815d8a550346895dcb965b/html5/thumbnails/16.jpg)
Example Applications
Job Queue
Disk Buffer
Data Servers
Collision Detect
failed cmd failed cmd failed cmd
ExpBackoff
“try” backoff “try” backoff “try” backoff
Limited Allocation
“try” timeout “try” timeout “try” timeout
Carrier Sense
File Descriptors
Estimated Free Space
Short Active ProbeE
ther
net P
rope
rties
handledby coder
handledby ftsh
![Page 17: The Ethernet Approach to Grid Computing](https://reader036.vdocument.in/reader036/viewer/2022062520/56815d8a550346895dcb965b/html5/thumbnails/17.jpg)
LocalFilesystem
Shared Job Queue
Condorschedd
JobJobJobJobJobJobJobJob
JobQueue
ActivityLog
MatchMaker
CPU
CPU
CPU
Client
Client
Client
Multiple clients connect to a job queue to manipulate jobs.(Submit, query, remove, etc.) What’s the bottleneck?
![Page 18: The Ethernet Approach to Grid Computing](https://reader036.vdocument.in/reader036/viewer/2022062520/56815d8a550346895dcb965b/html5/thumbnails/18.jpg)
Aloha Client
try for 5 minutescondor_submit job.file
end
![Page 19: The Ethernet Approach to Grid Computing](https://reader036.vdocument.in/reader036/viewer/2022062520/56815d8a550346895dcb965b/html5/thumbnails/19.jpg)
Ethernet Client
try for 5 minutesif avail_fds() .lt. 1000
failureendcondor_submit job.file
end
Measurefree filedescriptors.
Throw anexception and try again.
![Page 20: The Ethernet Approach to Grid Computing](https://reader036.vdocument.in/reader036/viewer/2022062520/56815d8a550346895dcb965b/html5/thumbnails/20.jpg)
![Page 21: The Ethernet Approach to Grid Computing](https://reader036.vdocument.in/reader036/viewer/2022062520/56815d8a550346895dcb965b/html5/thumbnails/21.jpg)
Shared Disk Buffer
d5.c d6.cd7.c d9.i
DataMover
Local FileSystem
Step E:Send
Job 8 Job 9 Job 10
d10.id8.i
Step C:Commit
Step D:Read
d4.c
Step F:Delete
Step B:Write
Step A:Arbitrate
Multiple batch jobs share an output buffer.Jobs write output files, and a mover pushes them out.
![Page 22: The Ethernet Approach to Grid Computing](https://reader036.vdocument.in/reader036/viewer/2022062520/56815d8a550346895dcb965b/html5/thumbnails/22.jpg)
Aloha Client
try for 30 minutestry
run-job > d$n.imv d$n.i d$n.c
catchrm -f d$n.i
endend
Remove the file if any failure.
Create the file, marked “incomplete.”
Atomically commit the file.
![Page 23: The Ethernet Approach to Grid Computing](https://reader036.vdocument.in/reader036/viewer/2022062520/56815d8a550346895dcb965b/html5/thumbnails/23.jpg)
Ethernet Client
try for 30 minutesif overcommitted()failureendtryrun-job > d$n.imv d$n.i d$n.ccatchrm -f d$n.iend
end
Buffer is overcommitted ifestimated needs exceed available space.
![Page 24: The Ethernet Approach to Grid Computing](https://reader036.vdocument.in/reader036/viewer/2022062520/56815d8a550346895dcb965b/html5/thumbnails/24.jpg)
![Page 25: The Ethernet Approach to Grid Computing](https://reader036.vdocument.in/reader036/viewer/2022062520/56815d8a550346895dcb965b/html5/thumbnails/25.jpg)
Shared Data Servers
Client Client Client Client
WWWServer
Client
WWWServer Black
Holedataset dataset
Accepts all connections and holds them idle indefinitely.
A healthy but loaded server
might also have a high response
time.
Each client wants one instance of the data set, but doesn’t carewhich one. How to deal with delays and failures?
![Page 26: The Ethernet Approach to Grid Computing](https://reader036.vdocument.in/reader036/viewer/2022062520/56815d8a550346895dcb965b/html5/thumbnails/26.jpg)
Aloha Clienttry for 15 minutes
forany host in xxx yyy zzztry for 1 minute
wget http://${host}/dataend
endend
![Page 27: The Ethernet Approach to Grid Computing](https://reader036.vdocument.in/reader036/viewer/2022062520/56815d8a550346895dcb965b/html5/thumbnails/27.jpg)
Ethernet Clienttry for 15 minutes
forany host in xxx yyy zzztry for 5 seconds
wget http://${host}/tinyendtry for 1 minute
wget http://${host}/dataend
endend
Test the server by fetching a tiny file.
![Page 28: The Ethernet Approach to Grid Computing](https://reader036.vdocument.in/reader036/viewer/2022062520/56815d8a550346895dcb965b/html5/thumbnails/28.jpg)
All ClientsBlocked onBlack Hole
![Page 29: The Ethernet Approach to Grid Computing](https://reader036.vdocument.in/reader036/viewer/2022062520/56815d8a550346895dcb965b/html5/thumbnails/29.jpg)
Some Thoughts• This is a necessary technique for real problems.
– Timing is uncontrollable; failures lack detail.– A simple technique has significant payoff.
• The Ethernet approach is not always ideal.– Carefully chosen errnos are powerful.– Designing errnos is tricky.
• Requires clients of good will.– Some scenarios require external coordination.– Admission control for admission control?
• Time and failure are first-class concerns.– They should be first-class elements of languages!– We get good mileage without complex constructions.
• More info at:– http://www.cs.wisc.edu/condor/ftsh
![Page 30: The Ethernet Approach to Grid Computing](https://reader036.vdocument.in/reader036/viewer/2022062520/56815d8a550346895dcb965b/html5/thumbnails/30.jpg)
Computing’s central challenge,“How not to make a mess of it,”
has not yet been met.
-Edsger Dijkstra