open science grid: more compute power alan de smet [email protected]
DESCRIPTION
Open Science Grid: More compute power Alan De Smet [email protected]. CHTC Cores In Use. (CPU days each day averaged over one month). 1,500. OSG Cores In Use. (CPU days each day averaged over one month). 60,000. Open Science Grid. CHTC and OSG usage. (CPU days each day). - PowerPoint PPT PresentationTRANSCRIPT
Open Science Grid:More compute power
Alan De Smet [email protected]
chtc.cs.wisc.edu
(CPU days each day averaged over one month)
CHTC Cores In Use
1,500
chtc.cs.wisc.edu
(CPU days each day averaged over one month)
OSG Cores In Use
60,000
chtc.cs.wisc.edu
Open Science Grid
chtc.cs.wisc.edu
CHTC and OSG usage
(CPU days each day)
chtc.cs.wisc.edu
Challenges Solved
We worry about all of this.
You don’t have to.
›Authentication X.509 certificates, certificate authorities, VOMS
›Interface Globus, GridFTP, Grid universe
›Validation Linux distribution, glibc version, basic libraries
chtc.cs.wisc.edu
Using OSG
› Before
universe = vanilla
executable = myjob
log = myjob.log
queue
chtc.cs.wisc.edu
Using OSG
› After
universe = vanilla
executable = myjob
log = myjob.log
+WantGlidein = true
queue
chtc.cs.wisc.edu
Challenge: Opportunistic
› OSG computers go away without notice
› Solutions Condor restarts automatically Sub-hour jobs Self-checkpointing Automated checkpointing
• Condor’s standard universe
• DMTCPhttp://dmtcp.sourceforge.net/
chtc.cs.wisc.edu
Challenge: Local Software
chtc.cs.wisc.edu
Challenge: Local Software
› Bare-bones Linux systems
› Solution Bring everything with you CHTC provided MATLAB and R packages
• RunDagEnv/mkdag
chtc.cs.wisc.edu
Challenge: Erratic Failures
› Complex systems fail sometimes
› Solution Expect failures and automatically
retry DAGMan for retries DAGMan POST scripts to detect
problems• RunDagEnv/mkdag
chtc.cs.wisc.edu
Challenge: Bandwidth
› Solutions Only send what you need Store large, shared files in our web
cache Read small amounts of data on the fly
• Condor’s standard universe• Parrot
http://www.cse.nd.edu/~ccl/software/parrot/