condor and gridshell how to execute 1 million jobs on the teragrid jeffrey p. gardner - psc edward...
TRANSCRIPT
Condor and GridShell
How to Execute 1 Million Jobs on the Teragrid
Jeffrey P. Gardner - Jeffrey P. Gardner - PSCPSCEdward Walker - Edward Walker - TACCTACCMiron Livney - Miron Livney - U. WisconsinU. WisconsinTodd Tannenbaum - Todd Tannenbaum - U. WisconsinU. Wisconsin
And many others!And many others!
Scientific Motivation Astronomy is increasingly being done by
using large surveys with 100s of millions of objects.
Analyzing large astronomical datasets frequently means performing the same analysis task on >100,000 objects.
Each object may take several hours of computing.
The amount of computing time required may vary, sometimes dramatically, from object to object.
Solution: PBS?
In theory, PBS should provide the answer. Submit 100,000 single-processor PBS jobs
In practice, this does not work. Teragrid nodes are multiprocessor
Only 1 PBS job per node Teragrid machines frequently restrict the
number of jobs a single user may run. Chad might get really mad if I submitted
100,000 PBS jobs!
Solution: mprun?
We could submit a single job that uses many processors. Now we have a reasonable number of
PBS jobs (Chad will now be happy). Scheduling priority would reflect our
actual resource usage. This still has problems.
Each job takes a different amount of time to run: we are using resources inefficiently.
The Real Solution: Condor+GridShell
The real solution is to submit one large PBS job, then use a private scheduler to manage serial work units within each PBS job.
We can even submit large PBS jobs to multiple Teragrid machines, then farm out serial work units as resources become availiable.
Vocabulary: JOB: (n) a thing that is submitted via Globus or PBS WORK UNIT: (n) An independent unit of work (usually serial), such as the analysis of a single astronomical object
The Real Solution: Condor+GridShell
The real solution is to submit one large PBS job, then use a private scheduler to manage serial work units within each PBS job.
We can even submit large PBS jobs to multiple Teragrid machines, then farm out serial work units as resources become availiable.
Vocabulary: JOB: (n) a thing that is submitted via Globus or PBS WORK UNIT: (n) An independent unit of work (usually serial), such as the analysis of a single astronomical object
CondorCondor
GridShellGridShell
Condor Overview
Condor was first designed as a CPU cycle harvester for workstations sitting on people’s desks.
Condor is designed to schedule large numbers of jobs across a distributed, heterogeneous and dynamic set of computational resources.
Condor: The User Experience
1. User writes a simple Condor submit script:
# my_job.submit:# A simple Condor submit scriptUniverse = vanillaExecutable = my_programQueue
2. User submits the job:
% condor_submit my_job.submitSubmitting job(s).1 job(s) submitted to cluster 1.
Condor: The User Experience
3. User watches job run:
4. Job completes. User is happy.
% condor_q
-- Submitter: perdita.cs.wisc.edu : <128.105.165.34:1027> : ID OWNER SUBMITTED RUN_TIME ST PRI SIZE CMD
1.0 Jeff 6/16 06:52 0+00:01:21 R 0 0.0 my_program
1 jobs; 0 idle, 1 running, 0 held
%
Advantages of Condor Condor user experience is simple Condor is flexible
Resources can be any mix of architectures Resources do not need a common filesystem Resources do not need common user
accounting Condor is dynamic
Resources can disappear and reappear Condor is fault-tolerant
Jobs are automatically migrated to new resources if existing one become unavailable.
Condor Daemons
condor_startd – (runs on execution node) Advertises specs and availability of execution
node (ClassAds). Starts jobs on exec. node. condor_schedd – (runs on submit node)
Handles job submission. Tracks job status. condor_collector – (runs on central manager)
Collects system information from execution node. condor_negotiator–(runs on central manager)
Matches schedd jobs to machines.
Condor Daemon Layout
Central Manager
collector
negotiator
Submission Machine
schedd
Execution Machine
startdStartd sends system specifications (ClassAds) and system status to Collector
Condor Daemon Layout
Central Manager
collector
negotiator
Submission Machine
schedd
Execution Machine
startd
Schedd sends job info to Negotiator
User submits Condor job
Condor Daemon Layout
Central Manager
collector
negotiator
Submission Machine
schedd
Execution Machine
startd
Negotiator uses information from Collector to match Schedd jobs to available Startds
Condor Daemon Layout
Central Manager
collector
negotiator
Submission Machine
schedd
Execution Machine
startd
Schedd sends job to Startd on assigned execution node
“Personal” Condor on a Teragrid Platform
Condor daemons can be run as a normal user.
Condor “GlideIn”™ ability supports the ability to launch condor_startd’s on nodes within an LSF or PBS job.
“Personal” Condor on a Teragrid Platform(Condor runs with normal user permissions)
Central Manager
collector
negotiatorSubmission Machine
schedd
Execution PE
startd
Execution PE
startd
Execution PE
startd
Submission Machine(could be login node)
Login Node
PBS Job - GlideIn
GridShell Overview Allows users to interact with distributed
grid computing resources from a simple shell-like interface.
extends TCSH version 6.12 to incorporates grid-enabled features: parallel inter-script message-passing and
synchronization output redirection to remote files parametric sweep
GridShell Examples
Redirecting the standard output of a command to a remote file location using GlobusFTP:
a.out > gsiftp://tg-login.ncsa.teragrid.org/data
Message passing between 2 parallel tasks:if ( $_GRID_TASKID == 0) then
echo "hello" > task_1 else
Set msg=`cat < task_0` endif
Executing 256 instances of a job:a.out on 256 procs
Merging GridShell with Condor
Use GridShell to launch Condor GlideIn jobs at multiple grid sites
All Condor GlideIn jobs report back to a central collector
This converts the entire Teragrid into your own personal Condor pool!
Merging GridShell with Condor
Login Node
Gridshell event monitor
SDSCPSC
NCSA
User starts GridShell Session at PSCUser starts GridShell Session at PSC
Merging GridShell with Condor
Login Node
Gridshell event monitor
Login Node
Gridshell event monitor
Login Node
Gridshell event monitor
SDSCPSC
NCSA
GridShell session starts event monitor on remote login nodes via GlobusGridShell session starts event monitor on remote login nodes via Globus
Merging GridShell with Condor
Login Node
collector
negotiator
schedd
Gridshell event monitor
Login Node
Gridshell event monitor
Login Node
Gridshell event monitor
SDSCPSC
NCSA
Local event monitor starts condor daemons on login nodeLocal event monitor starts condor daemons on login node
Login Node
collector
negotiator
schedd
Gridshell event monitor
PBS Job
startd
startdstartd
startdstartd
startd
Login Node
Gridshell event monitor
PBS Job
startd
startdstartd
startdstartd
startd
Login Node
Gridshell event monitor
PBS Job
startd
startdstartd
startdstartd
startd
SDSCPSC
NCSA
All event monitors submit Condor GlideIn PBS jobsAll event monitors submit Condor GlideIn PBS jobs
Login Node
collector
negotiator
schedd
Gridshell event monitor
PBS Job
startd
startdstartd
startdstartd
startd
Login Node
Gridshell event monitor
PBS Job
startd
startdstartd
startdstartd
startd
Login Node
Gridshell event monitor
PBS Job
startd
startdstartd
startdstartd
startd
SDSCPSC
NCSA
Condor startd’s tell collector that they have startedCondor startd’s tell collector that they have started
Login Node
collector
negotiator
schedd
Gridshell event monitor
PBS Job
startd
startdstartd
startdstartd
startd
Login Node
Gridshell event monitor
PBS Job
startd
startdstartd
startdstartd
startd
Login Node
Gridshell event monitor
PBS Job
startd
startdstartd
startdstartd
startd
SDSCPSC
NCSA
Condor schedd distributes independent work units to compute nodesCondor schedd distributes independent work units to compute nodes
GridShell in a NutShell
Using GridShell coupled with Condor one can easily harness the power of the Teragrid to process large numbers of independent work units.
Scheduling can be done dynamically from a central Condor queue to multiple grid sites as clusters of processors become availible.
All of this fits into existing Teragrid software.
Merging GridShell with Condor
Login Node
collector
negotiator
schedd
Gridshell event monitor
PBS Job
startd
startdstartd
startdstartd
startd
Login Node
Gridshell event monitor
PBS Job
startd
startdstartd
startdstartd
startd
Login Node
Gridshell event monitor
PBS Job
startd
startdstartd
startdstartd
startd
SDSCPSC
NCSA