using the hubzero platform to enable remote computing on diagrid hubbub 2015 - september 15 th 2015...
TRANSCRIPT
USING THE HUBZERO PLATFORM TO ENABLE REMOTE COMPUTING ON DIAGRIDHUBBUB 2015 - SEPTEMBER 15TH 2015
Christopher ThompsonRosen Center of Advanced ComputingPurdue University
OUTLINE
• Why compute remotely?
• Example Tools• SubmitR• BLASTer
• Review
HOW TO LOCAL COMPUTEI’VE GOT AN OLD MACHINE AROUND HERE SOMEWHERE FOR THAT…
Gather Input Set Params Run Code Review
Output
Where?
Under desk?
Other users?
Slow!
“I swear it’s in theright directory!”~ The Users
Now what?
What did I use yesterday?
Did X=4 or X=7?
• Some users are savvy, know about terminals and clusters.
• The standard user experience before the HUB for most, though…
“I think it will be done next Tuesday…”
WHY REMOTE COMPUTE?DIAGRID REMOTE COMPUTES BECAUSE… DiaGrid
• Purpose: low-barrier to (idle) cycles through a HUB• Started as a frontend to “DiaGrid” HT Condor Pool
~50,000 cores distributed across partner campuses• Now handles tools with many job types, destinations
Scalability• Computation
• Always want more: resolution, complexity, etc.• Storage
• “Big Data”• Not mutually exclusive!
• More computation Bigger data• Bigger data More computation
Users bring remote resource needs• Ex: XSEDE projects (Awarded X cpu hours at Y cluster)
DiaGrid Partner Institutions
Purdue University University of Notre Dame
University of Wisconsin Indiana State University
University of Nebraska-Lincoln
Indiana U. - Purdue U. Fort Wayne
University of Louisville Purdue U. North Central
Indiana University Purdue U. Calument
HOW TO REMOTE COMPUTEGET THOSE TOOLS ONLINE!
Luckily, HubZero gives us tool sessions & ‘submit’!
Gather Input Set Params Run Code Review
Output
Upload Input to DiaGrid
(Generate?)
Set Params‘submit’
Runs CodeReview Output
Where?
Under desk?
Other users?
Slow!
and download…and share….and publish….and analyze…
“Who cares where!?”~ The Users
Fast!
“Let the tool figure
out where it goes!”
~ The Users
“I swear it’s in theright directory!”~ The Users
Now what?
Save settings for
future sessions!
X XWhat did I use yesterday?
Did X=4 or X=7?
+ +
SUBMITRR SCRIPT EXECUTION
SUBMITRWHAT IS SUBMITR?
• GUI to upload and run R scripts
• Submit to HPC cluster @ Purdue
• Build argument lists
• Monitor running jobs
• Written in Python
• R used by:• Statisticians• Everyone!
SUBMITRWORKFLOW
User writes R script
Upload script
Configure job params
Submit & run job
Download output
SUBMITRSUBMIT JOB TYPES
• Serial• “Traditional” single core scripts
• Parallel• Multiple cores, nodes• Uses R parallel libraries (Snow)
• Parameter Sweeps• Define sweep variables in args• List ranges in SubmitR UI• ‘submit’ creates all the jobs• Ex: Monte Carlo simulations
SUBMITRARCHITECTURE
DiaGrid.org
Tool Session
subm
it
SubmitRhttp://www. SS
H ?
SUBMITRSUBMIT COMMAND
SUBMITRSUBMIT COMMAND
R CMD BATCH –q “--args [1-100][a,b,c]” test1.R
SUBMITRSUBMIT COMMAND
submit --debug venue flags list of job files (-i) R command and args
SUBMITRHOW TO CALL SUBMIT IN PYTHON
thread = threading.Thread(submit_thread) # See function below thread.start()..def submit_thread(): cmd = build_submit() # Assembles "submit..." string cmd = 'exec ' + cmd # Prevents additional process (enables cleanup) # Execute submit, update UI with output while it is running ui.set_status('Submitting job...') ui.log('Job run started') line = '(initial)' # Holds latest status update from submit lines = [] # Accumulates status updates
try: sub = subprocess.Popen( cmd, shell = True, bufsize = 1, # "line buffered" stdout=subprocess.PIPE, stderr=subprocess.STDOUT, close_fds = True) while line: line = sub.stdout.readline() lines.append(line) ui.set_status('Running...' + line) returncode = sub.wait()
except: for obj in sys.exc_info(): # Explanatory text available? lines.append(str(obj)) ui.log('Job run ended') ui.log('Output from submit:\n' +''.join(lines)+'(end of output)')
Run submit command in a function that is used as the body of a thread.
Create the ‘submit’ command as a string.
Send an update to the UI to tell user the job submission process has started and setup string array to hold statuses
1. Use subprocess.Popen to create process for ‘submit’ cmd.2. Capture stdout & stderr from the process and use to keep user updated. Loop and read line by line.3. When ‘submit’ done, stdout stream closes and loop will exit. Store the process exit code.
Use try-except block to capture any run errors.
Finally, update UI again to show job is done.
BLASTERGENOME SEQUENCE SEARCHING
BLASTERWHAT IS BLASTER?• BLAST
• Bioinformatics tools, industry-standard developed by NCBI
• Query gene sequence databases for similar sequences
• Suite of command-line tools
• BLASTer
• Java-based GUI for running BLAST
• Manages multiple searches, history of searches
• Handles different BLAST variants
• Submits to Purdue HPC resources, hides all execution details from user
• Maintains up-to-date copy of standard NCBI databases or allows custom DB uploaded from user.
BLASTERWORKFLOW
User uploads FASTA input file (& optionally custom DB)
Select search parameters
Submit search job
Monitor search job progress
View/download output
BLASTERSUBMIT JOB TYPES
• HT Condor (Pegasus)• First mode BLASTer supported• Input files broken into small pieces of <100 sequences, run in parallel• Each piece separate job sent to HT Condor pool via ‘submit’ with Pegasus• BLASTer merges results from each at end
• However…• Standard DB grew over time, some as much as 2 magnitudes in size!• Jobs that ran in 5-10min now run for hours…
• PBS• Now submits to HPC cluster at Purdue with PBS job handling• Supports longer runtimes: defined walltimes, no eviction from nodes like HT Condor
• ‘Submit’ allowed this change with very little effort• Most details hidden behind ‘submit’ and away from tool code• Changed some arguments, tweaked some output/status parsing
BLASTERINTERNAL ARCHITECTURE
BLASTer
GUI Classes
Job Data Manager
Engine Manager
Condor Engine
PBS Engine
??? Engine
Main Window
Job History Viewer
Job Config Panel
Job Control Panel
Subm
it Pega
sus
…
http://www.
History Current1
23
4 5
67
SSH
BLASTEROVERALL ARCHITECTURE
DiaGrid.org
Tool Session
subm
itBLASTer
http://www.
Pega
sus
DiaGridPool
GU
IJo
b M
ngr Engine1
HPCClusters
Engine2
Engine3
Condor
PBS
?SS
H
?
BLASTERSUBMIT COMMAND
Determines executable used by submit
Input file processed andsent by submit with job
Arguments of the BLASTexecutable appended tothe submit command and sent with the job
BLASTERHOW TO CALL SUBMIT IN JAVA – STARTING THE SUBMISSION THREAD
Engine Manager
Condor Engine
PBS Engine
??? Engine
public synchronized JobStatus submitJob(Job job) {
<a lot of error checking code…>
// create status item for job submission & add to list of active jobs JobStatus queuedStatus = new JobStatus(QUEUED); job.setStatus(queuedStatus); this.jobs.put(jobId, job);
// create a thread to handle the submission and run it SubmitPbsThread submitThread = new SubmitPbsThread(job);
this.jobThreads.put(jobId, submitThread); submitThread.start();
return queuedStatus;}
New Job
• When user clicks Submit, new job data object created and sent to engine manager which routes it to correct “engine” class for job type.
• A thread is started to actually run the submit command.
BLASTERHOW TO CALL SUBMIT IN JAVA – THE SUBMISSION & MONITORING THREADS
Submit Thread
Parse Parameters
Setup Job Data
Start ‘submit’ Command
Start Monitoring Thread
Monitoring Thread
Wait ‘submit’
Done
Merge & Prepare Output
End
‘submit’ Done? End
YES
NO
Scan Job Status
Update Data Structs
SLEEP(10s)
Trigger UI Update
BLASTERHOW TO CALL SUBMIT IN JAVA – RUNNING THE SUBMIT SHELL SCRIPT// gather all the user options & system settings needed
String[] jobArgs = parseParameters(job)
…
// run the shell script to call submit
Process p = Runtime.getRuntime().exec(jobArgs);
InputStream stdout = p.getInputStream();
InputStream stderr = p.getErrorStream();
…
// start a thread that will keep an eye on process & update UI
SubmitMonitorThread monitorThread =
new SubmitMonitorThread(stdout, stderr);
monitorThread.start();
…
// block until submit script is done
int exitCode = p.waitFor();
// stop the monitoring thread
monitorThread.interrupt();
jobArgs = /bin/bashblast_submit.sh
input.fastablastx
nr103…
BLASTERTHE SUBMIT COMMAND SHELL SCRIPT
#!/bin/bash
FASTA=$1
PROGRAM=$2
DB=$3
shift 3
ARGUMENTS="$@"
# get full path to latest copy of database
DB=`tail -1 /data/tools/blastgui/versions`'/'${DB}
# split the input file into the appropriate number of chunks
./blast-split-contigs.py ${FASTA}
submit --metrics -p @@seq=globnat:seq*
${PROGRAM} -query @@seq -db ${DB} ${ARGUMENTS} \
-out splitfile.output 1>>stdout 2>>stderr
Input file of sequences
Specific BLAST executable to run
Name of database user wants to search
BLAST executable arguments from user’s options
submit --metrics -p @@seq=globnat:seq* \ ${PROGRAM} -query @@seq -db ${DB} ${ARGUMENTS} \ -out splitfile.output 1>>stdout 2>>stderr
jobArgs =
/bin/bashblast_submit.sh
input.fastablastx
nr103…
$0
$1
$2
$3
$4
$5
…
IN REVIEW
REMOTE COMPUTING TIPSA PATTERN FOR TOOL DEVELOPMENT
Same general process, regardless of language or job type:
1. Build ‘submit’ command arguments from:
User selected options in UI
Automatically generated (ex: scan uploaded files)
Fixed values (ex: executable names, queue names, file paths, etc)
2. Create a new thread to execute ‘submit’ command
A. Exec the ‘submit’ command
B. Wait for ‘submit’ process to finish
3. Monitor status of ‘submit’ thread
Collect stdout/stderr while it runs
Update UI periodically
React to any problems that arise
4. Collect / process / analyze output
MORE DIAGRID TOOLSSO MANY TOOLS . . .
Tool Description Language Job Types Destination
CryoEM Electron microscope image analysis Python / C++ HT Condor &
PBSDiaGrid Condor Pool / Hansen (Purdue)
NAMDD GUI for NAMD, popular molecular dynamics tool Python PBS
(SSH)Hansen
(Purdue)
GROMAC-SIMUM
GUI for GROMACS, popular molecular dynamics tool Java PBS
(SSH)Hansen
(Purdue)
Spyder Python IDE with built-in job submission Python HT Condor
(Pegasus)DiaGrid
Condor Pool
NLACE Biomechanical image analysis Python PBS
(GSISSH)Gordon (SDCC)
And many more…
ACKNOWLEDGEMENTSTHANKS!
HUBzero team!• Not possible without ‘submit’ system• Support from HZ admins invaluable in tool dev
QUESTIONS? ANSWERS!
Also….
Explore, interact, & contribute at: http://diagrid.org
??? ?
??
?
?