using the hubzero platform to enable remote computing on diagrid hubbub 2015 - september 15 th 2015...

30
USING THE HUBZERO PLATFORM TO ENABLE REMOTE COMPUTING ON DIAGRID HUBBUB 2015 - SEPTEMBER 15 TH 2015 Christopher Thompson Rosen Center of Advanced Computing Purdue University

Upload: theresa-blake

Post on 29-Dec-2015

215 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: USING THE HUBZERO PLATFORM TO ENABLE REMOTE COMPUTING ON DIAGRID HUBBUB 2015 - SEPTEMBER 15 TH 2015 Christopher Thompson Rosen Center of Advanced Computing

USING THE HUBZERO PLATFORM TO ENABLE REMOTE COMPUTING ON DIAGRIDHUBBUB 2015 - SEPTEMBER 15TH 2015

Christopher ThompsonRosen Center of Advanced ComputingPurdue University

Page 2: USING THE HUBZERO PLATFORM TO ENABLE REMOTE COMPUTING ON DIAGRID HUBBUB 2015 - SEPTEMBER 15 TH 2015 Christopher Thompson Rosen Center of Advanced Computing

OUTLINE

• Why compute remotely?

• Example Tools• SubmitR• BLASTer

• Review

Page 3: USING THE HUBZERO PLATFORM TO ENABLE REMOTE COMPUTING ON DIAGRID HUBBUB 2015 - SEPTEMBER 15 TH 2015 Christopher Thompson Rosen Center of Advanced Computing

HOW TO LOCAL COMPUTEI’VE GOT AN OLD MACHINE AROUND HERE SOMEWHERE FOR THAT…

Gather Input Set Params Run Code Review

Output

Where?

Under desk?

Other users?

Slow!

“I swear it’s in theright directory!”~ The Users

Now what?

What did I use yesterday?

Did X=4 or X=7?

• Some users are savvy, know about terminals and clusters.

• The standard user experience before the HUB for most, though…

“I think it will be done next Tuesday…”

Page 4: USING THE HUBZERO PLATFORM TO ENABLE REMOTE COMPUTING ON DIAGRID HUBBUB 2015 - SEPTEMBER 15 TH 2015 Christopher Thompson Rosen Center of Advanced Computing

WHY REMOTE COMPUTE?DIAGRID REMOTE COMPUTES BECAUSE… DiaGrid

• Purpose: low-barrier to (idle) cycles through a HUB• Started as a frontend to “DiaGrid” HT Condor Pool

~50,000 cores distributed across partner campuses• Now handles tools with many job types, destinations

Scalability• Computation

• Always want more: resolution, complexity, etc.• Storage

• “Big Data”• Not mutually exclusive!

• More computation Bigger data• Bigger data More computation

Users bring remote resource needs• Ex: XSEDE projects (Awarded X cpu hours at Y cluster)

DiaGrid Partner Institutions

Purdue University University of Notre Dame

University of Wisconsin Indiana State University

University of Nebraska-Lincoln

Indiana U. - Purdue U. Fort Wayne

University of Louisville Purdue U. North Central

Indiana University Purdue U. Calument

Page 5: USING THE HUBZERO PLATFORM TO ENABLE REMOTE COMPUTING ON DIAGRID HUBBUB 2015 - SEPTEMBER 15 TH 2015 Christopher Thompson Rosen Center of Advanced Computing

HOW TO REMOTE COMPUTEGET THOSE TOOLS ONLINE!

Luckily, HubZero gives us tool sessions & ‘submit’!

Gather Input Set Params Run Code Review

Output

Upload Input to DiaGrid

(Generate?)

Set Params‘submit’

Runs CodeReview Output

Where?

Under desk?

Other users?

Slow!

and download…and share….and publish….and analyze…

“Who cares where!?”~ The Users

Fast!

“Let the tool figure

out where it goes!”

~ The Users

“I swear it’s in theright directory!”~ The Users

Now what?

Save settings for

future sessions!

X XWhat did I use yesterday?

Did X=4 or X=7?

+ +

Page 6: USING THE HUBZERO PLATFORM TO ENABLE REMOTE COMPUTING ON DIAGRID HUBBUB 2015 - SEPTEMBER 15 TH 2015 Christopher Thompson Rosen Center of Advanced Computing

SUBMITRR SCRIPT EXECUTION

Page 7: USING THE HUBZERO PLATFORM TO ENABLE REMOTE COMPUTING ON DIAGRID HUBBUB 2015 - SEPTEMBER 15 TH 2015 Christopher Thompson Rosen Center of Advanced Computing

SUBMITRWHAT IS SUBMITR?

• GUI to upload and run R scripts

• Submit to HPC cluster @ Purdue

• Build argument lists

• Monitor running jobs

• Written in Python

• R used by:• Statisticians• Everyone!

Page 8: USING THE HUBZERO PLATFORM TO ENABLE REMOTE COMPUTING ON DIAGRID HUBBUB 2015 - SEPTEMBER 15 TH 2015 Christopher Thompson Rosen Center of Advanced Computing

SUBMITRWORKFLOW

User writes R script

Upload script

Configure job params

Submit & run job

Download output

Page 9: USING THE HUBZERO PLATFORM TO ENABLE REMOTE COMPUTING ON DIAGRID HUBBUB 2015 - SEPTEMBER 15 TH 2015 Christopher Thompson Rosen Center of Advanced Computing

SUBMITRSUBMIT JOB TYPES

• Serial• “Traditional” single core scripts

• Parallel• Multiple cores, nodes• Uses R parallel libraries (Snow)

• Parameter Sweeps• Define sweep variables in args• List ranges in SubmitR UI• ‘submit’ creates all the jobs• Ex: Monte Carlo simulations

Page 10: USING THE HUBZERO PLATFORM TO ENABLE REMOTE COMPUTING ON DIAGRID HUBBUB 2015 - SEPTEMBER 15 TH 2015 Christopher Thompson Rosen Center of Advanced Computing

SUBMITRARCHITECTURE

DiaGrid.org

Tool Session

subm

it

SubmitRhttp://www. SS

H ?

Page 11: USING THE HUBZERO PLATFORM TO ENABLE REMOTE COMPUTING ON DIAGRID HUBBUB 2015 - SEPTEMBER 15 TH 2015 Christopher Thompson Rosen Center of Advanced Computing

SUBMITRSUBMIT COMMAND

Page 12: USING THE HUBZERO PLATFORM TO ENABLE REMOTE COMPUTING ON DIAGRID HUBBUB 2015 - SEPTEMBER 15 TH 2015 Christopher Thompson Rosen Center of Advanced Computing

SUBMITRSUBMIT COMMAND

R CMD BATCH –q “--args [1-100][a,b,c]” test1.R

Page 13: USING THE HUBZERO PLATFORM TO ENABLE REMOTE COMPUTING ON DIAGRID HUBBUB 2015 - SEPTEMBER 15 TH 2015 Christopher Thompson Rosen Center of Advanced Computing

SUBMITRSUBMIT COMMAND

submit --debug venue flags list of job files (-i) R command and args

Page 14: USING THE HUBZERO PLATFORM TO ENABLE REMOTE COMPUTING ON DIAGRID HUBBUB 2015 - SEPTEMBER 15 TH 2015 Christopher Thompson Rosen Center of Advanced Computing

SUBMITRHOW TO CALL SUBMIT IN PYTHON

thread = threading.Thread(submit_thread)  # See function below thread.start()..def submit_thread():     cmd = build_submit() # Assembles "submit..." string    cmd = 'exec ' + cmd  # Prevents additional process (enables cleanup)     # Execute submit, update UI with output while it is running    ui.set_status('Submitting job...')    ui.log('Job run started')    line = '(initial)' # Holds latest status update from submit    lines = []         # Accumulates status updates

    try:        sub = subprocess.Popen(            cmd,            shell     = True,            bufsize   = 1,  # "line buffered"            stdout=subprocess.PIPE,            stderr=subprocess.STDOUT,            close_fds = True)         while line:            line = sub.stdout.readline()            lines.append(line)            ui.set_status('Running...' + line)         returncode = sub.wait()

    except:        for obj in sys.exc_info(): # Explanatory text available?            lines.append(str(obj))     ui.log('Job run ended')    ui.log('Output from submit:\n' +''.join(lines)+'(end of output)')

Run submit command in a function that is used as the body of a thread.

Create the ‘submit’ command as a string.

Send an update to the UI to tell user the job submission process has started and setup string array to hold statuses

1. Use subprocess.Popen to create process for ‘submit’ cmd.2. Capture stdout & stderr from the process and use to keep user updated. Loop and read line by line.3. When ‘submit’ done, stdout stream closes and loop will exit. Store the process exit code.

Use try-except block to capture any run errors.

Finally, update UI again to show job is done.

Page 15: USING THE HUBZERO PLATFORM TO ENABLE REMOTE COMPUTING ON DIAGRID HUBBUB 2015 - SEPTEMBER 15 TH 2015 Christopher Thompson Rosen Center of Advanced Computing

BLASTERGENOME SEQUENCE SEARCHING

Page 16: USING THE HUBZERO PLATFORM TO ENABLE REMOTE COMPUTING ON DIAGRID HUBBUB 2015 - SEPTEMBER 15 TH 2015 Christopher Thompson Rosen Center of Advanced Computing

BLASTERWHAT IS BLASTER?• BLAST

• Bioinformatics tools, industry-standard developed by NCBI

• Query gene sequence databases for similar sequences

• Suite of command-line tools

• BLASTer

• Java-based GUI for running BLAST

• Manages multiple searches, history of searches

• Handles different BLAST variants

• Submits to Purdue HPC resources, hides all execution details from user

• Maintains up-to-date copy of standard NCBI databases or allows custom DB uploaded from user.

Page 17: USING THE HUBZERO PLATFORM TO ENABLE REMOTE COMPUTING ON DIAGRID HUBBUB 2015 - SEPTEMBER 15 TH 2015 Christopher Thompson Rosen Center of Advanced Computing

BLASTERWORKFLOW

User uploads FASTA input file (& optionally custom DB)

Select search parameters

Submit search job

Monitor search job progress

View/download output

Page 18: USING THE HUBZERO PLATFORM TO ENABLE REMOTE COMPUTING ON DIAGRID HUBBUB 2015 - SEPTEMBER 15 TH 2015 Christopher Thompson Rosen Center of Advanced Computing

BLASTERSUBMIT JOB TYPES

• HT Condor (Pegasus)• First mode BLASTer supported• Input files broken into small pieces of <100 sequences, run in parallel• Each piece separate job sent to HT Condor pool via ‘submit’ with Pegasus• BLASTer merges results from each at end

• However…• Standard DB grew over time, some as much as 2 magnitudes in size!• Jobs that ran in 5-10min now run for hours…

• PBS• Now submits to HPC cluster at Purdue with PBS job handling• Supports longer runtimes: defined walltimes, no eviction from nodes like HT Condor

• ‘Submit’ allowed this change with very little effort• Most details hidden behind ‘submit’ and away from tool code• Changed some arguments, tweaked some output/status parsing

Page 19: USING THE HUBZERO PLATFORM TO ENABLE REMOTE COMPUTING ON DIAGRID HUBBUB 2015 - SEPTEMBER 15 TH 2015 Christopher Thompson Rosen Center of Advanced Computing

BLASTERINTERNAL ARCHITECTURE

BLASTer

GUI Classes

Job Data Manager

Engine Manager

Condor Engine

PBS Engine

??? Engine

Main Window

Job History Viewer

Job Config Panel

Job Control Panel

Subm

it Pega

sus

http://www.

History Current1

23

4 5

67

SSH

Page 20: USING THE HUBZERO PLATFORM TO ENABLE REMOTE COMPUTING ON DIAGRID HUBBUB 2015 - SEPTEMBER 15 TH 2015 Christopher Thompson Rosen Center of Advanced Computing

BLASTEROVERALL ARCHITECTURE

DiaGrid.org

Tool Session

subm

itBLASTer

http://www.

Pega

sus

DiaGridPool

GU

IJo

b M

ngr Engine1

HPCClusters

Engine2

Engine3

Condor

PBS

?SS

H

?

Page 21: USING THE HUBZERO PLATFORM TO ENABLE REMOTE COMPUTING ON DIAGRID HUBBUB 2015 - SEPTEMBER 15 TH 2015 Christopher Thompson Rosen Center of Advanced Computing

BLASTERSUBMIT COMMAND

Determines executable used by submit

Input file processed andsent by submit with job

Arguments of the BLASTexecutable appended tothe submit command and sent with the job

Page 22: USING THE HUBZERO PLATFORM TO ENABLE REMOTE COMPUTING ON DIAGRID HUBBUB 2015 - SEPTEMBER 15 TH 2015 Christopher Thompson Rosen Center of Advanced Computing

BLASTERHOW TO CALL SUBMIT IN JAVA – STARTING THE SUBMISSION THREAD

Engine Manager

Condor Engine

PBS Engine

??? Engine

public synchronized JobStatus submitJob(Job job) {

<a lot of error checking code…>

// create status item for job submission & add to list of active jobs JobStatus queuedStatus = new JobStatus(QUEUED); job.setStatus(queuedStatus); this.jobs.put(jobId, job);

// create a thread to handle the submission and run it SubmitPbsThread submitThread = new SubmitPbsThread(job);

this.jobThreads.put(jobId, submitThread); submitThread.start();

return queuedStatus;}

New Job

• When user clicks Submit, new job data object created and sent to engine manager which routes it to correct “engine” class for job type.

• A thread is started to actually run the submit command.

Page 23: USING THE HUBZERO PLATFORM TO ENABLE REMOTE COMPUTING ON DIAGRID HUBBUB 2015 - SEPTEMBER 15 TH 2015 Christopher Thompson Rosen Center of Advanced Computing

BLASTERHOW TO CALL SUBMIT IN JAVA – THE SUBMISSION & MONITORING THREADS

Submit Thread

Parse Parameters

Setup Job Data

Start ‘submit’ Command

Start Monitoring Thread

Monitoring Thread

Wait ‘submit’

Done

Merge & Prepare Output

End

‘submit’ Done? End

YES

NO

Scan Job Status

Update Data Structs

SLEEP(10s)

Trigger UI Update

Page 24: USING THE HUBZERO PLATFORM TO ENABLE REMOTE COMPUTING ON DIAGRID HUBBUB 2015 - SEPTEMBER 15 TH 2015 Christopher Thompson Rosen Center of Advanced Computing

BLASTERHOW TO CALL SUBMIT IN JAVA – RUNNING THE SUBMIT SHELL SCRIPT// gather all the user options & system settings needed

String[] jobArgs = parseParameters(job)

// run the shell script to call submit

Process p = Runtime.getRuntime().exec(jobArgs);

InputStream stdout = p.getInputStream();

InputStream stderr = p.getErrorStream();

// start a thread that will keep an eye on process & update UI

SubmitMonitorThread monitorThread =

new SubmitMonitorThread(stdout, stderr);

monitorThread.start();

// block until submit script is done

int exitCode = p.waitFor();

// stop the monitoring thread

monitorThread.interrupt();

jobArgs = /bin/bashblast_submit.sh

input.fastablastx

nr103…

Page 25: USING THE HUBZERO PLATFORM TO ENABLE REMOTE COMPUTING ON DIAGRID HUBBUB 2015 - SEPTEMBER 15 TH 2015 Christopher Thompson Rosen Center of Advanced Computing

BLASTERTHE SUBMIT COMMAND SHELL SCRIPT

#!/bin/bash

FASTA=$1

PROGRAM=$2

DB=$3

shift 3

ARGUMENTS="$@"

# get full path to latest copy of database

DB=`tail -1 /data/tools/blastgui/versions`'/'${DB}

# split the input file into the appropriate number of chunks

./blast-split-contigs.py ${FASTA}

submit --metrics -p @@seq=globnat:seq*

${PROGRAM} -query @@seq -db ${DB} ${ARGUMENTS} \

-out splitfile.output 1>>stdout 2>>stderr

Input file of sequences

Specific BLAST executable to run

Name of database user wants to search

BLAST executable arguments from user’s options

submit --metrics -p @@seq=globnat:seq* \ ${PROGRAM} -query @@seq -db ${DB} ${ARGUMENTS} \ -out splitfile.output 1>>stdout 2>>stderr

jobArgs =

/bin/bashblast_submit.sh

input.fastablastx

nr103…

$0

$1

$2

$3

$4

$5

Page 26: USING THE HUBZERO PLATFORM TO ENABLE REMOTE COMPUTING ON DIAGRID HUBBUB 2015 - SEPTEMBER 15 TH 2015 Christopher Thompson Rosen Center of Advanced Computing

IN REVIEW

Page 27: USING THE HUBZERO PLATFORM TO ENABLE REMOTE COMPUTING ON DIAGRID HUBBUB 2015 - SEPTEMBER 15 TH 2015 Christopher Thompson Rosen Center of Advanced Computing

REMOTE COMPUTING TIPSA PATTERN FOR TOOL DEVELOPMENT

Same general process, regardless of language or job type:

1. Build ‘submit’ command arguments from:

User selected options in UI

Automatically generated (ex: scan uploaded files)

Fixed values (ex: executable names, queue names, file paths, etc)

2. Create a new thread to execute ‘submit’ command

A. Exec the ‘submit’ command

B. Wait for ‘submit’ process to finish

3. Monitor status of ‘submit’ thread

Collect stdout/stderr while it runs

Update UI periodically

React to any problems that arise

4. Collect / process / analyze output

Page 28: USING THE HUBZERO PLATFORM TO ENABLE REMOTE COMPUTING ON DIAGRID HUBBUB 2015 - SEPTEMBER 15 TH 2015 Christopher Thompson Rosen Center of Advanced Computing

MORE DIAGRID TOOLSSO MANY TOOLS . . .

Tool Description Language Job Types Destination

CryoEM Electron microscope image analysis Python / C++ HT Condor &

PBSDiaGrid Condor Pool / Hansen (Purdue)

NAMDD GUI for NAMD, popular molecular dynamics tool Python PBS

(SSH)Hansen

(Purdue)

GROMAC-SIMUM

GUI for GROMACS, popular molecular dynamics tool Java PBS

(SSH)Hansen

(Purdue)

Spyder Python IDE with built-in job submission Python HT Condor

(Pegasus)DiaGrid

Condor Pool

NLACE Biomechanical image analysis Python PBS

(GSISSH)Gordon (SDCC)

And many more…

Page 29: USING THE HUBZERO PLATFORM TO ENABLE REMOTE COMPUTING ON DIAGRID HUBBUB 2015 - SEPTEMBER 15 TH 2015 Christopher Thompson Rosen Center of Advanced Computing

ACKNOWLEDGEMENTSTHANKS!

HUBzero team!• Not possible without ‘submit’ system• Support from HZ admins invaluable in tool dev

Page 30: USING THE HUBZERO PLATFORM TO ENABLE REMOTE COMPUTING ON DIAGRID HUBBUB 2015 - SEPTEMBER 15 TH 2015 Christopher Thompson Rosen Center of Advanced Computing

QUESTIONS? ANSWERS!

Also….

Explore, interact, & contribute at: http://diagrid.org

??? ?

??

?

?