ian c. smith* introduction to research computing using the high performance computing facilities and...
TRANSCRIPT
![Page 1: Ian C. Smith* Introduction to research computing using the High Performance Computing facilities and Condor *Advanced Research Computing University of](https://reader036.vdocument.in/reader036/viewer/2022062801/56649e7d5503460f94b7f613/html5/thumbnails/1.jpg)
Ian C. Smith*
Introduction to research computing using the High Performance Computing
facilities and Condor
*Advanced Research Computing
University of Liverpool
![Page 2: Ian C. Smith* Introduction to research computing using the High Performance Computing facilities and Condor *Advanced Research Computing University of](https://reader036.vdocument.in/reader036/viewer/2022062801/56649e7d5503460f94b7f613/html5/thumbnails/2.jpg)
Overview
Introduction to research computing
High Performance Computing (HPC)
HPC facilities at Liverpool
High Throughput Computing using Condor
University of Liverpool Condor Service
Some research computing examples using Condor
Next steps
![Page 3: Ian C. Smith* Introduction to research computing using the High Performance Computing facilities and Condor *Advanced Research Computing University of](https://reader036.vdocument.in/reader036/viewer/2022062801/56649e7d5503460f94b7f613/html5/thumbnails/3.jpg)
What’s special about research computing ?
Often researchers need to tackle problems which are far too demanding for a typical PC or laptop computer
Programs may take too long to run or …
require too much memory or …
too much storage (disk space) or …
all of these !
Special computer systems and programming methods can help overcome these barriers
![Page 4: Ian C. Smith* Introduction to research computing using the High Performance Computing facilities and Condor *Advanced Research Computing University of](https://reader036.vdocument.in/reader036/viewer/2022062801/56649e7d5503460f94b7f613/html5/thumbnails/4.jpg)
Speeding things up Key to reducing run times is parallelism - splitting large problems
into smaller tasks which can be tackled at the same time (i.e. “in parallel” or “concurrently”)
Two main types of parallelism:
data parallelism
functional parallelism (pipelining)
Tasks may be independent or inter-dependent (this eventually limits the speed up which can be achieved)
Fortunately many problems in medical/bio/life science exhibit data parallelism and tasks can be performed independently
This can lead to very significant speed ups !
![Page 5: Ian C. Smith* Introduction to research computing using the High Performance Computing facilities and Condor *Advanced Research Computing University of](https://reader036.vdocument.in/reader036/viewer/2022062801/56649e7d5503460f94b7f613/html5/thumbnails/5.jpg)
Some sources of parallelism
Analysing patient data from clinical trials
Repeating calculations with different random numbers e.g. bootstrapping and Monte Carlo methods
Dividing sequence data by chromosome
Splitting chromosome sequences into smaller parts
Partitioning large BLAST (or other sequence) databases and/or query files
![Page 6: Ian C. Smith* Introduction to research computing using the High Performance Computing facilities and Condor *Advanced Research Computing University of](https://reader036.vdocument.in/reader036/viewer/2022062801/56649e7d5503460f94b7f613/html5/thumbnails/6.jpg)
High Performance Computing (HPC)
Uses powerful special purpose systems called HPC clusters
Contain large numbers of processors acting in parallel
Each processor may contain multiple processing elements (cores) which can also work in parallel
Provide lots of memory and large amounts of fast (parallel) disk storage – ideal for data-intensive applications
Almost all clusters run the UNIX operating system
Typically run parallel programs containing inter-dependent tasks (e.g. finite element analysis codes) but also suitable for biostatistics and bioinformatics applications
![Page 7: Ian C. Smith* Introduction to research computing using the High Performance Computing facilities and Condor *Advanced Research Computing University of](https://reader036.vdocument.in/reader036/viewer/2022062801/56649e7d5503460f94b7f613/html5/thumbnails/7.jpg)
HPC cluster hardware (architecture)
Network Switch
Compute Node
Compute Node
Compute Node
Compute NodeHead Node
parallel filestore
high speednetwork
standard (ethernet) network
connection to outside world
![Page 8: Ian C. Smith* Introduction to research computing using the High Performance Computing facilities and Condor *Advanced Research Computing University of](https://reader036.vdocument.in/reader036/viewer/2022062801/56649e7d5503460f94b7f613/html5/thumbnails/8.jpg)
Typical cluster use (1)
Network Switch
Head Node
parallel filestore
login and upload data
input data files
![Page 9: Ian C. Smith* Introduction to research computing using the High Performance Computing facilities and Condor *Advanced Research Computing University of](https://reader036.vdocument.in/reader036/viewer/2022062801/56649e7d5503460f94b7f613/html5/thumbnails/9.jpg)
Typical cluster use (2)
Compute Node
Compute NodeHead Node
submit jobs (programs)
login from outside
![Page 10: Ian C. Smith* Introduction to research computing using the High Performance Computing facilities and Condor *Advanced Research Computing University of](https://reader036.vdocument.in/reader036/viewer/2022062801/56649e7d5503460f94b7f613/html5/thumbnails/10.jpg)
Typical cluster use (3)
Network Switch
Compute Node
Compute Node
Parallel Filestore
input data
![Page 11: Ian C. Smith* Introduction to research computing using the High Performance Computing facilities and Condor *Advanced Research Computing University of](https://reader036.vdocument.in/reader036/viewer/2022062801/56649e7d5503460f94b7f613/html5/thumbnails/11.jpg)
Typical cluster use (4)
Network Switch
Compute Node
Compute Node
task synchronisation(only if needed !)
compute nodesprocess data in parallel
![Page 12: Ian C. Smith* Introduction to research computing using the High Performance Computing facilities and Condor *Advanced Research Computing University of](https://reader036.vdocument.in/reader036/viewer/2022062801/56649e7d5503460f94b7f613/html5/thumbnails/12.jpg)
Typical cluster use (5)
Network Switch
Compute Node
Compute Node
parallel filestore
output data(results)
![Page 13: Ian C. Smith* Introduction to research computing using the High Performance Computing facilities and Condor *Advanced Research Computing University of](https://reader036.vdocument.in/reader036/viewer/2022062801/56649e7d5503460f94b7f613/html5/thumbnails/13.jpg)
Typical cluster use (6)
Network Switch
Head Node
parallel filestore
login and download results
output data files
![Page 14: Ian C. Smith* Introduction to research computing using the High Performance Computing facilities and Condor *Advanced Research Computing University of](https://reader036.vdocument.in/reader036/viewer/2022062801/56649e7d5503460f94b7f613/html5/thumbnails/14.jpg)
Parallel BLAST examplelogin as: ianian@bioinf1's password:Last login: Tue Feb 24 14:45:31 2015 from uxa.liv.ac.uk
![Page 15: Ian C. Smith* Introduction to research computing using the High Performance Computing facilities and Condor *Advanced Research Computing University of](https://reader036.vdocument.in/reader036/viewer/2022062801/56649e7d5503460f94b7f613/html5/thumbnails/15.jpg)
Parallel BLAST examplelogin as: ianian@bioinf1's password:Last login: Tue Feb 24 14:45:31 2015 from uxa.liv.ac.uk[ian@bioinf1 ~]$ cd /users/ian/chris/perl #change folder[ian@bioinf1 perl]$
![Page 16: Ian C. Smith* Introduction to research computing using the High Performance Computing facilities and Condor *Advanced Research Computing University of](https://reader036.vdocument.in/reader036/viewer/2022062801/56649e7d5503460f94b7f613/html5/thumbnails/16.jpg)
Parallel BLAST examplelogin as: ianian@bioinf1's password:Last login: Tue Feb 24 14:45:31 2015 from uxa.liv.ac.uk[ian@bioinf1 ~]$ cd /users/ian/chris/perl[ian@bioinf1 perl]$[ian@bioinf1 perl]$ ls -lh farisraw_*.fasta #list files
![Page 17: Ian C. Smith* Introduction to research computing using the High Performance Computing facilities and Condor *Advanced Research Computing University of](https://reader036.vdocument.in/reader036/viewer/2022062801/56649e7d5503460f94b7f613/html5/thumbnails/17.jpg)
Parallel BLAST examplelogin as: ianian@bioinf1's password:Last login: Tue Feb 24 14:45:31 2015 from uxa.liv.ac.uk[ian@bioinf1 ~]$ cd /users/ian/chris/perl[ian@bioinf1 perl]$[ian@bioinf1 perl]$ ls -lh farisraw_*.fasta -rw-r--r-- 1 ian ph 1.2G Feb 23 12:25 farisraw_1.fasta-rw-r--r-- 1 ian ph 1.4G Feb 23 12:25 farisraw_2.fasta-rw-r--r-- 1 ian ph 1.2G Feb 23 12:26 farisraw_3.fasta-rw-r--r-- 1 ian ph 1.3G Feb 23 12:26 farisraw_4.fasta-rw-r--r-- 1 ian ph 1.2G Feb 23 12:27 farisraw_5.fasta-rw-r--r-- 1 ian ph 1.3G Feb 23 12:27 farisraw_6.fasta-rw-r--r-- 1 ian ph 1.2G Feb 23 12:28 farisraw_7.fasta-rw-r--r-- 1 ian ph 1.2G Feb 23 12:24 farisraw_8.fasta-rw-r--r-- 1 ian ph 9.6G Feb 23 11:18 farisraw_complete.fasta[ian@bioinf1 perl]$
original query file
![Page 18: Ian C. Smith* Introduction to research computing using the High Performance Computing facilities and Condor *Advanced Research Computing University of](https://reader036.vdocument.in/reader036/viewer/2022062801/56649e7d5503460f94b7f613/html5/thumbnails/18.jpg)
Parallel BLAST examplelogin as: ianian@bioinf1's password:Last login: Tue Feb 24 14:45:31 2015 from uxa.liv.ac.uk[ian@bioinf1 ~]$ cd /users/ian/chris/perl[ian@bioinf1 perl]$[ian@bioinf1 perl]$ ls -lh farisraw_*.fasta-rw-r--r-- 1 ian ph 1.2G Feb 23 12:25 farisraw_1.fasta-rw-r--r-- 1 ian ph 1.4G Feb 23 12:25 farisraw_2.fasta-rw-r--r-- 1 ian ph 1.2G Feb 23 12:26 farisraw_3.fasta-rw-r--r-- 1 ian ph 1.3G Feb 23 12:26 farisraw_4.fasta-rw-r--r-- 1 ian ph 1.2G Feb 23 12:27 farisraw_5.fasta-rw-r--r-- 1 ian ph 1.3G Feb 23 12:27 farisraw_6.fasta-rw-r--r-- 1 ian ph 1.2G Feb 23 12:28 farisraw_7.fasta-rw-r--r-- 1 ian ph 1.2G Feb 23 12:24 farisraw_8.fasta-rw-r--r-- 1 ian ph 9.6G Feb 23 11:18 farisraw_complete.fasta[ian@bioinf1 perl]$
partial query files
![Page 19: Ian C. Smith* Introduction to research computing using the High Performance Computing facilities and Condor *Advanced Research Computing University of](https://reader036.vdocument.in/reader036/viewer/2022062801/56649e7d5503460f94b7f613/html5/thumbnails/19.jpg)
Parallel BLAST example[ian@bioinf1 perl]$ cat blast.sub #show file contents
job file
![Page 20: Ian C. Smith* Introduction to research computing using the High Performance Computing facilities and Condor *Advanced Research Computing University of](https://reader036.vdocument.in/reader036/viewer/2022062801/56649e7d5503460f94b7f613/html5/thumbnails/20.jpg)
Parallel BLAST example[ian@bioinf1 perl]$ cat blast.sub#!/bin/bash#$ -cwd -V#$ -o stdout#$ -e stderr#$ -pe smp 8
blastn -query farisraw_${SGE_TASK_ID}.fasta \ -db /users/ian/chris/faris/dbs/RM_allfarisgenes_linuxbldb \ -out output${SGE_TASK_ID}.txt \ -word_size 11 -evalue .00001 -culling_limit 1 -max_target_seqs 5 \ -num_threads 8 \ -outfmt "6 qseqid sseqid qlen length qstart qend sstart send mismatch \ gaps qseq sseq pident evalue“[ian@bioinf1 perl]$
job options
![Page 21: Ian C. Smith* Introduction to research computing using the High Performance Computing facilities and Condor *Advanced Research Computing University of](https://reader036.vdocument.in/reader036/viewer/2022062801/56649e7d5503460f94b7f613/html5/thumbnails/21.jpg)
Parallel BLAST example[ian@bioinf1 perl]$ cat blast.sub#!/bin/bash#$ -cwd -V#$ -o stdout#$ -e stderr#$ -pe smp 8
blastn -query farisraw_${SGE_TASK_ID}.fasta \ -db /users/ian/chris/faris/dbs/RM_allfarisgenes_linuxbldb \ -out output${SGE_TASK_ID}.txt \ -word_size 11 -evalue .00001 -culling_limit 1 -max_target_seqs 5 \ -num_threads 8 \ -outfmt "6 qseqid sseqid qlen length qstart qend sstart send mismatch \ gaps qseq sseq pident evalue“[ian@bioinf1 perl]$
BLAST query file
![Page 22: Ian C. Smith* Introduction to research computing using the High Performance Computing facilities and Condor *Advanced Research Computing University of](https://reader036.vdocument.in/reader036/viewer/2022062801/56649e7d5503460f94b7f613/html5/thumbnails/22.jpg)
Parallel BLAST example[ian@bioinf1 perl]$ cat blast.sub#!/bin/bash#$ -cwd -V#$ -o stdout#$ -e stderr#$ -pe smp 8
blastn -query farisraw_${SGE_TASK_ID}.fasta \ -db /users/ian/chris/faris/dbs/RM_allfarisgenes_linuxbldb \ -out output${SGE_TASK_ID}.txt \ -word_size 11 -evalue .00001 -culling_limit 1 -max_target_seqs 5 \ -num_threads 8 \ -outfmt "6 qseqid sseqid qlen length qstart qend sstart send mismatch \ gaps qseq sseq pident evalue“[ian@bioinf1 perl]$
BLAST database
![Page 23: Ian C. Smith* Introduction to research computing using the High Performance Computing facilities and Condor *Advanced Research Computing University of](https://reader036.vdocument.in/reader036/viewer/2022062801/56649e7d5503460f94b7f613/html5/thumbnails/23.jpg)
Parallel BLAST example[ian@bioinf1 perl]$ cat blast.sub#!/bin/bash#$ -cwd -V#$ -o stdout#$ -e stderr#$ -pe smp 8
blastn -query farisraw_${SGE_TASK_ID}.fasta \ -db /users/ian/chris/faris/dbs/RM_allfarisgenes_linuxbldb \ -out output${SGE_TASK_ID}.txt \ -word_size 11 -evalue .00001 -culling_limit 1 -max_target_seqs 5 \ -num_threads 8 \ -outfmt "6 qseqid sseqid qlen length qstart qend sstart send mismatch \ gaps qseq sseq pident evalue“[ian@bioinf1 perl]$
output file
![Page 24: Ian C. Smith* Introduction to research computing using the High Performance Computing facilities and Condor *Advanced Research Computing University of](https://reader036.vdocument.in/reader036/viewer/2022062801/56649e7d5503460f94b7f613/html5/thumbnails/24.jpg)
Parallel BLAST example[ian@bioinf1 perl]$ cat blast.sub#!/bin/bash#$ -cwd -V#$ -o stdout#$ -e stderr#$ -pe smp 8
blastn -query farisraw_${SGE_TASK_ID}.fasta \ -db /users/ian/chris/faris/dbs/RM_allfarisgenes_linuxbldb \ -out output${SGE_TASK_ID}.txt \ -word_size 11 -evalue .00001 -culling_limit 1 -max_target_seqs 5 \ -num_threads 8 \ -outfmt "6 qseqid sseqid qlen length qstart qend sstart send mismatch \ gaps qseq sseq pident evalue“[ian@bioinf1 perl]$
takes on values [1..8]when jobs are submitted
![Page 25: Ian C. Smith* Introduction to research computing using the High Performance Computing facilities and Condor *Advanced Research Computing University of](https://reader036.vdocument.in/reader036/viewer/2022062801/56649e7d5503460f94b7f613/html5/thumbnails/25.jpg)
Parallel BLAST example[ian@bioinf1 perl]$ cat blast.sub#!/bin/bash#$ -cwd -V#$ -o stdout#$ -e stderr#$ -pe smp 8
blastn -query farisraw_${SGE_TASK_ID}.fasta \ -db /users/ian/chris/faris/dbs/RM_allfarisgenes_linuxbldb \ -out output${SGE_TASK_ID}.txt \ -word_size 11 -evalue .00001 -culling_limit 1 -max_target_seqs 5 \ -num_threads 8 \ -outfmt "6 qseqid sseqid qlen length qstart qend sstart send mismatch \ gaps qseq sseq pident evalue“[ian@bioinf1 perl]$
use all 8 cores on eachcompute node (in parallel)
![Page 26: Ian C. Smith* Introduction to research computing using the High Performance Computing facilities and Condor *Advanced Research Computing University of](https://reader036.vdocument.in/reader036/viewer/2022062801/56649e7d5503460f94b7f613/html5/thumbnails/26.jpg)
Parallel BLAST example[ian@bioinf1 perl]$ cat blast.sub#!/bin/bash#$ -cwd -V#$ -o stdout#$ -e stderr#$ -pe smp 8
blastn -query farisraw_${SGE_TASK_ID}.fasta \ -db /users/ian/chris/faris/dbs/RM_allfarisgenes_linuxbldb \ -out output${SGE_TASK_ID}.txt \ -word_size 11 -evalue .00001 -culling_limit 1 -max_target_seqs 5 \ -num_threads 8 \ -outfmt "6 qseqid sseqid qlen length qstart qend sstart send mismatch \ gaps qseq sseq pident evalue“[ian@bioinf1 perl]$ qsub –t 1-8 blast.sub #submit jobs
![Page 27: Ian C. Smith* Introduction to research computing using the High Performance Computing facilities and Condor *Advanced Research Computing University of](https://reader036.vdocument.in/reader036/viewer/2022062801/56649e7d5503460f94b7f613/html5/thumbnails/27.jpg)
Parallel BLAST example[ian@bioinf1 perl]$ cat blast.sub#!/bin/bash#$ -cwd -V#$ -o stdout#$ -e stderr#$ -pe smp 8
blastn -query farisraw_${SGE_TASK_ID}.fasta \ -db /users/ian/chris/faris/dbs/RM_allfarisgenes_linuxbldb \ -out output${SGE_TASK_ID}.txt \ -word_size 11 -evalue .00001 -culling_limit 1 -max_target_seqs 5 \ -num_threads 8 \ -outfmt "6 qseqid sseqid qlen length qstart qend sstart send mismatch \ gaps qseq sseq pident evalue“[ian@bioinf1 perl]$ qsub –t 1-8 blast.subYour job-array 20164.1-8:1 ("blast.sub") has been submitted[ian@bioinf1 perl]$
![Page 28: Ian C. Smith* Introduction to research computing using the High Performance Computing facilities and Condor *Advanced Research Computing University of](https://reader036.vdocument.in/reader036/viewer/2022062801/56649e7d5503460f94b7f613/html5/thumbnails/28.jpg)
Parallel BLAST example[ian@bioinf1 perl]$ qstat #show job status
![Page 29: Ian C. Smith* Introduction to research computing using the High Performance Computing facilities and Condor *Advanced Research Computing University of](https://reader036.vdocument.in/reader036/viewer/2022062801/56649e7d5503460f94b7f613/html5/thumbnails/29.jpg)
Parallel BLAST example[ian@bioinf1 perl]$ qstatjob-ID prior name user state submit/start at queue slots ja-task-ID----------------------------------------------------------------------------------------------- 20157 0.55500 blast.sub ian r 02/26/2015 14:32:49 [email protected] 8 1 20157 0.55500 blast.sub ian r 02/26/2015 14:32:49 [email protected] 8 2 20157 0.55500 blast.sub ian r 02/26/2015 14:32:49 [email protected] 8 3 20157 0.55500 blast.sub ian r 02/26/2015 14:32:49 [email protected] 8 4 20157 0.55500 blast.sub ian r 02/26/2015 14:32:49 [email protected] 8 5 20157 0.55500 blast.sub ian r 02/26/2015 14:32:49 [email protected] 8 6 20157 0.55500 blast.sub ian r 02/26/2015 14:32:49 [email protected] 8 7 20157 0.55500 blast.sub ian r 02/26/2015 14:32:49 [email protected] 8 8[ian@bioinf1 perl]$
![Page 30: Ian C. Smith* Introduction to research computing using the High Performance Computing facilities and Condor *Advanced Research Computing University of](https://reader036.vdocument.in/reader036/viewer/2022062801/56649e7d5503460f94b7f613/html5/thumbnails/30.jpg)
Parallel BLAST example[ian@bioinf1 perl]$ qstatjob-ID prior name user state submit/start at queue slots ja-task-ID----------------------------------------------------------------------------------------------- 20157 0.55500 blast.sub ian r 02/26/2015 14:32:49 [email protected] 8 1 20157 0.55500 blast.sub ian r 02/26/2015 14:32:49 [email protected] 8 2 20157 0.55500 blast.sub ian r 02/26/2015 14:32:49 [email protected] 8 3 20157 0.55500 blast.sub ian r 02/26/2015 14:32:49 [email protected] 8 4 20157 0.55500 blast.sub ian r 02/26/2015 14:32:49 [email protected] 8 5 20157 0.55500 blast.sub ian r 02/26/2015 14:32:49 [email protected] 8 6 20157 0.55500 blast.sub ian r 02/26/2015 14:32:49 [email protected] 8 7 20157 0.55500 blast.sub ian r 02/26/2015 14:32:49 [email protected] 8 8[ian@bioinf1 perl]$
indicatesjob is running
![Page 31: Ian C. Smith* Introduction to research computing using the High Performance Computing facilities and Condor *Advanced Research Computing University of](https://reader036.vdocument.in/reader036/viewer/2022062801/56649e7d5503460f94b7f613/html5/thumbnails/31.jpg)
Parallel BLAST example[ian@bioinf1 perl]$ qstatjob-ID prior name user state submit/start at queue slots ja-task-ID----------------------------------------------------------------------------------------------- 20157 0.55500 blast.sub ian r 02/26/2015 14:32:49 [email protected] 8 1 20157 0.55500 blast.sub ian r 02/26/2015 14:32:49 [email protected] 8 2 20157 0.55500 blast.sub ian r 02/26/2015 14:32:49 [email protected] 8 3 20157 0.55500 blast.sub ian r 02/26/2015 14:32:49 [email protected] 8 4 20157 0.55500 blast.sub ian r 02/26/2015 14:32:49 [email protected] 8 5 20157 0.55500 blast.sub ian r 02/26/2015 14:32:49 [email protected] 8 6 20157 0.55500 blast.sub ian r 02/26/2015 14:32:49 [email protected] 8 7 20157 0.55500 blast.sub ian r 02/26/2015 14:32:49 [email protected] 8 8[ian@bioinf1 perl]$
name of compute nodejob is running on
![Page 32: Ian C. Smith* Introduction to research computing using the High Performance Computing facilities and Condor *Advanced Research Computing University of](https://reader036.vdocument.in/reader036/viewer/2022062801/56649e7d5503460f94b7f613/html5/thumbnails/32.jpg)
Parallel BLAST example[ian@bioinf1 perl]$ qstatjob-ID prior name user state submit/start at queue slots ja-task-ID----------------------------------------------------------------------------------------------- 20157 0.55500 blast.sub ian r 02/26/2015 14:32:49 [email protected] 8 1 20157 0.55500 blast.sub ian r 02/26/2015 14:32:49 [email protected] 8 2 20157 0.55500 blast.sub ian r 02/26/2015 14:32:49 [email protected] 8 3 20157 0.55500 blast.sub ian r 02/26/2015 14:32:49 [email protected] 8 4 20157 0.55500 blast.sub ian r 02/26/2015 14:32:49 [email protected] 8 5 20157 0.55500 blast.sub ian r 02/26/2015 14:32:49 [email protected] 8 6 20157 0.55500 blast.sub ian r 02/26/2015 14:32:49 [email protected] 8 7 20157 0.55500 blast.sub ian r 02/26/2015 14:32:49 [email protected] 8 8[ian@bioinf1 perl]$ qstat
![Page 33: Ian C. Smith* Introduction to research computing using the High Performance Computing facilities and Condor *Advanced Research Computing University of](https://reader036.vdocument.in/reader036/viewer/2022062801/56649e7d5503460f94b7f613/html5/thumbnails/33.jpg)
Parallel BLAST example[ian@bioinf1 perl]$ qstatjob-ID prior name user state submit/start at queue slots ja-task-ID----------------------------------------------------------------------------------------------- 20157 0.55500 blast.sub ian r 02/26/2015 14:32:49 [email protected] 8 1 20157 0.55500 blast.sub ian r 02/26/2015 14:32:49 [email protected] 8 2 20157 0.55500 blast.sub ian r 02/26/2015 14:32:49 [email protected] 8 3 20157 0.55500 blast.sub ian r 02/26/2015 14:32:49 [email protected] 8 4 20157 0.55500 blast.sub ian r 02/26/2015 14:32:49 [email protected] 8 5 20157 0.55500 blast.sub ian r 02/26/2015 14:32:49 [email protected] 8 6 20157 0.55500 blast.sub ian r 02/26/2015 14:32:49 [email protected] 8 7 20157 0.55500 blast.sub ian r 02/26/2015 14:32:49 [email protected] 8 8[ian@bioinf1 perl]$ qstatjob-ID prior name user state submit/start at queue slots ja-task-ID----------------------------------------------------------------------------------------------- 20157 0.55500 blast.sub ian r 02/26/2015 14:32:49 [email protected] 8 1 20157 0.55500 blast.sub ian r 02/26/2015 14:32:49 [email protected] 8 2 20157 0.55500 blast.sub ian r 02/26/2015 14:32:49 [email protected] 8 6[ian@bioinf1 perl]$ qstat
![Page 34: Ian C. Smith* Introduction to research computing using the High Performance Computing facilities and Condor *Advanced Research Computing University of](https://reader036.vdocument.in/reader036/viewer/2022062801/56649e7d5503460f94b7f613/html5/thumbnails/34.jpg)
Parallel BLAST example[ian@bioinf1 perl]$ qstatjob-ID prior name user state submit/start at queue slots ja-task-ID----------------------------------------------------------------------------------------------- 20157 0.55500 blast.sub ian r 02/26/2015 14:32:49 [email protected] 8 1 20157 0.55500 blast.sub ian r 02/26/2015 14:32:49 [email protected] 8 2 20157 0.55500 blast.sub ian r 02/26/2015 14:32:49 [email protected] 8 3 20157 0.55500 blast.sub ian r 02/26/2015 14:32:49 [email protected] 8 4 20157 0.55500 blast.sub ian r 02/26/2015 14:32:49 [email protected] 8 5 20157 0.55500 blast.sub ian r 02/26/2015 14:32:49 [email protected] 8 6 20157 0.55500 blast.sub ian r 02/26/2015 14:32:49 [email protected] 8 7 20157 0.55500 blast.sub ian r 02/26/2015 14:32:49 [email protected] 8 8[ian@bioinf1 perl]$ qstatjob-ID prior name user state submit/start at queue slots ja-task-ID----------------------------------------------------------------------------------------------- 20157 0.55500 blast.sub ian r 02/26/2015 14:32:49 [email protected] 8 1 20157 0.55500 blast.sub ian r 02/26/2015 14:32:49 [email protected] 8 2 20157 0.55500 blast.sub ian r 02/26/2015 14:32:49 [email protected] 8 6[ian@bioinf1 perl]$ qstat[ian@bioinf1 perl]$
![Page 35: Ian C. Smith* Introduction to research computing using the High Performance Computing facilities and Condor *Advanced Research Computing University of](https://reader036.vdocument.in/reader036/viewer/2022062801/56649e7d5503460f94b7f613/html5/thumbnails/35.jpg)
Parallel BLAST example[ian@bioinf1 perl]$ ls -lh output*.txt #list output files
![Page 36: Ian C. Smith* Introduction to research computing using the High Performance Computing facilities and Condor *Advanced Research Computing University of](https://reader036.vdocument.in/reader036/viewer/2022062801/56649e7d5503460f94b7f613/html5/thumbnails/36.jpg)
Parallel BLAST example[ian@bioinf1 perl]$ ls -lh output*-rw-r--r-- 1 ian ph 45M Feb 26 14:38 output1.txt-rw-r--r-- 1 ian ph 22M Feb 26 14:38 output2.txt-rw-r--r-- 1 ian ph 59M Feb 26 14:38 output3.txt-rw-r--r-- 1 ian ph 20M Feb 26 14:38 output4.txt-rw-r--r-- 1 ian ph 28M Feb 26 14:38 output5.txt-rw-r--r-- 1 ian ph 13M Feb 26 14:38 output6.txt-rw-r--r-- 1 ian ph 30M Feb 26 14:38 output7.txt-rw-r--r-- 1 ian ph 30M Feb 26 14:38 output8.txt[ian@bioinf1 perl]$
partial output files
![Page 37: Ian C. Smith* Introduction to research computing using the High Performance Computing facilities and Condor *Advanced Research Computing University of](https://reader036.vdocument.in/reader036/viewer/2022062801/56649e7d5503460f94b7f613/html5/thumbnails/37.jpg)
Parallel BLAST example[ian@bioinf1 perl]$ ls -lh output*.txt-rw-r--r-- 1 ian ph 45M Feb 26 14:38 output1.txt-rw-r--r-- 1 ian ph 22M Feb 26 14:38 output2.txt-rw-r--r-- 1 ian ph 59M Feb 26 14:38 output3.txt-rw-r--r-- 1 ian ph 20M Feb 26 14:38 output4.txt-rw-r--r-- 1 ian ph 28M Feb 26 14:38 output5.txt-rw-r--r-- 1 ian ph 13M Feb 26 14:38 output6.txt-rw-r--r-- 1 ian ph 30M Feb 26 14:38 output7.txt-rw-r--r-- 1 ian ph 30M Feb 26 14:38 output8.txt[ian@bioinf1 perl]$ cat output*.txt > output_complete.txt[ian@bioinf1 perl]$ ls -lh output_complete.txt
combine partialresults files
![Page 38: Ian C. Smith* Introduction to research computing using the High Performance Computing facilities and Condor *Advanced Research Computing University of](https://reader036.vdocument.in/reader036/viewer/2022062801/56649e7d5503460f94b7f613/html5/thumbnails/38.jpg)
Parallel BLAST example[ian@bioinf1 perl]$ ls -lh output*.txt-rw-r--r-- 1 ian ph 45M Feb 26 14:38 output1.txt-rw-r--r-- 1 ian ph 22M Feb 26 14:38 output2.txt-rw-r--r-- 1 ian ph 59M Feb 26 14:38 output3.txt-rw-r--r-- 1 ian ph 20M Feb 26 14:38 output4.txt-rw-r--r-- 1 ian ph 28M Feb 26 14:38 output5.txt-rw-r--r-- 1 ian ph 13M Feb 26 14:38 output6.txt-rw-r--r-- 1 ian ph 30M Feb 26 14:38 output7.txt-rw-r--r-- 1 ian ph 30M Feb 26 14:38 output8.txt[ian@bioinf1 perl]$ cat output*.txt > output_complete.txt[ian@bioinf1 perl]$ ls -lh output_complete.txt-rw-r--r-- 1 ian ph 499M Feb 26 14:44 output_complete.txt[ian@bioinf1 perl]$
combined results
![Page 39: Ian C. Smith* Introduction to research computing using the High Performance Computing facilities and Condor *Advanced Research Computing University of](https://reader036.vdocument.in/reader036/viewer/2022062801/56649e7d5503460f94b7f613/html5/thumbnails/39.jpg)
Some HPC clusters available at Liverpool
bioinf1
System bought by the Institute of Translational Medicine for use in biomedical research about 5 years ago
9 compute nodes each with 8 cores and 32 GB of memory (one node has 128 GB of memory)
76 TB of main (parallel) storage
chadwick
Main CSD HPC cluster for research use
118 nodes each with 16 cores and 64 GB memory (one node has 2 TB of memory)
Total of 135 TB of main (parallel) storage
Fast (40 GB/s) internal network
![Page 40: Ian C. Smith* Introduction to research computing using the High Performance Computing facilities and Condor *Advanced Research Computing University of](https://reader036.vdocument.in/reader036/viewer/2022062801/56649e7d5503460f94b7f613/html5/thumbnails/40.jpg)
High Throughput Computing (HTC) using Condor
No dedicated hardware - uses ordinary classroom PCs to run jobs when then they would otherwise be idle (usually evenings and weekends)
Jobs may be interrupted by users logging into Condor PCs – works best for short running jobs (10-20 minutes ideally)
Only suitable for applications which use independent tasks (need to use HPC inter-dependent tasks)
No shared storage – all data files must be transferred to/from the Condor PCs
Limited memory and disk space available since Condor uses only commodity PCs
However… Condor is well suited to many statistical and data-intensive applications !
![Page 41: Ian C. Smith* Introduction to research computing using the High Performance Computing facilities and Condor *Advanced Research Computing University of](https://reader036.vdocument.in/reader036/viewer/2022062801/56649e7d5503460f94b7f613/html5/thumbnails/41.jpg)
A “typical” Condor pool
Condor Server
Desktop PC
Execute hostsExecute hosts
login and upload input data
![Page 42: Ian C. Smith* Introduction to research computing using the High Performance Computing facilities and Condor *Advanced Research Computing University of](https://reader036.vdocument.in/reader036/viewer/2022062801/56649e7d5503460f94b7f613/html5/thumbnails/42.jpg)
A “typical” Condor pool
Condor Server
Desktop PC
Execute hostsExecute hosts
jobsjobs
![Page 43: Ian C. Smith* Introduction to research computing using the High Performance Computing facilities and Condor *Advanced Research Computing University of](https://reader036.vdocument.in/reader036/viewer/2022062801/56649e7d5503460f94b7f613/html5/thumbnails/43.jpg)
A “typical” Condor pool
Condor Server
Desktop PC
Execute hostsExecute hosts
results results
![Page 44: Ian C. Smith* Introduction to research computing using the High Performance Computing facilities and Condor *Advanced Research Computing University of](https://reader036.vdocument.in/reader036/viewer/2022062801/56649e7d5503460f94b7f613/html5/thumbnails/44.jpg)
A “typical” Condor pool
Condor Server
Desktop PC
Execute hostsExecute hosts
download results
![Page 45: Ian C. Smith* Introduction to research computing using the High Performance Computing facilities and Condor *Advanced Research Computing University of](https://reader036.vdocument.in/reader036/viewer/2022062801/56649e7d5503460f94b7f613/html5/thumbnails/45.jpg)
University of Liverpool Condor Pool
contains around 750 classroom PCs running the CSD Managed Windows 7 Service
Each PC can support a maximum of 4 jobs concurrently giving a theoretical capacity of 3000 parallel jobs
Typical spec: 3.3 GHz Intel i3 dual-core processor, 8 GB memory, 128 GB disk space
Tools are available to help in running large numbers of R and MATLAB jobs (other software may work but not commercial packages such as SAS and Stata)
Single job submission point for Condor jobs provided by powerful UNIX server
Service can be also accessed from a Windows PC/laptop using Desktop Condor (even from off-campus)
![Page 46: Ian C. Smith* Introduction to research computing using the High Performance Computing facilities and Condor *Advanced Research Computing University of](https://reader036.vdocument.in/reader036/viewer/2022062801/56649e7d5503460f94b7f613/html5/thumbnails/46.jpg)
Desktop Condor (1)
![Page 47: Ian C. Smith* Introduction to research computing using the High Performance Computing facilities and Condor *Advanced Research Computing University of](https://reader036.vdocument.in/reader036/viewer/2022062801/56649e7d5503460f94b7f613/html5/thumbnails/47.jpg)
Desktop Condor (2)
![Page 48: Ian C. Smith* Introduction to research computing using the High Performance Computing facilities and Condor *Advanced Research Computing University of](https://reader036.vdocument.in/reader036/viewer/2022062801/56649e7d5503460f94b7f613/html5/thumbnails/48.jpg)
Desktop Condor (3)
![Page 49: Ian C. Smith* Introduction to research computing using the High Performance Computing facilities and Condor *Advanced Research Computing University of](https://reader036.vdocument.in/reader036/viewer/2022062801/56649e7d5503460f94b7f613/html5/thumbnails/49.jpg)
Personalised Medicine example
project is an example of a Genome-Wide Association Study
aims to identify genetic predictors of response to anti-epileptic drugs
try to identify regions of the human genome that differ between individuals (referred to as Single Nucleotide Polymorphisms or SNPs)
800 patients genotyped at 500 000 SNPs along the entire genome
Statistically test the association between SNPs and outcomes (e.g. time to withdrawal of drug due to adverse effects)
large data-parallel problem using R – ideal for Condor
divide datasets into small partitions so that individual jobs run for 15-30 minutes
batch of 26 chromosomes (2 600 jobs) required ~ 5 hours wallclock time on Condor but ~ 5 weeks on a single PC
![Page 50: Ian C. Smith* Introduction to research computing using the High Performance Computing facilities and Condor *Advanced Research Computing University of](https://reader036.vdocument.in/reader036/viewer/2022062801/56649e7d5503460f94b7f613/html5/thumbnails/50.jpg)
Radiotherapy example
large 3rd party application code which simulates photon beam radiotherapy treatment using Monte Carlo methods
tried running simulation on 56 cores of high performance computing cluster but no progress after 5 weeks
divided problem into 250 then 5 000 and eventually 50 000 Condor jobs
required ~ 2 600 days of CPU time (equivalent to ~ 3.5 years on dual core PC)
Condor simulation completed in less than one week
average run time was ~ 70 min
![Page 51: Ian C. Smith* Introduction to research computing using the High Performance Computing facilities and Condor *Advanced Research Computing University of](https://reader036.vdocument.in/reader036/viewer/2022062801/56649e7d5503460f94b7f613/html5/thumbnails/51.jpg)
Summary Parallelism can help speed up the solution of many research
computing problems by dividing large problems into many smaller ones which can be tackled at the same time
High Performance Computing clusters
Typically used for small numbers of long running jobs
Ideal for applications requiring lots of memory and disk storage space
Almost all systems are UNIX-based
Condor High Throughput Computing Service
Typically used for large/very large numbers of short running jobs
Limited memory and storage available on Condor PCs
Support available for applications using R (and MATLAB)
No UNIX knowledge needed with Desktop Condor
![Page 52: Ian C. Smith* Introduction to research computing using the High Performance Computing facilities and Condor *Advanced Research Computing University of](https://reader036.vdocument.in/reader036/viewer/2022062801/56649e7d5503460f94b7f613/html5/thumbnails/52.jpg)
Next steps Condor Service information: http://condor.liv.ac.uk
Information on bioinf1 and HPC clusters: http://clusterinfo.liv.ac.uk
Information on the Advanced Research Computing (ARC) facilities: http://www.liv.ac.uk/csd/advanced-research-computing
To contact the ARC team email: [email protected]
To request an account on Condor or chadwick use:http://www.liv.ac.uk/media/livacuk/computingservices/help/eScienceform.pdf
For an account on bioinf1 – just ask me ! ([email protected] )