introduction to linux and hpc john zaitseff, april 2015 high performance computing

43
Introduction to Linux and HPC John Zaitseff, April 2015 High Performance Computing

Upload: jeffry-hensley

Post on 19-Dec-2015

215 views

Category:

Documents


2 download

TRANSCRIPT

Page 1: Introduction to Linux and HPC John Zaitseff, April 2015 High Performance Computing

Introduction to Linux and HPCJohn Zaitseff, April 2015

High Performance Computing

Page 2: Introduction to Linux and HPC John Zaitseff, April 2015 High Performance Computing

High Performance Computing architecture

Massively Parallel Distributed

Computational Cluster

• Many individual servers (“nodes”): dozensto thousands

• Multiple processors per node: between 8and 64 cores

• Interconnected by fast networks• Almost always run Linux

– In our case: Rocks Linux Distributionon top of CentOS 6.x

The Trentino clusterImage credit: John Zaitseff, UNSW

Page 3: Introduction to Linux and HPC John Zaitseff, April 2015 High Performance Computing

High Performance Computing architecture

Head Node Storage Node

Internal Network Switch

Com

pute

Nod

e 1

Com

pute

Nod

e 2

Com

pute

Nod

e 3

Com

pute

Nod

e 4

Com

pute

Nod

e n

Internet

Chassis 1C

omp

ute

Nod

e 1-

1

Com

put

e N

ode

1-2

Com

put

e N

ode

1-3

Com

put

e N

ode

1-4

Com

put

e N

ode

1-n

Chassis m

Com

put

e N

ode

m-

1C

omp

ute

Nod

e m

-2

Com

put

e N

ode

m-

3C

omp

ute

Nod

e m

-4

Com

put

e N

ode

m-

n

Page 4: Introduction to Linux and HPC John Zaitseff, April 2015 High Performance Computing

The Newton cluster: newton.mech.unsw.edu.au• 10 × Dell R415 server nodes

– Head node: newton– Compute nodes: newton01 to newton09

• 160 × AMD Opteron 4386 3.1GHz processor cores– Two physical processors per node– Eight CPU cores per processor– Only four floating-point units per processor

• 320 GB of main memory (32 GB per node)• 12 TB of storage: 6 × 3 TB drives in RAID 6• 1Gb Ethernet network interconnect

http://cfdlab.unsw.wikispaces.net/

The Newton cluster

Image credit: John Zaitseff, UNSW

Page 5: Introduction to Linux and HPC John Zaitseff, April 2015 High Performance Computing

The Trentino cluster: trentino.mech.unsw.edu.au• 16 × Dell R815 server nodes

– Head node: trentino– Compute nodes: trentino01 to trentino15

• 1024 × AMD Opteron 6272 2.1GHz processor cores– Four physical processors per node– Sixteen CPU cores per processor– Only eight floating-point units per processor

• 2048 GB of main memory (128 GB per node)• 30 TB of storage: 12 × 3 TB drives in RAID 6• 4×1Gb Ethernet network interconnect

http://cfdlab.unsw.wikispaces.net/

The back of the Trentino cluster

Image credit: John Zaitseff, UNSW

Page 6: Introduction to Linux and HPC John Zaitseff, April 2015 High Performance Computing

The Leonardi cluster: leonardi.eng.unsw.edu.au• 7 × HP BladeSystem c7000 blade enclosures• 1 × HP ProLiant DL385 G7 server: leonardi• 56 × HP BL685c G7 compute nodes

– Compute nodes: ec01b01-ec07b08• 2944 × AMD Opteron 6174 2.2GHz processor cores

and Opteron 6276 2.3GHz processor cores– Four physical processors per node– Twelve or sixteen CPU cores per processor

• 8448 GB of main memory (96–512 GB per node)• 93.5 TB of storage: 70 × 2 TB drives in RAID 6+0• 2×10Gb Ethernet network interconnect

http://leonardi.unsw.wikispaces.net/Nodes in the Leonardi cluster

Image credit: John Zaitseff, UNSW

Page 7: Introduction to Linux and HPC John Zaitseff, April 2015 High Performance Computing

• 3592 × Fujitsu blade server nodes• Multiple login nodes• Multiple management nodes• 57,472 Intel Xeon E5-2670 2.60GHz

processors• 160 TB of main memory• 10 PB of storage using the Lustre

distributed file system• 56Gb Infiniband FDR network

interconnect

http://nci.org.au/nci-systems/national-facility/peak-system/raijin/

The Raijin cluster: raijin.nci.org.au

Image credit: National Computational Infrastructure

Page 8: Introduction to Linux and HPC John Zaitseff, April 2015 High Performance Computing

High Performance Computing architecture

Head Node Storage Node

Internal Network Switch

Com

pute

Nod

e 1

Com

pute

Nod

e 2

Com

pute

Nod

e 3

Com

pute

Nod

e 4

Com

pute

Nod

e n

Internet

Chassis 1C

omp

ute

Nod

e 1-

1

Com

put

e N

ode

1-2

Com

put

e N

ode

1-3

Com

put

e N

ode

1-4

Com

put

e N

ode

1-n

Chassis m

Com

put

e N

ode

m-

1C

omp

ute

Nod

e m

-2

Com

put

e N

ode

m-

3C

omp

ute

Nod

e m

-4

Com

put

e N

ode

m-

n

Do not run your jobs here!

Page 9: Introduction to Linux and HPC John Zaitseff, April 2015 High Performance Computing

Connecting to a HPC system

• Use the Secure Shell protocol (SSH)– Under Linux or Mac OS X: ssh username@hostname

(for example, ssh [email protected])– Under Windows: PuTTY (Start » All Programs » PuTTY » PuTTY)– Can install Cygwin: “that Linux feeling under Windows”

• You will get a command line prompt: something like – May be different in different systems; may be customised

• Try it now:– Start PuTTY, specify host name as newton.mech.unsw.edu.au– Check RSA2 fingerprint: 69:7e:64:75:57:67:ad:4c:21:8e:90:7d:8e:97:70:ce

– User name: your zID; Password: your zPass• To exit, type exit and press ENTER.

z9693022@newton:~ $

Page 10: Introduction to Linux and HPC John Zaitseff, April 2015 High Performance Computing

Simple Linux commands

• List files in a directory: ls [pathname ...]– [ ] indicates optional parameters, ... indicates one or more parameters– Italic fixed-width font indicates replaceable parameters

• To show the current directory: pwd• To change directories: cd directory

– ~ is the home directory– .. is the directory above the current one– ~user is the home directory of user user

• Try it now:

cd ~z9693022/src/trader-7.6 # Do not replace z9693022!ls # List files in current directorycd srcpwd; ls # More than one command at a timecd ..; pwd # You don’t have to enter the comments...

Page 11: Introduction to Linux and HPC John Zaitseff, April 2015 High Performance Computing

Directories and files: paths and pathnames• Files and directories are organised into a hierarchical tree structure• The top of the tree is called the root directory (or simply root), and is

denoted as / (slash)• The root directory contains directories, which in turn contain files and

directories of their own:

/

bin etc home

z9693022

share

apps

Modules

ansys

14.5

15.0

matlab

usr

bin share local

bin

Page 12: Introduction to Linux and HPC John Zaitseff, April 2015 High Performance Computing

Absolute pathnames

• Any file or directory can be represented as an absolute pathname:– gives the full name of the file or directory– starts with the root “/”– lists each directory along the way– has a “/” to separate each path (or pathname) component

• For example: the directory /share/apps/ansys/15.0

/

bin etc home

z9693022

share

apps

Modules

ansys

14.5

15.0

matlab

usr

bin share local

bin

Page 13: Introduction to Linux and HPC John Zaitseff, April 2015 High Performance Computing

Relative pathnames

• Second way of denoting a file or directory (a pathname)• Relative to the current working directory• Does not start with the root directory “/”• Path components are still separated with slashes “/”• Current directory is denoted by “.” (dot)• Going up a level is denoted by “..” (dot-dot)• Often just contains a filename with no directories listed

• Examples: Assume current directory is /home/z9693022/src/trader-7.6:

README → /home/z9693022/src/trader-7.6/README

src/trader.c → /home/z9693022/src/trader-7.6/src/trader.c

../trader-7.6.tar.xz → /home/z9693022/src/trader-7.6.tar.xz

src/.././README → /home/z9693022/src/trader-7.6/README

./README → /home/z9693022/src/trader-7.6/README

Page 14: Introduction to Linux and HPC John Zaitseff, April 2015 High Performance Computing

Important directories

• Home directory: /home/user (e.g., /home/z9693022)• Scratch directory for temporary files: /share/scratch/user

(but not available on Newton!)• Binary directories for utility programs:

– /bin — for essential utilities– /usr/bin — for other utilities and some applications– /usr/local/bin — for local utilities and applications– /home/user/bin — for your own utilities

• On our clusters, applications: /share/apps• On our clusters, module files: /share/apps/Modules

• Note synonyms: path, pathname, filename

Page 15: Introduction to Linux and HPC John Zaitseff, April 2015 High Performance Computing

More with pathnames

• To change directories: cd dir• To change to your home directory: cd ~ or cd $HOME or cd (by itself)• To get current working directory: pwd• To show the directory tree structure: tree, tree -d (directories only)• To view a file page by page: less filename, “q” to quit, “h” for help

• Try it now:

cd /home/z9693022/src/trader-7.6tree -dless READMEless src/trader.ccd src; pwdless READMEless ../README # Different from README!

Page 16: Introduction to Linux and HPC John Zaitseff, April 2015 High Performance Computing

Getting help

• Many commands have a myriad of command line options• For a brief summary of command line options, try command --help• For a full explanation, try man command• For some commands, try pinfo command• To search for a keyword in the manual: man -k keyword• Remember, “Google is your friend”

• Try it now:

ls --helpcd --help # Does this work?

man ls # See “See Also” section at end

pinfo coreutils # “q” to quit

man less # 1571 lines!

man cd # What is “BASH_BUILTINS”?

Page 17: Introduction to Linux and HPC John Zaitseff, April 2015 High Performance Computing

The Bourne Again (Bash) shell

• Official manual page entry:Bash is an sh-compatible command language interpreter that executes commands read from the standard input or from a file. Bash also incorporates useful features from the Korn and C shells (ksh and csh).

Bash is intended to be a conformant implementation of the Shell and Utilities portion of the IEEE POSIX specification (IEEE Standard 1003.1). Bash can be configured to be POSIX-conformant by default.

• Interprets your typed commands and executes them• Just another Linux program: nothing special about it!• Started by the system when you log in• You can then start another shell, if you like (e.g., ksh, tcsh, even python)• You can start a subshell by running bash• To exit a subshell (or the main shell): exit

Page 18: Introduction to Linux and HPC John Zaitseff, April 2015 High Performance Computing

Some features of Bash

• Powerful command line facilities (shortcuts):– Tab completion (press the TAB key to complete commands and

pathnames, TAB TAB to list all possibilities)– Command line editing: try ↑ (Up-Arrow) to recall previous commands,

CTRL-R (C-R or ^R) to search for previous commands, ← and → to move along current command line

• A full programming and scripting language:– Variables and arrays– Loops (for; while; until), control statements (if ... then ... else; case)– Functions and coprocesses– Text processing (“expansion” and “parameter substitution”)– Simple arithmetic calculations– Input/output redirection (e.g., redirect output to different files)– Much, much more! (The man page runs to over 5,300 lines)

Page 19: Introduction to Linux and HPC John Zaitseff, April 2015 High Performance Computing

Trying out some features of Bash

• Try it now:– cd ~z9693022/src/trader-7.6/src– Type “less”, then space, but do not press ENTER yet– Press TAB once: nothing appears– Press TAB a second time: all relevant completions appear– Type “f”, then press TAB: the filename is completed to “fileio.”– Press TAB TAB again: two files are listed– Type “h” to select the second file, then press ENTER (and “q” to quit)

• Try it now:– Press CTRL-R, then type “ls” (but do not press ENTER): previous

commands with “ls” in them are listed– Press CTRL-R again a few times: will even list “pinfo coreutils”– Press ENTER when you get to the command you wish to execute– Press CTRL-C if you do not wish to execute any command

Page 20: Introduction to Linux and HPC John Zaitseff, April 2015 High Performance Computing

Listing files and directories

• Already know the ls command: List directory contents• In full: ls [options] [pathname ...]• Some options:

– “-a” for all files (including those starting with “.”)– “-l” (lowercase letter L) for long (detailed) listing– Options sometimes can be combined: “-alF”

• Try it now: ls -laF or dir (an alias to “ls -laF”); ll (“ls -lF”)

• Example of a line in a long listing:

-rw-r--r-- 1 z9693022 unsw 1266 May 24 07:59 README• The columns of information are: file permissions, number of links (usually 1

for files, 2 or more for directories), file owner, group owner, size in bytes (here, 1266), date last modified, the actual filename (README), with perhaps a trailing “*” for executable files and “/” for directories.

Page 21: Introduction to Linux and HPC John Zaitseff, April 2015 High Performance Computing

File and directory patterns

• The Bash shell interprets certain characters in the command line by replacing them with matching pathnames

• Called pathname expansion, pattern matching, wildcards or globbing• For existing pathnames: “*” matches any string, “?” matches any single

character, “[...]” matches any one of the enclosed characters

• Try it now:

cd ~z9693022/src/trader-7.6/src; echo 1 2 3echo *c # All filenames ending in “c”: “.” is not special

echo ????.c # All filenames six characters long (4 + “.c”)

echo M*m # All filenames starting with “M” and ending with “m”

echo [it]* # All filenames starting with either “i” or “t”

echo ../lib/uni* # All filenames in ../lib starting with “uni”

echo ../*/*.c

Page 22: Introduction to Linux and HPC John Zaitseff, April 2015 High Performance Computing

More file and directory patterns

• Glob patterns “*”, “?” and “[...]” only match existing pathnames• Even for pathnames that do not exist: “{alt1,alt2,...}” lists alternatives,

“{n..m}” lists all numbers between n and m, “{n..m..s}” in steps of s– Technically called brace expansion

• Try it now:

cd ~z9693022/src/trader-7.6/srcls test-* # “No such file or directory”

echo test-* # What happens?

echo test-{one,two,three}echo newdir/{one,two,three}echo test-{1..100}echo test-{001..100} # Zero-padding

echo test-{1..100..3} # By steps of three

echo test-{100..1..-3} # By steps of negative three

Page 23: Introduction to Linux and HPC John Zaitseff, April 2015 High Performance Computing

Naming files and directories

• Linux allows any characters in filenames except “/” and the NUL byte• You may create filenames with “weird” characters in them:

– spaces and tabs– starting with “-”: conflicts with command line options– question marks “?”, asterisks “*”, brackets and braces– other characters with special meanings: “!”, “$”, “&”, “#”, “"”, etc.

• Just because you can does not mean you should!• To match such files: use the glob characters “*” and “?”• Linux file systems are case-sensitive: README.TXT is different from

readme.txt, which is different from Readme.txt and ReadMe.txt!• File type suffixes (e.g., “.txt”) are optional but recommended• Filenames starting with “.” are usually hidden from globs and ls output.

• Recommendation: Use “a” to “z”, “A” to “Z”, “0” to “9”, “-”, “_” and “.” only.

Page 24: Introduction to Linux and HPC John Zaitseff, April 2015 High Performance Computing

Managing directories

• To create a directory: mkdir dir ...• To create parent directories as well: mkdir -p dir ...• To remove an empty directory: rmdir dir ...

• Try it now:

cd ~; lsmkdir gsoe9400/dir{1,2,3} # Why does this fail?

mkdir -p gsoe9400/dir{1,2,3,99} gsoe9400/xls gsoe9400rmdir gsoe9400/dir?ls gsoe9400 # Should list dir99 and x only

rmdir gsoe9400/* # Be careful...

Page 25: Introduction to Linux and HPC John Zaitseff, April 2015 High Performance Computing

Managing files

• To output one or more file’s contents: cat filename ...• To view one or more files page by page: less filename ...

• To copy one file: cp source destination• To copy one or more files to a directory: cp filename ... dir• To preserve the “last modified” time-stamp: cp -p• To copy recursively: cp -pr source destination

• To move one or more files to a different directory: mv filename ... dir• To rename a file or directory: mv oldname newname• To remove files: rm filename ...

• Recommendation: use “ls filename ...” before rm or mv: what happens if you accidentally type “rm *”? or “rm * .c”? (note the space!)

Page 26: Introduction to Linux and HPC John Zaitseff, April 2015 High Performance Computing

Managing files and directories, continued

• To copy whole directory trees: cp -pr filename ... destination• To copy to and from another Linux or Mac OS X system (e.g., from Leonardi

to Trentino), use Secure Copy: scp [-p -r] source ... destination– Either source or destination (but not both) can contain a remote system

identifier followed by a colon: [user@]hostname:• Can also use rsync or insync: insync [-d] source destination

• Examples: (remember, don’t type in the examples!)

cp -pr ~z9693022/src/trader-7.6 .scp -p ~/file1.txt leonardi:file2.txtscp -p [email protected]:src/README .mkdir dir1; insync ~/orig dir1insync /share/scratch/$USER/data1 $HOME/data1insync leonardi:/share/scratch/$USER/data2 .

Page 27: Introduction to Linux and HPC John Zaitseff, April 2015 High Performance Computing

Managing files and directories, continued

• Try it now:

cd ~/gsoe9400cp -pr ~z9693022/src/trader-7.6 .; lscd trader-7.6; pwdcat build-aux/bootstrapls */*.crm */*.c; ls */*.c # What is the exact output of ls?

insync ~z9693022/src/trader-7.6 .mkdir ../new; cp src/trader.c ../newcd ../new; lsmv trader.c new.c; rm new.ccp -p ../trader-7.6/src/trader.* .cp trader.c new.cls -l trader.c new.c # What is the difference between the listings?

Page 28: Introduction to Linux and HPC John Zaitseff, April 2015 High Performance Computing

Transferring files

• To copy files to another Linux or Mac OS X system: use scp, rsync or insync (you will find insync in /usr/local/bin)

• To copy files to and from a Windows machine: use WinSCP or scp, rsync or insync under Cygwin

• Try it now:– Start WinSCP (Start » All Programs » WinSCP » WinSCP)

o Host name newton.mech.unsw.edu.auo RSA2 fingerprint: 69:7e:64:75:57:67:ad:4c:21:8e:90:7d:8e:97:70:ceo User name: your zID; Password: your zPass

– Copy ~/gsoe9400/new/new.c to the Windows desktop– Rename it to newnew.c (using the usual Windows right-click or F2)– Copy it back– Under PuTTY: ls newnew.c

Page 29: Introduction to Linux and HPC John Zaitseff, April 2015 High Performance Computing

More Linux commands

• What machine am I on? hostname• What is the date and time? date• Who is logged in? who• But who is user z1234567? finger [username ...]• What is the user name for someone? finger part-of-name• What files contains a particular string? grep 'pattern' filename ...• What is the difference between two files? diff [-u] file1 file2• How do I rename multiple files at once? rename or prename• Where is a file named filename? find dir ... -name filename• How big is a file or directory? du -h [filename ...]• How much space is available in a directory? df -h [dir ...]• How much disk quota do I have? quota -s

– “Blocks” is how many disk blocks you are using, in chunks of 1 kB– On Newton: “limit” is 10240M = 10 GB

Page 30: Introduction to Linux and HPC John Zaitseff, April 2015 High Performance Computing

Redirecting input and output

• The terminal is treated as just another file (/dev/tty); use CTRL-D to signify the end of file

• Other special files: /dev/null (an empty file), /dev/zero (an infinite number of binary zeros—can use up your quota in a hurry!)

• Input and output from a program can be redirected to a file or even piped to another program

• To redirect output to filename, use “>filename”• To append output to filename, use “>>filename”• To redirect input from filename, use “<filename”• To connect the output from one program to the input of another (pipes), use

“program1 | program2”• Multiple pipes are allowed: “program1 | program2 | ... | programn”• Many utility programs are designed to be used in this way, as filters• Output can be substituted into a command line: $(commandline)

Page 31: Introduction to Linux and HPC John Zaitseff, April 2015 High Performance Computing

Redirecting input and output, continued

• Try it now:

cd ~/gsoe9400/trader-7.6ls > ../dir-list1cat ../dir-list1cat ../dir-list1 | wc -l # How many lines in ../dir-list1?

ls ~/gsoe9400/trader-7.6 | wc -l # Same as above

rm ../dir-list1ls -l | grep May # How many files were last modified in May?

ls -l | grep May | sort -nk5 # Same, but sort by file size (5th field)

who | awk '{print $1}' # Just list first field of “who” output

finger $(who | awk '{print $1}') # Full details of who is logged in

finger $(who | awk '{print $1}') | less # One page at a time

Page 32: Introduction to Linux and HPC John Zaitseff, April 2015 High Performance Computing

Simple scripting

• Shell scripts are just files containing a list of commands to be executed• First line (“magic identifier”) must be #!/bin/bash• Comments are introduced with “#”• The script file must be made executable: chmod a+x filename

• Variables:– To set a variable, use varname=value (no spaces!)– To use a variable, use $varname or ${varname}– Variable names start with a letter, may contain letters, numbers and “_”– Variable names are case-sensitive (as with most things Linux)

• Functions (parameters are accessed using $1, $2, ...):

funcname () { body of function}

Page 33: Introduction to Linux and HPC John Zaitseff, April 2015 High Performance Computing

Simple scripting, continued

• For loops:

for varname in list ...; do process using ${varname}done

• Control statements (multiple “elif” allowed; “elif” and “else” clauses are optional):

if [ comparison ]; then if-true statementselif [ second-comparison ]; then if-second-true statementselse if-false statementsfi

• Example of comparisons: string1 = string2 (is equal)– See the manual page for test (“man test”) for more information

Page 34: Introduction to Linux and HPC John Zaitseff, April 2015 High Performance Computing

Simple scripting, continued

• While loops:

while [ comparison ]; do while-true statementsdone

• Until loops:

until [ comparison ]; do while-false statementsdone

• Many, many other programming features available!• Read the reference and manual pages: pinfo bash; man bash• Some books:

– William E. Shotts Jr., The Linux Command Line, No Starch Press, January 2012. ISBN 9781593273897, 9781593274269

– Cameron Newham, Learning the bash Shell, 3rd Edition, O’Reilly Media, March 2005. ISBN 9780596009656, 9780596158965

Page 35: Introduction to Linux and HPC John Zaitseff, April 2015 High Performance Computing

Editing files under Linux

• Use an editor to edit text files• Many choices, leading to “religious wars”!• Some options: GNU Emacs, Vim, Nano• Nano is very simple to use: nano filename

– CTRL-X to exit (you will be asked to save any changes)• GNU Emacs and Vim are highly customisable and programmable

– For example, see the file ~z9693022/.emacs– Debra Cameron et al., Learning GNU Emacs, 3rd Edition, O’Reilly

Media, December 2004. ISBN 9780596006488, 9780596104184– Arnold Robbins et al., Learning the vi and Vim Editors, 7th Edition,

O’Reilly Media, July 2008. ISBN 9780596529833, 9780596159351

• Try it now:

cd ~/gsoe9400; nano script1

Page 36: Introduction to Linux and HPC John Zaitseff, April 2015 High Performance Computing

Creating a simple script file

• Try it now, continued: Enter the following text:#!/bin/bash# How much disk quota am I using?# (We want only the last line of "quota" output:# use the "tail" utility, specifying 1 (one) line)quota_output=$(quota | tail -n 1)blocks_used=$(echo $quota_output | awk '{print $1}')blocks_limit=$(echo $quota_output | awk '{print $3}')percent=$(( $blocks_used * 100 / $blocks_limit ))

echo "I am using $blocks_used blocks (${percent}%)"

• Press CTRL-X to save the file and exit the editor (follow the prompts), then:

chmod a+x ./script1 # Make the script executable

./script1 # Execute the script! (Note the use of “./”)

Page 37: Introduction to Linux and HPC John Zaitseff, April 2015 High Performance Computing

Creating a script with loops

• Try it now:– Create and run the file script2, containing the following. What is the

output? (Hint: remember “chmod a+x ./script2; ./script2”)

#!/bin/bashmodule load matlab/2015afor n in {01..10}; do echo "n = $n;" >script${n}.m echo "sqrtn = sqrt(n);" >>script${n}.m echo "save('data${n}.txt', 'sqrtn', '-ascii');" \ >>script${n}.m echo "quit" >>script${n}.m matlab -nojvm -r script${n} >/dev/null cat data${n}.txtdone

Page 38: Introduction to Linux and HPC John Zaitseff, April 2015 High Performance Computing

Applications on the cluster

• Applications are managed using the module system• Applications are stored in /share/apps• Module files are stored in /share/apps/Modules

• Module files set shell environment variables such as PATH• PATH controls where applications are searched (the search path)

– Try it now: echo $PATH

• To see all available applications: module avail• To see currently loaded applications: module list• To load an application: module load application[/version]• To unload an application: module unload application[/version]

Page 39: Introduction to Linux and HPC John Zaitseff, April 2015 High Performance Computing

Submitting jobs to the cluster

• So far, everything has been run on the head node: a very bad idea!• To submit a job to the cluster compute nodes:

– Create a shell script file as per normal– Add #SBATCH directives as required directly after “#!/bin/bash”

(These look like shell comments, but are interpreted by sbatch)– Execute sbatch ./scriptfile– Wait for the job to run, checking its status as required

• Common #SBATCH directives (“man sbatch” for full details):– #SBATCH --mail-user=email — Send notifications to email address– #SBATCH --mail-type=ALL — What notifications to send– #SBATCH --time=[days-]hh:mm:ss — How much time is required– #SBATCH --mem=memsize — How much memory is required (MB)– #SBATCH --ntasks=1 — How many sub-jobs (usually 1)– #SBATCH --cpus-per-task=n — Request n processors per task– #SBATCH --partition=queuename — Which partition (queue) to submit to

Page 40: Introduction to Linux and HPC John Zaitseff, April 2015 High Performance Computing

Checking your job status

• Submit your jobs using “sbatch”• You will be given a job number to track the job• By default, script output will go to the file slurm-jobnumber.out

• Check job status: squeue [-l] [-j jobnumber]• Check partition and node status: sinfo [-l]• Show which nodes are reserved: sinfo -T• Show job status graphically: smap or sview (sview requires X11 display)

• Get overall information about the cluster: visit http://hostname/ganglia/– e.g., http://newton.mech.unsw.edu.au/ganglia/– Currently only available within UNSW

• Try it now: use a web browser to view the Ganglia page for Newton.

Page 41: Introduction to Linux and HPC John Zaitseff, April 2015 High Performance Computing

Managing your jobs

• To see which nodes exist on the cluster: rocks list host or sinfo -N• To see jobs belonging to you: squeue -u $USER• To see when a job will start: squeue --start [-j jobnumber]• For more detailed information: scontrol show job jobnumber

• To delete a queued job (whether running or not): scancel jobnumber ...• To place a job on hold: scontrol hold jobnumber• To release a job currently on hold: scontrol release jobnumber• To rerun a job (kill it and then restart it): scontrol requeue jobnumber

• To change one or more settings for a job (use with care!):scontrol show job jobnumberscontrol update jobid=jobnumber param=value ...

Page 42: Introduction to Linux and HPC John Zaitseff, April 2015 High Performance Computing

Submitting and checking a job

• Try it now:– Create and change to the directory ~/gsoe9400/job1:

mkdir ~/gsoe9400/job1; cd ~/gsoe9400/job1– Copy the previously created script file script2:

cp ../script2 job1– Edit the file job1 and add the following lines just after “#!/bin/bash”:

#SBATCH [email protected] # Do not replace email address—used to assess you for this class!

#SBATCH --mail-type=ALL#SBATCH --time=00:10:00#SBATCH --mem=2048#SBATCH --ntasks=1#SBATCH --cpus-per-task=1

– Submit the script: sbatch ./job1

Page 43: Introduction to Linux and HPC John Zaitseff, April 2015 High Performance Computing

Conclusion

You have begun your journeyto using High PerformanceComputing clusters effectively.

Well done!

John [email protected]

Available for consultationson Tuesdays 9:30am–4pmby appointment only.

http://www.engineering.unsw.edu.au/hpc

Image credit: John Zaitseff, UNSW