advanced hpc and linux 20130820
TRANSCRIPT
-
7/26/2019 Advanced Hpc and Linux 20130820
1/70
-
7/26/2019 Advanced Hpc and Linux 20130820
2/70
Advanced High
Perfomance Computing Using Linux
Advanced High PerformanceComputing Using Linux
TABLE OF CONTENTS
1.0 Advanced Linux
1.1 Emacs: The ost Advanced Text Edito
1.! Advanced "cripting
!.0 Advanced HPC
!.1 Computer "#stem Architectures
!.! Processors and Cores
!.$ Para%%e% Processing Performance
$.0 &ntroductor# P& Programming
$.1 The "tor# of the essage Passing &nterface 'P&( and )penP&
$.! Unrave%%ing A "amp%e P& Programming and )penP& *rappers
$.$ P&+s &ntroductor# ,outines
-.0 &ntermediate P& Programming
-.1 P& atat#pes
-
7/26/2019 Advanced Hpc and Linux 20130820
3/70
-.! &ntermediate ,outines
-.$ Co%%ective Communications
-.- erived ata T#pes
-./ Partic%e Advector
-. Creating A e2 Communicator
-.3 Profi%ing Para%%e% Programs
-.4 e5ugging P& App%ications
1.0 Advanced Linux
1.1 Emacs : The Most Advanced Text Editor
In the introductory course, we used nano as our text editor. In the intermediate
course, we used vim. Finally, in this advanced course, we'll provide an
introduction to Emacs, which can be described as the most advanced text editor
and is particularly popular among programmers.
Emacs is one of the oldest continious software applications available, first
written in 19! by "ichard #tallman, founder of the $%& free software
movement. t the time of writing it was up to version (), with a substantial
number of for*s and clones developed during its history.
+he big features of Emacs is the extremely high level of builtin commands,
customisation, and extensions, so extensive that those explored here only begin
to touch the extraordinary diverse world that is Emacs. Indeed, Eric "aymond,notes -i/t is a common 0o*e, both among fans and detractors of Emacs, to
describe it as an operating system masuerading as an editor-.
2ith extensions, Emacs includes 3a+ex formatted documents, syntax
highlighting for ma0or programming and scripting languages, a calculator, a
calender and planner, a textbased adventure, a webbrowser, a newsreader and
-
7/26/2019 Advanced Hpc and Linux 20130820
4/70
email client, and an ftp client. It provides file difference, merging, and version
control, a textbased adventure game, and even a "ogerian psychotherapist.
+ry doing that with %otepad4
+his all said, Emacs is not easily to learn for beginners. +he level of
customisation and the detailed use of meta and control characters does serve as
a barrier to immediate entry.
+his tutorial will provide a useable introduction to Emacs.
1.1.1 Starting Emacs
+he defaut of Emacs installations on contemporary 3inux systems assume the
use of a graphicaluser interface. +his is obviously not the case with an 567
system, but for those with a home installation you should be aware that 'emacs
nw' from the commandline will launch the program without the $&I. If you wish
to ma*e this the default you should add it as an alias to .bashrc 8e.g., alias
emacs'emacs nw':.
Emacs is launched by simply typing 'emacs' on the command line. 7ommands are
invo*ed by a combination of the 7ontrol 87trl: *ey and a character *ey 87;chr
-
7/26/2019 Advanced Hpc and Linux 20130820
5/70
+o -brea*- from a partially entered command, 7g.
If an Emacs session crashed recently, =x recoversession can recover the files
that were being edited.
+he menubar can be activated with =A
+he help files are accessed with 7h and the manual with 7h r.
1.1.3 Files, B!!ers, and "indo#s
Emacs has three main data structures, Files, Buffers, and 2indows which are
essential to understand.
file is the what is the actual file on dis*. #trictly, when using Emacs one does
not actually edit file. "ather, what happens is the file is copied into a buffer, then
edited, and then saved. Buffers can be deleted without deleting the file on dis*.
+he buffer is an data space within Emacs for editing a copy of the file. Emacs can
handle many buffers simultaneously, the effective limit being the maximum
buffer siCe, determined by integer capacity of the processor and memory 8e.g.,
for !)bit machines, this maximum buffer siCe is (D!1 ( bytes:. buffer has a
name, usually after the file from which it has copied the data.
window is the user's view of a buffer. %ot all buffers may be visible to the user
at once due to the limits of screen siCe. user may split the screen into multiple
windows. 2indows can be created and deleted, without deleting the bufferassociated with the window.
Emacs also has a blan* line below the mode line to display messages, and for
input for prompts from Emacs. +his is called the minibuffer, or echo.
-
7/26/2019 Advanced Hpc and Linux 20130820
6/70
1.1.$ Ex%loring and Entering Text
7ursor *eys can be used to mover around the text, along with 6age &p and 6age
own, if the terminal uses them. 5owever Emacs afficiandos will recommend the
use of the control *ey for speed. 7ommon commands include the following? you
may notice a pattern in the command logic
7v 8move page down:, =v 8move page up:
7p 8move previous line:, 7n 8move next line:,
7f 8move forward, one character:, 7b 8move bac*ward, one character:
=f 8move forward, one word:, =b 8move bac*ward, one word:
7a 8move to beginning of a line:, 7e 8move to end of a line:
=a 8move forward, beginning of a sentence:
=e 8move bac*ward, beginning of a sentence:=G 8move bac*ward, beginning of a paragraph:, =H 8end of paragraph:
=; 8move move tbeginning of a text:, =< 8move end of a text:.
;bac*space< 8delete the character 0ust before the cursor :
7d 8delete the character on the cursor:
=;bac*space< 8cut the word before the cursor:
=d 8cut the word after the cursor :
7* 8cut from the cursor position to end of line :
=* 8cut to the end of the current sentence :
7 8prefix command? use when you want to enter a control *ey into the buffer
e.g,. 7 E#7 inserts an Escape:
3i*e the pageup and pagedown *eys on a standard *eyboard you will discoverthat Emacs also interprets the Bac*space and elete *ey as expected.
selection can be cut 8or '*illed' in Emacs lingo: by mar*ing the beginning of the
selected text with 7#67 8space: and ending it with with standard cursor
movements and entering 7w. +ext that has been cut can be pasted 8'yan*ed': by
moving the cursor to the appropriate location and entering 7y.
-
7/26/2019 Advanced Hpc and Linux 20130820
7/70
Emacs commands also accept a numeric input for repetition, in the form of 7u,
the number of times the command is to be repeated, followed bythe command
8e.g., 7u 7n moves eight lines down the screen.:
1.1.& File Management
+here are only three main file manipulation commands that a user needs to
*now? how to find a file, how to save a file from a buffer, and how to save all.
+he first command is 7x 7f, shorthand for -findfile-. t first this command
chec*s prompts for the name of the file. If it is already copied into a buffer it willswitch to that buffer. If it is not, it will create a new buffer with the name
reuested.
For the second command, to save a buffer to a file file with the buffer name use
7x 7s, shorthand for -savebuffer-.
+he third command is 7x s. +his is shorthand for -savesomebuffers- and will
cycle through each open buffer and prompt the user for their action 8save, don't
save, chec* and maybe save, etc:
1.1.' B!!er Management
+here are four main commands relating to buffer management that a user needs
to *now. 5ow to switch to a buffer, how to list existing buffers, how to *ill a
buffer, and how to read a buffer in readonly mode.
+o switch to a buffer, user 7x b. +his will prompt for a buffer name, and switch
the buffer of the current window to that buffer. It does not change your existing
windows. If you type a new name, it will create a new empty buffer.
-
7/26/2019 Advanced Hpc and Linux 20130820
8/70
+o list current active buffers, use 7x 7b. +his will provide a new window which
lists current buffers, name, whether they have been modified, their Cie, and the
file that they are associated with.
+o *ill a buffer, use 7x *. +his will prompt for the buffer name, and then remove
the data for that buffer from Emacs, with an opportunity to save it. +his does not
delete any associated files.
+he toggle read only mode on a buffer, use 7x 7.
1.1.( "indo# Management
Emacs has it's own windowing system, consisting of several areas of framed text.
+he behaviour is similar to a tiling window manager? none of the windows
overlap with each other.
7ommonly used window commands include
7x J delete the current window
7x 1 delete all windows except the selected window
7x ( split the current window horiContally
7x K split the current window vertically
7x D ma*e selected window taller
7x H ma*e selected window wider
7x G ma*e selected window narrower
7x L ma*e all windows the same height
command use is to bring up other documents or menus. For example, with the*ey seuence 7h one usually calls for help files. If this is followed by *, it will
open a new vertical window, and with 7f, it will display the help information for
the command 7f 8i.e., 7h * 7f:. +his new window can be closed with 7x 1.
1.1.) *ill and +an, Search-e%lace, /ndo
-
7/26/2019 Advanced Hpc and Linux 20130820
9/70
Emacs is notable for having a Mvery largeN undo seuence, limited by system
resources, rather than application resources. +his undo seuence is invo*ed with
7O 8control underscore:, or with 7x u . 5owever it has a special feature that, by
engaging in a simple navigation command 8e.g., 7f: the undo action is pushed to
the top of the stac* and therefore the user can undo an undo command.
1.1.0 ther Featres
Emacs can ma*e it easier to read 7 and 7LL by colourcoding such files,
through the PQ.emacs configuration file, and adding Mglobalfontloc*mode tN.
6rogrammers also find the feature on being able to run the $%& debugger 8$B:
from within Emacs as well. +he command =x gdb will start up gdb. If thereRs abrea*point, Emacs automatically pulls up the appropriate source file, which gives
a better context than the standard $B.
1.2 Advanced Scri%ting
$ood *nowledge of scripting is reuired for any advanced 3inux user, and
especially those who find that they have regular tas*s, such as the processing of
data through a program. #hell scripting is no terribly difficult, although
sometimes some austere syntax bugs may prove frustrating but the machine is
0ust doing what you as*ed it to. espite their often underrated utility shell
scripts are not the answer to everything. +hey are not great at resource intensive
tas*s 8e.g., extensive file operations: where speed is important. +hey are not
recommended for heavyduty maths operations 8use 7, 7LL, or Fortran instead:.
It is not recommended in situations where data structures, multidimensional
arrays 8it's not a database4: and portQsoc*et IQ> is important.
In the Intermediate 7ourse, we loo*ed at scripting in reference to regular
expression utilities, such as sed and the programming language aw*, along with
some simple examples of using 3inux command invocations as variables in a
bac*up script, some sample -for-, -while-, -doQdone- and -until- loops along with
simple, optional, ladder, and nested conditionals using -if-, -then-, -else-, -else-,
-elif- and -fi-, the use of -brea*- and -continue-, and the -case- conditional, and
-select- for user input. +he implementation of these into 6B# 0ob submission
-
7/26/2019 Advanced Hpc and Linux 20130820
10/70
scripts was also illustrated. In this dvanced course we will revisit these
concepts but with more sophisticated and complex examples. In addition there
will be a close loo* at internal commands and filters, process substitution,
functions, arrays, and debugging.
1.2.1 Scri%ts "ith ariales
+he simplest script is simply one that runs a list of system commands. t least
this saves the time of retyping the seuence each time it is used, and reduces the
possibility of error. For example, in the Intermediate course, the following script
was recommended to calculate the dis* use in a directory. It's a good script, very
handy, but how often would you want to type itS Instead, type enter it once and
*eep it. Tou will recall of course, that a script starts with an invocation of the
shell, followed by commands.
emacs dis*use.sh
U4QbinQbash
du s* V W sort nr W cut f( W xargs d -Xn- du sh < dis*use.txt
7x 7c, y for save
chmod Lx dis*use.sh
s described in the Intermediate course, script runs a dis* usage in summary,
sorts in order of siCe and exports to the file dis*use.txt. +he -Xn- is to ignore
spaces in filenames.
=a*ing the script a little more complex, variables are usually better than hard
coded values. +here are two potential variables in this script, the wildcard 'V' and
the exported filename -dis*use.txt-. In the former case, we'll *eep the wildcared
as it allows a certain portibility of the script it can run in any directory it is
invo*ed from. For the latter case however, we'll use the date command so that a
history of dis*use can be created which can be reviewed for changes. It's also
-
7/26/2019 Advanced Hpc and Linux 20130820
11/70
good practise to alert the user when the script is completed and, although it is
often necessary, it is also good practise to cleanly finish any script with with
'exit'.
emacs dis*use.sh
U4QbinQbash
&dis*useY8date LZTZmZd:.txt
du s* V W sort nr W cut f( W xargs d -Xn- du sh < Y&
echo -is* summary completed and sorted.-
exit
7x 7c, y for save
1.2.2 ariales and Conditionals
nother example is a script with conditionals as well as variables. common
conditional, and sadly often forgotten, is whether or not a script has the reuiste
files for input and output specified. If an input file is not specified a script that
performs an action on the file will simple go idle and never complete. If an
output file is hardcoded, then the person running the script runs the ris* of
overwriting a file with the same name, which could be a disaster.
+he following script searches through any specified text file for text before and
after the ubiuitous email -[- symbol and outputs these as a csv file through use
of grep, sed, and sort 8for neatness:. If the input or the output file are not
specified, it exits after echoing the error.
emacs findemails.sh
#!/bin/bash
# Search for email addresses in file, extract, turn into csv with designated file
name
INPUT=${1}
OUTPUT=${2}
-
7/26/2019 Advanced Hpc and Linux 20130820
12/70
{
if [ !$1 -o !$2 ]; then
echo "Input file not found, or output file not specified. Exiting script."
exit 0
fi
}
grep --only-matching -E '[.[:alnum:]]+@[.[:alnum:]]+' $INPUT > $OUTPUT
sed -i 's/$/,/g' $OUTPUTsort -u $OUTPUT -o $OUTPUT
sed -i '{:q;N;s/\n/ /g;t q}' $OUTPUT
echo "Data file extracted to" $OUTPUT
exit
7x 7c, y for save
chmod Lx findemails.sh
+est this file with hidden.txt as the input text and found.csv as the output text.
+he output will include a final comma on the last line but this is potentially useful
if one wants to run the script with several input files and append to the same
output file 8simply change the single redirection in the grep statement to an
double appended redirection.
serious wea*ness of the script 8so far: is that it will gather any string with the'[' symbol in it, regardless of whether it's a wellformed email address or not. #o
it's not uite suitable for screenscraping usenet for email address to turn into a
spammers list. But it's getting close.
1.2.3 eads
+he read command simply reads a line from standard input. By applying the noption is can read in a number of characters, rather than a whole line, so n1 is
-read a single character-. +he use of the r option reads the input as raw input,
so that the bac*slash *ey 8for example: doesn't act li*e a a newline escape
character, and the p option displays the prompt. 6lus, a t timeout in seconds
option can also added. 7ombined, can be used in the effect of -press any *ey to
continue-, with a limited timeframe.
-
7/26/2019 Advanced Hpc and Linux 20130820
13/70
dd the following to findemails.sh at the end of the file.
emacs findemails.sh
#!/bin/bash
# Search for email addresses in file, extract, turn into csv with designated file
name
..
..
read -t5 -n1 -r -p "Press any key too see the list, sorted and with unique
record..."
if [ $? -eq 0 ]; then
echo A key was pressed.
else
echo No key was pressed.
exit 0
fi
less $OUTPUT | \
# Output file, piped through sort and uniq.
sort | uniq
exit
7x 7x, y for save
1.2.$ S%ecial Characters
#cripts essentially consist of commands, *eywords, and special characters.
#pecial characters have meaning beyond their literal meaning 8a metameaning,
if you li*e:. 7omments are the most common special meaning.
-
7/26/2019 Advanced Hpc and Linux 20130820
14/70
ny text following a U 8with the exception of U4: is comments and will not be
executed. 7omments may begin at the beginning of a line, following whitespace,
following the end of a command, and even be embedded within a piped command
8as above in section K:.
comment ends at the end of the line, and as a result a command may not follow
a comment on the same line. uoted or an escaped U in an echo statement
does not begin a comment.
nother special characters includes the command seperator, a semicolon, which
is used to permit two or more commands on the same line. +his is already shown
by the the various tests in the script 8e.g., if [ !$1 -o !$2 ]; thenand if [ $? -eq0 ]; then:. %ote the space after the semicolon. In contrast a double semicolon 8??:
represents a terminator in a case option, which was encountered in the extract
script in the Intermediate course.
..
case $1 in
*.tar.bz2) tar xvjf $1 ;;
*.tar.gz) tar xvzf $1 ;;
*.bz2) bunzip2 $1 ;;
..
..
esac
In contrast, the colon acts as a null command. 2hilst this obviously has a variety
of uses 8e.g., an alternative to the touch command, a really practical advantage
of this is that comes with a true exit status, and as such it can be used as
placeholder in ifQthen tests. n example from the Intermediate course?for i in *.plot.dat; do
if [ -f $i.tmp ]; then
: # do nothing and exit if-then
else
touch $i.tmp
+he use of the null command as a test at the beginning of a loop will cause it to
run endlessley 8e.g., ;code
-
7/26/2019 Advanced Hpc and Linux 20130820
15/70
evaluates as true. %ote that the colon is also used as a field separator in
QetcQpasswd and in the Y6+5 variable.
dot 8.: has multiple special character uses. s a command it sources a
filename, importing the code into a script, rather li*e the Uinclude directive in a
7 program. +his is very useful in situations when multiple scripts use a common
data file, for example 8e.g., . hidden.txt:. s part of a filename of course, as was
shown in the Introductory course, the . represents the current wor*ing directory
8e.g., cp r QpathQtoQdirectoryQ . and of course, .. for the parent directory:. third
use for the dot is in regular expressions, matching one character per dot. final
use is multiple dots in seuence in a loop. e.g.,
for a in {1..10}
do
echo -n "$a "
done
3i*e the dot, the comma operator has multiple uses. &sually it is used to lin*
multiple arithmetic calculations. +his is typically used in for loops, with a 7li*e
syntax. e.g.,
for ((a=1, b=1; a
-
7/26/2019 Advanced Hpc and Linux 20130820
16/70
doubleuote on a value does not change variable substitution. +his is
sometimes referred to as wea* uoting. &sing single uotes however, means the
variable to be used literally, with no substitution. +his is often referred to as
strong uoting. For example, a strict single uoted directory listing of ls with a
wildcard will only provide files that are expressed by the symbol 8which isn't a
very good file name:. 7ompare ls V with ls 'V'. +his example will also worth with
double uote and indeed, doubleuotes are generally preferable as they prevent
reinterpretation of all special characters except Y, A, and X. +his are usually the
symbols which are wanted in their interpreted mode. s the escape character
has a literal interpretation with single uotes, enclosing a single uote within
single uotes will not wor* as expected.Enclosing a referenced value in double
uotes 8- ... -: does not interfere with variable substitution. +his is called partial
uoting, sometimes referred to as -wea* uoting.- &sing single uotes 8' ... ':
causes the variable name to be used literally, and no substitution will ta*e place.+his is full uoting, sometimes referred to as 'strong uoting.'
"elated to uoting is the use of the bac*slash 8X: used to escape single
characters. o not confuse it with the forward slash 8Q: has multiple uses as both
the separator in pathnames 8e.g., 8QhomeQtrainJ1:, but also a the division
operator.
In some scripts bac*tic*s 8A: are used for command substitution, where the
output of a command can be assigned to a variable. 2hilst this is not a 6>#I\
standard, it does exist for historical reasons. %esting commands with bac*tic*s
also reuires escape characters? the deeper the nesting the more escape
characters reuired 8e.g., echo Aecho XAecho XXXApwdXXXAXAA:. +he preferrred and
6>#I\ standard method is to use the dollar sign and parentheses. e.g., echo
-5ello, Y8whoami:.- rather than echo -5ello, Awhoami:A.-
2.0 Advanced HPC
2.1 Computer S!tem Arc"itecture!
-
7/26/2019 Advanced Hpc and Linux 20130820
17/70
s explained in the first, introductory, course, -highperformance computing
8567: is the use of supercomputers and clusters to solve advanced computation
problems-. ll supercomputers 8-a nebulous term for computer that is at the
frontline of current processing capacity-: in contemporary times use parallel
computing, -the submission of 0obs or processes over one or more processors
and by splitting up the tas* between them-.
It is possible to illustrate the degree of parallelisation by using Flynn's +axonomy
of 7omputer #ystems 819!!:, where each process is considered as the execution
of a pool of instructions 8instruction stream: on a pool of data 8data stream:.
From this complex is four basic possibilities
"ing%e &nstruction "tream6 "ing%e
ata "tream '"&"(
"ing%e &nstruction "tream6 u%tip%e
ata "treams '"&(
u%tip%e &nstruction "treams6 "ing%e
ata "tream '&"(
u%tip%e &nstruction "treams6
u%tip%e ata "treams '&(
2.1.1 Single 4nstrction Stream, Single 5ata Stream 6S4S57
(Image from Oracle Essentials, 4th edition, O'Reilly Media, 2007)
+his is the simplest and, until recently, the most common processor
architecture on des*top computer systems. lso *nown as a
uniprocessor system it offers a single instruction set and a single
data stream. &niprocessors could however simulate or include
concurrency through a number of different methods
a: It is possible for a uniprocessor system to run processes
concurrently by switching between one and another.
b: #uperscale instruction level parallelism can be used on uniprocessors. =ore
than one instruction during a cloc* cycle is simultaneously dispatched to
different functional units on the processor.
-
7/26/2019 Advanced Hpc and Linux 20130820
18/70
c: Instruction prefetch, where an instruction is reuested from main memory
before it is actually needed and placed in a cache. +his often also includes a
prediction algorithm of what the instruction will be.
d: 6ipelines, on the instruction level or the graphics level, can also serve as an
example of concurrent activity. n instruction pipeline 8e.g., "I#7: allows
multiple instructions on the same circuty by dividing the tas* into stages.
graphics pipeline implements different stages of rendering operations to
different arithmetic units.
2.1.2 Single 4nstrction Stream, Mlti%le 5ata Streams 6S4M57
#I= architecture represents a situation where a single processor performs the
same instruction on multiple data streams. +his commonly occurs in
contemporary multimedia processors, for example ==\ instruction set from the
199Js, which lead to =otorollaRs 6ower67 ltivec, and more contemporary times
]E 8dvanced ]ector Extensions: instruction set used in Intel #andy Bridge
processors and ='s BulldoCer processor. +hese developments have primarily
been orientated towards realtime graphics, using shortvectors. 7ontemporary
supercomputers are invariably =I= clusters which can implement shortvector
#I= instructions.
#I= was also used especially in the 19Js and notably on the various 7ray
systems. For example the 7ray1 819!: had eight -vector registers,- which held
sixtyfour !)bit words each 8long vectors: with instructions applied to the
registers. 6ipeline parallelism was used to implement vector instructions with
separate pipelines for different instructions, which themselves cuold be run in
batch and pipelined 8vector chaining:. s a result the 7ray1 could have a pea*
performance of ()J mflops extraordinary for the day, and even acceptable in
the early (JJJs.
#I= is also *nown as vector processing or data parallelism, in comparison to a
regular #I= 76& which operates on scalars. #I= lines up a row of scalar data
8of uniform type: as a vector and operates on it as a unit. For example, inverting
an "$B picture to produce its negative, or to alter its brightness etc. 2ithout
#I= each pixel would have to be fetched to memory, the instruction applied to
-
7/26/2019 Advanced Hpc and Linux 20130820
19/70
it, and then returned. 2ith #I= the same instruction is applied to all the data,
depending on the availability of cores, i.e., get n pixels, apply instruction, return.
+he main disadvantages of #I=, within the limitations of the process itself, is
that it does reuire additional register, power consumption, and heat.
2.1.3 Mlti%le 4nstrction Streams, Single 5ata Stream 6M4S57
=ultiple Instruction, #ingle ata 8=I#: occurs when different operations are
performed on the same data. +his is uite rare and indeed debateable as it is
reasonable to claim that once an instruction has been performed on the data, it's
not thesame data anymore. If one doesn't ta*e this definition and allows for a
variety of instructions to be applied to the same data which can change then
various pipeline architectures can be considered =I#.
#ystolic arrays are another form of =I#. +hey are different to pipelines because
they have nonlinear array structure, they have multidirectional data flow, and
each processing element may even have its own local memory . In this situation a
matrix pipe networ* arrangement of processing units compute data and store it
independently of each other. =atrix multiplication is an example of such an array
in an algorithmic form, where one a matric is introduced one row at a time from
the top of the array, whereas another matrix is introduced one colum at a time.
=I# machines are rare? the 7isco 6\F processor is an example. +hey can be
fast and scalable, as they do operate in parallel, but they are reallydifficult to
build.
3.1.$ Mlti%le 4nstrction Streams, Mlti%le 5ata Streams 6M4M57
=ultiple Instruction, =ultiple ata 8=I=: have independent and asynchronous
processes that can operate on a number of different data streams. +hey are now
the mainstream in contemporary computer systems and thus can be further
differentiated between multiprocessor computers and their extension,
multicomputer mutiprocessors. s the name clearly indicates, the former refers
to single machines which have multiple processors and the latter to a cluster of
these machines acting as a single entity.
-
7/26/2019 Advanced Hpc and Linux 20130820
20/70
=ultiprocessor systems can be differentiated between shared memory and
distributed memory. #hared memory systems have all processors connected to a
single pool of global memory 8whether by hardware or by software:. +his may be
easier to program, but it's harder to achieve scalability. #uch an architecture is
uite common in single system unit multiprocessor machines.
2ith distributed memory systems, each processor has its own memory. Finally,
another combination is distributed shared memory, where the 8physically
separate: memories can be addressed as one 8logically shared: address space.
variant combined method is to have shared memory within each multiprocessor
node, and distributed between them.
3.2 8rocessors and Cores
2.2.1 /ni- and Mlti-8rocessors
further distinction needs to be made between processors and cores.
processor is a physical device that accepts data as input and provides results as
output. uniprocessor system has one such device, although the definitions can
become ambiguous. In some uniprocessor systems it is possible that there is
more than one, but the entities engage in separate functions. For example, a
computer system that has one central processing unit may also have a co
processor for mathematic functions and a graphics processor on a separate card.
Is that system uniprocessorS rguably not as the coprocessor will be seen as
belonging to the same entity as the 76&, and the graphics processor will have
different memory, system IQ>, and will be dealing with different peripherals. In
contrast a multiprocessor system does share memory, system IQ>, and
peripherals. But then the debate will become mur*y with the distinction between
shared and distributed memory discussed above.
2.2.2 /ni- and Mlti-core
In addition to the distinction between uniprocessor and multiprocessor there is
also the distinction between unicore and multicore processors. unicore
-
7/26/2019 Advanced Hpc and Linux 20130820
21/70
processor carries out the usual functions of a 76&, according to the instruction
set? data handling instructions 8set register values, move data, read and write:,
arithmetic and logic functions 8add, subtract, multiply, divide, bitwise operations
for con0unction and dis0unction, negate, compare:, and controlflow functions
8conditionally branch to another section of a program, indirectly branch and
return:. multicore processor carries out the same functions, but with
independent central processing units 8note lower case: called 'cores'.
=anufacturers integrate the multiple cores onto a single integrated circuit die or
onto multiple dies in a single chip pac*age.
In terms of theoretical architecture, a uniprocessor system could be multicore,
and a multiprocessor system could be unicore. In practise the most common
contemporary architecture is multiprocessor and multicore. +he number of cores
is represeneted by a prefix. For example, a dualcore processor has two cores
8e.g. = 6henom II \(, Intel 7ore uo:, a uadcore processor contains fourcores 8e.g. = 6henom II \), Intel iK, i^, and i:, a hexacore processor
contains six cores 8e.g. = 6henom II \!, Intel 7ore i Extreme Edition 9J\:,
an octocore processor or octacore processor contains eight cores 8e.g. Intel
\eon E((J, = F\K^J: etc.
2.2.3 /ni- and Mlt-Threading
In addition to the distinctions bewteen processors and cores, whether uni or
multi, there is also the uestion of threads. n execution thread is the smallest
processing unit in an operating system. thread is typically contained inside a
process. =ultiple threads can exist within the same process and share resources.
>n a uniprocessor, multithreading generally occurs by switching between
different threads engaging in timedivision multiplexing with the processor
switching between the different threads, which may give the apperance that the
as* is happening at the same time. >n a multiprocessor or multicore system,
threads become truly concurrent, with every processor or core executing a
separate thread simultaneously.
2.2.$ "h9 4s 4t A Mlticore Ftre
Ideally, don't we want clusters of multicore multiprocessors with multithreaded
instructionsS >f course we do? but thin* of the heat that this generates, thin* of
-
7/26/2019 Advanced Hpc and Linux 20130820
22/70
the potential for race conditions 8e.g., deadloc*s, data integrity issues, resource
conflicts, interleaved execution issues:.
+here are all fundamental problems with computer architecture.
>ne of the reasons that multicore multiprocessor clusters have become popular
is that cloc* rate has pretty much stalled. part from the physical reasons, it is
uneconomical. It's simply not worth the cost increasing the freuency of cloc*
rate in terms of the power consumed and the heat dissipitated. Intel calls the
rateQheat tradeoff a -fundamental theorem of multicore processors-.
%ew multicore systems are being developed all the time. &sing "I#7 76&s,
+ilera released !)core processors in (JJ9 and in (JJ9, a one hundred core
processor. In (J1( +ilera founder, r. garwal, is leading a new =I+ effort
dubbed +he ngstrom 6ro0ect. It is one of four "6funded efforts aimed at
building exascale supercomputers. +he goal is to design a chip with 1,JJJ cores.
-
7/26/2019 Advanced Hpc and Linux 20130820
23/70
2.# Para$$e$ Proce!!in% Per&ormance
2.3.1 S%eed% and ;ocs
6arallel programming and multicore systems should mean better performance.
+his can be expressed a ratio called speedup
#peedup 8p: +ime 8serial:Q +ime 8parallel:
+his is varied by the number of processors # +81:Q+8p:, where +8p: represents
the execution time ta*en by the program running on p processors, and +81:
represents the time ta*en by the best serial implementation of the applicationmeasured on one processor.
3inear, or ideal, speedup is when #8p: p. For example, double the processors
resulting in double the speedup.
5owever parallel programming is hard . =ore complexity more bugs.
7orrectness in parallelisation reuires synchronisation 8loc*ing:.
#ynchronisation and atomic operations causes loss of performance,
communication latency. probable issue in parallel computing is deadloc*s,
where two or more competing actions are each waiting for the other to finish,
and thus neither ever does. n apocraphyl story of a _ansas railroad statue
radically illustrates the problem of a deadloc*
"hen t!o trains aroach each other at a crossing, #oth shall come to a f$ll
sto and neither shall start $ again $ntil the other has gone%"
8 similar example is a liveloc*? the states of the processes involved in the
liveloc* constantly change with regard to one another, none progressing:.
3oc*s are currrently manually inserted in typically programming languages?
without loc*s programs can be put in aninconsistent state. =ultiple loc*s in
-
7/26/2019 Advanced Hpc and Linux 20130820
24/70
different places and orders can lead to deadloc*s. =anual loc* inserts is error
prone, tedious and difficult to maintain. oes the programmer *now what parts
of a program will benefit from parallelisationS +o ensure that parallel execution
is safe, a tas*Rs effects must not interfere with the execution of another tas*.
2.3.2 Amdahl
-
7/26/2019 Advanced Hpc and Linux 20130820
25/70
where each pixel is rendered independently. #uch tas*s are often called
-pleasingly parallel-. +o give an example using the " programming language the
#%>2 pac*age 8#imple %etwor* of 2or*stations: pac*age allows for
embarrassingly parallel computations 8yes, we have this installed:.
2hilst originally expressed by $ene mdahl in 19!, it wasn't until over twenty
years later in 19 that an alternative by `ohn 3. $ustafson amd Edwin 5. Barsis
was proposed. $ustafon noted that madahl's 3aw assumed a computation
problem of fixed data set siCe. $ustafson and Barsis observed that programmers
tend to set the siCe of their computational problems according to the available
euipment? therefore as faster and more parallel euipment becomes available,
larger problems can be solved. +hus scaled speedup occurs? although mdahl's
law is correct in a fixed sense, it can be circumvented in practise by increasing
the scale of the problem.
If the problem siCe is allowed to grow with 6, then the seuential fraction of the
wor*load would become less and less important. common metaphor is based
on driving 8computation:, time, and distance 8computational tas*:. In mdhal's
3aw, if a car had been travelling )J*mpQh and needs to reach a point J*m from
the point of origin, no matter how fast the vehicle travels it will can only reach a
maximum of a J*mQh average before reaching the J*m point, even if it
travelled at infinite speed as the first hour has already passed. 2ith the
$ustafonBarsis 3aw, it doesn't matter if the first hour has been at a plodding )J
*mQh, this can be infinitely increased given enough time and distance. `ust ma*e
the problem bigger4
-
7/26/2019 Advanced Hpc and Linux 20130820
26/70
Image from Wikipedia
#.0 'ntroductor to (P' Pro%rammin%
3.1 The Stor9 o! the Message 8assing 4nter!ace 6M847 and %enM84
+he =essage 6assing Interface 8=6I: is a widely used standard, initially
designed by academia and industry initiated in 1991, to run on parallel
computers. +he goal of the group was to ensure sourcecode portability, and as aresult they have a standard that defines an interface and specific functionality. s
a standard, syntax and semantics are defined for core library routines which
allow for programmers to write messagepassing programs in Fortran or 7.
#ome implementations of these core library routine specifications are available
as free and opensource software, such as >pen =6I. >pen =6I combined three
-
7/26/2019 Advanced Hpc and Linux 20130820
27/70
previous well*nown implementations, namely F+=6I from the &niversity of
+ennessee, 3=6I from 3os lamos %ational 3aboratory, and 3=Q=6I from
Indiana &niversity, each of which excelled in particular areas, with additional
contributions from the 67\=6I team at the &niversity of #tuttgart. >pen=6I
combines the uality peerreview of a scientific free and opensource software
pro0ect, and has been used in many of the world's top ran*ing supercomputers.
=a0or milestones in the development of =6I include the following
V 1991 ecision to initiate #tandards for =essage 6assing in a istributed
=emory Environment
V 199( 2ors*hop on the above held.
V 199( 6reliminary draft specification released for =6I
V 199) =6I1. #pecification, not an implementation. 3ibrary, not a language.esigned for 7 and Fortran .
V 199 =6I(. Extends messagepassing model to include parallel IQ>, includes
7LLQFortran9J. Interaction with threads, and more.
V (JJ =6I Forum reconvened? =6IK development.
V +he standard utilised in this course is =6I(.
+he messaage passing paradigm, as it is called, is attractive as it is portable on a
wide variety of distributed architectures, including distributed and sharedmemory multiprocessor systerms, networ*s of wor*stations, or even potentially a
combination thereof. lthough originally designed for distributed architectures
8unicore wor*stations connected by a common networ*: which were popular at
the time the standard was initiated, shared memory symmetric multiprocessing
systems over networ*s created a hybrid distributedQshared memory systems, that
is each system has shared memory within each machine but not the memory
distributed between machines, which distribute data over the networ*
communications. +he =6I library standards and implementations were modified
to handle both types of memory architectrues.
(image from
&a!rence
&iermore ational
&a#oratory, %*%+)
-
7/26/2019 Advanced Hpc and Linux 20130820
28/70
&sing =6I is a matter of some common sense. It is is the only message passing
library which can really be considered a standard. It is supported on virtually all
567 platforms, and has replaced all previous message passing libraries, such as
6]=, 6"=7#, E&I, %\, 7hameleon, to name a few predecessors.
6rogrammers li*e it because there is no need to modify their source code when
ported to a different system as long as that system also supports the =6I
standard 8there may be other reasons however to modify the code4:. =6I has
excellent performance with vendors able to exploit hardware features for
optimisation.
+he core principle is that many processors should be able cooperate to solve a
problem by passing messages to each through a common communications
networ*. +he flexible architecture does overcome serial bottlenec*s, but it alsodoes reuire explicit programmer effort 8the -uesting beast- of automatic
parallelisation remains somewhat elusive:. +he programmer is responsible for
identifying opportunities for parallelism and implementing algorithms for
parallelisation using =6I.
=6I programming is best where there is not too many small communications,
and where coarselevel brea*up of tas*s or data is possible.
"In cases !here the data layo$t is fairly simle, and the comm$nications
atterns are reg$lar this data-arallel. is an e/cellent aroach% o!eer, !hen
dealing !ith dynamic, irreg$lar data str$ct$res, data arallel rogramming can
#e diffic$lt, and the end res$lt may #e a rogram !ith s$#-otimal erformance%"
(arren, Michael *%, and 1ohn % *almon% "+ orta#le arallel article rogram%" 3om$ter hysics
3omm$nications 57%6 (68)9 2::-20%)
3.2 /nravelling A Sam%le M84 8rogram and %enM84 "ra%%ers
For the purposes of this course, copy a number of files to the home directory
-
7/26/2019 Advanced Hpc and Linux 20130820
29/70
cd ~
cp -r /common/advcourse .
In the Intermediate course, an example mpihelloworld.c program was illustrated
with an associated 6B# script. 3etRs recall what that included and the
explanation in the 7 program and in the 6B# script that launched it.
+his is the text for mpihelloworld.c
#include standard include for 7 programs.
#include "mpi.h" standard include for =6I
programs.int main( argc, argv ) 7eginning of the main function6
esta5%ish arguments and vector. To
incorporate input fi%es argc
'argument count( is the num5er of
arguments6 and argv 'argument
vector( is an arra# of characters
representing the arguments.
int argc; Argument count is an integer
char **argv; Argument vector is a string of
characters.
{
int rank, size; "et ran8 and si9e from the inputs.
MPI_Init( &argc, &argv );
Initialises the =6I executionenvironment. +he input parameters
argc is pointer to the number of
arguments and argv is a pointer to
the argument vector
MPI_Comm_size( MPI_COMM_WORLD,
&size );etermines the siCe of the group
associated with a communicator. In
-
7/26/2019 Advanced Hpc and Linux 20130820
30/70
input parameter is simply a handle
8Contains a%% of the processes(, the
output parameter, siCe, is an
integer of the number of processes
in the group.
MPI_Comm_rank( MPI_COMM_WORLD,
&rank );s above, except ran* is ran* of the
calling process.
printf( "Hello world from
process %d of %d\n", rank, size );Printing He%%o 2or%d from each
process.
MPI_Finalize(); +erminates =6I execution
environment
return 0; A successfu% program finishes;
}
It is compiled into an executable with the command
mpicc -o mpi-helloworld mpi-helloworld.c
+his is the text for the batch file pbshelloword which is launched sub and
reviewed with less.
qsub pbs-helloworld
less pbs-helloworld
+he sample -hello world- program should be understandable to any 7
programmer 8indeed, any programmer: and with the =6Ispecific annotations, it
should be clear what is going on. It is the same as any other program, but with a
few =6Ispecific additions. For example, one can chec* the 6$I mpi.h with the
following
less /usr/local/openmpi/1.6.3-pgi/include/mpi.h
-
7/26/2019 Advanced Hpc and Linux 20130820
31/70
=6I compiler wrappers are used to compile =6I programs which perform basic
error chec*ing, integrate the =6I include files, lin* to the =6I libraries and pass
switches to the underlying compiler. +he wrappers are as follows
mpif >pen =6I Fortran wrapper compiler
mpif9J >pen =6I Fortran 9J wrapper compiler
mpicc >pen =6I 7 wrapper compiler
mpicxx >pen =6I 7LL wrapper compiler
>pen =6I is comprised of three software layers >63 8>pen 6ortable ccess
3ayer:, >"+E 8>pen "un+ime Environment:, and >=6I 8>pen =6I:. Each layerprovides the following wrapper compilers
>63 opalcc and opalcLL
>"+E ortecc and ortecLL
>=6I mpicc, mpicLL, mpicxx, mpi77 8only on systems with
casesenstive file systems:, mpif, and mpif9J. %ote that
mpicLL, mpicxx, and mpi77 all invo*e the same underlying
7LL compiler with the same options. ll are provided ascompatibility with other =6I implementations.
+he distinction between Fortran and 7 routines in =6I are fairly minimal. ll the
names of =6I routines and constants in both 7 and Fortran begin with the same
=6IO prefix. +he main differences are
V +he include files are slightly different in 7, mpi.h, in Fortan, mpif.h.V Fortran =6I routine names are in uppercase 8e.g., =6IOI%I+:, whereas 7
compatible =6I routine names are upper and lowercase 8e.g., =6IOInit:.
V +he arguments to =6IOInit are different? an =6I 7 program can ta*e advantage
of commandline arguments.
V +he arguments in =6I 7 functions are more strongly typed than they are in
Fortran, resulting in specific types in 7 8e.g., =6IO7omm, =6IOatatype:
whereas =6I Fortran uses integers.
-
7/26/2019 Advanced Hpc and Linux 20130820
32/70
V Error codes are returned in a separate argument for Fortran as opposed to the
return value for 7 functions.
7onsider the mpihelloworld program in Fortran 8mpihelloworld.f:
! Fortran MPI Hello World comment
program hello 6rogram name
include 'mpif.h' Include file for =6I
integer rank, size, ierror, tag,
status(MPI_STATUS_SIZE)]ariables
call MPI_INIT(ierror) #tart =6I
call
MPI_COMM_SIZE(MPI_COMM_WORLD, size,
ierror)
%umber of processers
call
MPI_COMM_RANK(MPI_COMM_WORLD, rank,
ierror)
6rocess Is
print*, 'node', rank, ': Hello
world'
Each processor prints -5ello
2orld-
call MPI_FINALIZE(ierror) Finish =6I.
end
7ompile this with mpi9J 8the Fortran 9J wrapper: and submit with sub
mpif90 mpi-helloworld.f90 -o mpi-helloworld
qsub pbs-helloworld
+he mpihelloworld program is an example of using =6I in a manner that is
similar to a #ingle Instruction =ultiple ata architecture. +he same instruction
-
7/26/2019 Advanced Hpc and Linux 20130820
33/70
stream 8print hello world: is used across multiple times. It is perhaps best
described as #ingle 6rogram =ultiple ata, as it obtains the effect of running the
same program multiple times, or, if you li*e different programs with the same
instructions.
3.3 M84
-
7/26/2019 Advanced Hpc and Linux 20130820
34/70
;%;%6 MI==O2>"3 created. 7ommunicators are considered
analoguous to the mail or telephone system? every message travels in the
communicator, with every message passing call having a communcator
argument.
+he input paramters are argc, a pointer to the number of arguments, and argv,
the argument vector. +hese are for 7 and 7LL only. +he Fortranonly output
parameter is IE"">", as integer.
+he syntax for =6IOInit8: is as follows for 7, Fortran, and 7LL.
C S9ntax #include
int MPI_Init(int *argc, char ***argv)
Fortran S9ntax
-
7/26/2019 Advanced Hpc and Linux 20130820
35/70
INCLUDE mpif.h
MPI_INIT(IERROR)
INTEGER IERROR
C>> S9ntax
#include void MPI::Init(int& argc, char**& argv) void MPI::Init()
;%;%2 MI"3. +he input parameter is comm, which the handle for the
communicator, and the output paramter is siCe, the number of processes in thegroup of comm 8integer: and the Fortran only IE"">" providing the error status
as integer.
communicator is effectively a collection of processes that can send messages
to each other. 2ithin programs many communications also depend on the
number of processes executing the program.
+he syntax for =6IO7ommOsiCe8: is as follows for 7, Fortran, and 7LL.
C S9ntax #include
int MPI_Comm_size(MPI_Comm comm, int *size)
Fortran S9ntax INCLUDE mpif.h
MPI_COMM_SIZE(COMM, SIZE, IERROR)
INTEGER COMM, SIZE, IERROR
C>> S9ntax #include
int Comm::Get_size() const
-
7/26/2019 Advanced Hpc and Linux 20130820
36/70
;%;%; MI" error status
for Fortran. It is common for =6I programs to be written in a managerQwor*er
model, where one process 8typically ran* J: acts in a supervisory role, and the
other processes act in a computational role.
+he syntax for =6IO7ommOran*8: is as follows for 7, Fortran, and 7LL.
C S9ntax #include
int MPI_Comm_rank(MPI_Comm comm, int *rank)
Fortran S9ntax INCLUDE mpif.h
MPI_COMM_RANK(COMM, RANK, IERROR)
INTEGER COMM, RANK, IERROR
C>> S9ntax #include
int Comm::Get_rank() const
;%;%4 MI
-
7/26/2019 Advanced Hpc and Linux 20130820
37/70
implementation. +he messagepassing system ta*e care of delivery. 5owever this
Mappropriate wayN means stating various characteristics of the message 0ust li*e
the post or email? who is sending it, where itRs being sent to, what itRs about, and
so forth.
+he input parameters include buf, the initial address of the send buffer., count,
an integer of the number of elements., datatype, a handle of the datatype of each
send buffer., dest, an integer ran* of the destination., tag, an integer message
tag, and comm, the communicator handle. +he only output parameter is
Fortran's , IE"">".
If =6I7omm represents a community of addressable space, then =6I#end and
=6I"ecv the envelope, addressing information and the data. In order for a
message to be successfully communicated the system must append someinformation to the data that the application program wishes to transmit. +his
includes the ran* of the sender, the receiver, a tag, and the communicator. +he
source is used to differentiate messages received from different sources? the tag
to distinguish messages from a single process.
+he syntax for =6IO#end8: is as follows for 7, Fortran, and 7LL.
C S9ntax
Uinclude ;mpi.h> S9ntax #include
void Comm::Send(const void* buf, int count, const Datatype&
datatype, int dest, int tag) const
-
7/26/2019 Advanced Hpc and Linux 20130820
38/70
;%;%8 MI
-
7/26/2019 Advanced Hpc and Linux 20130820
39/70
void Comm::Recv(void* buf, int count, const Datatype& datatype,
int source, int tag) const
+he importance of =6IO#end8: and =6IO"ecv8: refers to the nature of process
variables, which remain private unless passed by =6I in the 7ommunications
2orld.
;%;%: MI
-
7/26/2019 Advanced Hpc and Linux 20130820
40/70
int MPI_Finalize()
Fortran S9ntax INCLUDE mpif.h
MPI_FINALIZE(IERROR)
INTEGER IERROR
C>> S9ntax #include
void Finalize()
2hilst the previous mpihelloworld.c and the mpihelloworld.f9J examples
illustrated the use of four of the six core routines of =6I, it did not illustrate the
use of the =6IO"ecv and =6IO#end routines. +he following program, of nogreater complexity, does this. +here is no need to provide additional explanation
of what is happening, as this should be discerned from the routine explanations
given. Each program should be compiled with mpicc and mpif9J respectively,
submitted with sub, with the results chec*ed.
7ompile with mpicc -o mpi-sendrecv mpi-sendrecv.c, submit with qsub pbs-sendrecv
#include
#include #include
int main(argc,argv)
int argc;
char *argv[];
{
int myid, numprocs;
int tag,source,destination,count;
int buffer;
MPI_Status status;
MPI_Init(&argc,&argv);
MPI_Comm_size(MPI_COMM_WORLD,&numprocs);
MPI_Comm_rank(MPI_COMM_WORLD,&myid);
tag=1;
source=0;
destination=1;
-
7/26/2019 Advanced Hpc and Linux 20130820
41/70
count=1;
if(myid == source){
buffer=1234;
MPI_Send(&buffer,count,MPI_INT,destination,tag,MPI_COMM_WORLD);
printf("processor %d sent %d\n",myid,buffer);
}
if(myid == destination){
MPI_Recv(&buffer,count,MPI_INT,source,tag,MPI_COMM_WORLD,&status);printf("processor %d received %d\n",myid,buffer);
}
MPI_Finalize();
}
+he mpisendrecv.f program? compile with mpif9J mpisendrecv.f9J, submit with
sub pbssendrecv
program sendrecv
include "mpif.h"
integer myid, ierr,numprocs
integer tag,source,destination,count
integer buffer
integer status(MPI_STATUS_SIZE)
call MPI_INIT( ierr )
call MPI_COMM_RANK( MPI_COMM_WORLD, myid, ierr )
call MPI_COMM_SIZE( MPI_COMM_WORLD, numprocs, ierr )
tag=1
source=0
destination=1count=1
if(myid .eq. source)then
buffer=1234
Call MPI_Send(buffer, count, MPI_INTEGER,destination,&
tag, MPI_COMM_WORLD, ierr)
write(*,*)"processor ",myid," sent ",buffer
endif
if(myid .eq. destination)then
Call MPI_Recv(buffer, count, MPI_INTEGER,source,&
tag, MPI_COMM_WORLD, status,ierr)
write(*,*)"processor ",myid," received ",buffer
endifcall MPI_FINALIZE(ierr)
stop
end
+he following provides a summary use of the six core routines in 7 and Fortran.
-
7/26/2019 Advanced Hpc and Linux 20130820
42/70
Purpo!e C Fortran
&nc%ude header fi%es #include INCLUDE mpif.h
&nitia%i9e P&int MPI_Init(int *argc, char
***argv)
INTEGER IERROR
CALL MPI_INIT(IERROR)
etermine num5er
of processes 2ithin
a communicator
int MPI_Comm_size(MPI_Comm
comm, int *size)
INTEGER COMM,SIZE,IERRORCALL
MPI_COMM_SIZE(COMM,SIZE,IERROR)
etermine processor
ran8 2ithin a
communicator
int MPI_Comm_rank(MPI_Comm
comm, int *rank)
INTEGER COMM,RANK,IERROR
CALL
MPI_COMM_RANK(COMM,RANK,IERROR)
"end a message
int MPI_Send (void *buf,int
count, MPI_Datatype
datatype, int dest, int tag,
MPI_Comm comm)
BUF(*)
INTEGER COUNT,
DATATYPE,DEST,TAG
INTEGER COMM, IERROR
CALL MPI_SEND(BUF,COUNT,DATATYPE, DEST, TAG, COMM,
IERROR)
,eceive a message
int MPI_Recv (void *buf,int
count, MPI_Datatype
datatype, int source, int
tag, MPI_Comm comm,
MPI_Status *status)
BUF(*)
INTEGER COUNT, DATATYPE,
SOURCE,TAG
INTEGER COMM, STATUS, IERROR
CALL MPI_RECV(BUF,COUNT,
DATATYPE, SOURCE, TAG, COMM,
STATUS, IERROR)
Exit P&int MPI_Finalize() CALL MPI_FINALIZE(IERROR)
$.? 4ntermediate M84 8rogramming
$.1 M84 5atat9%es
3i*e 7 and Fortran 8and indeed, almost every programming language that comes
to mind:, =6I has datatypes, a classification for identifying different types of
data 8such as real, int, float, char etc:. In the introductory =6I program there
wasnRt really much complexity in these types? as one delves deeper however
more will be encountered. Forewarned is forearmed, so the following provides a
handy comparison chart between =6I, 7, and Fortran.
-
7/26/2019 Advanced Hpc and Linux 20130820
43/70
M84 5ATAT+8E FTA@ 5ATAT+8E
MPI_INTEGER INTEGER
MPI_REAL REAL
MPI_DOUBLE_PRECISION DOUBLE PRECISION
MPI_COMPLEX COMPLEX
MPI_LOGICAL LOGICAL
MPI_CHARACTER CHARACTER
MPI_BYTE
MPI_PACKED
M84 5ATAT+8E C 5atat9%e
MPI_CHAR signed char
MPI_SHORT signed short int
MPI_LONG signed long int
MPI_UNSIGNED_CHAR unsigned char
MPI_UNSIGNED_SHORT unsigned short int
MPI_UNSIGNED unsigned int
MPI_UNSIGNED_LONG unsigned long int
MPI_FLOAT float
MPI_DOUBLE double
MPI_LONG_DOUBLE long double
MPI_BYTE
MPI_PACKED
$.2 4ntermediate otines
In the Intermediate course one of the last excersises involved the submission of
mpiping and mpipong. +he first simply tested whether a connection existed
between multiple processors. +he second program tested different pac*et siCes,
asynchronous, and bidirectional. In this example there is pingOpong.c, from the
-
7/26/2019 Advanced Hpc and Linux 20130820
44/70
&niversity of Edinburgh 6arallel 7omputing 7entre, and a fortran 9J version of
the same from 7olorado &niversity. +he usual methods can be used for compiling
and submitting these programs, e.g.,
mpicc -o mpi-pingpong mpi-pingpong.c ormpicc mpi-pingpong.f90 -o mpi-pingpong and
qsub pbs-pingpong
5owever for this course the interesting components is what is inside the code in
terms of the =6I routines. s previously there is the mpi.h include files, the
initialisation routines, the establishment of a communications world and so forth.
In addition however there are some new routines, specifically =6IO2time,=6IObort, and =6IO#send.
4%2%6 MI
-
7/26/2019 Advanced Hpc and Linux 20130820
45/70
C S9ntax
#include
double MPI_Wtime()
Fortran S9ntax
INCLUDE mpif.h
DOUBLE PRECISION MPI_WTIME()
C>> S9ntax
#include
double MPI::Wtime()
4%2%2 MI
-
7/26/2019 Advanced Hpc and Linux 20130820
46/70
#include int MPI_Abort(MPI_Comm comm, int errorcode)
Fortran S9ntaxINCLUDE mpif.h
MPI_ABORT(COMM, ERRORCODE, IERROR)INTEGER COMM, ERRORCODE, IERROR
C>> S9ntax#include
void Comm::Abort(int errorcode)
4%2%; MItherwise, =6IO#end is the more
flexible option.
+he available input parameters include buf, the initial addess of the send buffer.,
count., an nonnegative integer of the number of elements in the send buffer.,datatype, a datatype of each send buffer element as a handle., dest, an integer
ran* of destination., tag, a message tag represented as an integer, and comm,
the communicator handle. +he only output paramter is FortranRs IE""">".
+he syntax for =6IO#send8: is as follows for 7, Fortran, and 7LL.
C S9ntax
#include
int MPI_Ssend(void *buf, int count, MPI_Datatype datatype, int dest,
int tag, MPI_Comm comm)
-
7/26/2019 Advanced Hpc and Linux 20130820
47/70
Fortran S9ntax
INCLUDE mpif.h
MPI_SSEND(BUF, COUNT, DATATYPE, DEST, TAG, COMM, IERROR)
BUF(*)
INTEGER COUNT, DATATYPE, DEST, TAG, COMM, IERROR
C>> S9ntax
#include
void Comm::Ssend(const void* buf, int count, const Datatype&
datatype, int dest, int tag) const
4%2%4 Other *end and Rec Ro$tines
lthough not used in the specific program 0ust illustrated there are actually a
number of other send options for >pen =6I. +hese include =6IOBsend ,
=6IO"send, =6IOIsend, =6IOIbsend, =6IOIssend, and =6IOIrsend. +hese are
worth mentioning in summary as follows
MI
-
7/26/2019 Advanced Hpc and Linux 20130820
48/70
indicates to the system to start copying data out of the send buffer. send
reuest can be determined being completed by calling the =6IO2ait,
=6IO2aitany, =6IO+est, or =6IO+estany with reuest returned by this function.
+he send buffer cannot be used until one of these conditions is successful, or an
=6IO"euestOfree indicates that the buffer is available.
MI
-
7/26/2019 Advanced Hpc and Linux 20130820
49/70
int tag, MPI_Comm comm, MPI_Request *request)
Fortran S9ntax INCLUDE mpif.h MPI_ISEND(BUF, COUNT, DATATYPE, DEST, TAG, COMM, REQUEST, IERROR)
BUF(*) INTEGER COUNT, DATATYPE, DEST, TAG, COMM, REQUEST, IERROR
C>> S9ntax #include Request Comm::Isend(const void* buf, int count, const
Datatype& datatype, int dest, int tag) const
MI
-
7/26/2019 Advanced Hpc and Linux 20130820
50/70
MPI_IRECV(BUF, COUNT, DATATYPE, SOURCE, TAG, COMM, REQUEST,
IERROR)
BUF(*)
INTEGER COUNT, DATATYPE, SOURCE, TAG, COMM, REQUEST, IERROR
C>> S9ntax
MI> S9ntax #include void Request::Wait(Status& status)
void Request::Wait()
-
7/26/2019 Advanced Hpc and Linux 20130820
51/70
A Smmar9 o! Some ther M84 Sendeceive Modes
Send Mode Ex%lanation Bene!its 8rolemsMPI_Send() #tandard send.
=ay be
synchronous or
buffering
Flexible tradeoff?
automatically
uses buffer if
available, but
goes for
synchronous if
not.
7an hide
deadloc*s,
uncertainty of
type ma*es
debugging
harder.
MPI_Ssend() #ynchronous
send. oesn'treturn until
receive has also
completed.
#afest mode,
confident thatmessage has
been received.
3ower
performance,especially
without non
bloc*ing.
MPI_Bsend() Buffered send.
7opies data to
buffer, program
free to continue
whilst message
delivered later.
$ood
performance.
%eed to be
aware of buffer
space.
Buffer
management
issues.
MPI_Rsend() "eceive send.
=essage must be
already posted or
is lost.
#light
performance
increase since
there's no
handsha*e.
"is*y and
difficult to
design.
s described previously the arguments dest and source in the various modes of
send are the ran*s of the receiving and the sending processes. =6I also allows
source to be a -wildcard- through the predefined constant =6IO%TO#>&"7E8to receive from any source: and =6IO%TO+$ 8to receive with any source:.
+here is no wildcard for dest. gain using the postal analogy, a receipient may be
ready to receive a message from anyone, but they can't send a message to
anywhere4
4%2%8 Ahe risonerBs Cilemma
-
7/26/2019 Advanced Hpc and Linux 20130820
52/70
+he example of the 6risonerRs ilemma 8cooperation vs competition: is provided
as an example to illustrate how nonbloc*ing communications wor*. It in this
example, there are ten rounds between two players. +here are different payoffs
for each. In this particular version the distinction is between cooperation and
competition for financial rewards. If both players cooperate they receive Y( for
the round. If they both compete, they receive Y1 each for the round. But if one
adopts a competitive stance and the other a cooperative stance, the competitor
receives YK and the cooperative player nothing.
serial version of the code is provided 8serialgametheory.c, serial
gametheory.f9J:."eview and then attempt a parallel version from the s*eleton
versions of =6I 8mpis*elgametheory.c, mpis*elgametheory.f9J:.Each process
must run one of the players decisionma*ing, then they both have to transmittheir decision to the other, and then update their own tally of the result. 7onsider
using =6IO#end8:, or =6IOIrecv8:, and =6IO2ait8:. >n completion review with a
solution provided with mpigametheory.c and mpigametheory.f9J and submit the
tas*s with sub.
).# Co$$ective Communication!
=6I can also conduct collective communications. +hese include =6IOBroadcast,
=6IO#catter, =6IO$ather, =6IO"educe, and =6IOllreduce. brief summary of
their syntax and description of effects is provided before a practical example.
+he basic principle and motivation is that whilst collective communications this
may provide a performance improvement, it will certainly provide clearer code.
7onsider the following 7 snippet of a root processor sending to all..
if ( 0 == rank ) {
unsigned int proc_I;
for ( proc_I=1; proc_I < numProcs; proc_I++ ) {
MPI_Ssend( ¶m, 1, MPI_UNSIGNED, proc_I, PARAM_TAG, MPI_COMM_WORLD );
}
}
else {
MPI_Recv( ¶m, 1, MPI_UNSIGNED, 0 /*ROOT*/, PARAM_TAG, MPI_COMM_WORLD, &status
);
-
7/26/2019 Advanced Hpc and Linux 20130820
53/70
}
"eplaced with
MPI_Bcast( ¶m, 1, MPI_UNSIGNED, 0/*ROOT*/, MPI_COMM_WORLD );
4%;%6 MI".
+he syntax for =6IOBroadcast8: is as follows for 7, Fortran, and 7LL.
C S9ntax#include
int MPI_Bcast(void *buffer, int count, MPI_Datatype datatype,
int root, MPI_Comm comm)
Fortran S9ntaxINCLUDE mpif.hMPI_BCAST(BUFFER, COUNT, DATATYPE, ROOT, COMM, IERROR)
BUFFER(*)
INTEGER COUNT, DATATYPE, ROOT, COMM, IERROR
C>> S9ntax#include
-
7/26/2019 Advanced Hpc and Linux 20130820
54/70
void MPI::Comm::Bcast(void* buffer, int count,
const MPI::Datatype& datatype, int root) const = 0
4%;%2 MI
-
7/26/2019 Advanced Hpc and Linux 20130820
55/70
+he syntax for =6IO#catter8: is as follows for 7, Fortran, and 7LL.
C S9ntax
#include
int MPI_Scatter(void *sendbuf, int sendcount, MPI_Datatype sendtype,
void *recvbuf, int recvcount, MPI_Datatype recvtype, int root,
MPI_Comm comm)
Fortran S9ntax
INCLUDE mpif.h
MPI_SCATTER(SENDBUF, SENDCOUNT, SENDTYPE, RECVBUF, RECVCOUNT, RECVTYPE, ROOT, COMM, IERROR)
SENDBUF(*), RECVBUF(*)
INTEGER SENDCOUNT, SENDTYPE, RECVCOUNT, RECVTYPE, ROOT
INTEGER COMM, IERROR
C>> S9ntax
#include
void MPI::Comm::Scatter(const void* sendbuf, int sendcount,
const MPI::Datatype& sendtype, void* recvbuf,
int recvcount, const MPI::Datatype& recvtype,
int root) const
4%;%; MI
-
7/26/2019 Advanced Hpc and Linux 20130820
56/70
+he input parameters include sendbuff, the address of the send buffer.,
sendcount, an integer of the number of elements in the send buff., sendtype, the
datatype handle send buffer elements., recvcount, an root integer of the number
of elements in the receive buffer., rectype, the datatype handle for root of receive
buffer elements., root, the integer ran* of the sending process., and comm, the
communicator handle. +he output paramters include recbuf, the address of the
receive buff for root and the everdependable IE"">" for Fortran,
+he syntax for =6IO$ather8: is as follows for 7, Fortran, and 7LL.
C S9ntax
#include
int MPI_Gather(void *sendbuf, int sendcount, MPI_Datatype sendtype,
void *recvbuf, int recvcount, MPI_Datatype recvtype, int root,
MPI_Comm comm)
Fortran S9ntax
INCLUDE mpif.h
MPI_GATHER(SENDBUF, SENDCOUNT, SENDTYPE, RECVBUF, RECVCOUNT,
RECVTYPE, ROOT, COMM, IERROR)
SENDBUF(*), RECVBUF(*)
INTEGER SENDCOUNT, SENDTYPE, RECVCOUNT, RECVTYPE, ROOT
INTEGER COMM, IERROR
-
7/26/2019 Advanced Hpc and Linux 20130820
57/70
C>> S9ntax
#include
void MPI::Comm::Gather(const void* sendbuf, int sendcount,
const MPI::Datatype& sendtype, void* recvbuf,
int recvcount, const MPI::Datatype& recvtype, int root,
const = 0
4%;%4 MI
-
7/26/2019 Advanced Hpc and Linux 20130820
58/70
integer number of elements in the send buffer., datatype, a handle of the
datatype of elements in the send buffers., op, a handle of the reduce operation.,
root, the integrer ran* of the root process., comm, the communicator handle. +he
output paramters are recvbug, the address of the receive buffer for root, and
Fortran's IE"">".
+he syntax for =6IO"educe8: is as follows for 7, Fortran, and 7LL.
C S9ntax
#include
int MPI_Reduce(void *sendbuf, void *recvbuf, int count,
MPI_Datatype datatype, MPI_Op op, int root, MPI_Comm comm)
Fortran S9ntax
INCLUDE mpif.h
MPI_REDUCE(SENDBUF, RECVBUF, COUNT, DATATYPE, OP, ROOT, COMM,
IERROR)
SENDBUF(*), RECVBUF(*)
INTEGER COUNT, DATATYPE, OP, ROOT, COMM, IERROR
C>> S9ntax
#include
void MPI::Intracomm::Reduce(const void* sendbuf, void* recvbuf,
int count, const MPI::Datatype& datatype, const MPI::Op& op,
int root) const
=6I reduction operations include the following
M84@ame Fnction
MPI_Max =aximum
MPI_MIN =inimum
-
7/26/2019 Advanced Hpc and Linux 20130820
59/70
MPI_SUM #um
MPI_PROD 6roduct
MPI_LAND 3ogical %
MPI_BAND Bitwise %
MPI_LOR 3ogical >"
MPI_BOR Bitwise >"
MPI_LXOR 3ogical exclusive >"
MPI_BXOR Bitwise exclusive >"
MPI_MAXLOC =aximum and location
MPI_MINLOC =iniumun and location
$.3.& ther Collective Commnications
>ther collective communications include
MI
-
7/26/2019 Advanced Hpc and Linux 20130820
60/70
+he program could send the data one at a time e.g.,
double results[5][5];
int i;
for ( i = 0; i < 5; i++ ) {
MPI_Send( &(results[i][0]), 1, MPI_DOUBLE,dest, tag, comm );
}
But this has overhead? message passing is always 8relatively: expensive. #o
instead, a datatype can be created that informs =6I how the data is stored so it
can be sent in one routine.
+o create a derived type there are two steps Firstly construct the datatype, with=6IO+ypeOvector8: or =6IO+ypeOstruct8: and and then commit the datatype with
=6IO+ypeO7ommit8:.
2hen all the data to send is the same data type use the vector method e.g.,
int MPI_Type_vector( int count, int blocklen, int stride, MPI_Datatype old_type,
MPI_Datatype* newtype )
/* Send the first double of each of the 5 rows */
MPI_Datatype newType;
double results[5][5];
MPI_Type_vector( 5, 1, 5, MPI_Double, &newType);
MPI_Type_commit( &newType );
MPI_Ssend( &(results[0][0]), 1, newType, dest, tag, comm );
%ote that when sending a vector, data on receiving processor may be of adifferent type eg
double recvData[COUNT*BLOCKLEN];
double sendData[COUNT][STRIDE];
MPI_Datatype vecType;
MPI_Status st;
-
7/26/2019 Advanced Hpc and Linux 20130820
61/70
MPI_Type_vector( COUNT, BLOCKLEN, STRIDE, MPI_DOUBLE, &vecType );
MPI_Type_commit( &vecType );
if( rank == 0 )
MPI_Send( &(sendData[0][0]), 1, vecType, 1, tag, comm );
else
MPI_Recv( recvData, COUNT*BLOCKLEN, MPI_DOUBLE, 0, tag, comm, &st );
If you have specific parts of a struct you wish to send and the members are of
different types, use the struct datatype.
int MPI_Type_struct( int count, int blocklen[], MPI_Aint indices, MPI_Datatype
old_types[],MPI_Datatype* newtype )
For example....
/* Send the Packet structure in a message */
struct {
int a;
double array[3];
char b[10];
} Packet;
struct Packet dataToSend;
nother example .
int blockLens[3] = { 1, 3, 10 };
MPI_Aint intSize, doubleSize;
MPI_Aint displacements[3];
MPI_Datatype types[3] = { MPI_INT, MPI_DOUBLE, MPI_CHAR };
MPI_Datatype myType;
MPI_Type_extent( MPI_INT, &intSize ); //# of bytes in an int
MPI_Type_extent( MPI_DOUBLE, &doubleSize ); // double
displacements[0] = (MPI_Aint) 0;
displacements[1] = intSize;
displacements[2] = intSize + ((MPI_Aint) 3 * doubleSize);
-
7/26/2019 Advanced Hpc and Linux 20130820
62/70
MPI_Type_struct( 3, blockLens, displacements, types, &myType );
MPI_Type_commit( &myType );
MPI_Ssend( &dataToSend, 1, myType, dest, tag, comm );
+here are actually other functions for creating derived types
MPI_Type_contiguous
MPI_Type_hvector
MPI_Type_indexed
MPI_Type_hindexed
In many applications, the siCe of a message to receive is un*nown before it is
received. 8e.g. number of particles moving between domains:. =6I has a way of
dealing with this elegantly. Firstly, receive side calls =6IO6robe before actuallyreceiving
int MPI_Probe( int source, int tag, MPI_Comm comm, MPI_Status *status )
Can then examine the status, and find length using:
int MPI_Get_count( MPI_Status *status,
MPI_Datatype datatype, int *count )
+hen the application dynamically allocate the recv buffer, and call =6IO"ecv.
&.& 8article Advector
+he particle advector handson excersise consists of two parts.
+he first example is designed to gain familiarity with the =6IO#catter8: routine
as a means of distributing global arrays among multiple processesors viacollective commuinication. &se the s*eleton code provided and determine the
number of particles to assign to each processor. +hen use the function
=6IO#catter8: to spread the global particle coordinates, ids and tags among the
processors.
-
7/26/2019 Advanced Hpc and Linux 20130820
63/70
For an advanced tests, on the root processor only, calculate the particle with the
smallest distance from the origin 8hint =6IO"educe8 : :. If the particle with the
smallest distance is ; 1.J from the origin, then flip the direction of movement of
all the particles. +hen modify your code to use the =6IO#catterv8: function to
allow the given number of particles to be properly distributed among a variable
number of processors.
int MPI_Scatterv (
void *sendbuf,
int *sendcnts,
int *displs,
MPI_Datatype sendtype,
void *recvbuf,
int recvcnt,
MPI_Datatype recvtype,
int root,
MPI_Comm comm )
+he second example is designed to gain a practical example of the use of =6I
derived data types. Implement a data type storing the particle information from
the previous exercise and use this data type for collective communications. #et
up and commit a new =6I derived data type, based on the struct below
typedef struct Particle {
unsigned int globalId;
unsigned int tag;
Coord coord;
} Particle;
5int =6IO+ypeOstruct8 :, =6IO+ypeOcommit8 :
+hen seed the random number seuence on the root processor only, and
determine how many particles are to be assigned among the respective
processors 8same as for last exercise: and collectively assign their data using the
=6I derived data type you have implemented.
-
7/26/2019 Advanced Hpc and Linux 20130820
64/70
&.' Creating A @e# Commnicator
2hen creating a new communicator, each communicator has associated with it a
group of ran*ed processes. Before creating a new comunicator, first we must
create a group for it. 7reate a new group be eliminating processes from an
existing group
MPI_Group worldGroup, subGroup;
MPI_Comm subComm;
int *procsToExcl, numToExcl;
MPI_Comm_group( MPI_COMM_WORLD, &worldGroup );
MPI_Group_excl( worldGroup, numToExcl, procsToExcl, &subGroup );
MPI_Comm_create( MPI_COMM_WORLD, subGroup, &subComm );
&.( 8ro!iling 8arallel 8rograms
6arallel 6erformance Issues include the following
V 7overage Z of the code that is parallel
V $ranularity mount of wor* in each section
V 3oad Balancing
V 3ocality 7ommunication structure
V #ynchroniCation 3oc*ing latencies
#ince the performance of parallel programs are dependant on so many issues, it
is an inherently difficult tas* to profile parallel programs.
+& 8+uning and nalysis &tilities: is a portable profiling and tracing tool*it for
performance analysis of parallel programs written in `ava, 7, 7LL and Fortran.
-
7/26/2019 Advanced Hpc and Linux 20130820
65/70
+he steps involved in profiling parallel code are outlined as follows
Instrument the source code with +au macros
7ompile the instrumented code
"un the program to view profile.V files for each separate process
+he instrumentation of source code can be done manually or with the help of
another utility called 6+, which automatically parses source files and
instruments them with +au macros.
&.) 5egging M84 A%%lications
It has ta*en many years for this essential truth to be realised, but software
euals bugs. In parallel systems, the bugs are particularly difficult to diagnose,
and the core principle of parallelisation suggests race conditions and deadloc*s.
For example, what happens when two processers try to send a message to one
another at the same time.
2hen debugging =6I programs it is usually a good idea to do this in oneRs own
environment, i.e., install 8from source: the compilers and version of openmpi on
your own system. +he reason for this is it is uite time prohibitive to conduct
debugging activities on a batchprocessing highperformance computer. +he 567
systems that we have may run tas*s fairly uic*ly when launched, but they can
ta*e some time to begin whilst they are in the ueue.
5 @T /@ BS @ TDE DEA5 @5E
REALLY, 5 @T /@ M/;T4CE BS @ TDE DEA5 @5E
It is possible, for small tests, to bypass this by running small 0obs interactively
8following the instructions given in the Intermediate course:. e,g,
-
7/26/2019 Advanced Hpc and Linux 20130820
66/70
qsub -l walltime=0:30:0,nodes=1:ppn=2 -I
module load vpac
qsub pbs-sendrecv
In general however, parallel programs are hard to program and hard to debug.
6arallelism adds a whole new abstract layer. lthough the program is being
executed onprocessors, it may be running inslightly different ways on
different data.
lthough time consuming serious it is usually appropriate to build the code in
serial first to the point thatRs itRs wor*ing, and wor*ing well. s part of thisprocess use versioning control systems, engage in unit 8chec* each functional
component of the code independently: and integration testing 8chec* the
interfaces between components: as part of this development. &se standard
methods for these tests, such as the use of midrange, boundary, and outof
bounds variables.
Because parallelism adds a new level of abstraction, producing a serial version of
a code before producing a parallel version is not unli*e producing pseudocode
for a serial program. +ime and time again it has been shown that modelling
significantly improves the uality of a program and reduces errors, thus saving
time in the longer run. In the process of engaging in such modelling, developing
a defensive style of programming is effective, for example engaging in the
techniues that prevent deadloc*s, or *eeping in consideration the state of a
condition when running loops or ifelse statements. 2hen conducting actual tests
on the code, a tactically placed printf or write statements will assist.
For example, consider the following simple sendrecv programs? compile these
with openmpigcc as follows
module load openmpi-gcc
-
7/26/2019 Advanced Hpc and Linux 20130820
67/70
mpicc -g mpi-debug.c -o mpi-debug or
mpif90 -g mpi.debug.f90 -o mpi-debug
qsub -l walltime=0:20:0,nodes=1:ppn=2 -I
module load vpac
module load valgrind/3.8.1-openmpi-gcc
%ote that an interactive 0ob starts the user in their home directory reuring a
change in directories.
2hen mpiexec with ( processors is launched with valgrind debugging the
executable and with error output redirected to valgrind.out.
mpiexec -np 2 valgrind ./mpi-sendrecv-debug 2> valgrind.out
]algrind is a debugging suite that automatically detects many memory
management and threading bugs. 2hilst typically built for serial applications, it
can also be built with mpicc wrappers, but currently only for $%& $77 or IntelRs
7LL compiler. It is important to use the same compiler used in both the build
and the ]algrind test.
+he file valgrind.out in this case will contain uite a few errors, but none of these
are critical to the operation of the program.
s with serial programs, gdb can also be used for thorough debugging. Execute
as
mpiexec -np [number of processers] gdb ./executable command=gdb.cmd
-
7/26/2019 Advanced Hpc and Linux 20130820
68/70
2here gdb.cmd is a text file of the commands that you want to send to gdb. e.g.,
module load gdb
mpiexec -np 2 gdb --exec=mpi-debug --command=gdb.cmd
2hich should generate a result something li*e the following
[lev@trifid166 advancedhpc]$ mpiexec -np 2 gdb commands=gdb.cmd mpi-debug
(remove license information)
Reading symbols from /nfs/user2/lev/programming/advancedhpc/mpi-debug...(no
debugging symbols found)...done.
Reading symbols from /nfs/user2/lev/programming/advancedhpc/mpi-debug...(no
debugging symbols found)...done.
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib64/libthread_db.so.1".
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib64/libthread_db.so.1".
[New Thread 0x2aaaad875700 (LWP 19784)]
[New Thread 0x2aaaad875700 (LWP 19785)]
[New Thread 0x2aaaadc8b700 (LWP 19786)]
[New Thread 0x2aaaadc8b700 (LWP 19787)]
processor 0 final value: 324 with loop # 68
processor 1 final value: 2346 with loop # 68
[Thread 0x2aaaadc8b700 (LWP 19786) exited]
[Thread 0x2aaaad875700 (LWP 19784) exited][Thread 0x2aaaadc8b700 (LWP 19787) exited]
[Thread 0x2aaaad875700 (LWP 19785) exited]
[Inferior 1 (process 19776) exited normally]
[Inferior 1 (process 19777) exited normally]
-
7/26/2019 Advanced Hpc and Linux 20130820
69/70
+his of course, simple mentions that the program successfully with the final
values as listed 8hooray4:. +o use a serial debugger with $B that is running in
parallel is slightly more difficult. common hac* @ and it is a hac* @ is to find out
what process Is that the 0ob is doing then to log in to the appropriate node and
run gdb p 6I. 5owever in order to discover that the following code snippet is
usually implemented
{
int i = 0;
char hostname[256];
gethostname(hostname, sizeof(hostname));
printf("PID %d on %s ready for attach\n", getpid(), hostname);
fflush(stdout);
while (0 == i)
sleep(5);
}
+hen at 0ob submission those 6Is will be displayed. For example,
[lev@trifid166 advancedhpc]$ mpiexec -np 2 mpi-debug
PID 23166 on trifid166 ready for attach
PID 23167 on trifid166 ready for attach
+hen login to the appropriate nodes and run gdb p (K1!! and gdb p (K1! amd
step through the function stac* and set the variable to a nonCero value, e.g.,
(gdb) set var i = 7
+hen set a brea*point after your bloc* of code and continue execution until the
brea*point is hit 8e.g., by adding brea* in the loops on lines )9 and !): and using
the gdb commands to display the values as they are being generated 8e.g., print
loop, print value, or info locals:.
-
7/26/2019 Advanced Hpc and Linux 20130820
70/70
110 !/
--/ ?A@: =1 $ >>!/ --
infovpac.org***.vpac.or%
mailto:[email protected]:[email protected]