![Page 1: Lecture 20 - Stanford Universityweb.stanford.edu/class/cme213/files/lectures/Lecture_20.pdf · Groups and communicators A groupis an ordered set of processes. Each process in a group](https://reader033.vdocument.in/reader033/viewer/2022050523/5fa691cd61c7ce4fff46f9b3/html5/thumbnails/1.jpg)
CME213
EricDarve
SPRING 2017
![Page 2: Lecture 20 - Stanford Universityweb.stanford.edu/class/cme213/files/lectures/Lecture_20.pdf · Groups and communicators A groupis an ordered set of processes. Each process in a group](https://reader033.vdocument.in/reader033/viewer/2022050523/5fa691cd61c7ce4fff46f9b3/html5/thumbnails/2.jpg)
LINEAR ALGEBRAMATRIX-VECTOR PRODUCTS
![Page 3: Lecture 20 - Stanford Universityweb.stanford.edu/class/cme213/files/lectures/Lecture_20.pdf · Groups and communicators A groupis an ordered set of processes. Each process in a group](https://reader033.vdocument.in/reader033/viewer/2022050523/5fa691cd61c7ce4fff46f9b3/html5/thumbnails/3.jpg)
Applicationexample:matrix-vectorproduct
● WearegoingtousethatexampletoillustrateadditionalMPIfunctionalities.
● Thiswillleadustoprocessgroupsandtopologies.● First,wegoovertwoimplementationsthatusethe
functionalitieswehavealreadycovered.● Twosimpleapproaches:
• Rowpartitioningofthematrix,or• Columnpartitioning
![Page 4: Lecture 20 - Stanford Universityweb.stanford.edu/class/cme213/files/lectures/Lecture_20.pdf · Groups and communicators A groupis an ordered set of processes. Each process in a group](https://reader033.vdocument.in/reader033/viewer/2022050523/5fa691cd61c7ce4fff46f9b3/html5/thumbnails/4.jpg)
Rowpartitioning
Thisisthemostnatural.
MatrixA Vectorb
Step1:replicateboneachprocess:MPI_Allgather()Step2:performproductSeeMPIcode:matvecrow/
Allgather()
![Page 5: Lecture 20 - Stanford Universityweb.stanford.edu/class/cme213/files/lectures/Lecture_20.pdf · Groups and communicators A groupis an ordered set of processes. Each process in a group](https://reader033.vdocument.in/reader033/viewer/2022050523/5fa691cd61c7ce4fff46f9b3/html5/thumbnails/5.jpg)
![Page 6: Lecture 20 - Stanford Universityweb.stanford.edu/class/cme213/files/lectures/Lecture_20.pdf · Groups and communicators A groupis an ordered set of processes. Each process in a group](https://reader033.vdocument.in/reader033/viewer/2022050523/5fa691cd61c7ce4fff46f9b3/html5/thumbnails/6.jpg)
Columnpartitioning
Step1:calculatepartialproductswitheachprocess
Partialproducts
MatrixA Vectorb
![Page 7: Lecture 20 - Stanford Universityweb.stanford.edu/class/cme213/files/lectures/Lecture_20.pdf · Groups and communicators A groupis an ordered set of processes. Each process in a group](https://reader033.vdocument.in/reader033/viewer/2022050523/5fa691cd61c7ce4fff46f9b3/html5/thumbnails/7.jpg)
Columnpartitioning(cont’d)
● Step2:reduceallpartialresults:MPI_Reduce()● Step3:sendsub-blockstoallprocesses:MPI_Scatter()
● Stepsareverysimilartorowpartitioning.
VectorAb
![Page 8: Lecture 20 - Stanford Universityweb.stanford.edu/class/cme213/files/lectures/Lecture_20.pdf · Groups and communicators A groupis an ordered set of processes. Each process in a group](https://reader033.vdocument.in/reader033/viewer/2022050523/5fa691cd61c7ce4fff46f9b3/html5/thumbnails/8.jpg)
![Page 9: Lecture 20 - Stanford Universityweb.stanford.edu/class/cme213/files/lectures/Lecture_20.pdf · Groups and communicators A groupis an ordered set of processes. Each process in a group](https://reader033.vdocument.in/reader033/viewer/2022050523/5fa691cd61c7ce4fff46f9b3/html5/thumbnails/9.jpg)
Abetterpartitioning
● Ifthenumberofprocessesbecomeslargecomparedtothematrixsize,weneeda2Dpartitioning:
● Eachcoloredsquarecanbeassignedtoaprocess.● Thisallowsusingmoreprocesses.● Inaddition,atheoreticalanalysis(moreonthislater)showsthat
thisschemerunsfaster.
![Page 10: Lecture 20 - Stanford Universityweb.stanford.edu/class/cme213/files/lectures/Lecture_20.pdf · Groups and communicators A groupis an ordered set of processes. Each process in a group](https://reader033.vdocument.in/reader033/viewer/2022050523/5fa691cd61c7ce4fff46f9b3/html5/thumbnails/10.jpg)
Outlineofalgorithm:step1
Firstcolumncontainsb
Sendbtothediagonalprocesses
Sendbdowneachcolumn.
Thisisabroadcastoperation.
![Page 11: Lecture 20 - Stanford Universityweb.stanford.edu/class/cme213/files/lectures/Lecture_20.pdf · Groups and communicators A groupis an ordered set of processes. Each process in a group](https://reader033.vdocument.in/reader033/viewer/2022050523/5fa691cd61c7ce4fff46f9b3/html5/thumbnails/11.jpg)
Step2and3
● Step2:performmatrix-vectorproductlocally● Step3:reduceacrosscolumnsandstoreresultincolumn0.
Reductionacrosscolumns
![Page 12: Lecture 20 - Stanford Universityweb.stanford.edu/class/cme213/files/lectures/Lecture_20.pdf · Groups and communicators A groupis an ordered set of processes. Each process in a group](https://reader033.vdocument.in/reader033/viewer/2022050523/5fa691cd61c7ce4fff46f9b3/html5/thumbnails/12.jpg)
Reduction:2nReduction:n/2
Communicationcost(inanutshell)Whyis2Dpartitioningbetter?
Larger blocks Narrow columns
![Page 13: Lecture 20 - Stanford Universityweb.stanford.edu/class/cme213/files/lectures/Lecture_20.pdf · Groups and communicators A groupis an ordered set of processes. Each process in a group](https://reader033.vdocument.in/reader033/viewer/2022050523/5fa691cd61c7ce4fff46f9b3/html5/thumbnails/13.jpg)
Difficultieswith2Dpartitioning
● Thistypeofdecompositionbringssomedifficulties.● Weusedtwocollectiveoperations:
•Abroadcastinsideacolumn.•Areductioninsidearow.
● TodothisinMPI,weneedtwoconcepts:•Communicatorsorprocessgroups.Thisdefinesasubsetofalltheprocesses.Foreachsubset,collectiveoperationsareallowed,e.g.,broadcastforthegroupofprocessesinsideacolumn.•Processtopologies.Formatrices,thereisanatural2Dtopologywith(i,j)blockindexing.MPIsupportssuchgrids(anydimension).UsingMPIgrids(called“Cartesiantopologies”)simplifiesmanyMPIcommands.
![Page 14: Lecture 20 - Stanford Universityweb.stanford.edu/class/cme213/files/lectures/Lecture_20.pdf · Groups and communicators A groupis an ordered set of processes. Each process in a group](https://reader033.vdocument.in/reader033/viewer/2022050523/5fa691cd61c7ce4fff46f9b3/html5/thumbnails/14.jpg)
PROCESS GROUPS AND COMMUNICATORS
![Page 15: Lecture 20 - Stanford Universityweb.stanford.edu/class/cme213/files/lectures/Lecture_20.pdf · Groups and communicators A groupis an ordered set of processes. Each process in a group](https://reader033.vdocument.in/reader033/viewer/2022050523/5fa691cd61c7ce4fff46f9b3/html5/thumbnails/15.jpg)
Processgroups
● Groupsareneededformanyreasons.● Enablescollectivecommunicationoperationsacrossa
subsetofprocesses.● Allowstoeasilyassignindependenttaskstodifferent
groupsofprocesses.● Providesagoodmechanismtointegrateaparallellibrary
intoanMPIcode.
![Page 16: Lecture 20 - Stanford Universityweb.stanford.edu/class/cme213/files/lectures/Lecture_20.pdf · Groups and communicators A groupis an ordered set of processes. Each process in a group](https://reader033.vdocument.in/reader033/viewer/2022050523/5fa691cd61c7ce4fff46f9b3/html5/thumbnails/16.jpg)
Groupsandcommunicators● Agroup isanorderedsetofprocesses.● Eachprocessinagroupisassociatedwithauniqueintegerrank.
RankvaluesstartatzeroandgotoN-1,whereNisthenumberofprocessesinthegroup.
● Agroupisalwaysassociatedwithacommunicatorobject.● Acommunicatorencompassesagroupofprocessesthatmay
communicatewitheachother.AllMPImessagesmustspecifyacommunicator.
● Forexample,thehandleforthecommunicatorthatcomprisesalltasksisMPI_COMM_WORLD.
● Fromtheprogrammer'sperspective,agroupandacommunicatorarealmostthesame.Thegrouproutinesareprimarilyusedtospecifywhichprocessesshouldbeusedtoconstructacommunicator.
● Processesmaybeinmorethanonegroup/communicator.Theyhaveauniquespecificrankwithineachgroup/communicator.
![Page 17: Lecture 20 - Stanford Universityweb.stanford.edu/class/cme213/files/lectures/Lecture_20.pdf · Groups and communicators A groupis an ordered set of processes. Each process in a group](https://reader033.vdocument.in/reader033/viewer/2022050523/5fa691cd61c7ce4fff46f9b3/html5/thumbnails/17.jpg)
Mainfunctions
MPIprovidesover40routinesrelatedtogroups,communicators,andvirtualtopologies!int MPI_Comm_group(MPI_Comm comm, MPI_Group *group)
Returngroupassociatedwithcommunicator,e.g.,MPI_COMM_WORLD
int MPI_Group_incl(MPI_Group group, int p, int *ranks,
MPI_Group *new_group)
ranks integerarraywithpentries.
Createsanewgroup new_group withpprocesses,whichhaveranksfrom0top-1.Processi istheprocessthathasrankranks[i]ingroup.
int MPI_Comm_create(MPI_Comm comm, MPI_Group group, MPI_Comm *new_comm)
Newcommunicatorbasedongroup.SeeMPIcode:groups/
![Page 18: Lecture 20 - Stanford Universityweb.stanford.edu/class/cme213/files/lectures/Lecture_20.pdf · Groups and communicators A groupis an ordered set of processes. Each process in a group](https://reader033.vdocument.in/reader033/viewer/2022050523/5fa691cd61c7ce4fff46f9b3/html5/thumbnails/18.jpg)
![Page 19: Lecture 20 - Stanford Universityweb.stanford.edu/class/cme213/files/lectures/Lecture_20.pdf · Groups and communicators A groupis an ordered set of processes. Each process in a group](https://reader033.vdocument.in/reader033/viewer/2022050523/5fa691cd61c7ce4fff46f9b3/html5/thumbnails/19.jpg)
PROCESS TOPOLOGIES
![Page 20: Lecture 20 - Stanford Universityweb.stanford.edu/class/cme213/files/lectures/Lecture_20.pdf · Groups and communicators A groupis an ordered set of processes. Each process in a group](https://reader033.vdocument.in/reader033/viewer/2022050523/5fa691cd61c7ce4fff46f9b3/html5/thumbnails/20.jpg)
Processtopologies
● Manyproblemsarenaturallymappedtocertaintopologiessuchasgrids.
● Thisisthecaseforexampleformatrices,orfor2Dand3Dstructuredgrids.
● ThetwomaintypesoftopologiessupportedbyMPIareCartesiangridsandgraphs.
● MPItopologiesallowsimplifyingmanycommonMPItasks.
● MPItopologiesarevirtual— theremaybenorelationbetweenthephysicalstructureofthenetworkandtheprocesstopology.
![Page 21: Lecture 20 - Stanford Universityweb.stanford.edu/class/cme213/files/lectures/Lecture_20.pdf · Groups and communicators A groupis an ordered set of processes. Each process in a group](https://reader033.vdocument.in/reader033/viewer/2022050523/5fa691cd61c7ce4fff46f9b3/html5/thumbnails/21.jpg)
Advantagesofusingtopologies
● Convenience:virtualtopologiesmaybeusefulforapplicationswithspecificcommunicationpatterns.
● Communicationefficiency:aparticularimplementationmayoptimizetheprocessmappingbaseduponthephysicalcharacteristicsofagivenparallelmachine.• Forexamplenodesthatarenearbyonthegrid
(East/West/North/Southneighbors)maybecloseinthenetwork(lowestcommunicationtime).
● ThemappingofprocessesontoanMPIvirtualtopologyisdependentupontheMPIimplementation.
![Page 22: Lecture 20 - Stanford Universityweb.stanford.edu/class/cme213/files/lectures/Lecture_20.pdf · Groups and communicators A groupis an ordered set of processes. Each process in a group](https://reader033.vdocument.in/reader033/viewer/2022050523/5fa691cd61c7ce4fff46f9b3/html5/thumbnails/22.jpg)
MPIfunctionsfortopologiesManyfunctionsareavailable.Weonlycoverthebasicones.int MPI_Cart_create(MPI_Comm comm_old, int ndims,
int *dims, int *periods, int reorder,
MPI_Comm *comm_cart)
ndims numberofdimensionsdims[i] sizeofgridalongdimensioni.Shouldnotexceedthenumberofprocessesincomm_old.Thearrayperiods isusedtospecifywhetherornotthetopologyhaswraparoundconnections.Ifperiods[i] isnon-zero,thenthetopologyhaswraparoundconnectionsalongdimensioni.reorder isusedtodetermineiftheprocessesinthenewgrouparetobereorderedornot.Ifreorder isfalse,thentherankofeachprocessinthenewgroupisidenticaltoitsrankintheoldgroup.
![Page 23: Lecture 20 - Stanford Universityweb.stanford.edu/class/cme213/files/lectures/Lecture_20.pdf · Groups and communicators A groupis an ordered set of processes. Each process in a group](https://reader033.vdocument.in/reader033/viewer/2022050523/5fa691cd61c7ce4fff46f9b3/html5/thumbnails/23.jpg)
Example
Theprocessesareorderedaccordingtotheirrankrow-wiseinincreasingorder.
0(0,0)
1(0,1)
2(1,0)
3(1,1)
4(2,0)
5(2,1)
![Page 24: Lecture 20 - Stanford Universityweb.stanford.edu/class/cme213/files/lectures/Lecture_20.pdf · Groups and communicators A groupis an ordered set of processes. Each process in a group](https://reader033.vdocument.in/reader033/viewer/2022050523/5fa691cd61c7ce4fff46f9b3/html5/thumbnails/24.jpg)
PeriodicCartesiangrids
● Wechoseperiodicityalongthefirstdimension(periods[0]=1)whichmeansthatanyreferencebeyondthefirstorlastentryofanyrowwillbewrappedaroundcyclically.
● Forexample,rowindexi=-1 ismappedintoi=2.
● Thereisnoperiodicityimposedontheseconddimension.Anyreferencetoacolumnindexoutsideofitsdefinedrangeresultsinanerror.Tryit!
![Page 25: Lecture 20 - Stanford Universityweb.stanford.edu/class/cme213/files/lectures/Lecture_20.pdf · Groups and communicators A groupis an ordered set of processes. Each process in a group](https://reader033.vdocument.in/reader033/viewer/2022050523/5fa691cd61c7ce4fff46f9b3/html5/thumbnails/25.jpg)
Obtainingyourrankandcoordinatesint MPI_Cart_rank(MPI_Comm comm_cart,
int *coords, int *rank)int MPI_Cart_coords(MPI_Comm comm_cart, int rank,
int maxdims, int *coords)
● Thisallowsretrievingarankorthecoordinatesinthegrid.Thismaybeusefultogetinformationaboutotherprocesses.
● coords aretheCartesiancoordinatesofaprocess.
● Itssizeisthenumberofdimensions.● RememberthatthefunctionMPI_Comm_rank isstillavailableto
queryyourownrank.
● SeeMPIcode:mpi_cart/
![Page 26: Lecture 20 - Stanford Universityweb.stanford.edu/class/cme213/files/lectures/Lecture_20.pdf · Groups and communicators A groupis an ordered set of processes. Each process in a group](https://reader033.vdocument.in/reader033/viewer/2022050523/5fa691cd61c7ce4fff46f9b3/html5/thumbnails/26.jpg)
![Page 27: Lecture 20 - Stanford Universityweb.stanford.edu/class/cme213/files/lectures/Lecture_20.pdf · Groups and communicators A groupis an ordered set of processes. Each process in a group](https://reader033.vdocument.in/reader033/viewer/2022050523/5fa691cd61c7ce4fff46f9b3/html5/thumbnails/27.jpg)
Gettingtherankofyourneighborsint MPI_Cart_shift(MPI_Comm comm_cart, int dir,
int s_step, int *rank_source, int *rank_dest)
● dir direction
● s_step lengthshift
● rank_dest containsthegrouprankoftheneighboringprocessinthespecifieddimensionanddistance.
● rank_source istherankoftheprocessforwhichthecallingprocessistheneighboringprocessinthespecifieddimensionanddistance.
● Thus,thegroupranksreturnedinrank_dest andrank_source canbeusedasparametersforMPI_Sendrecv().
rank_destrank_sources_step = 4
![Page 28: Lecture 20 - Stanford Universityweb.stanford.edu/class/cme213/files/lectures/Lecture_20.pdf · Groups and communicators A groupis an ordered set of processes. Each process in a group](https://reader033.vdocument.in/reader033/viewer/2022050523/5fa691cd61c7ce4fff46f9b3/html5/thumbnails/28.jpg)
![Page 29: Lecture 20 - Stanford Universityweb.stanford.edu/class/cme213/files/lectures/Lecture_20.pdf · Groups and communicators A groupis an ordered set of processes. Each process in a group](https://reader033.vdocument.in/reader033/viewer/2022050523/5fa691cd61c7ce4fff46f9b3/html5/thumbnails/29.jpg)
SplittingaCartesiantopology
● ItisverycommonthatonewantstosplitaCartesiantopologyalongcertaindimensions.
● Forexample,wemaywanttocreateagroupforthecolumnsorrowsofamatrix.
int MPI_Cart_sub(MPI_Comm comm_cart,
int *keep_dims, MPI_Comm *comm_subcart)
● keep_dims booleanflagthatdetermineswhetherthatdimensionisretainedinthenewcommunicatorsorsplit,e.g.,iffalsethenasplitoccurs.
![Page 30: Lecture 20 - Stanford Universityweb.stanford.edu/class/cme213/files/lectures/Lecture_20.pdf · Groups and communicators A groupis an ordered set of processes. Each process in a group](https://reader033.vdocument.in/reader033/viewer/2022050523/5fa691cd61c7ce4fff46f9b3/html5/thumbnails/30.jpg)
Example
x
y
z
keep_dims[] = {true, false, true}
keep_dims[] = {false, false, true}
![Page 31: Lecture 20 - Stanford Universityweb.stanford.edu/class/cme213/files/lectures/Lecture_20.pdf · Groups and communicators A groupis an ordered set of processes. Each process in a group](https://reader033.vdocument.in/reader033/viewer/2022050523/5fa691cd61c7ce4fff46f9b3/html5/thumbnails/31.jpg)
Applicationexample:2Dpartitioning
Firstcolumncontainsb
Sendbtothediagonalprocesses
Sendbdowneachcolumn.Broadcast!
Startwith2Dcommunicator Usecolumngroup
![Page 32: Lecture 20 - Stanford Universityweb.stanford.edu/class/cme213/files/lectures/Lecture_20.pdf · Groups and communicators A groupis an ordered set of processes. Each process in a group](https://reader033.vdocument.in/reader033/viewer/2022050523/5fa691cd61c7ce4fff46f9b3/html5/thumbnails/32.jpg)
2Dtopologyformatrix
![Page 33: Lecture 20 - Stanford Universityweb.stanford.edu/class/cme213/files/lectures/Lecture_20.pdf · Groups and communicators A groupis an ordered set of processes. Each process in a group](https://reader033.vdocument.in/reader033/viewer/2022050523/5fa691cd61c7ce4fff46f9b3/html5/thumbnails/33.jpg)
Sendtodiagonalblock
![Page 34: Lecture 20 - Stanford Universityweb.stanford.edu/class/cme213/files/lectures/Lecture_20.pdf · Groups and communicators A groupis an ordered set of processes. Each process in a group](https://reader033.vdocument.in/reader033/viewer/2022050523/5fa691cd61c7ce4fff46f9b3/html5/thumbnails/34.jpg)
Column-wisebroadcast
![Page 35: Lecture 20 - Stanford Universityweb.stanford.edu/class/cme213/files/lectures/Lecture_20.pdf · Groups and communicators A groupis an ordered set of processes. Each process in a group](https://reader033.vdocument.in/reader033/viewer/2022050523/5fa691cd61c7ce4fff46f9b3/html5/thumbnails/35.jpg)
matvec2D
Reduction!Userowgroup
SeeMPIcode:matvec2D/
![Page 36: Lecture 20 - Stanford Universityweb.stanford.edu/class/cme213/files/lectures/Lecture_20.pdf · Groups and communicators A groupis an ordered set of processes. Each process in a group](https://reader033.vdocument.in/reader033/viewer/2022050523/5fa691cd61c7ce4fff46f9b3/html5/thumbnails/36.jpg)
Codeforrowreduction
![Page 37: Lecture 20 - Stanford Universityweb.stanford.edu/class/cme213/files/lectures/Lecture_20.pdf · Groups and communicators A groupis an ordered set of processes. Each process in a group](https://reader033.vdocument.in/reader033/viewer/2022050523/5fa691cd61c7ce4fff46f9b3/html5/thumbnails/37.jpg)
Topologiesforfinite-elementcalculations
● Atypicalsituationisthatprocessesneedtocommunicatewiththeirneighbors.
● Thisbecomescomplicatedtoorganizeforunstructuredgrids.● Inthatcase,graphtopologiesareveryconvenient.Theyallow
defininganeighborrelationshipinageneralway,usingagraph.Example:MPI_Graph_create
● Examplesofcollectivecommunications:• MPI_neighbor_allgather(): gatherdata,andallprocessesgettheresult• MPI_neighbor_alltoall(): processessendtoandreceivefromallneighborprocesses