openmp & mpicavazos/cisc879/lecture-03.pdf · openmp has a set of environment variables that...

45
Tristan Vanderbruggen & John Cavazos Dept of Computer & Information Sciences University of Delaware OpenMP & MPI 1 CISC 879

Upload: others

Post on 18-Aug-2020

10 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: OpenMP & MPIcavazos/cisc879/Lecture-03.pdf · OpenMP has a set of environment variables that control the runtime execution OMP_NUM_THREADS=num default number of threads contained

Tristan Vanderbruggen & John CavazosDept of Computer & Information Sciences

University of Delaware

OpenMP & MPI

1

CISC 879

Page 2: OpenMP & MPIcavazos/cisc879/Lecture-03.pdf · OpenMP has a set of environment variables that control the runtime execution OMP_NUM_THREADS=num default number of threads contained

Lecture Overview

● Introduction ● OpenMP

○ Model○ Language extension: directives-based○ Step-by-step example

● MPI○ Model○ Runtime Library○ Step-by-step example

● Hybrid of OpenMP & MPI● Conclusion

2

Page 3: OpenMP & MPIcavazos/cisc879/Lecture-03.pdf · OpenMP has a set of environment variables that control the runtime execution OMP_NUM_THREADS=num default number of threads contained

1 - OpenMP

3

OpenMP: Open Multi-ProcessingIntranode parallelism

Page 4: OpenMP & MPIcavazos/cisc879/Lecture-03.pdf · OpenMP has a set of environment variables that control the runtime execution OMP_NUM_THREADS=num default number of threads contained

1.1 - OpenMP: Model

4

● Shared Memory Model:○ multi-processor/core

Source: https://computing.llnl.gov/tutorials/openMP/

Page 5: OpenMP & MPIcavazos/cisc879/Lecture-03.pdf · OpenMP has a set of environment variables that control the runtime execution OMP_NUM_THREADS=num default number of threads contained

1.1 - OpenMP: Model

5

● Thread-level Parallelism:○ parallelism through threads○ typically: number of threads match number of cores

● Fork - Join Model:

Source: https://computing.llnl.gov/tutorials/openMP/

Page 6: OpenMP & MPIcavazos/cisc879/Lecture-03.pdf · OpenMP has a set of environment variables that control the runtime execution OMP_NUM_THREADS=num default number of threads contained

1.1 - OpenMP: Model

6Source: https://computing.llnl.gov/tutorials/openMP/

● Explicit Parallelism:

○ programmer has full control over parallelization○ can be as simple as inserting compiler directives in a serial

program○ or, as complex as inserting subroutines to set multiple

levels of parallelism, locks and even nested locks

Page 7: OpenMP & MPIcavazos/cisc879/Lecture-03.pdf · OpenMP has a set of environment variables that control the runtime execution OMP_NUM_THREADS=num default number of threads contained

1.2 - OpenMP: Language

7

● OpenMP is not exactly a language.

○ It is an extension for C and Fortran.

● It is a Directive-Based Language Extension● It works by annotating a sequential code

Source: https://computing.llnl.gov/tutorials/openMP/

Page 8: OpenMP & MPIcavazos/cisc879/Lecture-03.pdf · OpenMP has a set of environment variables that control the runtime execution OMP_NUM_THREADS=num default number of threads contained

1.2 - OpenMP: Language

8

● in C, it uses pragmas

#pragma omp construct [clause, ...]

● in Fortran, it uses sentinels (!$omp, C$omp, or *$omp):

!$OMP construct [clause, ...]

Source: https://computing.llnl.gov/tutorials/openMP/

Page 9: OpenMP & MPIcavazos/cisc879/Lecture-03.pdf · OpenMP has a set of environment variables that control the runtime execution OMP_NUM_THREADS=num default number of threads contained

1.2 - OpenMP: Language

9Source: https://computing.llnl.gov/tutorials/openMP/

● constructs are functionalities of the language

● clauses are parameters to those functionalities

● construct + clauses = directive

Page 10: OpenMP & MPIcavazos/cisc879/Lecture-03.pdf · OpenMP has a set of environment variables that control the runtime execution OMP_NUM_THREADS=num default number of threads contained

1.3 - OpenMP: Step-by-step Example

10

Two examples:● the classic HelloWorld● a matrix multiplication

Page 11: OpenMP & MPIcavazos/cisc879/Lecture-03.pdf · OpenMP has a set of environment variables that control the runtime execution OMP_NUM_THREADS=num default number of threads contained

1.3 (a) - OpenMP: Hello World

11

Page 12: OpenMP & MPIcavazos/cisc879/Lecture-03.pdf · OpenMP has a set of environment variables that control the runtime execution OMP_NUM_THREADS=num default number of threads contained

OpenMP: Environment Variables

● OpenMP has a set of environment variables that control the runtime execution

● OMP_NUM_THREADS=num○ default number of threads contained by a parallel region

● OMP_SCHEDULE=algorithm○ algorithm = dynamic or static○ the algorithm to be use for scheduling

12

Page 13: OpenMP & MPIcavazos/cisc879/Lecture-03.pdf · OpenMP has a set of environment variables that control the runtime execution OMP_NUM_THREADS=num default number of threads contained

1.3 (a) - OpenMP: Hello World

● Compile:○ $> gcc -fopenmp helloworld-omp.c -o helloworld-omp

● Run:○ $> qlogin -pe threads 8○ $> cd hpc-II○ $> export OMP_NUM_THREADS=8○ $> ./helloworld-omp

13

Page 14: OpenMP & MPIcavazos/cisc879/Lecture-03.pdf · OpenMP has a set of environment variables that control the runtime execution OMP_NUM_THREADS=num default number of threads contained

1.3 (b) - OpenMP: Matrix Multiply

14

Page 15: OpenMP & MPIcavazos/cisc879/Lecture-03.pdf · OpenMP has a set of environment variables that control the runtime execution OMP_NUM_THREADS=num default number of threads contained

1.3 (b) - OpenMP: Matrix Multiply

15

rand();

rand();

rand();

Page 16: OpenMP & MPIcavazos/cisc879/Lecture-03.pdf · OpenMP has a set of environment variables that control the runtime execution OMP_NUM_THREADS=num default number of threads contained

16

1.3 (b) - OpenMP: Matrix Multiply

rand();

rand();

rand();

Page 17: OpenMP & MPIcavazos/cisc879/Lecture-03.pdf · OpenMP has a set of environment variables that control the runtime execution OMP_NUM_THREADS=num default number of threads contained

17

1.3 (b) - OpenMP: Matrix Multiply

Page 18: OpenMP & MPIcavazos/cisc879/Lecture-03.pdf · OpenMP has a set of environment variables that control the runtime execution OMP_NUM_THREADS=num default number of threads contained

1.3 (b) - OpenMP: Matrix Multiply

18

● #pragma omp parallel shared(A,B,C) private(i,j,k)

○ create a parallel region■ fork a team of threads (usually as many as cores)

○ arrays A, B, C are shared among the threads

○ the "iterators" are private to each threads

Page 19: OpenMP & MPIcavazos/cisc879/Lecture-03.pdf · OpenMP has a set of environment variables that control the runtime execution OMP_NUM_THREADS=num default number of threads contained

1.3 (b) - OpenMP: Matrix Multiply

19

● #pragma omp for schedule (static)

○ declare a parallel for-loop■ to be executed by the team

○ schedule precise how the iterations have to be divided■ static/dynamic■ chunk size

Page 20: OpenMP & MPIcavazos/cisc879/Lecture-03.pdf · OpenMP has a set of environment variables that control the runtime execution OMP_NUM_THREADS=num default number of threads contained

1.3 (b) - OpenMP: Matrix Multiply

20

● on Intel i7 4 cores● for 512x512 float matrices● Sequential: 0.92s ● OpenMp : 0.24s

Speedup of 3.83

Page 21: OpenMP & MPIcavazos/cisc879/Lecture-03.pdf · OpenMP has a set of environment variables that control the runtime execution OMP_NUM_THREADS=num default number of threads contained

1.3 (b) - OpenMP: Matrix MultiplyBut the speedup depends on the input size:

21

Page 22: OpenMP & MPIcavazos/cisc879/Lecture-03.pdf · OpenMP has a set of environment variables that control the runtime execution OMP_NUM_THREADS=num default number of threads contained

22

● Constructs:

a. barrier : synchronisation point

b. single : only executed by one thread of the teamc. master : only executed by the master

d. critical : only one thread at anytime

e. sections / section : declare task parallelism

1.4 - OpenMP: Construct

Page 23: OpenMP & MPIcavazos/cisc879/Lecture-03.pdf · OpenMP has a set of environment variables that control the runtime execution OMP_NUM_THREADS=num default number of threads contained

23

● clauses:

a. shared/private apply to variables list

b. default policy for variables sharing■ either shared or none

c. firstprivate take a list of private variables to be initialized

d. lastprivate take a list of private variables to be copy out

e. reduction take an operation and a list of scalar variables

f. num_thread either■ from the team to be used■ in the team

1.4 - OpenMP: Clause

Page 24: OpenMP & MPIcavazos/cisc879/Lecture-03.pdf · OpenMP has a set of environment variables that control the runtime execution OMP_NUM_THREADS=num default number of threads contained

24

1.4 - OpenMP: Barrier example

Page 25: OpenMP & MPIcavazos/cisc879/Lecture-03.pdf · OpenMP has a set of environment variables that control the runtime execution OMP_NUM_THREADS=num default number of threads contained

25

1.4 - OpenMP: Barrier example

Page 26: OpenMP & MPIcavazos/cisc879/Lecture-03.pdf · OpenMP has a set of environment variables that control the runtime execution OMP_NUM_THREADS=num default number of threads contained

1.4 - OpenMP: Reduction

26

Page 27: OpenMP & MPIcavazos/cisc879/Lecture-03.pdf · OpenMP has a set of environment variables that control the runtime execution OMP_NUM_THREADS=num default number of threads contained

Questions ?

27

Any questions about OpenMP?

Page 28: OpenMP & MPIcavazos/cisc879/Lecture-03.pdf · OpenMP has a set of environment variables that control the runtime execution OMP_NUM_THREADS=num default number of threads contained

2 - MPI

28

Message Passing Interface:internodes parallelism

Page 29: OpenMP & MPIcavazos/cisc879/Lecture-03.pdf · OpenMP has a set of environment variables that control the runtime execution OMP_NUM_THREADS=num default number of threads contained

2.1 - MPI: Model

29

● Distributed Memory, originally● today implementation support shared memory SMP

Source: https://computing.llnl.gov/tutorials/mpi/

Page 30: OpenMP & MPIcavazos/cisc879/Lecture-03.pdf · OpenMP has a set of environment variables that control the runtime execution OMP_NUM_THREADS=num default number of threads contained

2.2 - MPI: Language

30

● MPI is an Interface○ MPI = Message Passing Interface

● Different implementations are available for C / Fortran

Source: https://computing.llnl.gov/tutorials/mpi/

Page 31: OpenMP & MPIcavazos/cisc879/Lecture-03.pdf · OpenMP has a set of environment variables that control the runtime execution OMP_NUM_THREADS=num default number of threads contained

2.3 - MPI: Step-by-step Examples

31Source: https://computing.llnl.gov/tutorials/mpi/

MPI Program Structure:

Page 32: OpenMP & MPIcavazos/cisc879/Lecture-03.pdf · OpenMP has a set of environment variables that control the runtime execution OMP_NUM_THREADS=num default number of threads contained

2.3 (a) - MPI: Hello World

32

Page 33: OpenMP & MPIcavazos/cisc879/Lecture-03.pdf · OpenMP has a set of environment variables that control the runtime execution OMP_NUM_THREADS=num default number of threads contained

● Compile

○ $> mpicc helloworld-mpi.c -o helloworld-mpi

○ mpicc provide includes directories and libraries paths

2.3 (a) - MPI: Hello World

33

Page 34: OpenMP & MPIcavazos/cisc879/Lecture-03.pdf · OpenMP has a set of environment variables that control the runtime execution OMP_NUM_THREADS=num default number of threads contained

● Run

○ On one node:■ mpirun -n $NB_PROCCESS ./helloworld-mpi

○ On a cluster with qsub (Sun Grid Engine)■ qsub -pe mpich $NB_PROCESS mpi-qsub.sh

■ mpi-qsub.sh:

2.3 (a) - MPI: Hello World

34

#!/bin/bash##$ -cwd#mpirun -np $NSLOTS ./matmul-mpi

Page 35: OpenMP & MPIcavazos/cisc879/Lecture-03.pdf · OpenMP has a set of environment variables that control the runtime execution OMP_NUM_THREADS=num default number of threads contained

2.3 (b) - MPI: Matrix Multiply

35

Page 36: OpenMP & MPIcavazos/cisc879/Lecture-03.pdf · OpenMP has a set of environment variables that control the runtime execution OMP_NUM_THREADS=num default number of threads contained

2.3 (b) - MPI: Matrix Multiply

36

MPI initialization:

Page 37: OpenMP & MPIcavazos/cisc879/Lecture-03.pdf · OpenMP has a set of environment variables that control the runtime execution OMP_NUM_THREADS=num default number of threads contained

2.3 (b) - MPI: Matrix Multiply

37

Master initialization:

Page 38: OpenMP & MPIcavazos/cisc879/Lecture-03.pdf · OpenMP has a set of environment variables that control the runtime execution OMP_NUM_THREADS=num default number of threads contained

2.3 (b) - MPI: Matrix Multiply

38

Page 39: OpenMP & MPIcavazos/cisc879/Lecture-03.pdf · OpenMP has a set of environment variables that control the runtime execution OMP_NUM_THREADS=num default number of threads contained

2.3 (b) - MPI: Matrix Multiply

39

Page 40: OpenMP & MPIcavazos/cisc879/Lecture-03.pdf · OpenMP has a set of environment variables that control the runtime execution OMP_NUM_THREADS=num default number of threads contained

2.3 (b) - MPI: Matrix Multiply

40

Page 41: OpenMP & MPIcavazos/cisc879/Lecture-03.pdf · OpenMP has a set of environment variables that control the runtime execution OMP_NUM_THREADS=num default number of threads contained

2.3 (b) - MPI: Matrix Multiply

41

1 - On master after initialization 2 - On worker after comm

4 - On master after comm3 - On worker after computation

Page 42: OpenMP & MPIcavazos/cisc879/Lecture-03.pdf · OpenMP has a set of environment variables that control the runtime execution OMP_NUM_THREADS=num default number of threads contained

Questions?

42

Any questions about MPI?

Page 43: OpenMP & MPIcavazos/cisc879/Lecture-03.pdf · OpenMP has a set of environment variables that control the runtime execution OMP_NUM_THREADS=num default number of threads contained

● MPI work on SMT processors○ Message Passing on top of Shared Memory

● Hybrid of OpenMP & MPI:○ The best of two worlds?

3 - OpenMP & MPI

43

● MPI : Internodes● OpenMP : Intranode

Page 44: OpenMP & MPIcavazos/cisc879/Lecture-03.pdf · OpenMP has a set of environment variables that control the runtime execution OMP_NUM_THREADS=num default number of threads contained

4 - OpenMP & MPI

44

https://github.com/cavazos-lab/hpc-lecture/blob/master/lecture3/matmul-mpi-omp.c

Page 45: OpenMP & MPIcavazos/cisc879/Lecture-03.pdf · OpenMP has a set of environment variables that control the runtime execution OMP_NUM_THREADS=num default number of threads contained

Questions?

45

Any questions?