1 (+*), ! .1/ · [zy\ $ • emacs7# emacsx %mabq • ^x ^s u]8controlvxidgj7 • ^x ^c x u^z 3...

85
(+"*)," !.1/ %,&- $'#,(+"*),".0/./ 1 ./ 2018/5/8

Upload: others

Post on 05-Mar-2020

4 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: 1 (+*), ! .1/ · [ZY\ $ • emacs7# emacsX %MABQ • ^x ^s U]8controlVXIDGJ7 • ^x ^c X U^z 3 0=4&GKFS7" * *=' 6/5(-4'V • ^g : !*?);5,5214+' • ^k : CTHQ:< 93 0' /1 8& 6

��(+"*),"��!��.1/

��������%,&- �� � ��

$'#,(+"*),".0/ .�/ 1

��������.�/ �����������

2018/5/8

Page 2: 1 (+*), ! .1/ · [ZY\ $ • emacs7# emacsX %MABQ • ^x ^s U]8controlVXIDGJ7 • ^x ^c X U^z 3 0=4&GKFS7" * *=' 6/5(-4'V • ^g : !*?);5,5214+' • ^k : CTHQ:< 93 0' /1 8& 6

:3�-i��<;,)j

1. 4 10�(��)o LKSgR2. 4 17�

l � ��&F�"%0i��j

3. 4 24�oRWPg�'=�l fOKg�#BURVYfOb]�5

4. 5 1�l A�4YfOb\gO�$F�+li>�^_cBdhYJgfhcgOj (��)

5. 5 8�l A�4YfOb\gO�$F�+2iM`TQaXfTN�j

6. 5 15�l 5 -ZNVd.F� �

RWPgYfOb\gOiljBi�j

7. 5 22�l HD�$F� �

8. 6 5�l 5 -5 .F� �ilj

9. 6 12�l 5 k5 .F� �imj

10. 6 19�?l pq�7$ilj

l PgURV9@(6

11. 7 3�?l pq�7$imj

12. 7 10�?l pq�7$inj

13. 7 17�?l RB-HC8EB?�!;�B*/1��

e[hVCIGPgURV9@i2�o2018�8 6�i j24� ��

22018/5/8

Page 3: 1 (+*), ! .1/ · [ZY\ $ • emacs7# emacsX %MABQ • ^x ^s U]8controlVXIDGJ7 • ^x ^c X U^z 3 0=4&GKFS7" * *=' 6/5(-4'V • ^g : !*?);5,5214+' • ^k : CTHQ:< 93 0' /1 8& 6

��������������-����

������� ��������� 32018/5/8

Page 4: 1 (+*), ! .1/ · [ZY\ $ • emacs7# emacsX %MABQ • ^x ^s U]8controlVXIDGJ7 • ^x ^c X U^z 3 0=4&GKFS7" * *=' 6/5(-4'V • ^g : !*?);5,5214+' • ^k : CTHQ:< 93 0' /1 8& 6

[ZY\��$

• emacs7#�X emacs �%MABQ�• ^x ^s U]8controlV XIDGJ7��• ^x ^c X ��U ^z 3��0=4&GKFS7"�*�*='��6/5(-4'V

• ^g : !*?);5,5214+'• ^k : CTHQ:<��93�0'

�/1�8&���6 �.>='

• ^y : ^k3�/1�@&�7CTHQ7�6FLT0='• ^s ��� : ���7��93��0='• ^M x goto-line : � /1�93��0='

GKFSNREPOSEUWV&U�V 42018/5/8

Page 5: 1 (+*), ! .1/ · [ZY\ $ • emacs7# emacsX %MABQ • ^x ^s U]8controlVXIDGJ7 • ^x ^c X U^z 3 0=4&GKFS7" * *=' 6/5(-4'V • ^g : !*?);5,5214+' • ^k : CTHQ:< 93 0' /1 8& 6

`_^a���,62• rm N?BU] N?BU6N?BU>�+�

• rm *~ : test.c~ 436�~%/!-LID@ION?BU>�+��"�7��5�*~ 6�5��%�.0*8"2��0%�#8+�

• ls : ��!:NCUH6��>�:�• cd NCUH] NCUH5�+:�

• cd .. : �/�6NCUH5��• cd ~ ]PYRJAVDKT5�'��%=$94'4.-2&�

• cat N?BU] N?BU6��>�:• make : �N?BU>�:ZMakefile% :2(<1*$ �1&4![• make clean : �N?BU>�+�Zclean %Makefile1��);0!4!2 �1&4![

GMFXOWESQXEZ\[�Z�[ 52018/5/8

Page 6: 1 (+*), ! .1/ · [ZY\ $ • emacs7# emacsX %MABQ • ^x ^s U]8controlVXIDGJ7 • ^x ^c X U^z 3 0=4&GKFS7" * *=' 6/5(-4'V • ^g : !*?);5,5214+' • ^k : CTHQ:< 93 0' /1 8& 6

EDCF����&3• less 6./;�B 6./;�&��-�+(cat"'���� (�%$ !�)�#�@• 48>40> : 1��41<>;• / : ��& )"���+�• q B �� ?��,�*$�$ �#�@

453=7<2:9=2?A@�?�@ 62018/5/8

Page 7: 1 (+*), ! .1/ · [ZY\ $ • emacs7# emacsX %MABQ • ^x ^s U]8controlVXIDGJ7 • ^x ^c X U^z 3 0=4&GKFS7" * *=' 6/5(-4'V • ^g : !*?);5,5214+' • ^k : CTHQ:< 93 0' /1 8& 6

��-����*:1819(63����

• C�� �! Fortran�� �/$%8�Mat-Mat-noopt-ofp.tar.gz

•+50,'71-/$%8mat-mat-noopt.bash��&4;�# lecture-flat �"lecture5-flat���(8;1�#gt00�"gt05�����"pjsub �������

• lecture-flat : �����&4;• lecture5-flat: �����&4;

,.):19(62:(<>=�<�= 72018/5/8

Page 8: 1 (+*), ! .1/ · [ZY\ $ • emacs7# emacsX %MABQ • ^x ^s U]8controlVXIDGJ7 • ^x ^c X U^z 3 0=4&GKFS7" * *=' 6/5(-4'V • ^g : !*?);5,5214+' • ^k : CTHQ:< 93 0' /1 8& 6

��-���!*:1819(64!��;>����Fortran���# <• ��!)2:.&���%

$ cd /work/gt05/t050xx$ cp /work/gt05/z30105/Mat-Mat-noopt-ofp.tar.gz ./$ tar xvfz Mat-Mat-noopt-ofp.tar.gz$ cd Mat-Mat-noopt

• ��! �$�&��$ cd C : C��&���$ cd F : ?BCDC@A��&���

• ��"��$ make

• +50,'71-&�����$$ pjsub mat-mat-noopt.bash

• �������$���&���%$ cat mat-mat-noopt.bash.oXXXX (XXXX"�)

,/):19(63:(;=<�;�< 82018/5/8

Page 9: 1 (+*), ! .1/ · [ZY\ $ • emacs7# emacsX %MABQ • ^x ^s U]8controlVXIDGJ7 • ^x ^c X U^z 3 0=4&GKFS7" * *=' 6/5(-4'V • ^g : !*?);5,5214+' • ^k : CTHQ:< 93 0' /1 8& 6

�- ����("&"'�%$�� •�������������(C�����)

N = 512Mat-Mat time = 12.511196 [sec.]21.455619 [MFLOPS]OK!N = 512Mat-Mat time = 13.501827 [sec.]19.881417 [MFLOPS]OK!

!�("'�%#(�)+*�)�* 92018/5/8

DDR4�����

MCDRAM�����

Page 10: 1 (+*), ! .1/ · [ZY\ $ • emacs7# emacsX %MABQ • ^x ^s U]8controlVXIDGJ7 • ^x ^c X U^z 3 0=4&GKFS7" * *=' 6/5(-4'V • ^g : !*?);5,5214+' • ^k : CTHQ:< 93 0' /1 8& 6

�- ����("&"'�%$�� •�������������(Fortran�����)

N = 512Mat-Mat time[sec.] = 24.4274609088898MFLOPS = 10.9890854527813OK!N = 512Mat-Mat time[sec.] = 27.0449259281158MFLOPS = 9.92553856630092OK!

!�("'�%#(�)+*�)�* 102018/5/8

DDR4�����

MCDRAM�����

Page 11: 1 (+*), ! .1/ · [ZY\ $ • emacs7# emacsX %MABQ • ^x ^s U]8controlVXIDGJ7 • ^x ^c X U^z 3 0=4&GKFS7" * *=' 6/5(-4'V • ^g : !*?);5,5214+' • ^k : CTHQ:< 93 0' /1 8& 6

MCDRAM����•����� (DDR4���)

• mpiexec.hydra -n ${PJM_MPI_PROC} ./mat-mat-noopt

• MCDRAM���• mpiexec.hydra -n ${PJM_MPI_PROC} numactl -m 1

./mat-mat-noopt

2018/5/8 ������ ��� ������� 11

Page 12: 1 (+*), ! .1/ · [ZY\ $ • emacs7# emacsX %MABQ • ^x ^s U]8controlVXIDGJ7 • ^x ^c X U^z 3 0=4&GKFS7" * *=' 6/5(-4'V • ^g : !*?);5,5214+' • ^k : CTHQ:< 93 0' /1 8& 6

/93738-65'��:?��;

• #define N 512'���+�!*$���/,1��#�(!

• #define DEBUG <�<�&!*$���-���' �����#�(!�

• MyMatMat��'��•@EFBDC�N�N��=$>'���+� %��@EFBDC�A�A��?&"'���)(!

02.938-649-:<;�:�; 122018/5/8

Page 13: 1 (+*), ! .1/ · [ZY\ $ • emacs7# emacsX %MABQ • ^x ^s U]8controlVXIDGJ7 • ^x ^c X U^z 3 0=4&GKFS7" * *=' 6/5(-4'V • ^g : !*?);5,5214+' • ^k : CTHQ:< 93 0' /1 8& 6

Fortran����!��� ������

•��������%%������ integer, parameter :: NN=512

���!� ���!�"$#"�# 132018/5/8

Page 14: 1 (+*), ! .1/ · [ZY\ $ • emacs7# emacsX %MABQ • ^x ^s U]8controlVXIDGJ7 • ^x ^c X U^z 3 0=4&GKFS7" * *=' 6/5(-4'V • ^g : !*?);5,5214+' • ^k : CTHQ:< 93 0' /1 8& 6

����

•MyMatMat�J��#K9�:HGIDH</.067���&+$)%�

•.���:HGIDH<1�"6�!��5��-/8��/.9�3+$)%��

• =H?;C1���FAE9L0&+�74'����,2����FAE9-7��(L,��&+ �+$)%��

• =H?;C068���9���!*�068:HGIDH<&+56�1,'"���067:HGIDH<1� "/$/74'�

>?=H@G<CBH<JMK�J�K 142018/5/8

Page 15: 1 (+*), ! .1/ · [ZY\ $ • emacs7# emacsX %MABQ • ^x ^s U]8controlVXIDGJ7 • ^x ^c X U^z 3 0=4&GKFS7" * *=' 6/5(-4'V • ^g : !*?);5,5214+' • ^k : CTHQ:< 93 0' /1 8& 6

kdn`,6

1. [Lrq] &�-&�!KF:G8fgi1#Z[_^HJU��H8�1#HJU��L�%Y-�BT9

2. [Lrt] &�-&�!LZmlnim\Y8i, j, k jnbKF:G�@8�%�L��:Y-OT9ILZmlnim\��R��>70HJUCW;=9

^a]mbl\hem\orp8o�p 15

�6LkcjK4AU+/u•L00: ?XQG"�J�69•L10u DSEH$<VNX=U�69•L20u ���J�69•L30u ��3 ��)HAU�69•L40u �23 ��)HAU�69(5J 'Y�)HAU9•L50u �=� ��)HAU�69�*��6YP9�Lsq��M8.�Y��AUK�AU�69

2018/5/8

Page 16: 1 (+*), ! .1/ · [ZY\ $ • emacs7# emacsX %MABQ • ^x ^s U]8controlVXIDGJ7 • ^x ^c X U^z 3 0=4&GKFS7" * *=' 6/5(-4'V • ^g : !*?);5,5214+' • ^k : CTHQ:< 93 0' /1 8& 6

���O� ���P

1. ���"#`^a]�!��@J0GBL0��$"����"^a\_VQUTRQRTXWRQX"U,WRR�

2. Kevin Dowd�"�����"#;-M<>.NAL4M1L=EN8,L0[`^aCKN/48N2FL(�!)��+ %�&'*)��$"-L6N:2F:IM9C5LM<?H72L0M3D<L"^a\_VQZRRXSYQRUQU"V,VRR�

4<1L@J0GBL0OSP"O�P 162018/5/8

Page 17: 1 (+*), ! .1/ · [ZY\ $ • emacs7# emacsX %MABQ • ^x ^s U]8controlVXIDGJ7 • ^x ^c X U^z 3 0=4&GKFS7" * *=' 6/5(-4'V • ^g : !*?);5,5214+' • ^k : CTHQ:< 93 0' /1 8& 6

����

1. !)���2. �������� 3. OpenMP���4. �*"'")�&%���,��-���OpenMP�-

5. ����6. (#+���

� �*")�&$*�,.-�,�- 172018/5/8

Page 18: 1 (+*), ! .1/ · [ZY\ $ • emacs7# emacsX %MABQ • ^x ^s U]8controlVXIDGJ7 • ^x ^c X U^z 3 0=4&GKFS7" * *=' 6/5(-4'V • ^g : !*?);5,5214+' • ^k : CTHQ:< 93 0' /1 8& 6

OpenMP����� ������

������������������ 182018/5/8

Page 19: 1 (+*), ! .1/ · [ZY\ $ • emacs7# emacsX %MABQ • ^x ^s U]8controlVXIDGJ7 • ^x ^c X U^z 3 0=4&GKFS7" * *=' 6/5(-4'V • ^g : !*?);5,5214+' • ^k : CTHQ:< 93 0' /1 8& 6

�'��!*��

• ^�w|ozx}o�3�q}w{w|ozye�iKVQTILfKVQTDFF_��4�

• "� � ,

• ���#�\HMEJ8:9C5=:<9?;=>?<\HMEJ8:<C5B@A8=:<9?;=>?>\&��� ;9:>�>�;>�

• `��g$�a• F/0\GUWYWOTB9/0e.1• F/0\GUWYWOTB9/0g-�gq}w{w|ozyb���+�sm}|~t���

• �2)g��l�dnu~• NSTPUZX5LF!*�+6F[RZST�%7]rvp}ek!*�+]• ��h(]jd�w|ozx}ol�i� cg�3�

rvp}w|ozx}o���\��� 192018/5/8

Page 20: 1 (+*), ! .1/ · [ZY\ $ • emacs7# emacsX %MABQ • ^x ^s U]8controlVXIDGJ7 • ^x ^c X U^z 3 0=4&GKFS7" * *=' 6/5(-4'V • ^g : !*?);5,5214+' • ^k : CTHQ:< 93 0' /1 8& 6

OpenMP���

��� ��� �������� 202018/5/8

Page 21: 1 (+*), ! .1/ · [ZY\ $ • emacs7# emacsX %MABQ • ^x ^s U]8controlVXIDGJ7 • ^x ^c X U^z 3 0=4&GKFS7" * *=' 6/5(-4'V • ^g : !*?);5,5214+' • ^k : CTHQ:< 93 0' /1 8& 6

OpenMP-�����• OpenMP.��<=?���-(/-9@3>;��

584A9@3>:A3CED$C�D 21

!��

PE PE PE ��<=?

PEOpenMP���4B7

OpenMP���4B7

OpenMP���4B7

OpenMP���4B7

��#�A[ ]

�,��-PE&��#�,1265⇒����)"�,� 0'+%*$ ���-��*��'+%

2018/5/8

Page 22: 1 (+*), ! .1/ · [ZY\ $ • emacs7# emacsX %MABQ • ^x ^s U]8controlVXIDGJ7 • ^x ^c X U^z 3 0=4&GKFS7" * *=' 6/5(-4'V • ^g : !*?);5,5214+' • ^k : CTHQ:< 93 0' /1 8& 6

OpenMP38• OpenMP (OpenMP C and C++ Application Program

Interface)38&��NOR���$���6KSCQM@��.?��Y1. ���

2. QBJQR

3. � ��

@#�-0<72.'

• PUE+&��KSCQM7�!,/?0;7��@�*?<72.'DTIBQ6=? ��28(>:/A'

• ��NOR���V[\Z54W6�91&HUG��7��7�%+�)�&�"+��2.'

FIDTKSCQLTCVXW&V�W 222018/5/8

Page 23: 1 (+*), ! .1/ · [ZY\ $ • emacs7# emacsX %MABQ • ^x ^s U]8controlVXIDGJ7 • ^x ^c X U^z 3 0=4&GKFS7" * *=' 6/5(-4'V • ^g : !*?);5,5214+' • ^k : CTHQ:< 93 0' /1 8& 6

OpenMPA\eUQM)#�j<Elk•SfVX���L'5[gPc]hP`We•,�E\eUQM)#�D/�• $1 C�%m8SfVX����E�'D�9• 8SfVXL*6KSfVX�'@24����L!�;KDF3[gPc]hPE��7�(1. _Nh_`d-OaVRb0EWiT+.%�7�#�%D�G�42. OpenMP@���L��@8C4[gPc^DC>?4Kj�-k

• YiX0E���FOpenMP@F@8C4• YiX0E���FMPIL�4K• &���QhZNcJ3SfVX���EI• HPF3 XcalableMPj"��k CBEQhZNc@FYiX0E���7 %=73H=��:?4C4

SZQh[gPc]hPjlk3j�k 232018/5/8

Page 24: 1 (+*), ! .1/ · [ZY\ $ • emacs7# emacsX %MABQ • ^x ^s U]8controlVXIDGJ7 • ^x ^c X U^z 3 0=4&GKFS7" * *=' 6/5(-4'V • ^g : !*?);5,5214+' • ^k : CTHQ:< 93 0' /1 8& 6

OpenMP,FK=84!��P*/TQ•���-:L>A�

• 16:L>ARBOA• T2K5OEN:C8NPAMD Quad Core Opteron(Barcelona) %4;7>@Q%

FX10:OC8NDIO<9:?HPSparc64 IXfxQ• 32V128:L>ARBOA

• HITACHI SR16000 (IBM Power7)• 32��84%64V128"�84PSMT���Q

• Reedbush (Intel Xeon E5-2695 v4, Broadwell-EP)• 3684

• 60V272:L>ARBOA• Intel Xeon Phi (Intel MIC(Many Integrated Core) %Knights Conner)

• 60��84%120V240"�84PHT���Q• Oakforest-PACS (Intel MIC, Knights Landing)

• 68��84%272"�84����• $&�PTVU��Q.0%100:L>A3#'+OpenMP.12���(��)2,��• � /EM6JH�/��(�

:C8NEM6JGN6PSQ%P�Q 242018/5/8

Page 25: 1 (+*), ! .1/ · [ZY\ $ • emacs7# emacsX %MABQ • ^x ^s U]8controlVXIDGJ7 • ^x ^c X U^z 3 0=4&GKFS7" * *=' 6/5(-4'V • ^g : !*?);5,5214+' • ^k : CTHQ:< 93 0' /1 8& 6

OpenMP����������

•!���•#pragma omp ��������

•"%&'&#$���• !$omp ��������

������������ ����� 252018/5/8

Page 26: 1 (+*), ! .1/ · [ZY\ $ • emacs7# emacsX %MABQ • ^x ^s U]8controlVXIDGJ7 • ^x ^c X U^z 3 0=4&GKFS7" * *=' 6/5(-4'V • ^g : !*?);5,5214+' • ^k : CTHQ:< 93 0' /1 8& 6

OpenMP!,91)6!��• ��,91)5!,91)6*2-49 �OpenMP�!*2-49(��&• �<Intel Fotran90,91)5

ifort -O3 -qopenmp foo.f• �<Intel C,91)5

icc -O3 -qopenmp foo.c• ��

• OpenMP! ����6:2"����• ,91)5 $%������ $&.7/0����!�����&��&������#�&

• OpenMP! ����&�"OpenMP $&.7/0���� ������'",91)5 $&�����

• �<Intel Fortran90,91)5ifort -O3 -qparallel -qopenmp foo.f

.1,928+539+;=<�;�< 262018/5/8

Page 27: 1 (+*), ! .1/ · [ZY\ $ • emacs7# emacsX %MABQ • ^x ^s U]8controlVXIDGJ7 • ^x ^c X U^z 3 0=4&GKFS7" * *=' 6/5(-4'V • ^g : !*?);5,5214+' • ^k : CTHQ:< 93 0' /1 8& 6

OpenMP7�#�!H>?N7�#• OpenMP7IPAML=BQG?N.2��.1�#�!H>?N7�#8)07H>?N=��/<-43#+

• CODE�=)� ��OMP_NUM_THREADS3��• �TOpenMP6;<�#�!H>?N,a.out7�

$ export OMP_NUM_THREADS=16$ ./a.out918

$ env OMP_NUM_THREADS=16 ./a.out• ��

• %�BQG?N7IPAML4)OpenMP6;<IPAML7�#&�,)OMP_NUM_THREADS=16.2:) 5<-4,*<S�$T• -7��8)OpenMP�6;<��7��S@RFRJDET• (CODE�#3)-7@RFRJDE6;<&���,'"�• IPAMKQA7��3��!

CGBQIPAMKQASUT)S�T 272018/5/8

Page 28: 1 (+*), ! .1/ · [ZY\ $ • emacs7# emacsX %MABQ • ^x ^s U]8controlVXIDGJ7 • ^x ^c X U^z 3 0=4&GKFS7" * *=' 6/5(-4'V • ^g : !*?);5,5214+' • ^k : CTHQ:< 93 0' /1 8& 6

OpenMP������

����� ���������� 282018/5/8

Page 29: 1 (+*), ! .1/ · [ZY\ $ • emacs7# emacsX %MABQ • ^x ^s U]8controlVXIDGJ7 • ^x ^c X U^z 3 0=4&GKFS7" * *=' 6/5(-4'V • ^g : !*?);5,5214+' • ^k : CTHQ:< 93 0' /1 8& 6

OpenMP�� %�',C��-� �*")�&$*�,/-�,�- 29

!)��0#pragma omp parallel4!)��1

5!)��2

OpenMP��

!)��0

!)��1 !)��1 !)��1…

!)��2

�(�����

�(��.,#��+�(��- �(��/ �(��3-1

�(�����

��(���3�����OMP_NUM_THREADS������

2018/5/8

Page 30: 1 (+*), ! .1/ · [ZY\ $ • emacs7# emacsX %MABQ • ^x ^s U]8controlVXIDGJ7 • ^x ^c X U^z 3 0=4&GKFS7" * *=' 6/5(-4'V • ^g : !*?);5,5214+' • ^k : CTHQ:< 93 0' /1 8& 6

OpenMP�� %�',Fortran��-� �*")�&$*�,/-�,�- 30

!)��0!$omp parallel!)��1

!$omp end parallel!)��2

OpenMP��

!)��0

!)��1 !)��1 !)��1…

!)��2

�(�����

�(��.,#��+�(��- �(��/ �(��3-1

�(�����

��(���3�����OMP_NUM_THREADS������

2018/5/8

Page 31: 1 (+*), ! .1/ · [ZY\ $ • emacs7# emacsX %MABQ • ^x ^s U]8controlVXIDGJ7 • ^x ^c X U^z 3 0=4&GKFS7" * *=' 6/5(-4'V • ^g : !*?);5,5214+' • ^k : CTHQ:< 93 0' /1 8& 6

Work sharing��• parallel���27#1���2>F?@-��*8��1$",�OpenMP-��:��*8��IBG?;BJ2��:���(parallel region).4

• ���:� ),�>F?@�-����*8��:��*8OpenMP2��:Work sharing��.4

• Work sharing��3��&'�(,��2L�%!8 1. ����-��*862

• for��Ido��J

• sections��

• single�� (master��)�0/2. parallel���.�5�9+862

• parallel for �� (parallel do��)• parallel sections���0/

>A=HCG<EDH<IKJ�I�J 312018/5/8

Page 32: 1 (+*), ! .1/ · [ZY\ $ • emacs7# emacsX %MABQ • ^x ^s U]8controlVXIDGJ7 • ^x ^c X U^z 3 0=4&GKFS7" * *=' 6/5(-4'V • ^g : !*?);5,5214+' • ^k : CTHQ:< 93 0' /1 8& 6

�������

��� ����������� 322018/5/8

Page 33: 1 (+*), ! .1/ · [ZY\ $ • emacs7# emacsX %MABQ • ^x ^s U]8controlVXIDGJ7 • ^x ^c X U^z 3 0=4&GKFS7" * *=' 6/5(-4'V • ^g : !*?);5,5214+' • ^k : CTHQ:< 93 0' /1 8& 6

For� ;do� <.1,928+539+;><�;�< 33

#pragma omp parallel forfor (i=0; i<100; i++){a[i] = a[i] * b[i];

} ��&��

for (i=0; i<25; i++){a[i] = a[i] * b[i];

}

for (i=25; i<50; i++){a[i] = a[i] * b[i];

}

��&��

.7/0&��

.7/0= .7/0> .7/03

.7/0&��

.7/02for (i=50; i<75; i++){a[i] = a[i] * b[i];

}

for (i=75; i<100; i++){a[i] = a[i] * b[i];

}

��� *��6:2���* "(�� ���%$)�#*4:-���!)�

�Fortran��&�'!$omp parallel do?!$omp end parallel do

2018/5/8

Page 34: 1 (+*), ! .1/ · [ZY\ $ • emacs7# emacsX %MABQ • ^x ^s U]8controlVXIDGJ7 • ^x ^c X U^z 3 0=4&GKFS7" * *=' 6/5(-4'V • ^g : !*?);5,5214+' • ^k : CTHQ:< 93 0' /1 8& 6

For��(��#�&��

01/726.437.9;:�9�: 34

for (i=0; i<100; i++) {a[i] = a[i] +1;b[i] = a[i-1]+a[i+1];

}

•582����� ,$���$����&,9a[i-1]�� �-"�&�����,:

for (i=0; i<100; i++) {a[i] = a[ ind[i] ];

}

•ind[i](�'*+�582���#�,�%���),

•a[ind[i]]��'� �-!�#&�$��582���#�,

2018/5/8

Page 35: 1 (+*), ! .1/ · [ZY\ $ • emacs7# emacsX %MABQ • ^x ^s U]8controlVXIDGJ7 • ^x ^c X U^z 3 0=4&GKFS7" * *=' 6/5(-4'V • ^g : !*?);5,5214+' • ^k : CTHQ:< 93 0' /1 8& 6

Sections���� ��������������� 35

#pragma omp sections{ #pragma omp section

sub1();#pragma omp section

sub2();#pragma omp section

sub3();#pragma omp section

sub4();}

sub1();����� ����� ����3����2

sub2(); sub3(); sub4();

l��������

sub1();

����� ����� ����2

sub2(); sub3();

sub4();

l��������

�Fortran�����!$omp sections�!$omp end sections

2018/5/8

Page 36: 1 (+*), ! .1/ · [ZY\ $ • emacs7# emacsX %MABQ • ^x ^s U]8controlVXIDGJ7 • ^x ^c X U^z 3 0=4&GKFS7" * *=' 6/5(-4'V • ^g : !*?);5,5214+' • ^k : CTHQ:< 93 0' /1 8& 6

Critical����

• ������1�� '!"����������

#�)$(�&%)�*-+�*�+ 36

#pragma omp critical{s = s + x;

}

s = s + x

'!", '!"- '!"3 '!"2

s = s + x

s = s + x

s = s + x

�Fortran������!$omp critical.!$omp end critical

2018/5/8

Page 37: 1 (+*), ! .1/ · [ZY\ $ • emacs7# emacsX %MABQ • ^x ^s U]8controlVXIDGJ7 • ^x ^c X U^z 3 0=4&GKFS7" * *=' 6/5(-4'V • ^g : !*?);5,5214+' • ^k : CTHQ:< 93 0' /1 8& 6

Private�� ��

&)%/*.$,+/$031�0�1 37

#pragma omp parallel for private(c)for (i=0; i<100; i++){a[i] = a[i] + c * b[i];

}

�� ��

for (i=0; i<25; i++){a[i] = a[i] + c0*b[i];

}

for (i=25; i<50; i++){a[i] = a[i] + c1*b[i];

}

�� ��

&-'( ��

&-'(2 &-'(3 &-'(3

&-'( ��

&-'(2for (i=50; i<75; i++){a[i] = a[i] + c2*b[i];

}

for (i=75; i<100; i++){a[i] = a[i] + c3* b[i];

}

���4�&-'(�� ��#������→���"!

2018/5/8

Page 38: 1 (+*), ! .1/ · [ZY\ $ • emacs7# emacsX %MABQ • ^x ^s U]8controlVXIDGJ7 • ^x ^c X U^z 3 0=4&GKFS7" * *=' 6/5(-4'V • ^g : !*?);5,5214+' • ^k : CTHQ:< 93 0' /1 8& 6

Private�����.��HK��I;?:F@E9BAF9HJI H�I 38

#pragma omp parallel for private( j )for (i=0; i<100; i++) {for (j=0; j<100; j++) {a[ i ] = a[ i ] + amat[ i ][ j ]* b[ j ];

}

•CG@� L $ �;D<>*�.�5��&)��%43!•private( j ) $,"� �;D<>* ��� j .86F=5� -�()&/#'0 100�.CG@��-,1,"!

→����$��+�,2 7BG+,3!

2018/5/8

Page 39: 1 (+*), ! .1/ · [ZY\ $ • emacs7# emacsX %MABQ • ^x ^s U]8controlVXIDGJ7 • ^x ^c X U^z 3 0=4&GKFS7" * *=' 6/5(-4'V • ^g : !*?);5,5214+' • ^k : CTHQ:< 93 0' /1 8& 6

Private�����.��HFortran��I;?:F@E9BAF9HJI H�I 39

!$omp parallel do private( j )do i=1, 100

do j=1, 100a( i ) = a( i ) + amat( i , j ) * b( j )

enddoenddo!$omp end parallel do

•CG@� K $ �;D<>*�.�5��&)��%43!•private( j ) $,"� �;D<>* ��� j .86F=5� -�()&/#'0 100�.CG@��-,1,"!

→����$��+�,2 7BG+,3!

2018/5/8

Page 40: 1 (+*), ! .1/ · [ZY\ $ • emacs7# emacsX %MABQ • ^x ^s U]8controlVXIDGJ7 • ^x ^c X U^z 3 0=4&GKFS7" * *=' 6/5(-4'V • ^g : !*?);5,5214+' • ^k : CTHQ:< 93 0' /1 8& 6

SKFIQV!����WC"$X• ���98*JTLM��;��C%1/>*Y5;��C�4+ �:��2@• �#;%1/><JTLM�:)��:90A@• reduction!����,�+7*ddot<9@����:9@4?*��� 6'�;��7�B9.9.9@

JNHVOUGRPVGWYX*W�X 40

#pragma omp parallel for reduction(+: ddot )for (i=1; i<=100; i++) {

ddot += a[ i ] * b[ i ]}

ddot; �<JER��;>#&�W(�<#&6-=3DX

2018/5/8

Page 41: 1 (+*), ! .1/ · [ZY\ $ • emacs7# emacsX %MABQ • ^x ^s U]8controlVXIDGJ7 • ^x ^c X U^z 3 0=4&GKFS7" * *=' 6/5(-4'V • ^g : !*?);5,5214+' • ^k : CTHQ:< 93 0' /1 8& 6

RJEHPU ����VFortran!#W• ���87)ISKL��:��B$0.=)X4:��B�3*�9��1?• �":$0.=;ISKL�9(��98/@?• reduction ����+�*6)ddot;�� �98?3>)����5&�:��6A8-8-8?

2018/5/8 IMGUNTFQOUFVXW)V�W 41

!$omp parallel do reduction(+: ddot )do i=1, 100

ddot = ddot + a(i) * b(i)enddo!$omp end parallel do

ddot:��;IDQ �:="%�V'�;"%5,<2CW

Page 42: 1 (+*), ! .1/ · [ZY\ $ • emacs7# emacsX %MABQ • ^x ^s U]8controlVXIDGJ7 • ^x ^c X U^z 3 0=4&GKFS7" * *=' 6/5(-4'V • ^g : !*?);5,5214+' • ^k : CTHQ:< 93 0' /1 8& 6

lbZ^jp1��+!L&�• reduction1��+!M;��)K�,A0SRQLI;�/A�=• .9)K;8_nce�T3?Q��;�/��A'D=

• ��LP>K;ddot(L7T*�DH5%I�,EQ"A:6J��O<QrFGD;�8]V`;fqeWXU��s

_g\pho[kip[rts;r�s 42

!$omp parallel do private ( i ) do j=0, p-1do i=istart( j ), iend( j )

ddot_t( j ) = ddot_t( j ) + a(i) * b(i)enddo

enddo!$omp end parallel doddot = 0.0d0do j=0, p-1

ddot = ddot + ddot_t( j )enddo

_nce LmqhT��u#�v_nce�(

�_nceIUZa_EQVpdcZ_-�T� K2�

�_nceI(=Q;oqYmJddot(L7ddot_t()T*�D;0K�$�DH@B

5%I4DCN

2018/5/8

Page 43: 1 (+*), ! .1/ · [ZY\ $ • emacs7# emacsX %MABQ • ^x ^s U]8controlVXIDGJ7 • ^x ^c X U^z 3 0=4&GKFS7" * *=' 6/5(-4'V • ^g : !*?);5,5214+' • ^k : CTHQ:< 93 0' /1 8& 6

������OpenMP��

����������������� 432018/5/8

Page 44: 1 (+*), ! .1/ · [ZY\ $ • emacs7# emacsX %MABQ • ^x ^s U]8controlVXIDGJ7 • ^x ^c X U^z 3 0=4&GKFS7" * *=' 6/5(-4'V • ^g : !*?);5,5214+' • ^k : CTHQ:< 93 0' /1 8& 6

�����������

• ����������� omp_get_num_threads()������

• ��integer (Fortran�) int (C�)

����������� "! �! 44

use omp_libInteger :: nthreads

nthreads = omp_get_num_threads()

l Fortran90���

#include <omp.h>int nthreads;

nthreads = omp_get_num_threads();

lC���

2018/5/8

Page 45: 1 (+*), ! .1/ · [ZY\ $ • emacs7# emacsX %MABQ • ^x ^s U]8controlVXIDGJ7 • ^x ^c X U^z 3 0=4&GKFS7" * *=' 6/5(-4'V • ^g : !*?);5,5214+' • ^k : CTHQ:< 93 0' /1 8& 6

������� �

• ����������omp_get_thread_num() ������

• ��integer (Fortran��)�int (C��)

��� ����� �!#"�!�" 45

use omp_libInteger :: myid

myid = omp_get_thread_num()

l Fortran90����#include <omp.h>int myid;

myid = omp_get_thread_num();

lC����

2018/5/8

Page 46: 1 (+*), ! .1/ · [ZY\ $ • emacs7# emacsX %MABQ • ^x ^s U]8controlVXIDGJ7 • ^x ^c X U^z 3 0=4&GKFS7" * *=' 6/5(-4'V • ^g : !*?);5,5214+' • ^k : CTHQ:< 93 0' /1 8& 6

������• �������omp_get_wtime()������• ��double precision (Fortran� )�double (C� )

��� ����� �!#"�!�" 46

use omp_libreal(8) :: dts, dte

dts = omp_get_wtime()����

dte = omp_get_wtime()print *, “Elapse time [sec.] =”,dte-dts

l Fortran90� ��#include <omp.h>double dts, dte;

dts = omp_get_wtime();����

dte = omp_get_wtime();printf(“Elapse time [sec.] = %lf ¥n”,

dte-dts);

lC� ��

2018/5/8

Page 47: 1 (+*), ! .1/ · [ZY\ $ • emacs7# emacsX %MABQ • ^x ^s U]8controlVXIDGJ7 • ^x ^c X U^z 3 0=4&GKFS7" * *=' 6/5(-4'V • ^g : !*?);5,5214+' • ^k : CTHQ:< 93 0' /1 8& 6

������

����� ���������� 472018/5/8

Page 48: 1 (+*), ! .1/ · [ZY\ $ • emacs7# emacsX %MABQ • ^x ^s U]8controlVXIDGJ7 • ^x ^c X U^z 3 0=4&GKFS7" * *=' 6/5(-4'V • ^g : !*?);5,5214+' • ^k : CTHQ:< 93 0' /1 8& 6

Single��• Single�� ��# ��, 6=3.-�$,�C!'1<34&�*�"+

• $'1<34&�*�"),+�(��#�%�• nowait�� ��-�,%��*�����+

150>7=/;9>/@CA�@�A 48

#pragma omp parallel forF6=3.A

#pragma omp single { 6=3.B }…}

7=/;:'�

6=3.A 6=3.A 6=3.A…

1<34'��

1<34B@812?1<34A

1<34C 1<34E

����6=3.D

�Fortran��'�(!$omp singleG!$omp end single

2018/5/8

Page 49: 1 (+*), ! .1/ · [ZY\ $ • emacs7# emacsX %MABQ • ^x ^s U]8controlVXIDGJ7 • ^x ^c X U^z 3 0=4&GKFS7" * *=' 6/5(-4'V • ^g : !*?);5,5214+' • ^k : CTHQ:< 93 0' /1 8& 6

Master��

•���/�single�����*�#•&'"�master�����)� "&��H�0+.�.�?E<7B�.��I/��$A:;G:D<=-�4�(5

•���.���� �3,�•%.&1��-24��!65

:>9F@E8CBF8HJI�H�I 492018/5/8

Page 50: 1 (+*), ! .1/ · [ZY\ $ • emacs7# emacsX %MABQ • ^x ^s U]8controlVXIDGJ7 • ^x ^c X U^z 3 0=4&GKFS7" * *=' 6/5(-4'V • ^g : !*?);5,5214+' • ^k : CTHQ:< 93 0' /1 8& 6

Flush#�• &'cdfLO�4�W U• Flush#�K��DVJ;U��OS9GO��K�4�W U:GV��O!��O�P9cdf�O�LO�4�P%;:

j$-."Pg[\]�N��DVUHA:cdfN2-."W ?6XK;M;k

• IRT9flush1����W =M;L9\g^_7K��N5EBXH.">9�0CLN*MU:

• barrier1����9critical1����O���9parallel#�O��9for9sections9single#�O��KP9�8+NflushDVJ;U:

• FlushW�<L�/P�@MU:K?UHA(;M;:

\`ZiahYebiYjlk9j�k 50

#pragma omp flush (�3LMU���O�Q) ,)FUL9JO��>�3

2018/5/8

Page 51: 1 (+*), ! .1/ · [ZY\ $ • emacs7# emacsX %MABQ • ^x ^s U]8controlVXIDGJ7 • ^x ^c X U^z 3 0=4&GKFS7" * *=' 6/5(-4'V • ^g : !*?);5,5214+' • ^k : CTHQ:< 93 0' /1 8& 6

Threadprivate��• /:13�"$58+6=2� $�(��/:13�!��*,0/!�(� )���(�

• /:13�"$�#(�)' ��� %�$���• �"�&�/:13�"$�#(9=5%��"���%�

/4.<5;-87<->@?�>�? 51

void main() {

#pragma omp parallel private (myid,

nthreds, istart, iend) {

nthreds = omp_num_threds();

myid = omp_get_thread_num();

istart = myid * (n/nthreads);

iend = (myid+1)*(n/nthreads);

if (myid == (nthreads-1)) {

nend = n;

}

kernel();

}

#include <omp.h>

int myid, nthreds, istart, iend;

#pragma omp threadprivate(istart, iend)…

void kernel() {

int i;

for (i=istart; i<iend; i++) {

for (j=0; j<n; j++) {

a[ i ] = a[ i ] + amat[ i ][ j ] * b[ j ];

}

}

}

/:13�$�#(�)� ��� )�parallel���!��(

2018/5/8

Page 52: 1 (+*), ! .1/ · [ZY\ $ • emacs7# emacsX %MABQ • ^x ^s U]8controlVXIDGJ7 • ^x ^c X U^z 3 0=4&GKFS7" * *=' 6/5(-4'V • ^g : !*?);5,5214+' • ^k : CTHQ:< 93 0' /1 8& 6

�������

����� ����������� 522018/5/8

Page 53: 1 (+*), ! .1/ · [ZY\ $ • emacs7# emacsX %MABQ • ^x ^s U]8controlVXIDGJ7 • ^x ^c X U^z 3 0=4&GKFS7" * *=' 6/5(-4'V • ^g : !*?);5,5214+' • ^k : CTHQ:< 93 0' /1 8& 6

IFHOVQUE;?W7>YX

• Parallel do��:?+�&RVM> W�/@Y[Z>)4XD+!=ISJK���=��W("6BA.=��X59+����D6B,

2018/5/8 ILGUMTEPNUEWYX+W�X 53

1 n

} 3>;1+�ISJK:��58RVM=�6B%�'#0��:<-;+ISJK�$�>����0�2<B

1 n

ISJK0 ISJK1 ISJK2 ISJK3 ISJK4

ISJK0 ISJK1 ISJK2 ISJK3 ISJK4

%�'#

RVM��>�CW��*X

Page 54: 1 (+*), ! .1/ · [ZY\ $ • emacs7# emacsX %MABQ • ^x ^s U]8controlVXIDGJ7 • ^x ^c X U^z 3 0=4&GKFS7" * *=' 6/5(-4'V • ^g : !*?);5,5214+' • ^k : CTHQ:< 93 0' /1 8& 6

FBEOTQSA04U-3XV

2018/5/8 FKCSLRAPMSAUWV#U�V 54

} ����;��,924#�8�/!";�)+#'.#��,97&2�8�/:57%$

1 n

} � 1#�8�/!"UHNS@D=G076V4#���JTI>?<0#�019��2�,9$

} ��3�8�/;�&�����(� *:/%9$

����

Page 55: 1 (+*), ! .1/ · [ZY\ $ • emacs7# emacsX %MABQ • ^x ^s U]8controlVXIDGJ7 • ^x ^c X U^z 3 0=4&GKFS7" * *=' 6/5(-4'V • ^g : !*?);5,5214+' • ^k : CTHQ:< 93 0' /1 8& 6

DI>403AICG/#����J�#MK• schedule (static, n)

• DI>�*[email protected]+5�����4E7:0��'��"J4E7:L�4E7:M�HHH ��&�"�B,G:F<G�� �%K�� �)&�"�(��)�n"[email protected]+5*���)�

• Schedule����*���!� �#8=-D9$�static����[email protected]+5$�DI>�/4E7:�

2018/5/8 4;1G>F/B?G/JMK�J�K 55

14E7:0 4E7:1 4E7:2 4E7:3

Page 56: 1 (+*), ! .1/ · [ZY\ $ • emacs7# emacsX %MABQ • ^x ^s U]8controlVXIDGJ7 • ^x ^c X U^z 3 0=4&GKFS7" * *=' 6/5(-4'V • ^g : !*?);5,5214+' • ^k : CTHQ:< 93 0' /1 8& 6

6:0*&)3:59%����;��><• schedule(dynamic, n)

• 6:0�",29$(#+�������� ���*7-.�����������"� ��!�n�,29$(#+"���!�

2018/5/8 */'908%419%;=<�;�< 56

1*7-.0 *7-.1 *7-.2 *7-.3

Page 57: 1 (+*), ! .1/ · [ZY\ $ • emacs7# emacsX %MABQ • ^x ^s U]8controlVXIDGJ7 • ^x ^c X U^z 3 0=4&GKFS7" * *=' 6/5(-4'V • ^g : !*?);5,5214+' • ^k : CTHQ:< 93 0' /1 8& 6

TXNEADQXSW@4�����Y,4\Z• schedule(guided, n)

• TXN�<GPW?C=F1��+ �"3GPW?C=F<�*)+2'8 ��'��+-EUHK&8�$��.1 ��<�9�0:!n3GPW?C=F<��1(:!

• GPW?C=F4��'14� �94����<EUHK�1�/-%%7,4�'GPW?4C=F32:!

• GPW?C=F5 1 3�&/0���3�*)2:!

• GPW?C=F3 1 79 ($ k <��+-� GPW? C=F5���3 k 61�*)2:' ��4GPW?5 k 79�*)2:�'#:!

• GPW?C=F'��*;0$2$� IM>TJ5 1 32:!

2018/5/8 ELBWNV@ROW@Y[Z Y�Z 57

1EUHK0 EUHK1 EUHK2 EUHK3

Page 58: 1 (+*), ! .1/ · [ZY\ $ • emacs7# emacsX %MABQ • ^x ^s U]8controlVXIDGJ7 • ^x ^c X U^z 3 0=4&GKFS7" * *=' 6/5(-4'V • ^g : !*?);5,5214+' • ^k : CTHQ:< 93 0' /1 8& 6

BE=;8:?EAD7.�����.�&�

;<9D=C7@>D7FHG#F�G 58

!$omp parallel do private( j, k ) schedule(dynamic,10)do i=1, n

do j=indj(i), indj (i+1)-1y( i ) = amat( j ) * x( indx( j ) )

enddoenddo!$omp end parallel do

l Fortran90��.�

lC��.� #pragma omp parallel for private( j, k ) schedule(dynamic,10)for (i=0; i<n; i++) {

for ( j=indj(i); j<indj (i+1); j++) {y[ i ] = amat[ j ] * x[ indx[ j ]];

}}

j-BE=.�� ("���-13�/4.+#i-BE=.�� �(��+%4'��$��-)'#�� �.��(5'2,&*0#dynamic;8:?EAD76!�

2018/5/8

Page 59: 1 (+*), ! .1/ · [ZY\ $ • emacs7# emacsX %MABQ • ^x ^s U]8controlVXIDGJ7 • ^x ^c X U^z 3 0=4&GKFS7" * *=' 6/5(-4'V • ^g : !*?);5,5214+' • ^k : CTHQ:< 93 0' /1 8& 6

pth]X\mtosWJ:?QhrWnjsW�K �• dynamic4guidedK_lsVZT^L�&J�<>�1• _lsVZT^;�@B=QG.(ens]L'>IQ;�04�#�EKUtei`c;�<>IQ5

• ��4_lsVZT^;�<B=G.(ens];�>IQ�04�#�EKUtei`c;�@>IQ5

• �,K�%KbqtcUg;6Q5• �)�K_lsVZT^K_mtdsW;�2F4_mtdsWY]b;�9Q5

• staticKOF3/�+;F<Qu��;6Qv• dynamicIHK�)�]X\mtosWL4[]akKUteti`c;�Q;4staticLUteti`cLuNGSHv!75

• �J.(��;�*GIQpth$�R-MC�F4static]X\mtosWR�8G4�P";'7 &�;6Q5

• CDA4hrWnjsWKY]bL��BQ

2018/5/8 ]fYshrWnjsWuwv4u�v 59

Page 60: 1 (+*), ! .1/ · [ZY\ $ • emacs7# emacsX %MABQ • ^x ^s U]8controlVXIDGJ7 • ^x ^c X U^z 3 0=4&GKFS7" * *=' 6/5(-4'V • ^g : !*?);5,5214+' • ^k : CTHQ:< 93 0' /1 8& 6

StaticWTV`gbfSGKC,$[afWQ�&:=N�'�• �%�-^RYc I/�;?�h*"H�-i

2018/5/8 W\Uf]eSa_fShji1h�i 60

!$omp parallel do private(S,J_PTR,I)DO K=1,NUM_SMP

DO I=KBORDER(K-1)+1,KBORDER(K)S=0.0D0DO J_PTR=IRP(I),IRP(I+1)-1

S=S+VAL(J_PTR)*X(ICOL(J_PTR))END DOY(I)=S

END DOEND DO

!$omp end parallel do

WdXZ���Gcg]hWdXZ9DGcg]��!�

Q�N?LF�(i

��F+JB)�;B43?1,$��6�&DENWdXZ9DGcg]!�

h�WdXZH1.#;B3N61��&Ecg]!�Q)�i

�%�F1�WdXZ6��<Ncg]!�FA3B1.#<N�M�BC15A1>OC,$6�&<N 0F/�C7N2��%�F,$6�F�P@B38��H/�C7E3

Page 61: 1 (+*), ! .1/ · [ZY\ $ • emacs7# emacsX %MABQ • ^x ^s U]8controlVXIDGJ7 • ^x ^c X U^z 3 0=4&GKFS7" * *=' 6/5(-4'V • ^g : !*?);5,5214+' • ^k : CTHQ:< 93 0' /1 8& 6

OpenMP����� ����������

2018/5/8 ������ ��������� 61

Page 62: 1 (+*), ! .1/ · [ZY\ $ • emacs7# emacsX %MABQ • ^x ^s U]8controlVXIDGJ7 • ^x ^c X U^z 3 0=4&GKFS7" * *=' 6/5(-4'V • ^g : !*?);5,5214+' • ^k : CTHQ:< 93 0' /1 8& 6

OpenMP),-7;198<1�*���• OpenMP���+�

parallel��0��%�(for:=7���

!�)(-"'!��• ��(OpenMP���+7;198<1234! -*&�OpenMP*7;198<1�*��!�/.-

• parallel��),-���+

private�� ��*�#����0��#(�'�51!�$->

2018/5/8 362<7;198<1?A@�?�@ 62

Page 63: 1 (+*), ! .1/ · [ZY\ $ • emacs7# emacsX %MABQ • ^x ^s U]8controlVXIDGJ7 • ^x ^c X U^z 3 0=4&GKFS7" * *=' 6/5(-4'V • ^g : !*?);5,5214+' • ^k : CTHQ:< 93 0' /1 8& 6

Private$ �!�:)2?��V3;XW• OpenMP7<+�&89?�';QUM����<+private��7��19-*>+6���:9?,• GLBQH;��<+ERFI(7��: �14��79-

2018/5/8 EKDTMSCPOTCVXW+V�W 63

!$omp parallel dodo i=1, 100

do j=1, 100tmp = b(i) + c(i) a( i ) = a( i ) + tmp

enddoenddo!$omp end parallel do

lQUM��:)2?���;�

�%91:MPANUH��816 �0@?;<+/;i-QUM��;=

/;j-QUM��<+private�%917<���:9?←ERFI(7�-"�57��←���#�:JC

/;��tmp<+private�%917<���:9?←ERFI(7�-"�57�.��←���#�:JC

Page 64: 1 (+*), ! .1/ · [ZY\ $ • emacs7# emacsX %MABQ • ^x ^s U]8controlVXIDGJ7 • ^x ^c X U^z 3 0=4&GKFS7" * *=' 6/5(-4'V • ^g : !*?);5,5214+' • ^k : CTHQ:< 93 0' /1 8& 6

Private)�&�F1>O �e?Ghf• Private)�&�F,.>O��Q!M>@J3�-/�Q1��=37A3?G1�G��Q�L>D31��I�=�08��=3VaWY���G�Q%�>O;D84O

2018/5/8 V[Uc\bT`_cTegf3e�f 64

!$omp parallel dodo i=1, 100

call foo(i,arg1,arg2,arg3,arg4,arg5, ….., arg100)

enddo!$omp end parallel do

l�I�=1�G��8�5�

1���H'�$F\`R^dX��FEO@J3private)�&�F,.>O��Q�!C9O← =7=31��I�=�GSdZd]WY8��>O

← VaWY�(�F65BK31��I�=GSdZd]WY8#*C9E:EN3 ��8�2<PO

�+��i����C�9"=B��Q�!

Page 65: 1 (+*), ! .1/ · [ZY\ $ • emacs7# emacsX %MABQ • ^x ^s U]8controlVXIDGJ7 • ^x ^c X U^z 3 0=4&GKFS7" * *=' 6/5(-4'V • ^g : !*?);5,5214+' • ^k : CTHQ:< 93 0' /1 8& 6

Private$��#�:)1B �;?8@• OpenMP7<+�&32:�"1B��<+1>6����Ushared variableV:9B

• C&(;����+Fortran90&(;common��+module��<+4;??7<����:9B• MPENTJ��:05,��<+Threadprivate�&-�%

• parallel��7)��=006,B��+4;)��7RTFQ:�&06,B��A+����:9B• 4;??7<+��!7�� �09,• /CD*.:<+��;HTK;��-�%

• �';RTFQ��D��:05)��=0D�B• �';RTFQ��D����:06+Threadprivate�&1B

2018/5/8 ILHSMRGPOSGUWV+U�V 65

Page 66: 1 (+*), ! .1/ · [ZY\ $ • emacs7# emacsX %MABQ • ^x ^s U]8controlVXIDGJ7 • ^x ^c X U^z 3 0=4&GKFS7" * *=' 6/5(-4'V • ^g : !*?);5,5214+' • ^k : CTHQ:< 93 0' /1 8& 6

Parallel��)�.�(�"-��=#)?>

• Parallel��*�do�����%��!$��%�-• ?9<6��)�����"-&do�����)��%9<6 &(fork"-2<4/� "-2;508��,������"-����-

2018/5/8 352;6:187;1=?>�=�> 66

!$omp parallel!$omp do private(j,tmp)do i=1, 100

do j=1, 100tmp = b( @ ) + c( @ ) a( i ) = a( i ) + tmp

enddoenddo!$omp end do!$omp end parallel

!$omp parallel do private(j,tmp)do i=1, 100

do j=1, 100tmp = b( @ ) + c( @ ) a( i ) = a( i ) + tmp

enddoenddo!$omp end parallel do

Parallel��)��?9<6'+ parallel do %�

Page 67: 1 (+*), ! .1/ · [ZY\ $ • emacs7# emacsX %MABQ • ^x ^s U]8controlVXIDGJ7 • ^x ^c X U^z 3 0=4&GKFS7" * *=' 6/5(-4'V • ^g : !*?);5,5214+' • ^k : CTHQ:< 93 0' /1 8& 6

Parallel��2�6 1!*5��E+2HF

• Parallel��3$do�����/�").��/(5• ��AD>2��7��),&�3$�"),�'# 105• ,-)$��AD>7��/(5�3+2�'��'�&• ��AD>1<D;��'%4$��/(0&�

2018/5/8 :=9C>B8@?C8EGF$E�F 67

do i=1, n!$omp parallel do

do j=1, n<��/(5�>

enddo!$omp end parallel do enddo

!$omp parallel do i=1, n!$omp do

do j=1, n<��/(5�>

enddo!$omp end doenddo!$omp end parallel

Page 68: 1 (+*), ! .1/ · [ZY\ $ • emacs7# emacsX %MABQ • ^x ^s U]8controlVXIDGJ7 • ^x ^c X U^z 3 0=4&GKFS7" * *=' 6/5(-4'V • ^g : !*?);5,5214+' • ^k : CTHQ:< 93 0' /1 8& 6

NYL��1�F�7PI@?E�

• 0� !63EGXNMHK@�7;%8E�• 0� !AQLYX25DC2KVMO�(ALGSXI�$<2.��"=&�6�'72 ���9=�/48E��63E• ",#@B0/:;4E

• OpenMPA����B2NYL�-�A�+B7?4• NYL�-�A�+@B2critical)���?>A��6�*

2018/5/8 KQJXRWIUSXIZ\[2Z�[ 68

!$omp parallel do private( j )do i=1, n

j = indx( i )a( j ) = a( j ) + 1

enddo!$omp end parallel do

lPI@?ERWIUT�!$omp parallel do private( j )do i=1, n

j = indx( i )!$omp critical

a( j ) = a( j ) + 1!$omp end criticalenddo!$omp end parallel do

Page 69: 1 (+*), ! .1/ · [ZY\ $ • emacs7# emacsX %MABQ • ^x ^s U]8controlVXIDGJ7 • ^x ^c X U^z 3 0=4&GKFS7" * *=' 6/5(-4'V • ^g : !*?);5,5214+' • ^k : CTHQ:< 93 0' /1 8& 6

Critical3��.!TY\8���(1/2)• �6UYDT@critical3��.!^�]SCRCHSC��@*T?evkm QU�2Q�1G��L\

• ?�1�L\TV@�%-TV_uctfq^�#L\KFSCA• IU��@��U}OU_owyjGB\A1. evkm_`geUXT;�K@critical3��.!^VML• :��)J]\lyhTOCP@+5-T@�[�PZ]NevkmUlyhKF_`geKSCYDT@_uctfq^�#L\

2. evkm:_`ge^$��• CriticalU��>�T�"T�\evkm G'\YDT@:��)L\lyh^� T4W@:��)L\lyhU=,^�#L\A

3. evkm:_`ge9�^uyoFZ�<K@7&+TL\• �{/(0TEH\ti`drx3���!

2018/5/8 enbxowaspxaz|{@z�{ 69

Page 70: 1 (+*), ! .1/ · [ZY\ $ • emacs7# emacsX %MABQ • ^x ^s U]8controlVXIDGJ7 • ^x ^c X U^z 3 0=4&GKFS7" * *=' 6/5(-4'V • ^g : !*?);5,5214+' • ^k : CTHQ:< 93 0' /1 8& 6

Critical�����#$����(2/2)• ��!1+�� : omp atomic %��

• ���.6-'(&�#$�����$7!�8

• ������� 1� "x = x op

• op: +, -, *, / , ��

2018/5/8 ,/*504)325)798�7�8 70

!$omp parallel do private( j )do i=1, n

j = indx( i )!$omp atomic

a( j ) = a( j ) + 1enddo!$omp end parallel do

Page 71: 1 (+*), ! .1/ · [ZY\ $ • emacs7# emacsX %MABQ • ^x ^s U]8controlVXIDGJ7 • ^x ^c X U^z 3 0=4&GKFS7" * *=' 6/5(-4'V • ^g : !*?);5,5214+' • ^k : CTHQ:< 93 0' /1 8& 6

OpenMP^%>J��S!$zIS|{• OpenMPT +Qvyp^��H[EOR�C• �%`pueyhsxR?D[0:QvypT;ISUUNTOpenMP�R�>M>Q>EOA=[<

1. private/��(��R�@\[���S�A-�RQ[• ��vyp@ZOpenMP��H[��;�8N�LM>[��S�A�>EOA=[

• private��uimR��^�B�\MW;fxoatRY[btyT�Q><z��S4�TrygR=[JV{

• �.H[O;jaqxdR��G2*, A5"O&Q[<PEA96LM>[@]@ZQ>SN;lnkcA��RQ[<

• 1#)}fxoatRYLMT;�7���^�H[EOANB[<IS��@Z;KX_Oprivate�F\M>[@'3H[<

2018/5/8 iofxpwdtqxdz|{;z�{ 71

Page 72: 1 (+*), ! .1/ · [ZY\ $ • emacs7# emacsX %MABQ • ^x ^s U]8controlVXIDGJ7 • ^x ^c X U^z 3 0=4&GKFS7" * *=' 6/5(-4'V • ^g : !*?);5,5214+' • ^k : CTHQ:< 93 0' /1 8& 6

OpenMPZ+BI���P&)zHP}{2. >cvgh�1 O�/ENB��Pfryix`E�=

• �0O?8cvgh#(KQ�/EWE?8cvgh��K�/E �GW@

1. 9�Pkyh\][Qpqt[_ecP�/E�B2. uymHPSPO���ENBzuym;E,B{

• 5'GWOQ?[ubtdoP�!?�2P�!?E�4ONV?OpenMPP�)KAW��Nmw`snx`Z�NC

3. 3<Ncvghmw`snx`OQ�DNB

• �.N��6-P^yjuuymZ?parallel for$�K78GW�:K�%E�UXJBWzL�YXW{

• 3<N*Q?PthreadNMPnativeNcvghAPIK"FRCETVTGB

2018/5/8 claxmw`snx`z|{?z�{ 72

Page 73: 1 (+*), ! .1/ · [ZY\ $ • emacs7# emacsX %MABQ • ^x ^s U]8controlVXIDGJ7 • ^x ^c X U^z 3 0=4&GKFS7" * *=' 6/5(-4'V • ^g : !*?);5,5214+' • ^k : CTHQ:< 93 0' /1 8& 6

��������������-���OpenMP�

������� ��������� 732018/5/8

Page 74: 1 (+*), ! .1/ · [ZY\ $ • emacs7# emacsX %MABQ • ^x ^s U]8controlVXIDGJ7 • ^x ^c X U^z 3 0=4&GKFS7" * *=' 6/5(-4'V • ^g : !*?);5,5214+' • ^k : CTHQ:< 93 0' /1 8& 6

��-���&1A8?8@/=:(OpenMP�)&���• C����('Fortran���&6+,?�

Mat-Mat-openmp-ofp.tar.gz•2<73.>846+,?mat-mat-openmp.bash�&-;B�* lecture-flat �)

lecture5-flat (������)/?B8* gt00�)gt05%� "�qsub "

$ #!��• lecture-flat : ����&-;B

• lecture5-flat: ����&-;B

2018/5/8 350A8@/=9A/CED�C�D 74

Page 75: 1 (+*), ! .1/ · [ZY\ $ • emacs7# emacsX %MABQ • ^x ^s U]8controlVXIDGJ7 • ^x ^c X U^z 3 0=4&GKFS7" * *=' 6/5(-4'V • ^g : !*?);5,5214+' • ^k : CTHQ:< 93 0' /1 8& 6

�- ���#3*1*2!/-� • ���"+3'� ��

$ cdw$ cp /work/gt05/z30105/Mat-Mat-openmp-ofp.tar.gz ./$ tar xvfz Mat-Mat-openmp-ofp.tar.gz$ cd Mat-Mat-openmp

• �������� $ cd C : C������$ cd F : 7:;<;89������

• �����$ make

• $.)% 0*&������$ pjsub mat-mat-openmp.bash

• ���������� ��$ cat mat-mat-openmp.bash.oXXXXX

2018/5/8 %("3*2!/,3!465�4�5 75

Page 76: 1 (+*), ! .1/ · [ZY\ $ • emacs7# emacsX %MABQ • ^x ^s U]8controlVXIDGJ7 • ^x ^c X U^z 3 0=4&GKFS7" * *=' 6/5(-4'V • ^g : !*?);5,5214+' • ^k : CTHQ:< 93 0' /1 8& 6

��-���#,&*&+!)(���•��������������-C �.(����OpenMP������� )

N = 2000Mat-Mat time = 1.386665 [sec.]11538.476510 [MFLOPS]OK!N = 2000Mat-Mat time = 1.386445 [sec.]11540.305945 [MFLOPS]OK!

$%",&+!)',!-/.�-�. 762018/5/8

Page 77: 1 (+*), ! .1/ · [ZY\ $ • emacs7# emacsX %MABQ • ^x ^s U]8controlVXIDGJ7 • ^x ^c X U^z 3 0=4&GKFS7" * *=' 6/5(-4'V • ^g : !*?);5,5214+' • ^k : CTHQ:< 93 0' /1 8& 6

��-���#,&*&+!)(���•��������������-Fortran �.(����OpenMP������� )

N = 2000Mat-Mat time[sec.] = 9.86477398872375MFLOPS = 1621.93274553408OK!N = 2000Mat-Mat time[sec.] = 7.95836710929871MFLOPS = 2010.46266650720OK!

$%",&+!)',!-/.�-�. 772018/5/8

Page 78: 1 (+*), ! .1/ · [ZY\ $ • emacs7# emacsX %MABQ • ^x ^s U]8controlVXIDGJ7 • ^x ^c X U^z 3 0=4&GKFS7" * *=' 6/5(-4'V • ^g : !*?);5,5214+' • ^k : CTHQ:< 93 0' /1 8& 6

����

1. MyMatMat��/<F93�&*,����"� ���• ��'6G=D=F4B@(Mat-Mat-noopt-

ofp.tar.gz)/�!"� ���OpenMP #(�,)�0I

• 5G;2B'��E>D/0&�"�,)�• 5G;2B&*-��/�.%�#� ����&*-1GFHCG4'���%�%,)��

2. MyMatMat��/�OpenMP��"����"� ���• �'6G=D=F4B@(Mat-Mat-openmp-ofp.tar.gz)/�!"� ���

• �+&1GFHCG4�<F93�%$'8AH:G4/�"� ���

2018/5/8 7;5G=F4B?G4JLK�J�K 78

Page 79: 1 (+*), ! .1/ · [ZY\ $ • emacs7# emacsX %MABQ • ^x ^s U]8controlVXIDGJ7 • ^x ^c X U^z 3 0=4&GKFS7" * *=' 6/5(-4'V • ^g : !*?);5,5214+' • ^k : CTHQ:< 93 0' /1 8& 6

��-����OpenMP����

��� ����������� 792018/5/8

Page 80: 1 (+*), ! .1/ · [ZY\ $ • emacs7# emacsX %MABQ • ^x ^s U]8controlVXIDGJ7 • ^x ^c X U^z 3 0=4&GKFS7" * *=' 6/5(-4'V • ^g : !*?);5,5214+' • ^k : CTHQ:< 93 0' /1 8& 6

��-��������OpenMP����� �• ����� ���� �

2018/5/8 ������������������ 80

#pragma omp parallel for private (j, k)for(i=0; i<n; i++) {for(j=0; j<n; j++) {for(k=0; k<n; k++) {C[i][j] += A[i][k] * B[k][j];

}}

}

Page 81: 1 (+*), ! .1/ · [ZY\ $ • emacs7# emacsX %MABQ • ^x ^s U]8controlVXIDGJ7 • ^x ^c X U^z 3 0=4&GKFS7" * *=' 6/5(-4'V • ^g : !*?);5,5214+' • ^k : CTHQ:< 93 0' /1 8& 6

��-��������OpenMP�����Fortran�• ����� ���� �

2018/5/8 ������������������ 81

!$omp parallel do private (j, k)do i=1, n

do j=1, ndo k=1, n

C(i, j) = C(i, j) + A(i, k) * B(k, j)enddo

enddoenddo

Page 82: 1 (+*), ! .1/ · [ZY\ $ • emacs7# emacsX %MABQ • ^x ^s U]8controlVXIDGJ7 • ^x ^c X U^z 3 0=4&GKFS7" * *=' 6/5(-4'V • ^g : !*?);5,5214+' • ^k : CTHQ:< 93 0' /1 8& 6

c^fX+4

1. [Ljl] [dWS�Q%A?%�-%�!GUfYF�<5RedfaeTQbf\FB7C�<�$Q,�>M6%�G :;gNhQ��;>5NF�<C50�ERedfaeT��Q,�>M6

2. [Ljl] OpenMP�<?%�-%�!GUfYF�<5[dWS�DRedfaeTQ�<�$Q,�>M6

2018/5/8 VZUe\dT`_eTgjh5g�h 82

�4Gc]bF2=N*.m•L00: :PKC"�E�46•L10m @LAD#8OIP9N�46•L20m ���E�46•L30m ��1 ��(D=N�46•L40m �/1 ��(D=N�46'3E�&Q�(D=N6•L50m �9� ��(D=N�46�)��4QJ6�Lki��H5-�Q��=NF�=N�46

Page 83: 1 (+*), ! .1/ · [ZY\ $ • emacs7# emacsX %MABQ • ^x ^s U]8controlVXIDGJ7 • ^x ^c X U^z 3 0=4&GKFS7" * *=' 6/5(-4'V • ^g : !*?);5,5214+' • ^k : CTHQ:< 93 0' /1 8& 6

�#��

3. OpenMP����������'�"���!������OpenMP�OpenACC���

2018/5/8 ���"�!���"�$&% $�% 83

Page 84: 1 (+*), ! .1/ · [ZY\ $ • emacs7# emacsX %MABQ • ^x ^s U]8controlVXIDGJ7 • ^x ^c X U^z 3 0=4&GKFS7" * *=' 6/5(-4'V • ^g : !*?);5,5214+' • ^k : CTHQ:< 93 0' /1 8& 6

����

• �� ][^Z����=G-D?I- �!������[^Y\SNRQONOQUTONU�R,TOO�

• Kevin Dowd��������� 8*J9;+K>I1J.I:BK5)I-X][^CHK,15K/CI%��&��(�"

�#$'&��!�*I3K7/C7FJ6@2IJ9<E4/I-J0A9I�[^Y\SNWOOUPVNORNR�S,SOO�

19.I=G-D?I-LPM�L�M 842018/5/8

Page 85: 1 (+*), ! .1/ · [ZY\ $ • emacs7# emacsX %MABQ • ^x ^s U]8controlVXIDGJ7 • ^x ^c X U^z 3 0=4&GKFS7" * *=' 6/5(-4'V • ^g : !*?);5,5214+' • ^k : CTHQ:< 93 0' /1 8& 6

������-�����

2018/5/8 �� ��������������� 85