authorized and redacted by ocean university of china on 4th july, … final contest... · 2017. 8....
TRANSCRIPT
Authorized and redacted by Ocean University of China on 4th July, 2017 as an example for ASC
Presentation of
ASC17
Final Competition
Ocean University of China
Team Advisor: Professor Shen Biao
2017-4-28
3
Team introduction
Cluster building
Application optimization
Summary
INTRODUCTION
01
team leader
Marine Science
Liao
JiawenMarine Science
Bai
Zongbao
Atmospheric
Science
Sun
Xiaoshan
Atmospheric
Science
Zhang
JinpeiMarine Science
Yuan Man
Cluster building
02
8
Our cluster
CPU
GPU
CENTOS7.3NVIDIA TESLA P100
9
Cluster building
KNL
GPU
GPU
Performance Optimization of Final Competition Applications
03
11
Performance optimization of HPL
the result we submitted was 9.15TFLOPSthe newer result is 20.14TFLOPS
12
Performance optimization of HPCG
• There are two sections can be altered: Time and Matrix Size.
• Changing the matrix size can optimize performance.
HPCG.dat
• We should run HPCG on many codes with multiple processes.
• When using ordering “mpirun “ we need to write a hostfile to note the code name.
Hostfile
• sudo nvidia-smi -acp 0
• # allow normal user to set GPU clocks
• sudo nvidia-smi -ac 715,1328
• # set max bost clk for P100 PCIe
Boost clocks
Some others we haven’t try : 1, the function OptimizeProblem() OptimezeData2, to change ComputeSPMV and ComputeSYMGS
13
Traffic prediction based paddle
percentage
DREAM
REALITY
difficult in installing many thoughts lack of practice
14
MASNUM
Step 1:Find hotspotproagat.inc implstch.inc
Step 2:Using OpenACC accelerate on 64 CPEs
Step 3:Other optimizations arraycompiler options vectorizaton
15
16
17
18
19
0
2000
4000
6000
8000
10000
12000
14000
openACC vectorization openACC vectorization openaACC vectorization
propagat implsch total
Tim
e(s
)
MASNUM Optimization Results
2CGs 4CGs 256CGs
SUMMARY
New field, New learning, New challenge !
Thank you for your
attention!