team tsinghua student cluster competition @sc’19 · team tsinghua . student cluster competition...

7
Team Tsinghua Student Cluster Competition @SC’19 Jiaao He, Shengqi Chen, Liyan Zheng, Kezhao Huang, Chen Zhang, Chenggang Zhao, Wentao Han, Jidong Zhai

Upload: others

Post on 05-Nov-2020

3 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Team Tsinghua Student Cluster Competition @SC’19 · Team Tsinghua . Student Cluster Competition @SC’19. Jiaao He, Shengqi Chen, Liyan Zheng, Kezhao Huang, Chen Zhang, Chenggang

Team Tsinghua Student Cluster Competition

@SC’19Jiaao He, Shengqi Chen, Liyan Zheng, Kezhao Huang, Chen Zhang,

Chenggang Zhao, Wentao Han, Jidong Zhai

Page 2: Team Tsinghua Student Cluster Competition @SC’19 · Team Tsinghua . Student Cluster Competition @SC’19. Jiaao He, Shengqi Chen, Liyan Zheng, Kezhao Huang, Chen Zhang, Chenggang

Team Members

Prof. Jidong Zhai

Dr. Wentao Han

Chen Zhang Reproducibility

Chenggang Zhao Reproducibility

Jiaao He Arch

HPL/HPCG SST

Shengqi Chen Networking

IO-500 SST

Kezhao Huang VPIC

Liyan Zheng VPIC

Junior Backups Working in all aspects

Page 3: Team Tsinghua Student Cluster Competition @SC’19 · Team Tsinghua . Student Cluster Competition @SC’19. Jiaao He, Shengqi Chen, Liyan Zheng, Kezhao Huang, Chen Zhang, Chenggang

Cluster Architecture

Intel 8280 x2 NVIDIA V100 x8

Intel 8280 x2 Intel PCIe SSD RAID0

Inifiniband

EthernetRedundant

Backup Nodes

Page 4: Team Tsinghua Student Cluster Competition @SC’19 · Team Tsinghua . Student Cluster Competition @SC’19. Jiaao He, Shengqi Chen, Liyan Zheng, Kezhao Huang, Chen Zhang, Chenggang

Software Stack

Debian 9

NFS on ZFS

Spack

Compilers: GCC, ICC, LLVM, etc.

Libraries: CUDA, CUDNN, BLAS, MPI, etc.

Power control & montior: fan, cpu, gpu, ipmi

Page 5: Team Tsinghua Student Cluster Competition @SC’19 · Team Tsinghua . Student Cluster Competition @SC’19. Jiaao He, Shengqi Chen, Liyan Zheng, Kezhao Huang, Chen Zhang, Chenggang

SST - Computer Simulator• Compilers: GCC, ICC (with bugs fixed), LLVM

• Core: Manual partitioner + Human Intelligence

• Components:

• Miranda: Callback lookup table -> 7x speedup

• MemH: Optimized data structure initialization -> 1.1x speedup

• Ember: Remove debug log generation -> 1.05x speedup

Page 6: Team Tsinghua Student Cluster Competition @SC’19 · Team Tsinghua . Student Cluster Competition @SC’19. Jiaao He, Shengqi Chen, Liyan Zheng, Kezhao Huang, Chen Zhang, Chenggang

Reproduction - Planet Normal Modes• Local data generation

• Accurate runtime estimation

• One-key task scheduler with fancy functions

• Switch datasets

• Run monitor

• Auto plotting

Page 7: Team Tsinghua Student Cluster Competition @SC’19 · Team Tsinghua . Student Cluster Competition @SC’19. Jiaao He, Shengqi Chen, Liyan Zheng, Kezhao Huang, Chen Zhang, Chenggang

VPIC

• Well-vectorized, largely scalable, computation intensive

• Performance insensitive across nodes (with IB)

• Optimize AVX load instruction imbalance by specifying core affinity -> 1.2x speedup