professor boris m. glinsky nikolay v. kuchin › hpc › downloads › 10._sscc... · • which...

16
Professor Boris M. Glinsky Nikolay V. Kuchin HP-CAST 20 2013

Upload: others

Post on 25-Jun-2020

0 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Professor Boris M. Glinsky Nikolay V. Kuchin › HPC › downloads › 10._SSCC... · • Which accelerator is the best for user applications (Xeon Phi, Nvidia Kepler) ? We can evaluate

Professor Boris M. GlinskyNikolay V. Kuchin

HP-CAST 202013

Page 2: Professor Boris M. Glinsky Nikolay V. Kuchin › HPC › downloads › 10._SSCC... · • Which accelerator is the best for user applications (Xeon Phi, Nvidia Kepler) ? We can evaluate

The Siberian Supercomputer Center

2HP - CAST 20

• The supercomputer center of the collective use “The Siberian Supercomputer Center” (SSCC) is created as a laboratory of the Institute of Computational Mathematics and Mathematical Geophysics (ICM&MG) according to the Directive of Presidium of the Siberian Branch of the Russian Academy of Sciences (SB RAS) of 06.03.2001 No. 100.

• Now the center of the collective use “The Siberian Supercomputer Center” is the joint project of ICM&MG and the Institute of Cytology and Genetics (IC&G) SB RAS.

• Scientific Supervisor of the Siberian Supercomputer Center -Academician, Professor Boris G. Mikhailenko is the director of the ICM&MG SB RAS and the Chairman of Scientific Council on Supercomputing of the SB RAS.

• HPC Competence Center SB RAS – Intel has been organized in 2008.

• http://www.sscc.ru, http://www2.sscc.ru

Page 3: Professor Boris M. Glinsky Nikolay V. Kuchin › HPC › downloads › 10._SSCC... · • Which accelerator is the best for user applications (Xeon Phi, Nvidia Kepler) ? We can evaluate

The Siberian Supercomputer Center

3HP - CAST 20

The main objectives of the Siberian Supercomputer Center are as follows:• providing the modern computing resources ‐ hardware and 

software to the researchers from the Siberian Branch of the Russian Academy of Sciences (SB RAS) and  Universities of Siberia;

• training the researchers of the SB RAS and students of the universities  the modern methods of parallel computing and solving large‐scale problems on supercomputers;

• managing the development of all Supercomputer Centers of the SB RAS (according to directives of the Scientific Council on Supercomputing of the SB RAS).

Page 4: Professor Boris M. Glinsky Nikolay V. Kuchin › HPC › downloads › 10._SSCC... · • Which accelerator is the best for user applications (Xeon Phi, Nvidia Kepler) ? We can evaluate

Training, schools and seminars

4HP - CAST 20

1) The International Conference “Parallel and Computing Technologies 2012” was held by the ICMMG with 242 participants from Russia, Kazakhstan, Ukraine, Germany, France, the USA, see at http://agora.guru.ru/display.php?conf=pavt2012.

2) NVIDIA CUDA Technology 3 days trainings was held in 2012. The School was supported by the specialists of NVIDIA on computer resources of the cluster. 118 listeners from the research institutes of SB RAS, Higher Education Institutions and Firms were trained. Program and teaching materials are displayed at:http://www2.sscc.ru/Seminars/Nvidia%20Cuda-1.htm

3) The Workshop on Parallel Programming of Hybrid Clusters was held in the December of 2012 : see at: http://www2.sscc.ru/Seminars/Shool-2012.htm

4) Trainings in high-performance computing and mathematical modeling at 5 Chairs at NSU and NSTU by the specialists of ICMMG SB RAS: Mathematical Methods in Geophysics Chair (NSU); Numerical Analysis Chair (NSU); Parallel Computations Chair (NSU); Computer Systems Chair (NSU); Parallel Computer Technologies Chair (NSTU).

5) The “Architecture, System and Application Software of Cluster Supercomputers” seminars are regularly provided on the basis of the SSCC, the Chair of Computer Systems of NSU and the Competence Center on High-Performance Computing of SB RAS/Intel. The Seminars are presented at: http://www2.sscc.ru/Seminars/NEW/Seminars.htm

Page 5: Professor Boris M. Glinsky Nikolay V. Kuchin › HPC › downloads › 10._SSCC... · • Which accelerator is the best for user applications (Xeon Phi, Nvidia Kepler) ? We can evaluate

COMPUTING RESOURCES

GROWTH

5HP - CAST 20

0,247 TFlops2006

> 1 TFlops2007

5,8Т TFlops2008

7,1Т TFlops2009 

17,5Tflops2010

0,246TFlops2005

31TFlops2011

NКS-1601 TFlops

МVS-1000/32

МVS-1000/128М

NКS-160> 1 TFlops

NКS-30Т4,8 TFlops

NКS-30Т30 TFlops

NКS-30Т16,5 TFlops

NКS-160

NКS-160

116TFlops2012

Hybrid Cluster115 TFlops

Page 6: Professor Boris M. Glinsky Nikolay V. Kuchin › HPC › downloads › 10._SSCC... · • Which accelerator is the best for user applications (Xeon Phi, Nvidia Kepler) ? We can evaluate

Hardware and software

Storage subsystem

Cluster file systemIBRIX

4 servers,32 Tbytes

576 CPU (2688 cores)Intel Xeon Е5450/E5540/X5670;

80 CPU CPU (X5670)(480 cores);

120 GPUs Tesla M2090 (61440 cores).

Peak performance 115 TFlops

Software

RedHat 5.4 – operating systemPBSPro 11.1 – batch systemHP Cluster Management Utility

Toolkits:

Intel® Cluster Studio XE 2013,Including Intel compilers andMPI 4.1. NVIDIA CUDA 5,Portland Group PGI Accelerator.

Application programs:ANSYS CFD 14.5.7 (Fluent),Gaussian 09,Bioscope.

Software developed in ICMMG:PARMONC,AGNES.

Shared memory server(hp DL980 G7)

4 CPU (40 cores) Intel Е7-4870;RAM - 512 Gbytes;384 GFlops.

Hybrid cluster HKC-30T+GPUs with Cool Aisle Containment System (CACS)

6HP - CAST 20

Page 7: Professor Boris M. Glinsky Nikolay V. Kuchin › HPC › downloads › 10._SSCC... · • Which accelerator is the best for user applications (Xeon Phi, Nvidia Kepler) ? We can evaluate

Hybrid cluster (NKS-30Т + GPUs)

7HP - CAST 20

64BL2x220c

G5

128BL2x220c

G6

96BL2x220c

G7

40SL390s G7

+120 GPUs

M2090

++

NKS‐30T cluster Cluster extension with GPUs

+

Shared Memory server

DL980 G7

RAM ‐ 512 GB40 cores

384 GFlops

I N F I N I B A N D

IBRIXCluster File system

4 servers32 TBytes

IBRIXCluster File system

Future extension / plans

xx servers

. . . . .

Page 8: Professor Boris M. Glinsky Nikolay V. Kuchin › HPC › downloads › 10._SSCC... · • Which accelerator is the best for user applications (Xeon Phi, Nvidia Kepler) ? We can evaluate

Hardware: more details

HP - CAST 20 8

• 7 HP BladeSystem c7000 Enclosure 2 c7000 Enclosure, 32 HP BL2х220c G5, 64 Compute nodes,

128 CPU Intel Xeon E5450, 512 Cores, RAM 16 GB, 6.1 TFlops. 4 c7000 Enclosure, 64 HP BL2х220c G6, 128 Compute nodes,

256 CPU Intel Xeon Е5540, 1024 Cores, RAM 16 GB, 10.36 Tflops.BL2х220c G6 system boards were redesigned in the end 2010. We are goingto replace all old system boards with the new redesigned system boards.

3 c7000 Enclosure, 48 HP BL2х220c G7, 96 Compute nodes,192 CPU Intel Xeon X5670, 1152 Cores, RAM 24 GB, 13.5 Tflops.

• 40 SL390s G7, each server has 2 CPU Intel Xeon X5670, RAM 96 GB,3 GPU NVIDIA Tesla M2090. Peak performance of 40 servers is 85 TFlops.

• DL980 G7, 4 CPU Intel Е7-4870, 40 Cores, RAM 512 GB, local RAID 1.5 Tbytes,384 GFlops.

• IBRIX cluster file system 4 HP DL380 G6, 2 CPU Intel Xeon E5520, 8 Cores, RAM 48 GB. 4 HP 2000sa Modular Smart Array (MSA2000sa), 32 TB (48 TB row space).

IBRIX is very good system but its configuration is too small for our cluster.IBRIX throughput is bottleneck for some user applications.

Page 9: Professor Boris M. Glinsky Nikolay V. Kuchin › HPC › downloads › 10._SSCC... · • Which accelerator is the best for user applications (Xeon Phi, Nvidia Kepler) ? We can evaluate

Software: more details

HP - CAST 20 9

• System software RHEL 5.4 Altair PBS Pro 11.1 batch system HP Cluster Management Utility

• Toolkits Intel® Cluster Studio XE 2013 for Linux Compilers Intel С/C++ и Intel Fortran Composer XE Intel MPI 4.1, Trace Analyzer & Collector 8.1 NVIDIA CUDA 5 Portland Group PGI Accelerator 13.4

• Application programs ANSYS CFD 14.5.7 (Fluent) Gaussian 09 (in the next 2 or 3 months) Bioscope, Gromacs, Quantum Espresso

• Software developed in ICMMG PARMONC, AGNES

Page 10: Professor Boris M. Glinsky Nikolay V. Kuchin › HPC › downloads › 10._SSCC... · • Which accelerator is the best for user applications (Xeon Phi, Nvidia Kepler) ? We can evaluate

10HP - CAST 20

The program library PARMONCfor the parallel Monte-Carlo simulation

M.A. Marchenko, ICM&MG SB RAS, [email protected]

The program library PARMONC (PARallel MONte Carlo) is designed for parallelization of the time-consuming Monte-Carlo simulations.

The core of the library is the parallel long-period random numbers generator developed in ICM&MG. Using the PARMONC, the Monte Carlo codes written in FORTRAN or C may be easily parallelized without explicit use of MPI.

Scope: time-consuming applications in the natural sciences (physics, chemistry, biology, etc.)

References:

1) Marchenko, M.: PARMONC - A software library for massively parallel stochastic simulation. LNCS, vol. 6873, pp. 302–315. Springer, Heidelberg (2011)2) http://www2.sscc.ru/soran-intel/paper/2011/parmonc.htm

Page 11: Professor Boris M. Glinsky Nikolay V. Kuchin › HPC › downloads › 10._SSCC... · • Which accelerator is the best for user applications (Xeon Phi, Nvidia Kepler) ? We can evaluate

11HP - CAST 20

32,98 % ICG Institute of Cytology and Genetics

19,84 % IC Institute of Catalysis

15,67 % ICMMG Institute of Computational Mathematics and Mathematical Geophysics

10,47 % NSU Novosibirsk State University

6,16 % ICKC Institute of Chemical Kinetics and Combustion

4,69 % ICCT Institute of Chemistry and Chemical Technologies, Krasnoyarsk

2,90 % ITP Institute of Thermophysics

1,80 % INP Institute of Nuclear Physics

1,31 % ICT Institute of Computational Technologies

1,20 % IPGG Institute of Petroleum Geology and Geophysics

0,79 % ITAM Institute of Theoretical and Applied Mechanics

0,35 % NSTU Novosibirsk State Technical University

0,27 % ICBFM Institute of Chemical Biology and Fundamental Medicine

0,26 % IEC Institute of the Earth Cryosphere, Tyumen

0,05 % NIIC Institute of Inorganic Chemistry

0,05 % ILP Institute of Laser Physics

0,02 % ISP Institute of Semiconductor Physics

Usage of computing resources in 2012

Page 12: Professor Boris M. Glinsky Nikolay V. Kuchin › HPC › downloads › 10._SSCC... · • Which accelerator is the best for user applications (Xeon Phi, Nvidia Kepler) ? We can evaluate

Scientific Researches Areas(from annual reports of users)

12HP - CAST 20

1) Industry of nano-systems - ICMMG, IC, ITAM, ICKC, IFS, INP,ICCT (Krasnoyarsk), IEC (Tyumen)

2) Information-Telecommunication systems - ICT, ICMMG, NIIC, NSU, NSTU, ICG, INP

3) Energy Efficiency, Economy of Power, Nuclear Power - ICT, ICMMG, IC, ITP, ICKC, INP, NSU, NSTU, IPGG

4) Life Sciences - ICBFM, ICG, ICMMG, NSU, IC, IEC (Tyumen)5) Rational Use of Natural Resources - ICMMG, IPGG, ITP, NSU, ICCT

(Krasnoyarsk), IEC (Tyumen)6) Transport and Cosmic Systems - ITAM, NSTU

Annual reports of our users in 2012:Our users are carried out 152 grants, programs, projects, topics,including 2 international grants. The Russian Foundation for Basic Research (RFBR) grants – 54, Russian Academy of Sciences Programs – 16, SB RAS Projects – 37, the Ministry of Education Programs – 23, others – 22.

Page 13: Professor Boris M. Glinsky Nikolay V. Kuchin › HPC › downloads › 10._SSCC... · • Which accelerator is the best for user applications (Xeon Phi, Nvidia Kepler) ? We can evaluate

Usage of computing resources in 2009 - 2012

13HP - CAST 20

Accounting statistics(NKS-160 + NKS-30T) 2009 2010 2011 2012

∑ peak performance(TFlops) 7,1 17,5 31,0 116

∑ CPU (hours) 1 924 308,38 2 908 834,93 5 039 941,76 12 799 789,11

∑ jobs number 38 914 39 750 35 952 83 797

Page 14: Professor Boris M. Glinsky Nikolay V. Kuchin › HPC › downloads › 10._SSCC... · • Which accelerator is the best for user applications (Xeon Phi, Nvidia Kepler) ? We can evaluate

Processing the results of experiments in high energy physics

14HP - CAST 20

Сommon virtualized computing environment has been designed in the Institute of Nuclear Physics of the SB RAS (INP). It includes computing resources of HPC clusters of the Siberian Supercomputer Center and Novosibirsk State University. Additional software, including KVM, has been installed on cluster compute nodes. Special network (10 Gbits) has been used as a transport level. All data are located in INP and are accessible via NFS. PBS prologue and epilogue are used to start and stop kvm virtual machine on compute node and start / stop NFS over IB plus 10 Gbits network.

• KEDR

KEDR is a large scale particle detector experiment being carried out at VEPP-4M electron-positron collider at INP. The offline software of the experiment was being developed since late 90’s. After several migrations the standard computing environment was frozen on Scientific Linux CERN 3 i386 and no further migrations are expected in the future.

The SND detector experiment which is being carried out at INP at VEPP-2000 collider have successfully adopted the virtualization solution previously built for KEDR detector in order to satisfy their own needs for HPC resources. The INP user group doing data analysis for ATLAS experiment at the LHC machine (CERN, Switzerland) within the framework of ATLAS Exotics Working Group have joined the activity as well.

Page 15: Professor Boris M. Glinsky Nikolay V. Kuchin › HPC › downloads › 10._SSCC... · • Which accelerator is the best for user applications (Xeon Phi, Nvidia Kepler) ? We can evaluate

Future Work

15HP - CAST 20

• Which parallel file system the new cluster needs (Lustre, Panasas) ?

Some of Bioinformatics application use big volumes of data.

• Which accelerator is the best for user applications (Xeon Phi, NvidiaKepler) ?

We can evaluate Intel Xeon Phi now. We have access to Intel Xeon Phicluster located in Joint Supercomputer Center RAS, Moscow.

• HPC cluster workload optimization: combining CPU intensiveapplications with I/O intensive applications.

But it’s impossible to predict cluster workload.

Page 16: Professor Boris M. Glinsky Nikolay V. Kuchin › HPC › downloads › 10._SSCC... · • Which accelerator is the best for user applications (Xeon Phi, Nvidia Kepler) ? We can evaluate

16HP - CAST 20

Thank you