information technology services (its) hpc infrastructure€¦ · cray sr5110 chassis ivy-bridge...
TRANSCRIPT
Information Technology Services | www.odu.edu/its | [email protected]
Information Technology Services (ITS)
HPC Infrastructure
HPC Day
October 15th 2014
Information Technology Services | www.odu.edu/its | [email protected]
HPC@ODU Introduction
High Performance Computing Resources.
Storage
Software
Support staff
Education and Outreach
2
Information Technology Services | www.odu.edu/its | [email protected]
HPC@ODU Introduction
Supporting research computing with parallel compute environments using MPI and OpenMP
protocols.
Pre 2013, ZORKA & NIKOLA were providing computation resources to the campus researchers.
Some examples of resources in these clusters:
40 Dell PE 1950 nodes, 4 cores each & 8GB RAM.
4 Dell PE R900 nodes, 16 cores each & 32GB RAM.
7 Sun Fire X46000 M2 Nodes, 32 cores each & 64 GB RAM.
17 APPRO nodes , Tesla M2090 GPU’s, 12 cores & 48GB RAM.
NFS based scratch space mounted on all compute nodes (about 2TB). Research Mass storage,
accessible from Head Node (approximately 60TB of data plus tape storage).
3
Information Technology Services | www.odu.edu/its | [email protected]
HPC Turing Cluster Base Configuration
4
New HPC cluster , primarily deployed to support GROMACS.
Funding support from Dr. Vernier, Bio-Electric research center & ITS.
Initial specification for Turing Cluster is as follows:
FDR based infiniband switches.
Dell C8000 chassis with eight (8) SLEDS
Each SLED has 16 cores (E5-2660) & 32GB RAM.
(1) Head Nodes , Dell R720 server with 128GB memory
1Gbps switching hardware with 10Gbps uplinks
Base for expansion of research computing infrastructure at ODU.
Available to Researchers for computation since Fall 2013.
4
Information Technology Services | www.odu.edu/its | [email protected]
HPC Turing Cluster Expansion
5
Integration of 8 computes nodes purchased by Dr. Gangfeng Ma (Civil & Environmental
Engineering Department)
Thirty Six (36) compute SLED’s added by ITS
720 compute cores (E5-2660v2, ivy-bridge)
128GB per node, total 24TB.
2x500GB local disk per compute node.
Additional Head Node for redundancy (Dell R720 server with 128GB memory)
Separate Login Node for optimal compute environment.
Information Technology Services | www.odu.edu/its | [email protected]
Turing Cluster Summer 2014 upgrade
6
Seventy Six(76) CRAY CS-GB-512X compute nodes
1520 compute cores (E5-26670v2, ivy-bridge)
128GB per node.
2x250GB local disk per compute node.
Four (4) High Memory Nodes
32 compute cores (E5-4610v2, ivy-bridge) per node
768 GB per node.
Ten(10) Xeon Phi Nodes
Each node has 2 Xeon Phi (60 core) co processors.
Each node has 20 cores (E5-2670v2, ivy-bridge) , 128 GB memory
Infiniband (FDR based) compute backbone upgraded
324 level 1 FDR interfaces.
144 level 2 backbone interfaces.
Information Technology Services | www.odu.edu/its | [email protected]
Turing Cluster
7
Item Description SocketsCores/Socket
GPU Co-Processors
GPU/MIC cores/Socket Mem(GB)
Storage(TB) Quantity
Total Compute
Cores
Total GPU/MIC
CoresTotal
Mem(GB)
Total Storage
(TB)
Dell C8000 chassis Sandy-Bridge nodes
PowerEdge C8220, 2 x E5-2660(2.2GHz), Mellanox single port FDR 2 8 0 0 128 1 8 128 0 1024 8
Dell C8000 chassis Ivy-Bridge nodes
PowerEdge C8220, 2 x E5-2660v2(2.2GHz), Mellanox single port FDR 2 10 0 0 128 1 36 720 0 4608 36
Dell C6000 chassis Sandy-Bridge nodes
PowerEdge C6220, 2 x E5-2660(2.2GHz), Mellanox single port FDR 2 8 0 0 128 1 8 128 0 1024 8
Dell C6000 chassis Sandy-Bridge nodes
PowerEdge C6220, 2 x E5-2660(2.2GHz), Mellanox single port FDR 2 8 0 0 128 1 12 192 0 1536 12
CRAY SR5110 chassis Ivy-Bridge nodes
CRAY CS-GB-512X, 2 x E5-2670v2(2.5GHz), Mellanox single port FDR 2 10 0 0 128 0.5 76 1520 0 9728 38
CRAY SR5110 chassis Ivy-Bridge nodes
CRAY CS-GB-512X, 2 x E5-2670v2(2.5GHz), 2 x Xeon Phi 5110P, Mellanox single port FDR 2 10 2 60 128 1 10 200 1200 1280 10
High Memory NodesIntel R2304LH2HKC, 4 x E5-4610v2, Mellanox single port FDR 4 8 0 0 768 4 4 128 0 3072 16
CRAY Appro GPU Nodes
APPRO 1426G4, 2 x Intel Xeon X5650 (2.67Ghz), 4 x NVidia M2090 GPU processor, Mellanox single port QD 2 6 4 512 48 1 17 204 34816 816 17
Information Technology Services | www.odu.edu/its | [email protected]
STORAGE HPC
9
Legacy Clusters had NFS based scratch space mounted on all compute nodes.
Research Mass storage, accessible from Head Node (approximately 60TB of
disk plus tape storage).
Complete redesign of the computational storage infrastructure.
EMC Isilon based scale out NAS storage .
Integration of additional 430TB for computational research.
Integration of 36TB of LUSTRE based scratch space.
Information Technology Services | www.odu.edu/its | [email protected]
Turing Cluster Queuing Strategy
10
Current Job Scheduler is SGE.
Four separate job queues
Traditional computational resources
High memory nodes
APPRO GPU nodes
Nodes with dual Intel Xeon Phi
Fair Queuing Strategy
Shared computational resources
Compensates user over time
Information Technology Services | www.odu.edu/its | [email protected]
HPC Software Packages
11
Some Software Packages on HPC Turing Cluster:
COMSOL (finite element analysis & simulation software for physics and engineering)
MATLAB (Numerical computational software, plotting of functions, implementation of algorithms)
CHARMM (Molecular dynamics simulation & analysis software)
METIS (Software package for graph partitioning)
GAUSSIAN (Software package for computational biology)
GROMACS (Molecular dynamics simulation & analysis software for proteins, lipids and nucleic acids)
R (Software programming language & software environment for statistical computing and graphs)
CLC Bio (Software package for analysis of biological data)
MOLPRO (Software package used for quantum chemistry calculations)
DDSCAT(Software package for calculating scattering and absorption of irregular particles)
Information Technology Services | www.odu.edu/its | [email protected]
HPC@ODU Support Structure
13
Early 2013 , HPC support staff challenges
Two engineers moved on to other opportunities
One systems administrator position realigned for HPC support.
Started recruitment process to add two position focused on HPC
Addition to two (2) dedicated HPC systems engineers.
Jeaime Powell
Terry Stilwell
A computational scientist position is ready to to be recruited shortly.
Information Technology Services | www.odu.edu/its | [email protected]
Education & Outreach
14
Effective means of communicating with researchers and students on campus
regarding available resources and services
Quarterly HPC newsletter
HPC day.
New faculty outreach .
HPC Advisory Committee.