cray xc50™ supercomputer - supercomputers, hpc … · series network and compute technology has...

2
Cray Inc. 901 Fifth Avenue, Suite 1000 Seattle, WA 98164 Tel: 206.701.2000 Fax: 206.701.2500 www.cray.com © 2017 Cray Inc. All rights reserved. Specifications are subject to change without notice. Cray and the Cray logo are registered trademarks, and Cray XC and ClusterStor are trademarks of Cray Inc. All other trademarks mentioned herein are the properties of their respective owners. 20171215ES Adaptive Hybrid Computing and Scalable Many-Core Performance Supercomputer users procure their machines to satisfy specific and demanding requirements. But they need their systems to grow and evolve to maximize machine lifetime and return on investment. To solve these challenges today and into the future, the Cray ® XC™ supercomputer series network and compute technology has been designed to easily accommodate new pro- cessor introductions, upgrades and enhancements. Users can augment their systems in place to upgrade to higher performance processors, or add coprocessor and accelerator compo- nents to build even higher performance Cray XC50 supercomputer configurations. Cray XC50 GPU Compute Blade The Cray XC series architecture implements two processor engines per compute node and has four compute nodes per blade. GPU compute blades stack 16 to a chassis and each cabinet can be populated with up to three chassis, culminating in 384 sockets per cabinet. Cray XC50 supercomputers can be configured up to hundreds of cabinets and upgraded to exceed 500 petaflops per system. Processor Daughter Cards (PDCs) The Cray XC50 system mates processor engine technology to the main compute blades via two configurable daughter cards. The flexible PCI Express 3.0 standard accommodates scalar processors, coprocessors and accelerators to create hybrid systems that can evolve over time. For example, PDCs can be swapped out or reconfigured while keeping the original compute base blades in place, quickly leveraging the best possible performance technologies. NVIDIA ® Tesla ® P100 GPU Accelerators Enhanced Performance: More Cores, Flexible Compute Modes and High-Bandwidth Memory NVIDIA’s Tesla P100 GPU accelerators deliver in excess of 3,500 embedded cores and flexible mixed-precision computing options. The P100 offers customers a choice of double-precision, single-precision or half-precision compute operation, empowering users to trade off precision and performance for their own specific application requirements. Additionally, the P100 GPU accelerator integrates high-bandwidth memory into the package, enabling up to 3x memory bandwidth improvements over prior-generation external-memory GPU solutions. The Cray XC series Tesla P100 system delivers superior application performance, memory band- width and performance per watt. Cray also supports multiple programming models for the P100 GPU accelerator, including the Cray compiler, OpenACC directives-based coding and CUDA. Learn more about NVIDIA Tesla GPU products at www.nvidia.com/tesla. Cray ® XC50™ Supercomputer NVIDIA ® Tesla ® P100 GPU Accelerator Compute Blade The Cray XC supercomputer series architecture has been specifically designed from the ground up to be adaptive. A holistic approach optimizes the entire system to deliver sustained real-world performance and extreme scalability across the collective inte- gration of all hardware, networking and software. One key differentiator with this adaptive supercomputing platform is the flexible method of implementing best-in-class processing elements via processor daughter cards (PDCs) on XC series compute blades.

Upload: lyanh

Post on 03-May-2018

218 views

Category:

Documents


1 download

TRANSCRIPT

Page 1: Cray XC50™ Supercomputer - Supercomputers, HPC … · series network and compute technology has been designed to easily ... Cray® XC50™ Supercomputer NVIDIA ® Tesla P100 GPU

Cray Inc. • 901 Fifth Avenue, Suite 1000 • Seattle, WA 98164 • Tel: 206.701.2000 • Fax: 206.701.2500 • www.cray.com

© 2017 Cray Inc. All rights reserved. Specifications are subject to change without notice. Cray and the Cray logo are registered trademarks, and Cray XC and ClusterStor are trademarks of Cray Inc. All other trademarks mentioned herein are the properties of their respective owners. 20171215ES

Adaptive Hybrid Computing and Scalable Many-Core PerformanceSupercomputer users procure their machines to satisfy specific and demanding requirements. But they need their systems to grow and evolve to maximize machine lifetime and return on investment. To solve these challenges today and into the future, the Cray® XC™ supercomputer series network and compute technology has been designed to easily accommodate new pro-cessor introductions, upgrades and enhancements. Users can augment their systems in place to upgrade to higher performance processors, or add coprocessor and accelerator compo-nents to build even higher performance Cray XC50 supercomputer configurations.

Cray XC50 GPU Compute BladeThe Cray XC series architecture implements two processor engines per compute node and has four compute nodes per blade. GPU compute blades stack 16 to a chassis and each cabinet can be populated with up to three chassis, culminating in 384 sockets per cabinet. Cray XC50 supercomputers can be configured up to hundreds of cabinets and upgraded to exceed 500 petaflops per system.

Processor Daughter Cards (PDCs)The Cray XC50 system mates processor engine technology to the main compute blades via two configurable daughter cards. The flexible PCI Express 3.0 standard accommodates scalar processors, coprocessors and accelerators to create hybrid systems that can evolve over time. For example, PDCs can be swapped out or reconfigured while keeping the original compute base blades in place, quickly leveraging the best possible performance technologies.

NVIDIA® Tesla® P100 GPU Accelerators Enhanced Performance: More Cores, Flexible Compute Modes and High-Bandwidth MemoryNVIDIA’s Tesla P100 GPU accelerators deliver in excess of 3,500 embedded cores and flexible mixed-precision computing options. The P100 offers customers a choice of double-precision, single-precision or half-precision compute operation, empowering users to trade off precision and performance for their own specific application requirements. Additionally, the P100 GPU accelerator integrates high-bandwidth memory into the package, enabling up to 3x memory bandwidth improvements over prior-generation external-memory GPU solutions.

The Cray XC series Tesla P100 system delivers superior application performance, memory band-width and performance per watt. Cray also supports multiple programming models for the P100 GPU accelerator, including the Cray compiler, OpenACC directives-based coding and CUDA.

Learn more about NVIDIA Tesla GPU products at www.nvidia.com/tesla.

Cray® XC50™ Supercomputer NVIDIA® Tesla® P100 GPU Accelerator Compute Blade

The Cray XC supercomputer series architecture has been specifically designed from the ground up to be adaptive. A holistic approach optimizes the entire system to deliver sustained real-world performance and extreme scalability across the collective inte-gration of all hardware, networking and software.

One key differentiator with this adaptive supercomputing platform is the flexible method of implementing best-in-class processing elements via processor daughter cards (PDCs) on XC series compute blades.

Page 2: Cray XC50™ Supercomputer - Supercomputers, HPC … · series network and compute technology has been designed to easily ... Cray® XC50™ Supercomputer NVIDIA ® Tesla P100 GPU

Cray® XC™ Series Specifications for NVIDIA® Tesla® P100 Compute BladeProcessor 192 nodes with one Xeon® (Haswell/Broadwell) host and one P100 PCIe card per node

Memory32-128 GB DDR4 on host and up to 16 GB of HBM2 on accelerator

Memory bandwidth: DDR up to 76.8 GB/s per node, HBM2 up to 720 GB/s per node

Compute CabinetUp to 192 compute nodes per cabinet

Peak performance: up to 902 TF GPU and 141 TF host processor, total up to 1.04 PF/cabinet

Interconnect

1 Aries™ routing and communications ASIC per 4 compute nodes

48 switch ports per Aries chip (500 GB/s switching capacity per chip)

Dragonfly interconnect: low latency, high-bandwidth topology

System Administration

Cray System Management Workstation (SMW)

Single-system view for system administration

System software rollback capability

Reliability Features (Hardware)

Integrated Cray Hardware Supervisory System (HSS)

Independent, out-of-band management network

Full ECC protection of all packet traffic in the Aries network

Redundant power supplies; redundant voltage regulator modules

Redundant paths to all system RAID

Hot swap blowers, power supplies and compute blades

Integrated pressure and temperature sensors

Reliability Features (Software)

HSS system monitors operation of all operating system kernels

Lustre® file system object storage target failover; Lustre metadata server failover

Software failover for critical system services including system database, system logger and batch subsystems

NodeKARE (Node Knowledge and Reconfiguration)

Operating SystemCray Linux® Environment (includes SUSE Linux SLES11, HSS and SMW software)

Extreme Scalability Mode (ESM) and Cluster Compatibility Mode (CCM)

Compilers, Libraries & Tools

Cray Compiler Environment, PGI compiler, GNU compiler

Support for the ISO Fortran standard (2008) including parallel programming using coarrays, C/C++ and UPC

MPI 3.0, Cray SHMEM, other standard MPI libraries using CCM; Cray Apprentice and CrayPAT™ performance tools; Intel Parallel Studio Development Suite (option)

Job Management

PBS Professional job management system

Moab Adaptive Computing Suite job management system

SLURM – Simple Linux Unified Resource Manager

External I/O Interface Infiniband, 40 and 10 gigabit Ethernet, Fibre Channel (FC) and Ethernet

Disk Storage Full line of FC, SAS and IB based disk arrays with support for FC and SATA disk drives, ClusterStor™ data storage

Parallel File System Lustre, Data Virtualization Service (DVS) allows support for NFS, external Lustre and other file systems

Power

90 kW per compute cabinet, maximum configuration

Support for 480 VAC and 400 VAC computer rooms

6 kW per blower cabinet, 20 AMP at 480 VAC or 16 AMP at 400 VAC (three-phase, ground)

Cooling Water cooled with forced transverse air flow: 6,900 cfm intake

Dimensions (Cabinets)H 80.25” x W 35.56” x D 76.50” (compute cabinet)

H 80.25” x W 18.06” x D 59.00” (blower cabinet)

Weight (Operational)4,500 lbs. maximum per compute cabinet — liquid cooled, 254 lbs./square foot floor loading900 lbs. maximum per blower cabinet

Regulatory Compliance

EMC: FCC Part 15 Subpart B, CE Mark, CISPR 22 & 24, ICES-003, C-tick, VCCI

Safety: IEC 60950-1, TUV SUD America CB Report

Acoustic: ISO 7779, ISO 9296