the k computer system overview - fujitsu
TRANSCRIPT
the K computer System overview
Atsuya Uno
System Development Team, Development GroupNext-Generation Supercomputer R&D CenterRIKEN (The Institute of Physical and Chemical Research)
SC'11
K computer, “京” ,is …
• “京 [kei]” is a nickname of the next-generation supercomputer system
– The name was chosen from public applications in July last year.
– “京” stands for a unit corresponding to 10 P, which is also the performance target of our project.
• No.1 system on the latest TOP500 list.– 10.51 PFlops in LINPACK benchmark
(peak : 11.28PFlops)– Efficiency : 93.2%– Execution time : 29hours 28minutes.– Power consumption : 12.66 MW
SC'11 (1)
Design targets (in 2006) and Schedule• 10 peta-FLOPS in LINPACK benchmark• Peta-FLOPS sustained performance in real applications• Low power consumption system• Highly reliable system
SC'11 (2)
FY2008 FY2009 FY2010 FY2011FY2007FY2006 FY2012
Tuning and improvementTuning and
improvementSystem Prototype, evaluation
Prototype, evaluationDetailed designDetailed design
Conceptualdesign
Conceptualdesign
Production, installation, and adjustment
Production, installation, and adjustment
the present
Computerbuilding
Researchbuilding
ConstructionConstructionDesignDesign
ConstructionConstructionDesignDesignBuild
ings
SC'11 (3)
System Overview of the K computer
System Configuration
SC'11 (4)
Local file system(11PB~)
Global file system(30PB~)
Control and Management NW
K computerFront-endServers
Internet
IO Nodes
The K computerCompute nodes
Tofu interconnect(6D mesh / torus network)
Pre/PostProcessing
Servers
User
Global IO NWSystem
ManagementServers
SystemControlServers
SystemConfiguration
Job / UserManagement
Number of CPUsTotal memory capacity
> 80K> 1PB
Compute Nodes
SC'11 (5)
• CPU : SPARC64TM VIIIfx (8 core)– 128 Gflops– 2.0 GHz
• Memory : 16 GB
• Network : Tofu interconnect(6D mesh / torus network)
– provide logical 3D torus network for each job– 5 GB/s x 2 bandwidth of each link
(Image of 3D torus)© Fujitsu Limited.
x
y
z
SPARC64TM VIIIfx
© Fujitsu Limited.
x
Compute node
CPU: 128GFLOPS(8 cores)
CoreSIMD(4FMA)
16GFlopsCore
SIMD(4FMA)16GFlops
CoreSIMD(4FMA)
16GFlops
CoreSIMD(4FMA)
16GFlops
CoreSIMD(4FMA)
16GFlopsCore
SIMD(4FMA)16GFlops
CoreSIMD(4FMA)
16GFlops
L2$: 6MB
64 GB/s
CoreSIMD(4FMA)
16GFLOPS
MEM: 16 GB
5 GB/s x 2 (peak) 5 GB/s x 2 (peak)
5 GB
/s x 2(peak)
5 GB
/s x 2 (peak)
CPU – SPARC64TM VIIIfx
SC'11 (6)
• Superscalar Multi-core processor (8 cores)– 128 GFLOPS (16 GFLOPS/core)
• 2 GHz– 256 FP registers (double precision) / core– 2 SIMD (FMA x 4) / core
• Shared 6MB L2$ (12way)– Software controllable cache (Sector cache)
• Memory throughput– L2$ - Main memory : 0.5B/FLOP (64GB/s)
• Hardware barrier– High-speed synchronization of on-chip cores
• Power consumption– 58W at 30℃ by water-cooling
You can get more detailed information from ‘SPARC64TM VIIIfx Extensions’ (http://img.jp.fujitsu.com/downloads/jp/jhpc/sparc64viiifx-extensions.pdf)
Specification
Performance (peak) 128 GFLOPS (16 GFLOPS x 8 cores)
Core 8
Clock 2.0 GHz
Floating-pointExecution units
(Core spec)
FMA x 4 (2 SIMD)DIVIDE x 2COMPARE x 2
Floating-point register (64bit) : 256General purpose register (64bit) : 188
CacheL1I$ : 32 KB (2way)L1D$ : 32 KB (2way)L2$ : Shared 6 MB (12way)
Memory throughput 64 GB/s (0.5B/F)
45nm CMOS processChip size : 22.7mm x 22.6mm# of transistors : 760MPower : 58W (TYP, 30℃), Water Cooling
© Fujitsu Limited.
Tofu (Torus fusion) Interconnect• High communication performance and fault-tolerant network• Network topology : 6D mesh / torus network
– Each node has 10 links (5 GB/s x 2 bandwidth)• 4 links for making a basic unit (2x3x2 mesh / torus) and 6 links for XYZ link (XZ:torus / Y:mesh)
– Multi-path routing by a combination of XYZ link and 2x3x2 mesh / torus network enables to make a detour of faulty nodes
• From user’s point of view, network topology of the job is 1,2 or 3D torus network.– This torus network is dynamically configured when the nodes are assigned to the job.
SC'11 (7)
Z+
Z-
X- X+
Y+
Y-
XY
ZBasic unit
(2x3x2 mesh / torus)
Neighboring basic units are connected by 12 links.
Basic units form 3D torus.
System environments• OS: Linux based OS on compute nodes
• File system : FEFS (Fujitsu Exabyte File System) based on Lustrefile system
– Two-level large-scale distributed file system • Local file system for temporary files used by jobs• Global file system for user’s permanent files
– Staging functions• Stage-in
Input files on the global file system which are used in a job are copied on the local file system.
• Stage-out Output files generated during a job execution are moved back to the global file system after the job finished.
• Batch job-oriented system– Interactive environments are available for debugging.
SC'11
Local File System
Global File System
Compute nodes
Inputfiles
Filestaging
Outputfiles
IO
(8)
Programming environments• Programing languages and compilers
– Fortran 95 and subset of Fortran 2003, XPFortran– C99 and C++2003 including several extensions to GNU C and C++– Optimized compilers to obtain full capabilities of SPARC64TM VIIIfx
• SIMD, 256 FP registers, programmable L2 cache (sector cache), and so on. – OpenMP 3.0 is supported
• MPI library based on MPI-2.1 specification– Low-latency and high-throughput– Collective communications are optimized for the Tofu interconnect
• Bcast, Allgather, Alltoall, and Allreduce
• Numerical libraries– BLAS, LAPACK, FFTW, Fujitsu scientific numerical library SSL II, and so on will be
provided.
SC'11 (9)
Hardware Implementation
SC'11 (10)
• Compute rack houses :– System board x 24 (top:12, bottom:12)– IO system board x 6– Power supply unit x 9– Service processor board x 2– System disk (RAID 5)
• Peak performance : 12.3TF / Rack• Total memory : 1.5TB / Rack• Weight: 1,500Kg (max.)
System Board x 12
Pipes forWater-cooling
2060mm
796mm
IO System Board x 6
System Disk
Service Processor board x 2
Rack
System Board x 12
© Fujitsu Limited.
SPARC64TM VIIIfx(4 CPUs / 1 SB)
ICC (Tofu NW LSI)(4 ICCs / 1 SB)
© Fujitsu Limited.
DDR3 Memory(16 GB x 4 / 1 SB)
Power SupplyUnit x 9
Water-coolingmodule
System Board(24 boards / 1 rack)
Hardware view
SC'11 (11)
System Board (SB)
Node×4
512 GFLOPS64 GB
Compute Rack
SB×24IOSB×6
12.3 TFLOPS1.5 TB
Node
CPU×1ICC×1Memory
128 GFLOPS16 GB
Racks
Compute Rack×8Disk Rack×2
98.4 TFLOPS12 TB
Full System
Compute Rack × > 800
> 10 PFLOPS> 1 PB
Compute RackDisk Rack
Image of the K computer
SC'11 (12)
864 racks are housed in the computer room.
SC'11 (13)
Facilities for the K computer
Location of the K computer in Kobe
SC'11
450km (280miles) west from Tokyo
(14)
Tokyo
KobeKyoto
AICS (Advanced Institute for Computational Science) was established at RIKEN in July, 2010.
Layout of buildings
SC'11 (15)
Computer
Building
Substation
Chillers
Research
Building
Buildings and Cooling System
SC'11
Research Building• Six-story above ground and one below• Floor area : ~1,800 m2, ~9,000 m2 (total)
Computer Building• Three-story above ground and one below• Floor area: ~4,300 m2, ~10,500 m2 (total)
(16)
Chillers (Area : 1900m2)
Absorption Refrigerating Chiller
x 4
CentrifugalWater Chiller
x 3
CGS (5MW)x 2
Substation (Area : 200m2)
30MW (max.)77,000V (receiving)
→ 6,600V
Cooling System• K computer is cooled by air and water
SC'11 (17)
Parts Temperature
Water CPU and ICC Input : ~15℃Output : ~17℃
Air the others Room temp. : ~20℃……
……
Heat exchanger Air handlers
Compute racks
……
Absorption Refrigerating Chillers &
Centrifugal Water Chillers
SB
Installation of the K computer
SC'11 (18)
Period of Installation : 2010.9 ~ 2011.860
m
50m
First 8 racks installed on 30th Sep 2010.
• No pillar– flexible arrangement of computer racks – easy installation of network cables
The computer room on the third floorof the computer building.
Last racks were installed at the end of Aug. 2011.
Installation of the K computer (movie)
SC'11 (19)
SC'11
A photo in the early evening
Thank you for your attention !
(20)
AICS Booth #112