k-computer and supercomputing projects in japan makoto taiji computational biology research core...

29
K-computer and Supercomputing Projects in Japan Makoto Taiji Computational Biology Research Core RIKEN Planning Office for the Center for Computational and Quantitative Life Science & Processor Research Team RIKEN Advanced Institute for Computational Science [email protected]

Upload: bradyn-lifton

Post on 15-Dec-2015

216 views

Category:

Documents


2 download

TRANSCRIPT

  • Slide 1
  • K-computer and Supercomputing Projects in Japan Makoto Taiji Computational Biology Research Core RIKEN Planning Office for the Center for Computational and Quantitative Life Science & Processor Research Team RIKEN Advanced Institute for Computational Science [email protected]
  • Slide 2
  • Agenda K-computer Advanced Institute for Computational Science High Performance Computing Infrastructure My own perspective in future HPC, and MDGRAPE-4 (in short)
  • Slide 3
  • My Backgrounds Physics Special-purpose computers for scientific simulations (1986~) Monte Carlo simulations of spin systems (1986, m-TIS I) FPGA-based reconfigurable machine (1990, m-TIS II) Gravitational N-body problems (1992~96, GRAPE-4,5) Molecular Dynamics simulations (1994~, MD-GRAPE, MDM, MDGRAPE-3,4) Dense Matrix Calculation, quasi-general-purpose machine (MACE, 2000) Ultrafast laser spectroscopy (1987~92) Conjugated Polymers Rhodopsin and Bacteriorhodopsin Learning process as dynamical systems, multi-agent dynamics (1996~2002) Physical Random Number Generator (1997~2004)
  • Slide 4
  • World situation of HPC (Top 500) Country Share of Japan: Down to 6 th position
  • Slide 5
  • Next-Generation Supercomputer Project National project to develop a leading general- purpose supercomputer in Japan Not for single purpose cf. Earth Simulator Location: Kobe Port Island Developer: Fujitsu Linpack 10 PetaFLOPS Partial operation: Spring 2011 Full service: Autumn 2012 K computer system (CG)
  • Slide 6
  • Mt. Rokko Sannomiya Port Island Kobe Sky Bridge Portliner To Akashi / Awaji-Island To Osaka About 5km from Sannomiya 12 min. by Portliner Ashiya Kobe Airport Kobe Medical Industry Development Project Core Facilities Shinkansen-Line Shin-Kobe Station Photo: June, 2006 K-computer & Advanced Institute for Computational Sciences Location of K computer
  • Slide 7
  • RIKEN Advanced Institute for Computational Science National Center to cover wide fields of computational science and engineering
  • Slide 8
  • Formation of Central Hub in Kobe 8 Strategic Region Academia Registered Organization Selection of applications User Support Public Use Industry Advanced Institute for Computational Science Operation Sophistication Operation Organization Use Interdisciplinary Research, Computer Science Operation and sophistication of the supercomputer, Computational Sciences Interdisciplinary research Director: Dr. Kimihiko Hirao Strategic Region Strategic Use
  • Slide 9
  • RIKEN Advanced Institute for Computational Science 9 Director Operation Technology Division Research Promotion Division Research Division Field Theory Research Team (TL: Yoshinobu Kuramashi) Computational Biophysics Research Team (TL: Yuji Sugita) Computational Materials Science Research Team (TL: Seiji Yunoki) Computational Molecular Science Research Team (TL: Takahito Nakajima) System Software Research Team (TL: Yutaka Ishikawa) Processor Research Team (TL: Makoto Taiji) Deputy Director Computational Science Research Computer Science Research
  • Slide 10
  • Grand Challenge Applications Next-Generation Integrated Nano-Science Simulation Software (20062011) Next-Generation Integrated Life-Science Simulation Software (20062012) To create next-generation nano-materials (new semiconductor materials, etc.) by integrating theories (such as quantum chemistry, statistical dynamics and solid electron theory) and simulation techniques in the fields of new-generation information functions/materials, nano-biomaterials, and energy Base site: Institute for Molecular Science Next-Generation Energy Solar energy fixation Fuel alcohol Fuel cells Electric energy storage Electrons and molecules Electrons Domain Electron theory of solids Quantum chemistry Doping of fullerene and carbon nanotubes Molecular dynamics Condensed matters Integrated system 5nm Self- organized magnetic nanodots Semi- macroscopic Molecular assembly Next-Generation Nano Biomolecules Next-Generation information Function Materials One-dimensional crystal of silicon Polio virus Orbiton (orbital waves) Ferromagnetic half-metals offon light Optical switch Liposome Nafion Water 15nm Mesoscale structure of naflon membrane Self- assembly Capsulation Nafion membrane Medicines, New drug, and DDS Protein folding Nonlinear optical Device Nano quantum devices Spin electronics Ultra high-density storage devices Integrated electronic devices Water molecules inside lisozyme cavity Whole body Cardiova scular system Cells Organs Tissues Micro Macro Meso Microscopic approach MD/first principle/quantum chemistry simulations Continuous entity simulations Size Base site: RIKEN Wako Institute Electronic conduction in integrated systems Vascular system modeling Skeleton model Fluids, heat, structures Achievement of chemical reactions Molecular network analysis Protein structural analysis Drug response analysis Proteins/ DNA 10 0 10 -1 10 -3~-2 10 -5~-4 10 -8~-6 High Intensity Focused Ultrasound Drug development Tailor-made medicine Drug Delivery System Regenerative medicine Surgical procedures Catheters Micromachines Hyperthermia Macroscopic approach Organ and body scale Toward therapeutic technology Molecular scale Cellular scale Viruses Anticancer drugs Protein control Nano processes for DDC light 27 nm 46 nm To provide new tools for breakthroughs against various problems in life science by means of petaflops-class simulation technology, leading to comprehensive understanding of biological phenomena and the development of new drugs/medical devices and diagnostic/therapeutic methods Brain Function
  • Slide 11
  • Appointment of Strategic Regions Computational resources and budget will be allocated for the following regions Strategic organization will organize the research Region 1. Foundations for predictive life sciences, medical care, and drug design Region 2. Innovation of new materials and new energies Region 3. Prediction of global change for disaster prevention and reduction Region 4. Next-generation manufacturing Region 5. Origin and structure of matter and the universe 2009-2010: Feasibility Studies 2011-2015: Strategic Researches 11
  • Slide 12
  • FY2008FY2009FY2010FY2011 Computer building Research building FY2007FY2006FY2012 Shared file system Processing unit Front-end unit (total system software) Next-Generation Integrated Nanoscience Simulation Next-Generation Integrated Life Simulation Verification Development, production, and evaluation Tuning and improvement Verification Production, installation, and adjustment Production, installation, and adjustment Production, installation, and adjustment Construction Design Construction Design Prototype and evaluation Detailed design Conceptual design Detailed design Basic design Basic design Development, production, and evaluation Production and evaluation System Buildings Detailed design Basic design Basic design Schedule of Project Applications Strategic Researches Research Promotion Preparatory Researches Preparatory Researches Partial operation within FY2010, Full operation starts from FY2012 Feasibility Studies 12
  • Slide 13
  • Features of K computer = K means 10 16 High Performance : Linpack 10 PFLOPS Massive Parallelization > 80,000 Processors, > 640,000 Cores SPARC64 VIIIfx: Processor designed for HPC VISIMPACT / HPC-ACE extensions 16GB / node, 2GB / core ~20MW
  • Slide 14
  • K-Computer System Number of nodes : > 80,000 Number of Processors: > 80,000 Number of Cores: > 640,000 Peak Performance: > 10 PFLOPS Memory Capacity: > 1PB (16GB/node) Network: Tofu interconnect (6-dim. Torus) User view: 3D-Torus Bandwidth: 5GB/s bidirectional for each six direction 4 Simultaneous Communication Bisection Bandwidth: >30TB/s (bidirectional, nominal peak) CPU: 128GFLOPS (8 Core) Core SIMD(4FMA) 16GFlops Core SIMD(4FMA) 16GFlops Core SIMD(4FMA) 16GFlops Core SIMD(4FMA) 16GFlops Core SIMD(4FMA) 16GFlops Core SIMD(4FMA) 16GFlops Core SIMD(4FMA) 16GFlops L2$: 5MB 64GB/s Core SIMD(4FMA) 16GFLOPS MEM: 16GB 3D-Torus Network x y z 5GB/s x Bidirectional 5GB/s x Bidirectional 5GB/s x Bidirectional 5GB/s x Bidirectional 5GB/s x Bidirectional 5GB/s x Bidirectional
  • Slide 15
  • Cabinet of K computer 24 boards/cabinet 192 CPUs 24 TFLOPS 15
  • Slide 16
  • What is special in K computer? Network High Bandwidth, Low Latency Processor for HPC VISIMPACT Shared Cache & Hardware Barrier Multi-core parallelization of inner loop HPC-ACE Register Extension SIMD 2FMA, 2 issue/cycle (4FMA/Core) Instructions for special functions (trigonometric, inverse, square-root, inverse square-root etc.) 16
  • Slide 17
  • 17 T. Maruyama, Proc. Hot Chips 2009.
  • Slide 18
  • Software OS: Linux Compiler Fujitsu compiler will support Fortran(2003), C(1999), C++(2003) GNU C/C++ extensions Automatic vectorization for SPARC64 VIIIfx OpenMP 3.0 MPI-2.1 gcc may also be available. However, it cannot generate CPU specific instructions (e.g SIMD) and poor performance is expected.
  • Slide 19
  • How to use it? Five Strategic Regions has been selected. For these fields, MEXT will fund some research budget, and machine time will be delivered. General Use For general use, registered organization will control distribution of machine time. Commercial Use RIKEN does not responsible for the usage of the machine, basically.
  • Slide 20
  • HPCI: High Performance Computing Infrastructure System to utilize academic supercomputers in Japan 2012~ User Communities 5 strategic regions, Industrial Consortiums, National Universities and Institutes Computing Resource Provider RIKEN AICS, University Centers, National Institutes 20
  • Slide 21
  • Basic Idea of HPCI 21 Logical Structure Physical Structure 25 Organization13 Organization
  • Slide 22
  • Problem in Future of HPC Hardware If the problem can be parallelized Computing performance is cheap. However, in every aspects Data movements dominates costs. Core Cache Cache Main Memory Node Node Node Disk System System/Apparatus/Internet 22
  • Slide 23
  • Future Processors for HPC Gap between top-end HPC processors and commodity will increase What are needed for HPC Many-core processors, Accelerators for dense problems Chip stacking for bandwidth Network integration Network will be the most important factor in HPC
  • Slide 24
  • Future Directions (1) Network integration is essential both for general- purpose machines and special-purpose ones Platform for Accelerators General-purpose processor cores Cache or local memory Fast, low-latency on-chip and off-chip networks Network >30GB/s Memory 100GB/s Memory PU Accelerator On-chip Network >100GB/s/router
  • Slide 25
  • Future Directions (2) High Memory Bandwidth System Single-chip BlueGene/L by System-on-Chip or Chip stacking by TSV B/F 1 B/F 0.1 for remote node Network >50GB/s Memory PU >500GB/s >500GFLOPS
  • Slide 26
  • Problem in Network Molecular Dynamics: Strong Scaling is important 50,000 FLOP/particle/step N=10 5 5 GFLOP/step 5TFLOPS effective performance 1msec/step = 170nsec/day Rather Easy 5PFLOPS effective performance 1sec/step = 200sec/day??? Difficult, but important
  • Slide 27
  • Anton D. E. Shaw Research Special-purpose pipeline + General-purpose core + Dedicated Network By decreasing communication latency, it can achieve high sustained performance even for small systems R. O. Dror et al., Proc. Supercomputing 2009, in USB memory.
  • Slide 28
  • MDGRAPE-4 Special-purpose computer for molecular dynamics simulations Test bed for future HPC hardware FY2010-FY2012 System-on-Chip Accelerator Memory General-purpose processor Network ~4Tflops / chip
  • Slide 29
  • Fin 29