blue gene project updatebuphy.bu.edu/~brower/pr/bluegene.pdf · in november 2001, a partnership...

21
Blue Gene Blue Gene Project Update Project Update January 2002 January 2002

Upload: others

Post on 01-Oct-2020

4 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Blue Gene Project Updatebuphy.bu.edu/~brower/pr/bluegene.pdf · In November 2001, a partnership with Lawrence Livermore National Laboratory was announced. The Blue Gene Project. Problem:

Blue GeneBlue Gene Project Update Project Update

January 2002January 2002

Page 2: Blue Gene Project Updatebuphy.bu.edu/~brower/pr/bluegene.pdf · In November 2001, a partnership with Lawrence Livermore National Laboratory was announced. The Blue Gene Project. Problem:

In December 1999, IBM Research announced a 5 year, $100M US, effort to build a petaflop scale supercomputer to attack problems such as protein folding. The Blue Gene project has two primary goals:

Advance the state of the art of biomolecular simulation. Advance the state of the art in computer design and software for extremely large scale systems.

In November 2001, a partnership with Lawrence Livermore National Laboratory was announced.

The Blue Gene ProjectThe Blue Gene Project

Page 3: Blue Gene Project Updatebuphy.bu.edu/~brower/pr/bluegene.pdf · In November 2001, a partnership with Lawrence Livermore National Laboratory was announced. The Blue Gene Project. Problem:

Problem: Changing Workload ParadigmsData-Intensive Workloads drive emerging commercial opportunities.Power consumption, cooling and size increasingly important to users.

Drivers: Data-Intensive Computing & TechnologyGrowth in traditional unstructured data (text, images, graphs, pictures, ...)Enormous increase in complex data types (video, audio, 3D images, ...)Increased dependency on optimized data mngmt, analysis, search and distribution.

Research Opportunity: A new way to build serversIncrease total system performance per cubic ft., per watt & per dollar by two orders of magnitude.Minimize development time and expense via common/reusable cores.Distribute workloads by placing compute power in memory and IO subsystems.Focus on Key Issues: Power Consumption, Sys. Mngmt., Compute Density, Autonomic Operation, Scalability, RAS.

Cellular System ArchitectureCellular System Architecture

Page 4: Blue Gene Project Updatebuphy.bu.edu/~brower/pr/bluegene.pdf · In November 2001, a partnership with Lawrence Livermore National Laboratory was announced. The Blue Gene Project. Problem:

Cellular System ProjectsCellular System Projects

Super Dense Server (SDS)Uses industry standard componentsLinux Web Hosting EnvironmentsCluster-in-a-box using 10s-100s of low power processorsWorkload specific functionality per node with shared storage system

Blue Gene / L SystemTarget scientific and emerging commercial apps.Collaboration with LLNL for ~200TF System for S&TC workloads.Extended distributed programming model.Scaleable from small to large systems (2048 proc/rack)Leverages high speed interconnect and system-on-a-chip technologies.

Blue Gene / Petaflop SupercomputerScience and research project

Explore the limits of protein science, computer science & computer arch.Applications - molecular dynamics, complex simulationsSupporting research: Distributed systems management, self-healing, problem determination, fast initialization, visualization, system monitoring, packaging, etc.

Page 5: Blue Gene Project Updatebuphy.bu.edu/~brower/pr/bluegene.pdf · In November 2001, a partnership with Lawrence Livermore National Laboratory was announced. The Blue Gene Project. Problem:

Supercomputer Peak PerformanceSupercomputer Peak Performance

1940 1950 1960 1970 1980 1990 2000 2010Year Introduced

1E+2

1E+4

1E+6

1E+8

1E+10

1E+12

1E+14

1E+16

Peak

Spe

ed (f

lops

)

Doubling time = 1.5 yr.

ENIAC (vacuum tubes)UNIVAC

IBM 701 IBM 704IBM 7090 (transistors)

IBM Stretch

CDC 6600 (ICs)CDC 7600

CDC STAR-100 (vectors)CRAY-1

Cyber 205 X-MP2 (parallel vectors)

CRAY-2X-MP4

Y-MP8i860 (MPPs)

ASCI White

Blue Gene / PBlue Gene / L

Blue Pacific

DeltaCM-5 Paragon

NWTASCI Red

ASCI Red

CP-PACS

Page 6: Blue Gene Project Updatebuphy.bu.edu/~brower/pr/bluegene.pdf · In November 2001, a partnership with Lawrence Livermore National Laboratory was announced. The Blue Gene Project. Problem:

QCDOCCMOS7SF20 TFLOP

QCDSP0.6 TFLOP

ASCI-Q

30 TFLOP

PPC 630ASCI-White10 TFLOP

PPC 604ASCI-Blue

3.3TFLOP

Perf

orm

ance

ArchitectureSpecific General

BG/LCU-11

180 TFLOP

BG/CCU-11

1000 TFLOP

BG/ PCU-08

1000 TFLOP

Supercomputing LandscapeSupercomputing Landscape

Page 7: Blue Gene Project Updatebuphy.bu.edu/~brower/pr/bluegene.pdf · In November 2001, a partnership with Lawrence Livermore National Laboratory was announced. The Blue Gene Project. Problem:

Two cellular computing architecturesBlue Gene/L Blue Gene/C (formerly Cyclops)

Software stackKernels, host, middleware, simulators, OSSelf healing, autonomic computing

Application programMolecular dynamics application softwarePartnerships, external advisory board

Blue Gene Project componentsBlue Gene Project components

Page 8: Blue Gene Project Updatebuphy.bu.edu/~brower/pr/bluegene.pdf · In November 2001, a partnership with Lawrence Livermore National Laboratory was announced. The Blue Gene Project. Problem:

Massively parallel architecture applicable to a wide class of problems.

Third generation of low power - high performance super computer QCDSP (600GF based on Texas Instruments DSP C31)

Gordon Bell Prize for Most Cost Effective Supercomputer in '98QCDOC (20TF based on IBM System-on-a-Chip)

Collaboration between Columbia University and IBM ResearchBlue Gene/L (180/360 TF)

Processor architecture included in the optimization

Outstanding price performancePartnership between IBM, ASCI-Trilab and UniversitiesInitial focus on numerically intensive scientific problems

Blue Gene/LBlue Gene/L

Page 9: Blue Gene Project Updatebuphy.bu.edu/~brower/pr/bluegene.pdf · In November 2001, a partnership with Lawrence Livermore National Laboratory was announced. The Blue Gene Project. Problem:

2.8/5.6 GF/s4 MB

Chip(2 processors)

CU-11 Design

Board(8 chips, 2x2x2)

Rack(128 boards, 8x8x16)

22.4/44.8 GF/s2.08 GB

2.9/5.7 TF/s266 GB

System(64 cabinets, 32x32x64)

180/360 TF/s16 TB

440 core

440 core

EDRAM

I/O

Blue Gene/L Blue Gene/L

Page 10: Blue Gene Project Updatebuphy.bu.edu/~brower/pr/bluegene.pdf · In November 2001, a partnership with Lawrence Livermore National Laboratory was announced. The Blue Gene Project. Problem:

65536 nodes interconnected with three integrated networks

3 Dimensional TorusVirtual cut-through hardware routing to maximize efficiency2.8 Gb/s on all 12 node links (total of 4.2 GB/s per node)Communication backbone134 TB/s total torus interconnect bandwidth

Global TreeOne-to-all or all-all broadcast functionalityArithmetic operations implemented in tree~1.4 GB/s of bandwidth from any node to all other nodes Latency of tree traversal less than 1usec

Blue Gene/L - The NetworksBlue Gene/L - The Networks

EthernetIncorporated into every node ASICDisk I/OHost control, booting and diagnostics

Page 11: Blue Gene Project Updatebuphy.bu.edu/~brower/pr/bluegene.pdf · In November 2001, a partnership with Lawrence Livermore National Laboratory was announced. The Blue Gene Project. Problem:

Software environment includes:High performance Scientific KernelMPI-2 subset, defined by usersMath libraries subset, defined by usersCompiler support for DFPU (C, C++, Fortran)Parallel file systemSystem management

External Collaborations: Boston University, Caltech, Columbia University, Oak Ridge National Labs, San Diego Supercomputing Center, Universidad Politecnica de Valencia, University of Edinburgh, University of Maryland,Texas A&M, Tech. Univ. of Vienna...

Blue Gene/L System SoftwareBlue Gene/L System Software

Page 12: Blue Gene Project Updatebuphy.bu.edu/~brower/pr/bluegene.pdf · In November 2001, a partnership with Lawrence Livermore National Laboratory was announced. The Blue Gene Project. Problem:

Kernel Services

File I/OCheckPoint

GraphicsNetwork

I/O Node

I/O& RAS

Network

1 Gb/sEthernet

Kernel Services

File I/OCheckPoint

GraphicsNetwork

I/O Node

RAID Svrs

RAID

RAID

RAID

RAID

RAID

RAID

RAID

RAID

RAID

RAID

RAID

RAID

RAID

RAID

RAID

RAID

1Tb/s

Kernel Services

Application

MPI C/C++, F95Math

Compute Nodes

Kernel Services

Application

MPI C/C++, F95Math

Compute Nodei

Kernel Services

Application

MPI C/C++, F95Math

Compute Nodei

Kernel Services

Application

MPI C/C++, F95Math

Compute Nodei

64K Nodes

1K I/O Nodes

System Console

Host

BG/L - Operating EnvironmentBG/L - Operating Environment

Page 13: Blue Gene Project Updatebuphy.bu.edu/~brower/pr/bluegene.pdf · In November 2001, a partnership with Lawrence Livermore National Laboratory was announced. The Blue Gene Project. Problem:

Blue Gene Science Blue Gene Science

Explore simulation methodologies, analysis tools, and biological systems.

Thermodynamic and kinetic studies of peptide systemsThe free energy landscape for hairpin folding in explicit water, R. Zhou et al. (to appear in PNAS)

Solvation free energy/partition coefficient calculations for peptides in solution using a variety of force fields for comparison with experiment

Explore scientific and technical computing applications in other domains such as climate, materials science, ...

Page 14: Blue Gene Project Updatebuphy.bu.edu/~brower/pr/bluegene.pdf · In November 2001, a partnership with Lawrence Livermore National Laboratory was announced. The Blue Gene Project. Problem:

Drivers for ApplicationDrivers for Application

Aggressive machine architecture requires:small memory footprintfine-grained concurrency in application decompositionexploration of reusable frameworks for application development

Biological science program requires:multiple force field supportarchitectural support for algorithmic and methodological investigationsframework to support migration of analysis modules from externally hosted application to the parallel core

Page 15: Blue Gene Project Updatebuphy.bu.edu/~brower/pr/bluegene.pdf · In November 2001, a partnership with Lawrence Livermore National Laboratory was announced. The Blue Gene Project. Problem:

Algorithm Scientific Importance

Mapping to BG/L

ArchitectureAb Initio Molecular Dynamics in biology and materials science (JEEP) A A

Three dimensional dislocation dynamics (MICRO3D and PARANOID) A A

Molecular Dynamics(MDCASK code) A A

Kinetic Monte Carlo (BIGMAC) A C

Computational Gene Discovery (MPGSS) A B

Turbulence: Rayleigh-Taylor instability (MIRANDA) A A

Shock Turbulence (AMRH) A A

Turbulence and Instability Modeling(sPPM benchmark code) B A

Hydrodynamic Instability (ALE hydro) A A

BG/L Application to ASCI algorithmsBG/L Application to ASCI algorithms

Page 16: Blue Gene Project Updatebuphy.bu.edu/~brower/pr/bluegene.pdf · In November 2001, a partnership with Lawrence Livermore National Laboratory was announced. The Blue Gene Project. Problem:

Blue Matter - a Molecular Dynamics Blue Matter - a Molecular Dynamics CodeCode

Separate MD program into three subpackages (offload function to host where possible):

MD core engine (massively parallel, minimal in size)Setup programs to setup force field assignments, etcAnalysis Tools to analyze MD trajectories, etc

Multiple Force Field SupportCHARMM force field (done)OPLS-AA force field (done)AMBER force field (done)Polarizable Force Field (desired)

Potential Parallelization StrategiesInteraction-basedVolume-basedAtom-based

Page 17: Blue Gene Project Updatebuphy.bu.edu/~brower/pr/bluegene.pdf · In November 2001, a partnership with Lawrence Livermore National Laboratory was announced. The Blue Gene Project. Problem:

phenomenon System/sizew/solvent

time scale

time stepcount

beta hairpin kinetics

-hairpin/4000 atoms

5 sec 10**9

peptide thermo. -helix,-hairpin/400

0

0.1-1 s 10**8

protein thermo. 60-100 res./20-30,000

1-10 s 10**9

protein kinetics 60-100 res./20-30,000

500 sec 10**11

Time Scales for Protein Folding Phenomena Time Scales for Protein Folding Phenomena

Page 18: Blue Gene Project Updatebuphy.bu.edu/~brower/pr/bluegene.pdf · In November 2001, a partnership with Lawrence Livermore National Laboratory was announced. The Blue Gene Project. Problem:

Simulation CapacitySimulation Capacity

1000 10000 100000System Size (atoms)

1.00E+6

1.00E+7

1.00E+8

1.00E+9

1.00E+10

1.00E+11

1.00E+12

1.00E+13

time

step

s/m

onth

1 rack Power3 ('01)512 node BG/L partition (2H03)

40*512 node BG/L partition (4Q04)1,000,000 GFLOP/second (2H06)

Page 19: Blue Gene Project Updatebuphy.bu.edu/~brower/pr/bluegene.pdf · In November 2001, a partnership with Lawrence Livermore National Laboratory was announced. The Blue Gene Project. Problem:

1E+3 1E+4 1E+5System Size (atoms)

1.000E+12

1.000E+13

1.000E+14

1.000E+15

1.000E+16

1.000E+17

1.000E+18

byte

s/m

onth

data volume/month (1 rack Power3)data volume/month (512 node BL)data volume/month (40*512 node BL)data volume/month (1,000,000 GLOP/s)

Data Volumes (assuming every Data Volumes (assuming every time-step written out)time-step written out)

Page 20: Blue Gene Project Updatebuphy.bu.edu/~brower/pr/bluegene.pdf · In November 2001, a partnership with Lawrence Livermore National Laboratory was announced. The Blue Gene Project. Problem:

External Scientific InteractionsExternal Scientific Interactions

First Blue Gene Protein Science workshop held at San Diego Supercomputer Center, March 2001Second Blue Gene Protein Science workshop to be held at the Maxwell Institute, U. of Edinburgh, in March 2002Collaborations with ORNL, Columbia, UPenn, Maryland, Stanford, ETH-Zurich, ...Blue Gene seminar series has hosted over 25 speakers at the T.J. Watson Research CenterBlue Gene Applications Advisory Board formed with 15 members from the external scientific and HPC communities.

Page 21: Blue Gene Project Updatebuphy.bu.edu/~brower/pr/bluegene.pdf · In November 2001, a partnership with Lawrence Livermore National Laboratory was announced. The Blue Gene Project. Problem:

Blue GeneBlue Gene Project Update Project Update

January 2002January 2002