seaborg cerise wuthrich cmps 5433. seaborg manufactured by ibm distributed memory parallel...
TRANSCRIPT
Seaborg
Manufactured by IBM Distributed Memory Parallel Supercomputer Based on IBM’s SP RS/6000 Architecture
Seaborg
Used by National Energy Research Scientific Computing Center (funded by Department of Energy) at Berkeley Lab
Named for Glenn Seaborg – Nobel laureate chemist who discovered 10 atomic elements, including plutonium
Nodes416 nodes with 16 processors/node
380 compute nodes 20 nodes used for disk storage 6 login nodes 2 network nodes 8 spares
Node Architecture16 IBM Power3
processors per node
Between 16 – 64 GB Memory per node
2 network adapters per node
Processors IBM Power3 processors each running at 375 MHz Power – Performance Optimized With Enhanced
RISC PowerPC processors are RISC-based symmetric
multiprocessors (every processor is functionally identical) with 64-bit addressability
Connected to L2 cache by bus running at 250 MHz Dynamic Branch Prediction Instruction prefetching FP units are fully pipelined 4 FLOP/cycle x 375 MHz = 1500 Million or 1.5
GFLOPS/sec
Interconnection NetworkNodes connected with high bandwidth,
low latency IBM SP2 switchCan be connected in various topologies
depending on number of nodesEach switchboard has up to 32 links
16 links to nodes 16 links to other switchboards
Interconnection NetworkStar Topology used for up to 80 nodes and still guarantee 4 independent shortest paths
Interconnection NetworkThe combination of HW and SW of the
switch system is known as the CSS – Communication SubSystem
Network is highly available
Latency in the networkWithin nodes,
latency is 9 microseconds
Between nodes, using Message Passing Interface, the latency is 17 microseconds
ScalabilityArchitecture can handle from 1 – 512
nodesThe current version of Seaborg (2003)
is twice the size of the original (2001)
MemoryWithin each node, between 16 & 64 GB
of shared memoryshared memoryBetween nodes, there is distributed
memoryParallel programs can be run using
distributed memory message passing, shared memory threading or a combination
I/O 20 nodes run the distributed parallel I/O
system called GPFS – General Parallel File System
44 Terabytes of disk spaceEach node runs its own copy of AIX –
IBM’s Unix-based OS
Production Status/Cost$33 Million for the first version put into
operation in June 2001At the time, it was the 2nd most powerful
computer in the world and the most powerful one for unclassified research
In 2003, number of nodes was doubled
Customers2100 researchers at national labs and
universities across the countryRestricted to Department of Energy
funded massively parallel processing projects
Located at the National Energy Research Computing Center
Applications Massively Parallel Scientific Research Gasoline Combustion Simulation Fusion Energy Research Climate Modeling Materials Science Computational Biology Particle Simulations Plasma Acceleration Large Scale Simulation of Atomic Structures
Interesting Features In 2004, 2.4 times
as many requests as resources available
Uses POE (Parallel Operating Environment) and LoadLeveler to schedule jobs
Survey Results – Why do you use Seaborg? Need massively parallel
computer High speed Achieves level of
numerical accuracy Can run several
simulations in parallel Easy to connect using
ssh
Fastest and most efficient computer available for my research
Long queue times are great
Large enough memory for my needs
Survey Results – How could Seaborg be improved? I think too many nodes are scheduled for
many jobs. Scaling is not good in many cases.
“..Virtually impossible to do interactive work” “Debuggers are terrible.” “Compilers and debuggers are a step down
from the Cray.” Giving preference to high concurrency jobs
makes smaller jobs wait