developing a cluster strategy for npaci all hands meeting panel feb 11, 2000 david e. culler...
Post on 20-Dec-2015
215 views
TRANSCRIPT
Developing a Cluster Strategy for NPACI
All Hands Meeting PanelFeb 11, 2000
David E. Culler
Computer Science Division
University of California, Berkeleyhttp://www.cs.berkeley.edu/~culler
2/11/2000 NPACI Clusters 2
• x86+Myrinet platforms w/ GbE inter-networking
UCB Millennium Cluster of Clusters
PIII-X 64x4
PII8x2
PII8x2
PII8x2
PII8x2
PII8x2
PIII 32x2
½ TB
DLIB PII
PIII
Gigabit Ethernet (GbE)
Ninja
Math
BioCE
PhysicsAstro
NTONInternet-2SuperNet
Mobile SvcsKiosksNOW
• Distributed ownership, allocation, and management
2/11/2000 NPACI Clusters 3
Vineyard Cluster Architecture
• Distributed resource utilization and management in a “Vineyard” of Clusters.
- VIA / GM, GbE- Multicast
Applications / Services(ISPACE/Kiosks)
- NT / Linux (2.2.x)- Stride Scheduler
MPI VEXEC
PBS
I/O
Mg
mt /
Mo
nito
rin
g
REXEC
TOOLS
Rootstock Distribution
2/11/2000 NPACI Clusters 4
Clusters “own” HPC
2/11/2000 NPACI Clusters 5
Fundamental Advantages of Clusters
• Cost
• Performance
• Performance / Cost
• Track leading edge of market technology
• Incremental scalability
• Availability
• Tremendous I/O performance
• Wide-Area Network performance– competitive internal network performance too
• Allow specialization of networked services
2/11/2000 NPACI Clusters 6
Fundamental Challenges
• Management• Complete system on every node
– need scalable administration
• Incremental scalability & availability => – heterogeneity
– some parts inoperable at any time
• The Cluster projects are making great progress in this area
– eg: Millennium rootstock
• Cluster tools are what you want for managing the desktops across your department
2/11/2000 NPACI Clusters 7
CS&E HPC hampered by “self-centered” usage model• Have my own application for my studies
• Want the entire machine to myself
• Want it now
• Think “services”
• Think “software”• The value is in your application.
• Make it a service and make it available to the scientific community.
• Put it on a cluster to deliver results 24x7 x 52
2/11/2000 NPACI Clusters 8
Example: TCAD Simulation Service
• star formation simulation
• earthquake simulations
• phylogeny, BLAST, ...
•http://cuervo.eecs.berkeley.edu/Volcano/
2/11/2000 NPACI Clusters 9
Extreme Example
• UCB Millennium / NOW has deliver 70 CPU years!
• Simple special case, but ...
• Engineered for portability, adaptability, availability
2/11/2000 NPACI Clusters 10
What should NPACI do?
To be relevant: • become a “Center of Expertise” for clusters• draw expertise toward the center for ease of
dissemination• facilitate and encourage building clusters among
the partners• invest in an interesting cluster “close to home”
– cheap! Graft Millennium
• invest in people to understand the implications
To Lead:• Pioneer widespread computational science and
engineering services• infiniband
2/11/2000 NPACI Clusters 11
from e-commerce to
2/11/2000 NPACI Clusters 12
Technical Backup Slides
2/11/2000 NPACI Clusters 13
Rootstock Mechanics
KKclusterstock- build- os- drvrs- mill SW- os mods
leasedbuilds
cs
CA
N
Cluster System Distribution Center
...
IPnetwork
1. Cluster Stock
- Rootstock build pages
- Full Current Linux
- all fixes and pckgs
- SSL, SSH
- Cluster Drivers
- Cluster System Layers
- rexec, mpe, pbs
- Optional SW ($)
- Cluster Kernal Mods
5. Cluster Update button (future) - 2nd dialtone, CF engine, rolling update
2. Make the CS “graft” - specify IP address - pckg removes - dchp, dns, nis,...sanity check and build - resolv.conf, /etc/hosts, ...constructs cluster build (lease)download CS build floppy
Cluster
3. CS power-on build
- xfer and localize DT
- add local admin scripts
- node build floppy
4. Node power-on build
- local stock from CS
2/11/2000 NPACI Clusters 14
REXEC / VEXEC
• Components– rexecd, rexec & vexecd
rexecd rexecd rexecd rexecd
vexecd(Policy A)
rexec
Cluster IP Multicast Channel
%rexec –n 2 –r 3 indexer
minimum $
vexecd(Policy B)
Node A Node B Node C Node D
“Nodes AB”run indexer on Nodes AB at 3 credits/min
2/11/2000 NPACI Clusters 15
Computational Economy
• Market-based approach to resource allocation– Optimizes for user value
Resources
EconomicF.E.
API
API
Access Modules
ResourceManagers
TimeShare
BatchQueue
Apps(Value)