rocks clusters sun hpc consortium november 2004 federico d. sacerdoti advanced cyberinfrastructure...
TRANSCRIPT
![Page 1: Rocks Clusters SUN HPC Consortium November 2004 Federico D. Sacerdoti Advanced CyberInfrastructure Group San Diego Supercomputer Center](https://reader035.vdocument.in/reader035/viewer/2022062314/56649de85503460f94ae2908/html5/thumbnails/1.jpg)
Rocks Clusters
SUN HPC Consortium
November 2004
Federico D. Sacerdoti
Advanced CyberInfrastructure Group
San Diego Supercomputer Center
![Page 2: Rocks Clusters SUN HPC Consortium November 2004 Federico D. Sacerdoti Advanced CyberInfrastructure Group San Diego Supercomputer Center](https://reader035.vdocument.in/reader035/viewer/2022062314/56649de85503460f94ae2908/html5/thumbnails/2.jpg)
Copyright © 2004 F. Sacerdoti, M. Katz, G. Bruno, P. Papadopoulos, UC Regents
Outline
• Rocks Identity• Rocks Mission• Why Rocks • Rocks Design• Rocks Technologies, Services, Capabilities• Rockstar
![Page 3: Rocks Clusters SUN HPC Consortium November 2004 Federico D. Sacerdoti Advanced CyberInfrastructure Group San Diego Supercomputer Center](https://reader035.vdocument.in/reader035/viewer/2022062314/56649de85503460f94ae2908/html5/thumbnails/3.jpg)
Copyright © 2004 F. Sacerdoti, M. Katz, G. Bruno, P. Papadopoulos, UC Regents
Rocks Identity
• System to build and manage Linux Clusters
General Linux maintenance system for N nodes
Desktops too
Happens to be good for clusters
• Free
• Mature
• High Performance Designed for scientific workloads
![Page 4: Rocks Clusters SUN HPC Consortium November 2004 Federico D. Sacerdoti Advanced CyberInfrastructure Group San Diego Supercomputer Center](https://reader035.vdocument.in/reader035/viewer/2022062314/56649de85503460f94ae2908/html5/thumbnails/4.jpg)
Copyright © 2004 F. Sacerdoti, M. Katz, G. Bruno, P. Papadopoulos, UC Regents
Rocks Mission
• Make Clusters Easy (Papadopoulos, 00)
• Most cluster projects assume a sysadmin will help build the cluster.
• Build a cluster without assuming CS knowledge
Simple idea, complex ramifications Automatic configuration of all components and services ~30 services on frontend, ~10 services on compute nodes
Clusters for Scientists
• Results in a very robust system that is insulated from human mistakes
![Page 5: Rocks Clusters SUN HPC Consortium November 2004 Federico D. Sacerdoti Advanced CyberInfrastructure Group San Diego Supercomputer Center](https://reader035.vdocument.in/reader035/viewer/2022062314/56649de85503460f94ae2908/html5/thumbnails/5.jpg)
Copyright © 2004 F. Sacerdoti, M. Katz, G. Bruno, P. Papadopoulos, UC Regents
Why Rocks
• Easiest way to build a Rockstar-class machine with SGE ready out of the box
• More supported architectures Pentium, Athlon, Opteron, Nocona, Itanium
• More happy users 280 registered clusters, 700 member support list HPCwire Readers Choice Awards 2004
• More configured HPC software: 15 optional extensions (rolls) and counting.
• Unmatched Release Quality.
![Page 6: Rocks Clusters SUN HPC Consortium November 2004 Federico D. Sacerdoti Advanced CyberInfrastructure Group San Diego Supercomputer Center](https://reader035.vdocument.in/reader035/viewer/2022062314/56649de85503460f94ae2908/html5/thumbnails/6.jpg)
Copyright © 2004 F. Sacerdoti, M. Katz, G. Bruno, P. Papadopoulos, UC Regents
Why Rocks
• Big projects use Rocks BIRN (20 clusters)
GEON (20 clusters)
NBCR (6 clusters)
• Supports different clustering toolkits Rocks Standard (RedHat HPC) SCE SCore (Single Process Space) OpenMosix (Single Process Space: on the way)
![Page 7: Rocks Clusters SUN HPC Consortium November 2004 Federico D. Sacerdoti Advanced CyberInfrastructure Group San Diego Supercomputer Center](https://reader035.vdocument.in/reader035/viewer/2022062314/56649de85503460f94ae2908/html5/thumbnails/7.jpg)
Copyright © 2004 F. Sacerdoti, M. Katz, G. Bruno, P. Papadopoulos, UC Regents
Rocks Design
• Uses RedHat’s intelligent installer Leverages RedHat’s ability to discover & configure hardware Everyone tries System Imaging at first
Who has homogeneous hardware? If so, whose cluster stays that way?
• Description Based install: Kickstart Like Jumpstart
• Contains a viable Operating System No need to “pre-configure” an OS
![Page 8: Rocks Clusters SUN HPC Consortium November 2004 Federico D. Sacerdoti Advanced CyberInfrastructure Group San Diego Supercomputer Center](https://reader035.vdocument.in/reader035/viewer/2022062314/56649de85503460f94ae2908/html5/thumbnails/8.jpg)
Copyright © 2004 F. Sacerdoti, M. Katz, G. Bruno, P. Papadopoulos, UC Regents
Rocks Design
• No special “Rocksified” package structure. Can install any RPM.
• Where Linux core packages come from: RedHat Advanced Workstation (from SRPMS)
Enterprise Linux 3
![Page 9: Rocks Clusters SUN HPC Consortium November 2004 Federico D. Sacerdoti Advanced CyberInfrastructure Group San Diego Supercomputer Center](https://reader035.vdocument.in/reader035/viewer/2022062314/56649de85503460f94ae2908/html5/thumbnails/9.jpg)
Copyright © 2004 F. Sacerdoti, M. Katz, G. Bruno, P. Papadopoulos, UC Regents
Rocks Leap of Faith
• Install is primitive operation for Upgrade and Patch Seems wrong at first
Why must you reinstall the whole thing?
Actually right: debugging a Linux system is fruitless at this scale. Reinstall enforces stability.
Primary user has no sysadmin to help troubleshoot
• Rocks install is scalable and fast: 15min for entire cluster Post script work done in parallel by compute nodes.
• Power Admins may use up2date or yum for patches. To compute nodes by reinstall
![Page 10: Rocks Clusters SUN HPC Consortium November 2004 Federico D. Sacerdoti Advanced CyberInfrastructure Group San Diego Supercomputer Center](https://reader035.vdocument.in/reader035/viewer/2022062314/56649de85503460f94ae2908/html5/thumbnails/10.jpg)
Rocks Technology
![Page 11: Rocks Clusters SUN HPC Consortium November 2004 Federico D. Sacerdoti Advanced CyberInfrastructure Group San Diego Supercomputer Center](https://reader035.vdocument.in/reader035/viewer/2022062314/56649de85503460f94ae2908/html5/thumbnails/11.jpg)
Copyright © 2004 F. Sacerdoti, M. Katz, G. Bruno, P. Papadopoulos, UC Regents
Cluster Integration with Rocks
1. Build a frontend node1. Insert CDs: Base, HPC, Kernel, optional Rolls
2. Answer install screens: network, timezone, password
2. Build compute nodes1. Run insert-ethers on frontend (dhcpd listener)
2. PXE boot compute nodes in name order
3. Start Computing
![Page 12: Rocks Clusters SUN HPC Consortium November 2004 Federico D. Sacerdoti Advanced CyberInfrastructure Group San Diego Supercomputer Center](https://reader035.vdocument.in/reader035/viewer/2022062314/56649de85503460f94ae2908/html5/thumbnails/12.jpg)
Copyright © 2004 F. Sacerdoti, M. Katz, G. Bruno, P. Papadopoulos, UC Regents
Rocks Tech: Dynamic Kickstart File
On node install
![Page 13: Rocks Clusters SUN HPC Consortium November 2004 Federico D. Sacerdoti Advanced CyberInfrastructure Group San Diego Supercomputer Center](https://reader035.vdocument.in/reader035/viewer/2022062314/56649de85503460f94ae2908/html5/thumbnails/13.jpg)
Copyright © 2004 F. Sacerdoti, M. Katz, G. Bruno, P. Papadopoulos, UC Regents
Rocks Roll Architecture
• Rolls are Rocks Modules Think Apache
• Software for cluster Packaged
3rd party tarballs
Tested Automatically configured
services
• RPMS plus Kickstart graph in ISO form.
![Page 14: Rocks Clusters SUN HPC Consortium November 2004 Federico D. Sacerdoti Advanced CyberInfrastructure Group San Diego Supercomputer Center](https://reader035.vdocument.in/reader035/viewer/2022062314/56649de85503460f94ae2908/html5/thumbnails/14.jpg)
Copyright © 2004 F. Sacerdoti, M. Katz, G. Bruno, P. Papadopoulos, UC Regents
Rocks Tech: Dynamic Kickstart File
With Roll (HPC)
HPCbase
![Page 15: Rocks Clusters SUN HPC Consortium November 2004 Federico D. Sacerdoti Advanced CyberInfrastructure Group San Diego Supercomputer Center](https://reader035.vdocument.in/reader035/viewer/2022062314/56649de85503460f94ae2908/html5/thumbnails/15.jpg)
Copyright © 2004 F. Sacerdoti, M. Katz, G. Bruno, P. Papadopoulos, UC Regents
Rocks Tech: Wide Area Net InstallInstall a frontend without CDs
Benefits• Can install from minimal
boot image
• Rolls downloaded dynamically
• Community can build specific extensions
![Page 16: Rocks Clusters SUN HPC Consortium November 2004 Federico D. Sacerdoti Advanced CyberInfrastructure Group San Diego Supercomputer Center](https://reader035.vdocument.in/reader035/viewer/2022062314/56649de85503460f94ae2908/html5/thumbnails/16.jpg)
Copyright © 2004 F. Sacerdoti, M. Katz, G. Bruno, P. Papadopoulos, UC Regents
Rocks Tech: Security & EncryptionTo protect the kickstart file
![Page 17: Rocks Clusters SUN HPC Consortium November 2004 Federico D. Sacerdoti Advanced CyberInfrastructure Group San Diego Supercomputer Center](https://reader035.vdocument.in/reader035/viewer/2022062314/56649de85503460f94ae2908/html5/thumbnails/17.jpg)
Copyright © 2004 F. Sacerdoti, M. Katz, G. Bruno, P. Papadopoulos, UC Regents
Rocks Tech: 411 Information Service
• 411 does NIS Distribute passwords
• File based, simple HTTP transport Multicast
• Scalable
• Secure
![Page 18: Rocks Clusters SUN HPC Consortium November 2004 Federico D. Sacerdoti Advanced CyberInfrastructure Group San Diego Supercomputer Center](https://reader035.vdocument.in/reader035/viewer/2022062314/56649de85503460f94ae2908/html5/thumbnails/18.jpg)
Rocks Services
![Page 19: Rocks Clusters SUN HPC Consortium November 2004 Federico D. Sacerdoti Advanced CyberInfrastructure Group San Diego Supercomputer Center](https://reader035.vdocument.in/reader035/viewer/2022062314/56649de85503460f94ae2908/html5/thumbnails/19.jpg)
Copyright © 2004 F. Sacerdoti, M. Katz, G. Bruno, P. Papadopoulos, UC Regents
Rocks Cluster Homepage
![Page 20: Rocks Clusters SUN HPC Consortium November 2004 Federico D. Sacerdoti Advanced CyberInfrastructure Group San Diego Supercomputer Center](https://reader035.vdocument.in/reader035/viewer/2022062314/56649de85503460f94ae2908/html5/thumbnails/20.jpg)
Copyright © 2004 F. Sacerdoti, M. Katz, G. Bruno, P. Papadopoulos, UC Regents
Rocks Services: Ganglia Monitoring
![Page 21: Rocks Clusters SUN HPC Consortium November 2004 Federico D. Sacerdoti Advanced CyberInfrastructure Group San Diego Supercomputer Center](https://reader035.vdocument.in/reader035/viewer/2022062314/56649de85503460f94ae2908/html5/thumbnails/21.jpg)
Copyright © 2004 F. Sacerdoti, M. Katz, G. Bruno, P. Papadopoulos, UC Regents
Rocks Services: Job Monitoring
SGE Batch System
![Page 22: Rocks Clusters SUN HPC Consortium November 2004 Federico D. Sacerdoti Advanced CyberInfrastructure Group San Diego Supercomputer Center](https://reader035.vdocument.in/reader035/viewer/2022062314/56649de85503460f94ae2908/html5/thumbnails/22.jpg)
Copyright © 2004 F. Sacerdoti, M. Katz, G. Bruno, P. Papadopoulos, UC Regents
Rocks Services: Job Monitoring
How a job affects resources on this node
![Page 23: Rocks Clusters SUN HPC Consortium November 2004 Federico D. Sacerdoti Advanced CyberInfrastructure Group San Diego Supercomputer Center](https://reader035.vdocument.in/reader035/viewer/2022062314/56649de85503460f94ae2908/html5/thumbnails/23.jpg)
Copyright © 2004 F. Sacerdoti, M. Katz, G. Bruno, P. Papadopoulos, UC Regents
Rocks Services: Configured, Ready
• Grid (Globus, from NMI)
• Condor (NMI)
Globus GRAM
• SGE Globus GRAM
• MPD parallel job launcher (Argonne) MPICH 1, 2
• Intel Compiler set
• PVFS
![Page 24: Rocks Clusters SUN HPC Consortium November 2004 Federico D. Sacerdoti Advanced CyberInfrastructure Group San Diego Supercomputer Center](https://reader035.vdocument.in/reader035/viewer/2022062314/56649de85503460f94ae2908/html5/thumbnails/24.jpg)
Rocks Capabilities
![Page 25: Rocks Clusters SUN HPC Consortium November 2004 Federico D. Sacerdoti Advanced CyberInfrastructure Group San Diego Supercomputer Center](https://reader035.vdocument.in/reader035/viewer/2022062314/56649de85503460f94ae2908/html5/thumbnails/25.jpg)
Copyright © 2004 F. Sacerdoti, M. Katz, G. Bruno, P. Papadopoulos, UC Regents
High Performance Interconnect Support
• Myrinet All major versions, GM2 Automatic configuration and support in Rocks since first
release
• Infiniband Via Collaboration with AMD & Infinicon
IB IPoIB
![Page 26: Rocks Clusters SUN HPC Consortium November 2004 Federico D. Sacerdoti Advanced CyberInfrastructure Group San Diego Supercomputer Center](https://reader035.vdocument.in/reader035/viewer/2022062314/56649de85503460f94ae2908/html5/thumbnails/26.jpg)
Copyright © 2004 F. Sacerdoti, M. Katz, G. Bruno, P. Papadopoulos, UC Regents
Rocks Visualization “Viz” Wall
• Enables LCD Clusters One PC / tile Gigabit Ethernet Tile Frame
• Applications Large remote sensing Volume Rendering Seismic Interpretation
• Electronic Visualization Lab Bio-Informatics Bio-Imaging (NCMIR BioWall)
![Page 27: Rocks Clusters SUN HPC Consortium November 2004 Federico D. Sacerdoti Advanced CyberInfrastructure Group San Diego Supercomputer Center](https://reader035.vdocument.in/reader035/viewer/2022062314/56649de85503460f94ae2908/html5/thumbnails/27.jpg)
Rockstar
![Page 28: Rocks Clusters SUN HPC Consortium November 2004 Federico D. Sacerdoti Advanced CyberInfrastructure Group San Diego Supercomputer Center](https://reader035.vdocument.in/reader035/viewer/2022062314/56649de85503460f94ae2908/html5/thumbnails/28.jpg)
Copyright © 2004 F. Sacerdoti, M. Katz, G. Bruno, P. Papadopoulos, UC Regents
Rockstar Cluster
• Collaboration between SDSC and SUN
• 129 Nodes: Sun V60x (Dual P4 Xeon) Gigabit Ethernet Networking (copper) Top500 list positions: 201, 433
• Built on showroom floor of Supercomputing Conference 2003
Racked, Wired, Installed: 2 hrs total Running apps through SGE
![Page 29: Rocks Clusters SUN HPC Consortium November 2004 Federico D. Sacerdoti Advanced CyberInfrastructure Group San Diego Supercomputer Center](https://reader035.vdocument.in/reader035/viewer/2022062314/56649de85503460f94ae2908/html5/thumbnails/29.jpg)
Copyright © 2004 F. Sacerdoti, M. Katz, G. Bruno, P. Papadopoulos, UC Regents
Building of Rockstar
QuickTime™ and aMPEG-4 Video decompressor
are needed to see this picture.
![Page 30: Rocks Clusters SUN HPC Consortium November 2004 Federico D. Sacerdoti Advanced CyberInfrastructure Group San Diego Supercomputer Center](https://reader035.vdocument.in/reader035/viewer/2022062314/56649de85503460f94ae2908/html5/thumbnails/30.jpg)
Copyright © 2004 F. Sacerdoti, M. Katz, G. Bruno, P. Papadopoulos, UC Regents
Rockstar Topology
• 24-port switches• Not a symmetric network
Best case - 4:1 bisection bandwidth Worst case - 8:1 Average - 5.3:1
• Linpack achieved 49% of peak• Very close to percentage peak of
1st generation DataStar at SDSC
![Page 31: Rocks Clusters SUN HPC Consortium November 2004 Federico D. Sacerdoti Advanced CyberInfrastructure Group San Diego Supercomputer Center](https://reader035.vdocument.in/reader035/viewer/2022062314/56649de85503460f94ae2908/html5/thumbnails/31.jpg)
Copyright © 2004 F. Sacerdoti, M. Katz, G. Bruno, P. Papadopoulos, UC Regents
Rocks Future Work
• High Availability: N Frontend nodes. Not that far off (supplemental install server design) Limited by Batch System
Frontends are long lived in practice: Keck 2 Cluster (UCSD) uptime: 249 days, 2:56
• Extreme install scaling• More Rolls!• Refinements
![Page 32: Rocks Clusters SUN HPC Consortium November 2004 Federico D. Sacerdoti Advanced CyberInfrastructure Group San Diego Supercomputer Center](https://reader035.vdocument.in/reader035/viewer/2022062314/56649de85503460f94ae2908/html5/thumbnails/32.jpg)
Copyright © 2004 F. Sacerdoti, M. Katz, G. Bruno, P. Papadopoulos, UC Regents
www.rocksclusters.org
• Rocks mailing List https://lists.sdsc.edu/mailman/listinfo.cgi/npaci-rocks-discussion
• Rocks Cluster Register http://www.rocksclusters.org/rocks-register
• Core: {fds,bruno,mjk,phil}@sdsc.edu