roadrunner supercluster university of new mexico -- national computational science alliance paul...
TRANSCRIPT
Roadrunner Supercluster
University of New Mexico --National Computational Science Alliance
Paul Alsing
23 September 1999
Cactus Workshop3
Alliance/UNM Roadrunner SuperCluster
• Strategic Collaborations with Alta Technologies Intel Corp.
• Node configuration Dual 450MHz Intel Pentium II processors 512 KB cache, 512 MB ECC SDRAM 6.4 GB IDE hard drive Fast Ethernet and Myrinet NICs
23 September 1999
Cactus Workshop4
Alliance / UNM Roadrunner
• Interconnection Networks Control: 72-port Fast Ethernet
Foundry switch with 2 Gigabit Ethernet uplinks
Data: Four Myrinet Octal 8-port switches
Diagnostic: Chained serial ports
23 September 1999
Cactus Workshop6
Roadrunner System Software
• Redhat Linux 5.2 (6.0)• SMP Linux kernel 2.2.12• MPI (Argonne’s MPICH 1.1.2)• Portland Group Compiler Suite• Myricom GM Drivers (1.086) and • MPICH-GM (1.1.2.7)• Portable Batch Scheduler (PBS)
• HPF Parallel Fortran for clusters• F90 Parallel SMP Fortran 90• F77 Parallel SMP Fortran 77• CC Parallel SMP C/C++• DBG symbolic debugger• PROF performance profiler
23 September 1999
Cactus Workshop8
Roadrunner System Libraries
• BLAS• LAPACK• ScaLAPACK• Petsc• FFTw• Cactus• Globus Grid Infrastructure
23 September 1999
Cactus Workshop9
Parallel Job Scheduling
• Node-based resource allocation• Job monitoring and auditing• Resource reservations
23 September 1999
Cactus Workshop10
Computational Grid
• National Technology Grid• Globus Infrastructure
Authentication Security Heterogenous environments Distributed applications Resource monitoring
23 September 1999
Cactus Workshop11
For more information:
• Contact Informationhttp://www.alliance.unm.edu/[email protected]
• To Apply for an Account http://www.alliance.unm.edu/accounts [email protected]
23 September 1999
Cactus Workshop12
Easy to Use
rr% ssh -l username rr.alliance.unm.edu rr% mpicc -o prog helloWorld.c rr% qsub -I -l nodes=64
r021 % mpirun prog
23 September 1999
Cactus Workshop17
Applications on RR
• MILC QCD (Bob Sugar, Steve Gottlieb) A body of high performance research software for doing
SU(3) and SU(2) lattice gauge theory on several different (MIMD) parallel computers in current use
• ARPI3D (Dan Weber) 3-D numerical weather prediction model to simulate the rise
of a moist warm bubble in a standard atmosphere• AS-PCG (Danesh Tafti)
2-D Navier Stokes solver• BEAVIS (Marc Ingber, Andrea Mammoli)
1994 Gordon Bell Prize-winning dynamic simulation code for particle-laden, viscous suspensions
23 September 1999
Cactus Workshop18
Applications: CACTUS
• 3D Numerical Relativity Toolkit for Computational Astrophysics(Courtesy of Gabrielle Allen and Ed Seidel)
• Roadrunner performance under the Cactus application benchmark shows near-perfect scalability.
23 September 1999
Cactus Workshop21
CACTUS: The evolution of a pure gravitational wave
A subcritical Brill wave (Amplitude=4.5), showing the Newman-Penrose Quantity as volume rendered 'glowing clouds'. The lapse function is shown as a height field in the bottom part of the picture.
(Courtesy of Werner Benger)
23 September 1999
Cactus Workshop22
• TeraScale computing• “A SuperCluster in every lab”• Efficient use of SMP nodes• Scalable interconnection networks• High-performance I/O• Advanced programming models for
hybrid (SMP and Grid-based) clusters
23 September 1999
Cactus Workshop23
Exercises
• Login to Roadrunner% ssh roadrunner.alliance.unm.edu -l cactusXX
• Request interactive session% qsub -I -l nodes=n
• Create Myrinet Node-Configuration File% gmpiconf $PBS_NODEFILE (to use 1 CPU per node)% gmpiconf2 $PBS_NODEFILE (to use 2 CPUs per node)
• Run Job% mpirun cactus_wave wavetoyf90.par (on 1 CPU per node)% mpirun -np 2*n cactus wavetoyf90.par (on 2 CPUs per node)
23 September 1999
Cactus Workshop24
Compiling Cactus: WaveToy
• Login to Roadrunner% ssh roadrunner.alliance.unm.edu -l cactusXX
• .cshrc #MPI (season to taste) #setenv MPIHOME /usr/parallel/mpich-eth.pgi #ethernet/Portland Grp #setenv MPIHOME /usr/parallel/mpich-eth.gnu #ethernet/GNU setenv MPIHOME /usr/parallel/mpich-gm.pgi #myrinet/Portland Grp #setenv MPIHOME /usr/parallel/mpich-gm.gnu #myrinet/GNU
• if you modify .cshrc make sure to source .cshrc; rehash echo $MPIHOME #should read /usr/parallel/mpich-gm.pgi
23 September 1999
Cactus Workshop25
Compiling Cactus: WaveToy
• Create WaveToy configuration% gmake wave F90=pgf90 MPI=MPICH MPICH_DIR=$MPIHOME
• Compile WaveToy % gmake wave% cd ~/Cactus/exeCopy all .par files into this directory (not necessary)% foreach file (`find ~/Cactus -name “*.par” -print`) foreach> cp $file . foreach> end
23 September 1999
Cactus Workshop26
Running WaveToy on RoadRunner
• Run wave interactively on RoadRunner PBS job scheduler: request interactive nodes% qsub -I -l nodes=4 (note: -I = interactive)
• Note: prompt changes from a front-end node name like [cactus01@rr exe] to an compute-node name for e.g. [cactus01@r034 exe]
• Note: you should compile on the front-end and run on the compute nodes (open 2 windows)
PBS job scheduler: setup a node-configuration file% gmpiconf $PBS_NODEFILE
• Note: cat ~/.gmpi/conf-xxxx.rr will contain specific node names
Run the job from ~/Cactus/exe % mpirun cactus_wave wavetoyf90.par % mpirun -np 2 cactus_wave wavetoyf90.par
23 September 1999
Cactus Workshop27
Running WaveToy on RoadRunner
• Run wave batch on RoadRunner PBS script: (call it, for e.g.) wave.pbs#PBS -l nodes=4# pbs script for wavetoy: 1 processor per nodegmpiconf $PBS_NODEFILEmpirun ~/Cactus/exe/cactus_wave wavetoyf90.par #(use full path)
%Submit batch PBS job% qsub wave.pbs% 234.44 (PBS responds with your job_id #)% qstat -a (check status of your job)% qstat -n (check status, and see the nodes you are on)% qdel 234.44 (remove job from queue)% dsh killall cactus_wave (if things hang, mess up, etc…)