using the cluster. what we’ll be doing add users run linpack compile code compute node management
Post on 29-Dec-2015
225 Views
Preview:
TRANSCRIPT
Using The Cluster
What We’ll Be Doing
Add users Run Linpack Compile code Compute Node Management
Add a User
Adding a User Account
Execute:
# useradd <username>
Output from ‘useradd’Creating user: gbmake: Entering directory `/var/411'/opt/rocks/bin/411put --comment="#" /etc/auto.home411 Wrote: /etc/411.d/etc.auto..homeSize: 514/207 bytes (encrypted/plain)Alert: sent on channel 239.2.11.71 with master 10.1.1.1
/opt/rocks/bin/411put --comment="#" /etc/passwd411 Wrote: /etc/411.d/etc.passwdSize: 2565/1722 bytes (encrypted/plain)Alert: sent on channel 239.2.11.71 with master 10.1.1.1
/opt/rocks/bin/411put --comment="#" /etc/shadow411 Wrote: /etc/411.d/etc.shadowSize: 1714/1093 bytes (encrypted/plain)Alert: sent on channel 239.2.11.71 with master 10.1.1.1
/opt/rocks/bin/411put --comment="#" /etc/group411 Wrote: /etc/411.d/etc.groupSize: 1163/687 bytes (encrypted/plain)Alert: sent on channel 239.2.11.71 with master 10.1.1.1
make: Leaving directory `/var/411'
411 Secure Information Service
Secure NIS replacement
Distributes files within the cluster Default 411 configuration is to distribute user account
files, but one can use 411 to distribute any file to all nodes
411 Secure Information Service
When a 411 monitored file changes, an alert is multicast When a node receives an alert, it pulls the file associated with
the alert
Compute nodes periodically pull all files under the control of 411
User Accounts
All user accounts are housed on the frontend under: /export/home/<username>
All nodes use ‘autofs’ to automatically mount the user directory when a user logs into a node This method provides for a simple global file system
On the frontend and every compute node, the user account is available at “/home/<username>”
Deleting a User
Use:
# userdel <username>
Note: the user’s home directory (/export/home/<username>) will not be removed For safety, this must be removed by hand
Running Linpack
Linpack Linpack is a floating point matrix multiply benchmark Measures sustained floating-point operations per second
“Giga flops” - 1 billion floating point operations per second
This benchmark is used to rate the Top500 fastest supercomputers in the world
We use it as a comprehensive test of the system Stresses the CPU Uses the MPICH layer Sends a modest number of messages Ensures a user can launch a job on all nodes Can run through queueing system to also test queueing
system
Running Linpack From the Command Line
Make a ‘machines’ file Execute: vi machines Input the following:
compute-0-0compute-0-0
Get a test Linpack configuration file:
$ cp /var/www/html/rocks-documentation/3.2.0/examples/HPL.dat .
# su - <userid>
Login as non-root user
Run It Load your ssh key into your environment:
$ /opt/mpich/gnu/bin/mpirun -nolocal -np 2 \-machinefile machines /opt/hpl/gnu/bin/xhpl
$ ssh-agent $SHELL$ ssh-add
Execute Linpack:
Flags: -nolocal : don’t run Linpack on host that is launching the job -np 2 : give the job 2 processors -machinefile : run the job on the nodes specified in the file ‘machines’
Successful Linpack OutputThe following parameter values will be used:
N : 2000 NB : 64 P : 1 Q : 2 PFACT : Left Crout Right NBMIN : 8 NDIV : 2 RFACT : Right BCAST : 1ringM DEPTH : 1 SWAP : Mix (threshold = 80)L1 : transposed formU : transposed formEQUIL : yesALIGN : 8 double precision words
----------------------------------------------------------------------------
- The matrix A is randomly generated for each test.- The following scaled residual checks will be computed: 1) ||Ax-b||_oo / ( eps * ||A||_1 * N ) 2) ||Ax-b||_oo / ( eps * ||A||_1 * ||x||_1 ) 3) ||Ax-b||_oo / ( eps * ||A||_oo * ||x||_oo )- The relative machine precision (eps) is taken to be 1.110223e-16- Computational tests pass if scaled residuals are less than 16.0
============================================================================T/V N NB P Q Time Gflops----------------------------------------------------------------------------W11R2L8 2000 64 1 2 1.96 2.724e+00----------------------------------------------------------------------------||Ax-b||_oo / ( eps * ||A||_1 * N ) = 0.1049227 ...... PASSED||Ax-b||_oo / ( eps * ||A||_1 * ||x||_1 ) = 0.0255037 ...... PASSED||Ax-b||_oo / ( eps * ||A||_oo * ||x||_oo ) = 0.0055411 ...... PASSED
Running Linpack Through a Job Management System Get a test SGE submission script:
$ cp /var/www/html/rocks-documentation/3.2.0/examples/sge-qsub-test.sh .
Examine the script Most of the script concerns adding (and removing) a
temporary ssh key to your environment
/opt/mpich/gnu/bin/mpirun -nolocal -np $NSLOTS \-machinefile $TMPDIR/machines \/opt/hpl/gnu/bin/xhpl
Important Part Of The Script
At the top Requested number of processors
In the middle What job to run
#$ -pe mpi 2
Submit the Job
Send the job off to SGE:
$ qsub sge-qsub-test.sh
Monitoring the Job
Command line$ qstat -f
queuename qtype used/tot. load_avg arch states----------------------------------------------------------------------------compute-0-0q BIP 2/2 99.99 glinux 3 0 sge-qsub-t bruno r 06/03/2004 02:48:15 MASTER 0 sge-qsub-t bruno r 06/03/2004 02:48:15 SLAVE
Status
Job Output
SGE writes 4 files: sge-qsub-test.sh.e0
Stderr for job ‘0’ sge-qsub-test.sh.o0
Stdout for job ‘0’ sge-qsub-test.sh.pe0
Stderr from the queueing system regarding job ‘0’ sge-qsub-test.sh.po0
Stdout from the queueing system regarding job ‘0’
Removing a Job from the Queue
Execute:
$ qdel <job id>
queuename qtype used/tot. load_avg arch states----------------------------------------------------------------------------compute-0-0q BIP 2/2 99.99 glinux 3 0 sge-qsub-t bruno r 06/03/2004 02:48:15 MASTER 0 sge-qsub-t bruno r 06/03/2004 02:48:15 SLAVE
Find the job id with ‘qstat -f’
To remove the job above:$ qdel 3
Monitoring SGE Via The Web
Setup access to web server Local access
Configure X: redhat-config-xfree86 Remote access
Open http port in “/etc/sysconfig/iptables” Or, port forwarding
“ssh root@stakkato.rocksclusters.org -L 8080:localhost:80”
Then point web browser to “http://localhost:8080”
Frontend Web Page
SGE Job Monitoring
SGE Job Monitoring
Ganglia Monitoring
Ganglia Monitoring
Scaling Up Linpack
Tell SGE to allocate more processors Edit ‘sge-qsub-test.sh’ and change:
#$ -pe mpi 2
To:
#$ -pe mpi 4
Tell Linpack to use more processors Edit ‘HPL.dat’ and change
1 Ps
To:
2 Ps
The number of processors Linpack uses is P * Q
Scaling Up Linpack
Submit the larger job$ qsub sge-qsub-test.sh
To make Linpack use more memory (and increase performance, edit ‘HPL.dat’ and change
1000 Ns
To:
4000 Ns
Linpack operates on an N * N matrix Goal: consume 75% of memory on each compute node
Using Linpack Over Myrinet
Scale up the job in the same manner as described in the previous slides.
Submit the Myrinet-based job$ qsub sge-qsub-test-myri.sh
Get a test Myrinet SGE submission script:
$ cp /var/www/html/rocks-documentation/3.2.0/examples/sge-qsub-test-myri.sh .
Executing Commands Across the Cluster Collect “ps” status
cluster-ps <regular expression> To get the status of all the processes being executed by user ‘bruno’
Execute: cluster-ps bruno
Kill processes cluster-kill <regular expression> To kill all the Linpack jobs
Execute: cluster-kill xhpl
Execute any command line executable cluster-fork <regular expression> To restart the ‘autofs’ service on all compute nodes
Execute: cluster-fork “service autofs restart”
Executing Commands Across the Cluster All cluster-* commands can query the database to
generate a node list
To restart the ‘autofs’ service only on the nodes in cabinet 1
Execute: cluster-fork --query=“select name from nodes where rack=1” “service autofs restart”
Compile Code
Compile Test MPI Program with gcc Compile cpi
$ cp /opt/mpich/gnu/examples/cpi.c .$ cp /opt/mpich/gnu/examples/Makefile .$ make cpi/opt/mpich/gnu/bin/mpicc -c cpi.c/opt/mpich/gnu/bin/mpicc -o cpi cpi.o -lm
Run it$ /opt/mpich/gnu/bin/mpirun -nolocal -np 2 -machinefile machines $HOME/cpi/cpiProcess 0 on compute-2-1.localpi is approximately 3.1416009869231241, Error is 0.0000083333333309wall clock time = 0.000650Process 1 on compute-2-1.local
Compile Test MPI Program with gcc Compile cpi
$ cp /opt/mpich/gnu/examples/cpi.c $HOME$ cp /opt/mpich/gnu/examples/Makefile $HOME$ make cpi/opt/mpich/gnu/bin/mpicc -c cpi.c/opt/mpich/gnu/bin/mpicc -o cpi cpi.o -lm
Run it$ /opt/mpich/gnu/bin/mpirun -nolocal -np 2 -machinefile machines $HOME/cpiProcess 0 on compute-2-1.localpi is approximately 3.1416009869231241, Error is 0.0000083333333309wall clock time = 0.000650Process 1 on compute-2-1.local
Compile MPI Code with Intel Compiler
Simply change ‘gnu’ to ‘intel’
$ cp /opt/mpich/intel/examples/cpi.c $HOME$ cp /opt/mpich/intel/examples/Makefile $HOME$ make cpi/opt/mpich/intel/bin/mpicc -c cpi.c/opt/mpich/intel/bin/mpicc -o cpi cpi.o -lm
Bring In Your Own Code
FTP your code to the frontend Let’s compile and try to run it!
Compute Node Management
Adding a Compute Node
Execute “insert-ethers” If adding to a specific rack:
For example, if adding to cabinet 2: “insert-ethers --cabinet=2”
If adding to a specific location within a rack: “insert-ethers --cabinet=2 rank=4”
Replacing a Dead Node
To replace node compute-0-4:
# insert-ethers --replace=“compute-0-4”
Remove the dead node Power up the new node Put the new node into “installation mode”
Boot with Rocks Base CD, PXE boot, etc.
The next node that issues a DHCP request will assume the role of compute-0-4
Removing a Node
If decommissioning a node:
# insert-ethers --remove=“compute-0-2”
Insert-ethers will remove all traces of compute-0-2 from the database and restart all relevant services You will not be asked for any input
top related