ibm platform lsfaadityahpc.tropmet.res.in/aaditya/incois/inois_lsf... · 2015. 11. 5. · work load...
TRANSCRIPT
© 2012 IBM Corporation
IBM Platform LSF Training Nov 19-21, 2014
© 2012 IBM Corporation
IBM Platform LSF Architecture Overview
© 2012 IBM Corporation
Platform Computing
3
Without IBM Platform LSF
Which node
can run my
job or task?
In a distributed environment (hundreds of hosts)
Monitoring and control of resources is complex
Work load is “silo”-based
Resource usage imbalance
Users perceive a lack of resources
© 2012 IBM Corporation
Platform Computing
4
With IBM Platform LSF
All nodes are grouped
into a “cluster”
Now, IBM
Platform LSF
will run my job
or task on the
best node
available!
Virtual Pool of computing
resources managed by
IBM Platform LSF
© 2012 IBM Corporation
Platform Computing
5
Key LSF Objectives
• Provide the means to create a powerful computer system made up of many smaller systems to increase productivity and lower operating costs
• Match limited supply of resources with demand
IBM Platform LSF
© 2012 IBM Corporation
Platform Computing
6
LSF 8 Architecture
Services in
Real-time
Application
Servers On-
demand
Platform Enterprise Grid Orchestrator™ (EGO)
Network
Bandwidth Servers Licenses Data Storage
Platform LSF®
Heterogeneous Enterprise Resources
Application Application Application Application Application
Enterprise Applications
© 2012 IBM Corporation
Platform Computing
7
LSF Terminology
• Cluster
A collection of TCP/IP networked hosts running Platform LSF
• Master host
A cluster requires a master host. The master host controls the rest of the hosts in the grid
• Master candidates
Master failover hosts
• Server host
A host within the cluster that submits and executes jobs and tasks
• Client host
A host within the cluster that only submits jobs and tasks
© 2012 IBM Corporation
Platform Computing
8
LSF Terminology (cont.d)
• Execution host
The host that executes the job or task
• Submission host
The host from which a job or task is submitted
• Job
A command submitted to Platform LSF. Can take more than one Job Slot
• Queue
A network-wide holding place for jobs which implements different job scheduling and control policies
• Job Slot
The basic unit of processor allocation in Platform LSF. Can be more than one per physical processor
© 2012 IBM Corporation
Platform Computing
9
LSF Overview
Individual machines are grouped into a cluster to be managed by Platform LSF
Users then submit their jobs to Platform
LSF and the master makes a decision on
where to run the job based on the
collected vital signs
One machine in the cluster is
selected as the “master” of LSF
Each slave machine in the cluster
collects its own “vital signs”
periodically and reports them
back to the master
© 2012 IBM Corporation
Platform Computing
10
How It Works – IBM Platform LSF
Load
Information
Manager
Host
Workload
Manager
LSF Web
Services
Broker
Web Application
Job Submission
API
Plugin
Schedulers
Cluster
Workload
Manager
Job Queue
In
tell
igen
t S
ch
ed
ule
r
Fairshare
Preemption
Resource
Reservation
Advance
Reservation
License
Scheduling
SLA
Scheduling Service Level
Agreement
MultiCluster
Other
Scheduling
Modules
© 2012 IBM Corporation
Platform Computing
11
Load Information Manager (LIM)
• Runs on every server host in the cluster
• Defines the cluster configuration
– Identifies the master host
– Licenses the static server hosts, dynamic server hosts and fixed client hosts
– Gathers built-in resource load information directly from /dev/kmem and forwards the
information to the master LIM
– Reports site-defined resource load information gathered by Master ELIM and slave
ELIMs to the master LIM
© 2012 IBM Corporation
Platform Computing
12
Master LIM
• Selected based on the order of static server hosts in
LSF_MASTER_LIST variable in lsf.conf
• Stores built-in and site-defined resource load information gathered by
the Master ELIM and slave ELIMs
• MBD queries the resource load information from the Master LIM for
MBSCHD
• If master LIM becomes unavailable, a new master LIM is automatically
started in the failover master host
© 2012 IBM Corporation
Platform Computing
13
Master and Slave ELIMs
• Site-defined resources can be managed by LSF
• These site-defined resources are gathered using slave ELIMs
(elim.slave)
• Slave ELIMs are managed by a Master ELIM (melim)
• The slave ELIMs are written by the LSF Administrator
• Static ELIMS is used to report user defined static resources
© 2012 IBM Corporation
Platform Computing
14
Process Information Manager (PIM)
• Runs on every server host in the cluster
• Responsible for gathering information about every process running on
the server
• Information gathered is used:
– By SBD to enforce load thresholds
– By MBD to calculate fairshare
• Automatically started by LIM
© 2012 IBM Corporation
Platform Computing
15
Remote Execution Server (RES)
• Runs on every server host in the cluster
• Provides fast, transparent and secure remote execution of interactive
tasks
• To prevent interactive task submissions, disable the RES daemon
© 2012 IBM Corporation
Platform Computing
16
Slave Batch Daemon (SBD)
• Runs on every server host in the cluster
• Receives job requests from MBD
• Is responsible for enforcing load thresholds
• Maintains the state of jobs on a server host
• Launches MBD on the master host
© 2012 IBM Corporation
Platform Computing
17
Master Batch Daemon (MBD)
• One MBD per cluster that runs on the master host
• Responds to user queries (bjobs, bhosts, etc.)
• Receives job requests (bsub)
• Responsible for the overall state of all jobs in the system
• Sends resource load information from the master LIM and pending job
information to MBSCHD for scheduling
• Receives scheduling decision from MBSCHD and dispatches jobs to
SBD on designated server host
• Keeps a transaction file on jobs
• Manages queues
© 2012 IBM Corporation
Platform Computing
18
LSF Scheduling Daemon (MBSCHD)
• One MBSCHD per cluster that runs on the master host
• Receives resource load information and pending job information from
MBD
• Makes scheduling decisions based on job requirements, policies and
resource availability
• Sends scheduling decisions to MBD for job dispatching
• Launched automatically by MBD on the master host
• If MBSCHD fails, a new MBSCHD is restarted
• Reads the lsf.conf file for environment information
© 2012 IBM Corporation
Platform Computing
19
• Upon a successful completion of an LSF installation, the daemons must be started
LIM
PIM
RES
PEM
SBD
LSF Server
Starting LSF Daemons
LSF Master
LIM
PIM
RES
SBD
MBD
MBSCHD
© 2012 IBM Corporation
Platform Computing
20
Starting LSF Daemons (cont.d)
• To start all daemons on all hosts in the cluster you can use the following
script
lsfstartup
• Useful for cold starting a cluster
© 2012 IBM Corporation
Platform Computing
21
Starting LSF Daemons (cont.d)
• To start daemons on local host: Daemons
% lsadmin limstartup LIM,PIM
% lsadmin resstartup RES
% badmin hstartup SBD[MBD,MBSCHD]
• To start daemons on remote host(s): % lsadmin limstartup host1 [host2…hostn]
% lsadmin resstartup host1 [host2…hostn]
% badmin hstartup host1 [host2…hostn]
• To start daemons on all hosts in the lsf.cluster file: % lsadmin limstartup all
% lsadmin resstartup all
% badmin hstartup all
© 2012 IBM Corporation
Platform Computing
22
Starting LSF Daemons (cont.d)
• To start daemons on local host: # $LSF_LSFSERVERDIR/lsf_daemons start
or
# /etc/init.d/lsf start
or
# cd $LSF_SERVERDIR
# ./lim
# ./res
# ./sbatchd
© 2012 IBM Corporation
Platform Computing
23
IBM Platform LSF Daemons
Host 2 Host N
/dev/kmem /dev/kmem
Master
/dev/kmem
LIM LIM LIM
RES RES RES
MBD
MBSCHD
MELIM MELIM MELIM
PIM PIM PIM
SELIM SELIM SELIM
Master
SBD SBD SBD
© 2012 IBM Corporation
Platform Computing
24
LSF LIM & RES Status
% lsload HOST_NAME status r15s r1m r15m ut pg ls it tmp swp mem
training8 ok 0.0 0.0 0.0 0% 0.0 1 11728 112M 114M 52M
training3 ok 1.9 1.0 1.1 19% 1.9 1 0 121M 113M 31M
training1 -ok 2.9 3.0 1.6 49% 9.9 2 0 127M 143M 35M
training5 busy 9.5 12.0 8.1 *100% 10.9 5 0 30M 21M 40M
training2 unavail - - - - - - - - - -
* Indicates that a load threshold has been exceeded
© 2012 IBM Corporation
Platform Computing
25
LSF SBD Status
% bhosts
HOST_NAME STATUS JL/U MAX NJOBS RUN SSUSP USUSP RSV
training8 ok - 32 10 5 4 1 0
training1 ok - 16 0 0 0 0 0
training3 unreach - 1 0 0 0 0 0
training5 closed - 2 1 1 0 0 0
training2 unavail - 8 0 0 0 0 0
© 2012 IBM Corporation
Platform Computing
26
LSF SBD Status (cont.d)
% bhosts –l training5
HOST training5
STATUS CPUF JL/U MAX NJOBS RUN SSUSP USUSP RSV DISPATCH_WINDOWS
closed_Busy 18.60 - 2 1 1 0 0 0 ()
CURRENT LOAD USED FOR SCHEDULING:
r15s r1m r15m ut pg io ls it tmp swp mem
Total 7.8 6.9 5.4 *75% 5.5 3.4 3 0 82M 44M 52M
Reserved 0.0 0.0 0.0 0% 0.0 0 0 0 0M 0M 0M
LOAD THRESHOLD USED FOR SCHEDULING:
r15s r1m r15m ut pg io ls it tmp swp mem
loadSched - - - *0.2 - - - - - - -
loadStop - - - *0.7 - - - - - - -
* Indicates that a load threshold has been exceeded
© 2012 IBM Corporation
Platform Computing
27
LSF Job Process
• By default, LSF handles a job as follows:
– Receives the job
– During the next dispatch turn, considers the job for dispatch
– Places the job on the best available host
– Sets the environment on the host
– Starts the job
© 2012 IBM Corporation
Platform Computing
28
Queue Definition
• User access restriction
• Host restriction
• Queue status
• Exclusive execution restriction
• Job resources requirement
© 2012 IBM Corporation
Platform Computing
29
Job Dispatching
• Every JOB_SCHEDULING_INTERVAL, MBD sends
jobs for scheduling to MBSCHD
• Jobs may not be dispatched in order of submission
• MBD sends job information and resource information
to MBSCHD for scheduling
• As soon as MBD receives scheduling decisions from
MBSCHD, it immediately dispatches the job for execution
© 2012 IBM Corporation
Platform Computing
30
Job Scheduling
• MBSCHD evaluates jobs and makes scheduling
decisions based on:
– Job priority
– Scheduling policies
– Available resources
• MBSCHD selects the best appropriate execution host and sends it’s
decision to MBD
© 2012 IBM Corporation
Platform Computing
31
Host Selection
• A host is eligible to run a job if all conditions are met:
– Job slot availability on host
– Job slot limits
– Host load levels
– Host dispatch windows
– Resource requirements of the job
– Resource requirements of the queue
© 2012 IBM Corporation
Platform Computing
32
Job Execution
• By default:
– The execution environment is maintained to be as close to the submission
environment as possible
– LSF transfers global environment variables from the submission host to the
execution host
• LSF sets LSF-specific environment variables for jobs
© 2012 IBM Corporation
Platform Computing
33
Execution Environment Variables
LSB_JOBID
Job id assigned by LSF
LSB_MCPU_HOSTS
The list of hosts that are used
to run the batch job
LSB_QUEUE
The name of the queue the
job is dispatched from
LSB_JOBNAME
The name of the job
LSB_INTERACTIVE
Set to “Y” if the job is submitted
with the –I option.
LS_JOBPID
Set to the process ID of the job
LS_SUBCWD
The directory on the submission
host when the job was
submitted
Refer to the LSF Reference Guide for more information
© 2012 IBM Corporation
IBM Platform LSF Key Features Overview
© 2012 IBM Corporation
IBM Platform LSF Key Features Overview
© 2012 IBM Corporation
Platform Computing
36
Key LSF Concepts
• Resource - Computers, applications, licenses, storage ... a cluster can
be thought of as a collection of resources
• Transparency – job executing on any node in the cluster must appear to
the end user like it’s running on his/her local node
• Policies – resources are allocated to the jobs according to centrally
configured policies
LSF continuously matches demand with supply. Job is dispatched to run
on remote node once job resource requirements are matched with
resources supplied by the host(s) in the cluster, and job met scheduling
polices currently in effect.
© 2012 IBM Corporation
Platform Computing
37
Resource Requirement String (cont.d)
• A resource requirement string is divided into the following
sections:
– Selection - select[selection_string]
– Usage - rusage[rusage_string]
– Ordering - order[order_string]
– Locality - span[span_string]
– Same - same[same_string]
– CU - cu[cu_string]
• The span and same sections are specifically used for parallel
jobs
© 2012 IBM Corporation
Platform Computing
38
Selection String
• A logical expression built from a set of resource names
• Specifies the characteristics of a server host to be considered as a
potential execution host
• Evaluated for each host
– If the expression evaluates to ‘true’, then that host is considered a candidate
© 2012 IBM Corporation
Platform Computing
39
Selection String (cont.d)
• The select keyword can be omitted if the selection
section is first in the resource requirement string
• The default selection string for execution type commands such as bsub or lsrun is:
select[type==local]
• The default selection string for query type commands such as lsload is:
select[type==any]
© 2012 IBM Corporation
Platform Computing
40
Selection String Examples
• $ bsub –R "select[type==any && swp>=300 && \
mem>500]" job1
Select a candidate execution host of any type which has at least 300MB of
available swap and more than 500MB of available memory
• $ lsload –R "select[type==local && cpuf<18.0]"
Displays all candidate execution hosts of the same type as the submission host
which has a CPU factor less than 18.0
© 2012 IBM Corporation
Platform Computing
41
• $ bsub –R“(ut<0.50 && ncpus==2) || \
(ut<0.75 && ncpus==4)“ job2
Select a candidate execution host the CPU utilization is less than 0.50 and the
number of CPUs is 2, or the CPU utilization is less than 0.75 and the number of
CPUS is 4
• $ bsub –R "type==SUNSOL && swap>300 || \
type==HPPA && swap>400" task1
Select a candidate execution host where the type is SUNSOL and more than
300MB of available swap or where the type is HPPA and has more than 400MB
of available swap
Selection String Examples (cont.d)
© 2012 IBM Corporation
Platform Computing
42
Resource Usage String
• Specifies resource reservations for jobs on execution hosts
• Ignored when running interactive tasks
• By default no resources are reserved
• If rusage is defined at the job level and queue level, the job level
takes precedence
• Keywords duration and decay can be used
• If a job can run with more than one rusage string, it is possible to
specify multiple strings with an “OR” operator and have LSF pick the
first one that matches
© 2012 IBM Corporation
Platform Computing
43
Resource Usage String Examples
• $ bsub –R "select[type==any && swap>=300 && \
mem>500] order[swap:mem] \
rusage[swap=300,mem=500]" job1
On the selected execution host, reserve 300MB of swap space and 500MB of
memory for the duration of the job
• $ bsub –R rusage[mem=500:app_lic_v2=1 || \
mem=400:app_lic_v1.5=1]" job1
Job will use 500MB with app_lic_v2, or 400MB with app_lic_v1.5
• Resource reservation is ignored for interactive tasks (ie: lsload, lsrun)
© 2012 IBM Corporation
Platform Computing
44
Resource Usage String Examples (cont.d)
• $ bsub –R “select[ut<0.50 && ncpus==2] \
rusage[ut=0.50:duration=20:decay=1]“ job2
On the selected execution host, reserve 50% of cpu utilization and linearly
decay the amount of cpu utilization reserved over the duration of the period
• $ bsub –R "select[type == SUNSOL && mem > 300] \
rusage[mem=300:duration=1h]" job3
On the selected execution host, reserve 300MB of memory for 1 hour
© 2012 IBM Corporation
Platform Computing
45
Resource Usage String Examples (cont.)
% bjobs –lp Job <215>, User <john>, Project <default>, Status <PEND>, Queue <reserve>, Command
<job1> Thu Jul 24 07:12:16: Submitted from host <delpe07>, CWD </home/john>, Requested
Resources <rusage[res1=4, tmp=1000]>; Thu Jul 24 07:12:21: Reserved <1> job slot on host <delpe07>; Thu Jul 24 07:12:21: Reserved <581> megabyte tmp on host <581M*delpe07>; Thu Jul 24 07:12:21: Reserved <2> res1
$ bsub -q reserve –R “rusage[res1=4,tmp=1000]” Job1
© 2012 IBM Corporation
Platform Computing
46
Order String
• Allows candidate execution hosts to be sorted according to the value of
resources
• The first index is the primary sort key, the second is your secondary sort
key, etc
• Hosts are ordered from best to worst on the given index or indices
• If defined at the job and queue level, job level takes precedent
• The default order string used is order[r15s:pg]
© 2012 IBM Corporation
Platform Computing
47
Order String Examples
• $ bsub –R "select[type==any && swp>=300 && \
mem>500] order[mem]" job1
Order the candidate execution hosts from the highest to lowest amount of
available memory
• $ lsload –R "select[type==local && cpuf < 18.0] \
order[cpuf]”
Order the candidate execution hosts from highest to lowest CPU factor
• If the “Order” is not specified, order of the candidate execution hosts
will use the default order string of [r15s:pg]
© 2012 IBM Corporation
Platform Computing
48
Order String Examples (cont.d)
• $ bsub –R “select[ut<0.8 && mem>200] \
order[r1m:ut:-mem]" small_mem_job.sh
Order the candidate execution hosts from lowest to highest one minute run
queue length, from lowest to highest CPU utilization, and from lowest to highest
amount of available memory
© 2012 IBM Corporation
Platform Computing
49
Span String
• Specifies the locality of a parallel job
• Supported options:
– span[hosts=1] which indicates that all processors allocated to this job must be on
the same execution host
– span[ptile=n] which indicates that up to n processors on each execution host
should be allocated to the job
– span[ptile=![,HOSTTYPE:n] uses the predefined maximum job slot limit in
lsb.hosts (MXJ per host type/model) as the value for other host model or type, other
then those host type is specified.
• When defined at both job-level and queue-level, the job-level definition
takes precedence
© 2012 IBM Corporation
Platform Computing
50
Span String Examples
• $ bsub –n 16 –R "select[ut<0.15] order[ut] \
span[hosts=1]" parallel_job1
All processors required to complete this job must reside on the same
execution host
• $ bsub –n 16 –R "select[ut<0.15] order[ut] \
span[ptile=2]" parallel_job2
Up to 2 CPUs per execution host can be used to execute this job therefore at
least 8 execution hosts are required to complete this job
© 2012 IBM Corporation
Platform Computing
51
Same CPU String
• Used to specify that all processes of a parallel job must run on hosts
with the same resources
• The parallel job scheduler plugin must be installed to use this option
• When defined at both job-level and queue-level, both requirements are
combined to allocate processors
• Any static resource can be specified
© 2012 IBM Corporation
Platform Computing
52
Same CPU String Examples
• $ bsub –n 64 –R "select[type==SGI6||type==SOL7] \
same[type]" parallel_job1
Run all parallel processes on the same host type, either SGI IRIX or Solaris 7,
but not both
• $ bsub –n 64 –R "select[type==any] \
same[type:model]" parallel_job2
Run all parallel processes on the same host type and model
© 2012 IBM Corporation
Platform Computing
53
• Support for multiple resource requirement strings (-R) Options
• The administrator can easily change resource requirements In job level
submission bsub -R "select[swp > 15]" -R “select[hpux] order[r15m]” -R “rusage[mem=100]” –
R “order[ut]” -R “same[type]” -R “rusage[tmp=50:duration=60]” -R
“same[model]” myjob
• LSF merges the multiple -R options into one string and selects a host
that meets all of the resource requirements
• The number of -R option sections is unlimited
• Up to a maximum of 512 characters for the entire string per –R option
Multiple Resource Requirements
© 2012 IBM Corporation
Platform Computing
54
Compound Resource Requirements
• Specify different requirements for some slots within a job
Example:
Requests 32 processors: one on a X86_64 machine, on which the job reserves 16000 M memory; and the rest on X86_64 machines.
Requests 48 processors to run MPMD application: the first 16 must be on 2 XT5 nodes, each with 8 cores; the remaining 32 processors must be on 8 XT4 nodes, each with 4 cores.
$bsub –R “1*{ select[type==X86_64 && mem>16000]\
rusage[mem=16000] } \
+ 31*{ select[type==X86_64] }” myjob
$ bsub –R “16*{ select[type==XT5] span[ptile=8] } \
+ 32*{ select[type==XT4] span[ptile=4] }”\
crayjob
© 2012 IBM Corporation
Platform Computing
55
Multi-phase Resource Reservation
• bsub -R option can contain multiple durations with multiple memory and decay requirements.
Example:
After job runs for 10 minutes, job will reserve 100M memory and release other 400M memory.
The job reserves 500M memory with decay for the first 20 minutes, then reserves 400M memory with no decay for the next 10 minutes, then reserves 300M memory with decay for the next 5 minutes, then reserves no memory for the rest of job’s life cycle.
bsub –R”rusage[mem=(500 100):duration=(10)] myjob
bsub –R “rusage[mem=(500 400 300):duration=(20 10
5):decay=(1 0 1)]” myjob
© 2012 IBM Corporation
Platform Computing
56
Job Submission & Control Commands
• bsub [options] command [cmdargs]
• bjobs [-a][-J jobname][-u usergroup|-u all][…] jobID
• bhist [-a][-J jobname][-u usergroup|-u all][…] jobID
• bbot/btop [jobID | "jobID[index_list]“] [position]
• bkill [-J jobname] [-m] [-u ] [-q] [-s signalvalue]
• bmod [bsub_options] jobID
• bpeek [-f] jobID
• bstop/bresume jobID
• bswitch destination_queue jobID
© 2012 IBM Corporation
Platform Computing
57
LSF Job Submission Commands
• bsub [commonly used options]
-n # - number of CPUs required for the job
-o filename – redirect stdout, stderr and resource usage
information of the job to the specified output file
-e filename – redirect stderr to the specified error file
-i filename – use the specified file as standard input for the job
-q qname – submits the job to the specified queue
-m hname – select host(s) or host group. Keywords “all” and “others” can be used
-J jobname – assigns the specified name to the job
-Q “[exit_code] [EXCLUDE(exit_code)]” - Success exit code
© 2012 IBM Corporation
Platform Computing
58
LSF Job Submission Commands (cont.d)
• bsub [options]
-oo filename - same as -o, but overwrite file if it exists
-eo filename - same as -e, but overwrite file if it exists
-L login_shell - Initializes the execution environment using the specified login shell
-n number - if PARALLEL_SCHED_BY_SLOT=Y in lsb.params, then specify number of
job slots, not CPUs
-g jobgroup - submit job to specified group
-sla serviceclass - submit job to specified service class
-W runlimit - if ABS_RUNLIMIT=Y uses wall clock time
-app - Application Profiles
© 2012 IBM Corporation
Platform Computing
59
LSF Job Submission Commands (cont.d)
-C core_limit Set the per-process core file size limit (KB)
-c cpu_time Limit the total CPU time for the job ([HH:]MM)
-cwd Specify the current working directory for the job
-W runlimit Set the run time limit for the job ([HH:]MM)
-We Specifies an estimated run time for the job
• bsub [setting limits]
© 2012 IBM Corporation
Platform Computing
60
LSF Job Submission Commands (cont.d)
• bsub [options]
-B Send email when the job is dispatched
-H Hold the job in the PSUSP state after submission
-N Email job header only when job completes
-b begin_time Dispatch the job after the specified time with year & time
-G user_group Associate job with specified user group (fairshare)
-L login_shell Initialized exec environment using specified shell
-t term_time Specify the job termination deadline with year & time
-u mail_user Send email to the specified address
© 2012 IBM Corporation
Platform Computing
61
LSF Interactive Job Submission
• bsub
-I – submit an interactive job
-Ip – submit an interactive job with pseudo-tty support
-Is – submit an interactive job and create a pseudo-tty with shell mode support
-XF – Submit an interactive with support SSH X11
• By default, LSF uses ssh for interactive X-Window jobs
© 2012 IBM Corporation
Platform Computing
62
bsub - Condensed Host Notations
bsub -m “host[1-12,24] host[12-64]+2” job1
• Optional Configuration in lsf.conf
– LSB_MAX_ASKED_HOSTS_NUMBER=integer
– Limits the number of hosts a user can specify with the bsub -m option. The request is rejected if
more hosts are specified than the value set
– Default value is 512
• Commands can use the notation:
bsub brun bmod brestart brsvadd brsvmod bswitch bjobs bhist bacct
brsvs bmig bpeek
© 2012 IBM Corporation
Platform Computing
63
• By script or command % cd /home/user/project_dir
% bsub –q parallel –a fluent –n 4 ./my_fluent_launcher.sh
• By job spooling % bsub < spoolfile
• Interactively % bsub
bsub> #BSUB –q parallel –n 4
bsub> #BSUB –a fluent
bsub> cd /home/user/project_dir
bsub> ./my_fluent_launcher.sh
bsub> ^D
Job <1234> submitted to queue <parallel>
Available Methods for Submitting Jobs
Example spoolfile #BSUB –q parallel
#BSUB –n 4 –a fluent
cd /home/user/project_dir
./my_fluent_launcher.sh
© 2012 IBM Corporation
Platform Computing
64
View Job Information
bjobs Can display parallel jobs and condensed host groups in an aggregate format
-a Display information about jobs in all states (including finished jobs)
-A Display summarized information about job arrays
-d Display information about jobs that finished recently
-l|-w Display information in long or wide format
-p Display information about pending jobs
-r Display information about running jobs
-g job_group Display information about jobs in specified group
-J job_name Display information about specified job or array
-m host_list Display information about jobs on specified hosts or groups
-P project_name Display information about jobs in specified project
-q queue_name Display information about jobs in specified queue
-u user_name Display information about jobs for specified users/groups
© 2012 IBM Corporation
Platform Computing
65
View Job Information (cont.d)
bhist
-a Display information about all jobs (overrides -d, -p, -r, and -s)
-b|-l|-w Display information in brief, long, or wide format
-d Display information about finished jobs
-p Display information about pending jobs
-s Display information about suspended jobs
-t Display job events chronologically
-C|-D|-S|-T start_time,end_time Display information about completed, dispatched, submitted, or all
jobs in specified time window
-P project Display information about jobs belonging to specified project
-q queue Display information about jobs submitted to specified queue
-u username|all Display information about jobs submitted by user
© 2012 IBM Corporation
Platform Computing
66
% bjobs –u all –a
JOBID USER STAT QUEUE FROM_HOST EXEC_HOST JOB_NAME SUBMIT_TIME
1233 user1 DONE normal training8 training1 *sortName Nov 21 10:00
1234 user1 RUN priority training8 training1 *verilog Nov 21 10:00
1235 user2 PEND night training9 *sortFile Nov 21 10:03
1236 user2 PEND normal training9 *sortName Nov 21 10:04
View Submitted Job Information Example
© 2012 IBM Corporation
Platform Computing
67
View Detailed Submitted Job Information
% bjobs -l 1235
Job <1235>, User <user2>, Project <default>, Status <PEND>, Queue <night>,
Command <night_job>
Wed Nov 21 10:03:51: Submitted from host <training9>, CWD <$HOME>;
PENDING REASONS:
Dispatch window closed: 1 queue;
SCHEDULING PARAMETERS:
r15s r1m r15m ut pg io ls it tmp swp mem
loadSched - - - - - - - - - - -
loadStop - - - - - - - - - - -
© 2012 IBM Corporation
Platform Computing
68
View Historical Job Information Example
% bhist
Summary of time in seconds spent in various states:
JOBID USER JOB_NAME PEND PSUSP RUN USUSP SSUSP UNKWN TOTAL
2299 alfred *eep 100 5 0 2 0 0 0 7
2300 alfred *eep 100 4 0 2 0 0 0 6
2301 alfred *eep 100 4 0 2 0 0 0 6
2302 alfred *eep 100 3 0 2 0 0 0 5
© 2012 IBM Corporation
Platform Computing
69
View Historical Job Information Example (cont.d)
% bhist -l 2302
Job <2302>, User <alfred>, Project <default>, Command <sleep 100>
Wed Mar 30 13:53:44: Submitted from host <blade02>, to Queue
<normal>,CWD<$HOME>;
Wed Mar 30 13:53:47: Dispatched to <blade02>;
Wed Mar 30 13:53:47: Starting (Pid 24595);
Wed Mar 30 13:53:52: Running with execution home </home/alfred>, Execution CWD
</home/alfred>, Execution Pid <24595>;
Wed Mar 30 13:55:32: Done successfully. The CPU time used is unknown;
Wed Mar 30 13:55:32: Post job process done successfully;
Summary of time in seconds spent in various states by Wed Mar 30 13:55:32
PEND PSUSP RUN USUSP SSUSP UNKWN TOTAL
3 0 105 0 0 0 108
© 2012 IBM Corporation
Platform Computing
70
Manipulating Jobs
• bbot – moves a pending job to the bottom of the queue
• btop – moves a pending job to the top of the queue
• bkill – sends a signal to kill, suspend or resume unfinished
jobs (use a job ID of “0” to kill all your jobs).
• bmod – modifies job submission options of a job
• bpeek – displays the stdout and stderr of an unfinished job
• bstop – suspend unfinished jobs
• bresume – resumes one or more suspended jobs
• bswitch – switches unfinished jobs to another queue
© 2012 IBM Corporation
Platform Computing
71
• Example #1 % bsub /home/LSF/scripts/sleeper Job <1234> is submitted to default queue <normal>
% bstop 1234 Job <1234> is being stopped
% bresume 1234 Job < 1234> is being resumed
• Example #2
% bsub –q night –m "host1 host2" night_job Job <1235> is submitted to queue <night>
% bmod –m "hostGroupA" 1235 Parameters of job <1235> are being changed
LSF Job Submission Examples
© 2012 IBM Corporation
Platform Computing
72
LSF Job Submission Examples (cont.d)
• Example #3
% bsub –i "/home/user1/in.dat"
-o "/home/user1/out.dat"
–e "/home/user1/error.dat" long_job
Job <1236> is submitted to default queue <normal>
% bswitch night 1236
Job <1236> is switched to queue <night>
• Example #4
% bsub –P Research –J Projection1 project_job
Job <1238> is submitted to default queue <normal>
% btop 1238
Job <1238> has been moved to position 1 from top
% bbot 1238
Job <1238> has been moved to position 1 from bottom
© 2012 IBM Corporation
Platform Computing
73
File Transfer Option (-f )
• Copies file from local (submission) host to the remote (execution) host if
there is no shared file system:
bsub –f “local_file operator [remote_file]”
The following operators can be used:
> Copies local file to remote file before jobs start
< Copies remote file to local file after job completes
<< Appends the remote file to the local file after job completes
>< or <> Copies local file to remote file before job starts, and
remote file to local file after job completes
© 2012 IBM Corporation
Platform Computing
74
• Submit myjob with input file /data/data2. After job has completed, copy
the output file out to /data/out2
bsub –f “/data/data2 > data2” \
-f “/data/out2 < out” myjob data2 out
File Transfer Examples
© 2012 IBM Corporation
Platform Computing
75
LSF Cluster Query Commands
• lsid
• lsinfo [-l][-r][-m][-t][resourcename ...]
• lshosts [-w|-l][-R "res_req"][hostname ...]
• lsload [-N|-E][-l][-R "res_req"] [hostname... ]
• lsmon [-i][-L logfile]
• bhosts [-w|-l][-R "res_req"][hostname|hostgroup]
• bmgroup [-r][hostgroup …]
• bqueues [-w|-l|-r][-m hostname|-m all] [queuename …]
• busers [username …|usergroup …| all]
• bugroup [-r][usergroup …]
© 2012 IBM Corporation
Platform Computing
76
View Cluster and Master Names
% lsid
Platform LSF 8.0, Jan 20 2011
Copyright 1992-2011 Platform Computing Corporation
My cluster name is cluster8
my master name is training1
View load sharing configuration information
% lsinfo RESOURCE_NAME TYPE ORDER DESCRIPTION r15s Numeric Inc 15-second CPU run queue length maxtmp Numeric Dec Maximum /tmp space (Mbytes) cpuf Numeric Dec CPU factor hname String N/A Host name
TYPE_NAME DEFAULT LINUX LINUXPPC64
MODEL_NAME CPU_FACTOR ARCHITECTURE PC1133 23.10 x6_1189_PentiumIIICoppermine
© 2012 IBM Corporation
Platform Computing
77
Server Host Configuration Information
% lshosts HOST_NAME type model cpuf ncpus maxmem maxswp server RESOURCES
training1 LINUX86 PC1133 23.1 16 2.0G 1.0G Yes (mg)
training8 LINUX64 Intel64 30.3 32 6.0G 2.0G Yes (mg)
training5 SOL32 Ultra450 25.0 8 1.0G 740M Yes ()
training2 LINUX86 PC1133 23.1 1 512M 256M Yes ()
training3 LINUX86 PC1133 23.1 1 512M 256M Yes ()
training9 HPPA HP735 - - - - No ()
© 2012 IBM Corporation
Platform Computing
78
% lshosts -l training3 HOST_NAME: training3
type model cpuf ncpus ndisks maxmem maxswp maxtmp rexpri server nprocs ncores nthreads
X86_64 PC6000 116.1 1 1 2007M 4094M 40319M 0 Yes 1 1
1
RESOURCES: (mg)
RUN_WINDOWS: (always open)
LICENSES_ENABLED: (LSF_Base LSF_Manager LSF_Make)
LOAD_THRESHOLDS:
r15s r1m r15m ut pg io ls it tmp swp mem
- 3.5 - - - - - - - - -
Detailed Server Host Configuration Information
© 2012 IBM Corporation
Platform Computing
79
% lsload HOST_NAME status r15s r1m r15m ut pg ls it tmp swp mem
training8 ok 0.0 0.0 0.0 0% 0.0 1 11728 112M 114M 52M
training2 ok 0.3 0.7 0.1 9% 0.2 0 2467 330M 201M 290M
training5 ok 1.5 2.0 0.1 25% 2.9 5 0 130M 101M 90M
training1 ok 2.9 3.0 1.6 49% 6.9 2 0 127M 143M 35M
training3 ok 3.1 4.0 2.6 69% 9.9 6 0 117M 103M 25M
Detailed server host load information (Including I/O information and external load indices)
% lsload –l HOST_NAME status r15s r1m r15m ut pg io ls it tmp swp mem licA
training8 ok 0.0 0.0 0.0 0% 0.0 3 0 11728 112M 114M 52M 12
training2 ok 0.3 0.7 0.1 9% 0.2 12 0 2467 330M 201M 290M 12
training5 ok 1.5 2.0 0.1 25% 2.9 21 5 0 130M 101M 90M 12
training1 ok 2.9 3.0 1.6 49% 6.9 80 2 0 127M 143M 35M 12
training3 ok 3.1 4.0 2.6 69% 9.9 381 6 0 117M 103M 25M 12
Server Host Load Information
© 2012 IBM Corporation
Platform Computing
80
Current Load Information
% lsmon Hostname: training1 Refresh Rate: 10 secs
HOST_NAME status r15s r1m r15m ut pg ls it tmp swp mem
training8 ok 0.0 0.0 0.0 0% 0.0 1 11728 112M 114M 52M
training2 ok 0.3 0.7 0.1 9% 0.2 0 2467 330M 201M 290M
training5 ok 1.5 2.0 0.1 25% 2.9 5 0 130M 101M 90M
training1 ok 2.9 3.0 1.6 49% 6.9 2 0 127M 143M 35M
training3 ok 3.1 4.0 2.6 69% 9.9 6 0 117M 103M 25M
© 2012 IBM Corporation
Platform Computing
81
Server Host Information
% bhosts
HOST_NAME STATUS JL/U MAX NJOBS RUN SSUSP USUSP RSV
training8 ok 2 32 10 5 4 1 0
training5 ok 1 8 7 6 0 1 0
training2 ok - 16 3 1 1 1 0
training1 ok - - 0 0 0 0 0
training3 closed - 1 0 0 0 0 0
© 2012 IBM Corporation
Platform Computing
82
Detailed Server Host Information
% bhosts –l training3
HOST training3
STATUS CPUF JL/U MAX NJOBS RUN SSUSP USUSP RSV DISPATCH_WINDOW
closed_Wind 10.3 - 1 0 0 0 0 0 12:00-15:00
CURRENT LOAD USED FOR SCHEDULING:
r15s r1m r15m ut pg io ls it tmp swp mem
Total 0.0 0.0 0.0 0% 0.0 1 1 11720 512M 256M 412M
Reserved 0.0 0.0 0.0 0% 0.0 0 0 0 0M 0M 0M
LOAD THRESHOLD USED FOR SCHEDULING:
r15s r1m r15m ut pg io ls it tmp swp mem
loadSched - - - - - - - - - - -
loadStop - - - - - - - - - - -
© 2012 IBM Corporation
Platform Computing
83
Queue Information
% bqueues QUEUE_NAME PRIO STATUS MAX JL/U JL/P JL/H NJOBS PEND RUN SUSP
priority 73 Open:Active 10 1 1 1 1 0 1 0
interact 70 Open:Active 1 - - - 5 4 1 0
night 40 Open:Inact - - 1 - 3 3 0 0
normal 30 Open:Active - 1 - - 10 4 4 2
© 2012 IBM Corporation
Platform Computing
84
% bqueues -l normal QUEUE: normal
-- For normal low priority jobs. This is the default queue.
PARAMETERS/STATISTICS
PRIO NICE STATUS MAX JL/P JL/H NJOBS PEND RUN SSUSP USUSP RSV
30 20 Open:Active - - - 10 4 4 1 1 0
SCHEDULING PARAMETERS
r15s r1m r15m ut pg io ls it tmp swp mem
loadSched - 0.7 - - - - - - - - -
loadStop - 2.0 - - - - - - - - -
USERS: all users
HOSTS: all
Detailed Queue Information
© 2012 IBM Corporation
Platform Computing
85
User and User Group Information
% busers all USER/GROUP JL/P MAX NJOBS PEND RUN SSUSP USUSP RSV
default - 1 - - - - - -
user1 - - 15 10 4 1 0 0
user6 1 4 24 20 4 0 0 0
userGroupA 1 - 35 25 7 3 0 0
View user group member information % bugroup
GROUP_NAME USERS
userGroupA user2 user3 UnixAdminGrp
userGroupB NISgroup NTgroup
© 2012 IBM Corporation
Platform Computing
86
bparams -a
• Display all parameters in lsb.params configurations
$ bparams –a
DEFAULT_QUEUE = normal
DEFAULT_HOST_SPEC = NULL
MBD_SLEEP_TIME = 20 (seconds)
SBD_SLEEP_TIME = 15 (seconds)
JOB_ACCEPT_INTERVAL = 1
PG_SUSP_IT = 180
CLEAN_PERIOD = 3600
MAX_JOB_NUM = 1000
MAX_SBD_FAIL = 3
HIST_HOURS = 5
DEFAULT_PROJECT = default
…
© 2012 IBM Corporation
Platform Computing
87
Queue Configuration: Example 1
• Queue used for short running jobs
Begin Queue
QUEUE_NAME = short (mandatory)
DESCRIPTION = for short running jobs
ADMINISTRATORS = userGroupA
PRIORITY = 75
USERS = userGroupA engineer/ engineer
CPULIMIT = 2
RUNLIMIT = 5 10
CORELIMIT = 0
MEM = 800/100
SWP = 500/50
HOSTS = hostGroupC
End Queue
© 2012 IBM Corporation
Platform Computing
88
Queue Configuration: Example 2
Begin Queue
QUEUE_NAME = normal
DESCRIPTION = default queue for single CPU jobs
PRIORITY = 30
USERS = all
INTERACTIVE = NO
NICE = 20
MEMLIMIT = 204800 # 200MB of memory
PROCLIMIT = 1
HOSTS = all
End Queue
General submission queue
© 2012 IBM Corporation
Platform Computing
89
Queue Configuration: Example 3
Begin Queue
QUEUE_NAME = night
DESCRIPTION = used for jobs running at night
ADMINISTRATORS = lsfadmin user1
PRIORITY = 40
DISPATCH_WINDOW = (18:00-07:30)
RUN_WINDOW = (18:00–08:00)
RES_REQ = select[(type==SUNSOL && r1m<2.0)||\
(type==HPPA && r1m<1.0)]
HOSTS = all ~training4
End Queue
After hours queue
© 2012 IBM Corporation
Platform Computing
90
Queue Configuration: Example 4
Begin Queue
QUEUE_NAME = interactive
DESCRIPTION = default for interactive jobs
ADMINISTRATORS = user2 userGroupA
PRIORITY = 80
INTERACTIVE = ONLY
NEW_JOB_SCHED_DELAY = 0
HOSTS = hostGroupB+5 hostGroupA+2 others
End Queue
Interactive job queue
© 2012 IBM Corporation
Platform Computing
91
Disabling a Queue
• Remove queue and configuration from file
• Comment the Begin and End Queue lines
• No need to comment the queue configurations
# Begin Queue
QUEUE_NAME = interactive
DESCRIPTION = default for interactive jobs
ADMINISTRATORS = user2 userGroupA
PRIORITY = 80
INTERACTIVE = ONLY
NEW_JOB_SCHED_DELAY = 0
HOSTS = hostGroupB+5 hostGroupA+2 others
# End Queue
© 2012 IBM Corporation
Platform Computing
92
lsb.applications - LSF Application Profile
• Application Encapsulation
– Application-specific job containers
– Maps common execution requirements for an application
– Centralized application job submission properties
– Different applications may be submitted through the same queue
– Minimize amount of queues
– Minimize system administration overhead for maintaining queues
– User manageable application profiles
– New configuration lsb.applications
© 2012 IBM Corporation
Platform Computing
93
lsb.applictions - LSF Application Profile (cont.d)
• Profiles provides a central definition for application specific LSF
attributes including:
– Pre/Post-exec and job starter
– Automatic job controls
– Process and processor limits
– Runtime hints/estimates, in addition to limits, improved scheduling
– Re-runnable jobs
– Re-queue exit values
– Default resource requirements
– Control of job chunking
– Estimated runtime
© 2012 IBM Corporation
Platform Computing
94
lsb.applications - LSF Application Profile (cont.d)
• Example: execution requirements for the FLUENT application:
• bsub -app fluent -q overnight myjob
Begin Application
NAME = fluent
DESCRIPTION = FLUENT Version 6.2
CPULIMIT = 180/hostA # 3 hours of host hostA
FILELIMIT = 20000
DATALIMIT = 20000 # jobs data segment limit
CORELIMIT = 20000
PROCLIMIT = 5 # job processor limit
PRE_EXEC = /usr/local/lsf/misc/testq_pre >> /tmp/pre.out
REQUEUE_EXIT_VALUES = 55 34 78
End Application