Claudio Grandi INFN Bologna CHEP'03 Conference, San Diego March 27th 2003
BOSS:a tool for batch job
monitoring and book-keeping
Claudio Grandi(INFN Bologna)
March 27th 2003CHEP'03 Conference, San Diego 2Claudio Grandi INFN Bologna
BOSS“Batch Object Submission System”
Is a tool for job monitoring and book-keeping
Allows to deal with job-specific information
Is not a job scheduler, but can be interfaced with most schedulers: LSF (CERN, INFN) PBS (Bristol, Caltech, UFL, Imperial College, INFN) FBSNG (Fermilab) Condor (INFN, U.Wisconsin)
Has been designed to work on computing farms
Is compatible with use on a WAN, but is not robust against network failures (yet)
March 27th 2003CHEP'03 Conference, San Diego 3Claudio Grandi INFN Bologna
Basic BOSS components boss executable:
the BOSS interface to the user
MySQL database: where BOSS stores job information
jobExecutor executable: the BOSS wrapper around the user job
dbUpdator executable: the process that writes to the database while the job is
running
Local scheduler may be a “Grid” scheduler
March 27th 2003CHEP'03 Conference, San Diego 4Claudio Grandi INFN Bologna
Basic flow
Accepts job submission from users Stores info about job in a DB Builds a wrapper around the job (jobExecutor) Sends the wrapper to the local scheduler The wrapper sends to the DB info about the job
boss submitboss queryboss kill BOSS
DB
BOSS Local
Schedulerfarm node
farm node
Wrapper
March 27th 2003CHEP'03 Conference, San Diego 5Claudio Grandi INFN Bologna
User defined information User registers a job type:
Schema for the information to be monitored A new table is created in the BOSS database with a
defined structure
Algorithms to retrieve the information from the job The user programs (filters) are stored in the database as
blobs
User submits jobs: One or more job types can be specified for the job
A new entry is created for the job in the database tables
The filters are extracted from the database and made available to the running job
March 27th 2003CHEP'03 Conference, San Diego 6Claudio Grandi INFN Bologna
testJOBID COUNTER 12345 0
BO
SS
DB
std
ou
t
The job interface to BOSS
#!/usr/bin/perlwhile(<STDIN>){ if($_=~/.*counter\s+(\d+).*/){ print “COUNTER=$1\n"; }}
BOSSjobExecutor
counter 1
counter 2
counter 3 COUNTER=1COUNTER=2COUNTER=3
123
#!/usr/bin/perl$i = 0;while($i<3){ sleep(1); $i++; print "counter $i\n";}
Use
r jo
b
Fil
ter
jou
rnal 1234 test counter 1
1234 JOB T_START xxx1234 JOB …… ……
1234 test counter 21234 test counter 31234 JOB …… ……1234 JOB T_STOP yyy
BOSSdbUpdator
The job interfaces to BOSS are its standard input, output and error streams The user defined algorithms are filters that read
stdin/out/err and write key=value pairs
The keys are the user-defined schema variables
March 27th 2003CHEP'03 Conference, San Diego 7Claudio Grandi INFN Bologna
Runtime data flow
STDIN
STDOUT
STDERR
LOGUSER
OUT pipe
ERR pipe
tee
tee
Standard input or output Standard error Other I/O streams
User supplied or returned to the userTemporary processes and filesBOSS Processes and files
RunTime Filter pipe
jobExecutor
RunTime Filter pipeRunTime Filter pipe
Journal tee pipe
tee pipe
tee pipe
BOSSDB
dbUpdator
March 27th 2003CHEP'03 Conference, San Diego 8Claudio Grandi INFN Bologna
Queries Standard queries:
Get job status and user defined quantities% boss q -all -specific -type test
ID S_USR EXECUTABLE ST EXE_HOST START TIME STOP TIME comment counter
1 grandi test.pl 15 E pccms10.bo 14:30:00 06/06 14:30:16 06/06 ...STOP 15
2 grandi test.pl 15 R pccms10.bo 14:30:02 06/06 -------------- START... 13
Advanced queries: Use SQL to query job info (standard + user defined)
Output suitable for parsing by a script:% boss SQL -query "select JOB.ID,EXEC,counter from JOB,test WHERE JOB.ID=test.JOBID"
3,4,23,9
ID EXEC counter
1 test.pl 15
2 test.pl 13 number
of fields
Width of 1st field
…Width of nth field
Information line
Header line
March 27th 2003CHEP'03 Conference, San Diego 9Claudio Grandi INFN Bologna
Interface to the scheduler User registers a scheduler:
Scripts for job submission, deletion and query The scripts are stored in the database as blobs
The fork scheduler is already registered
User submits/deletes/queries jobs: The scheduler can be specified for the submission
The boss executable fetches the scripts from the database and uses them as interface to the scheduler
Job submission via ClassAd file is supported BOSS manages the keys it understands and passes the
others to the submission script
User-defined keys are possible!
March 27th 2003CHEP'03 Conference, San Diego 10Claudio Grandi INFN Bologna
BOSS as a grid-tool
boss submitboss queryboss kill
BOSSDB
Local BOSS gateway
GRIDScheduler
boss registerScheduler
gatekeeper
gatekeeper
farm node
farm node
Tested on the European DataGrid testbed Interface scripts incluided in BOSS distribution
See talk by P.Capiluppi
dbUpdator uses native MySQL calls
Proof of concept using R-GMA (from EDG-WP3) as BOSS transport layer (H.Nebrensky, Brunel Univ.)
March 27th 2003CHEP'03 Conference, San Diego 11Claudio Grandi INFN Bologna
Input/OutputSandbox
BOSS and R-GMA
BOSSDB
R-GMA Receiverservlets
R-GMARegistry
boss executable
Use
r In
terf
ace
Com
putin
g E
lem
ent
Wor
ker
Nod
e
R-GMAenabled
dbUpdator
jobExecutor starts user job
BOSSjournal
Useroutput
R-GMAProducerservlets
EDG WP1 + GRAM
Fire
wal
l
subscribelookup
March 27th 2003CHEP'03 Conference, San Diego 12Claudio Grandi INFN Bologna
March 27th 2003CHEP'03 Conference, San Diego 13Claudio Grandi INFN Bologna
Current use of BOSS CMS 2002 productions:
– about 500,000 jobs running in about 20 regional centers
– complete book-keepig of every single job
CMS/EDG stress test (Nov.-Dec. 2002):– about 10,000 jobs submitted by 4 user interfaces on
the European DataGrid testbed– allowed validation of jobs for which the output
sandbox was lost due to EDG internals
R-GMA demo at EDG review (Feb. 2002):– proof of concept
March 27th 2003CHEP'03 Conference, San Diego 14Claudio Grandi INFN Bologna
BOSS data analysis boss2root (by D.Bonacorsi)
– Produce root trees from BOSS MySQL tables– Used to analyze the data of the CMS/EDG stress test
- complete classification of problems
- graphical representation of results
March 27th 2003CHEP'03 Conference, San Diego 15Claudio Grandi INFN Bologna
Summary BOSS is a tool that allows real-time monitoring and
book-keeping of batch jobs User-defined information is archived for different job types
Has been used by CMS for 2002 official productions
Has been used during the CMS/EDG stress test in a grid environment
Is a general tool: nothing CMS or even HEP specific
Web site: http://www.bo.infn.it/cms/computing/BOSS/
March 27th 2003CHEP'03 Conference, San Diego 16Claudio Grandi INFN Bologna