Transcript
Page 1: Using ssh as portal - The CMS CRAB over glideinWMS experience

CHEP 2013 Using SSH as portal for CMS 1

CHEP 2013

Using ssh as portal

The CMS CRAB over glideinWMS experience

byI Sfiligoi1, S Belforte2,

J Letts1, T Martin1, M D Saiz Santos1 and F Fanzago3 1University of California San Diego 2Università e INFN di Trieste

3Università e INFN di Padova

Page 2: Using ssh as portal - The CMS CRAB over glideinWMS experience

CHEP 2013 Using SSH as portal for CMS 2

Agenda

● How we got there?● What we did?● How it worked out?

Page 3: Using ssh as portal - The CMS CRAB over glideinWMS experience

CHEP 2013 Using SSH as portal for CMS 3

CMS AnaOps

● CMS is one of the 4 experiments at theLarge Hadron Collider

● CMS Analysis Operations (AnaOps) is a computing task focused on the operational aspects of enabling physics data analysis– i.e. provides the computing infrastructure

used by the actual physicists

Page 4: Using ssh as portal - The CMS CRAB over glideinWMS experience

CHEP 2013 Using SSH as portal for CMS 4

CRABCMS AnaOps submission tool

● CMS AnaOps has been using CRABas the user interface to the Gridsince 2004

● Initially was a purely client interface– CRAB basically just a fancy wrapper

– So the Client machine had to havethe full Grid stack to function

Page 5: Using ssh as portal - The CMS CRAB over glideinWMS experience

CHEP 2013 Using SSH as portal for CMS 5

CRAB2 Server

● CRAB2 Server introduced in 2007● Client does not need the Grid stack anymore

Page 6: Using ssh as portal - The CMS CRAB over glideinWMS experience

CHEP 2013 Using SSH as portal for CMS 6

glideinWMS

● In 2009 CMS added glideinWMS to the list of supported middleware

● glideinWMS, based on HTCondor, does not support remote job submission– CRAB2 Server had to sit on the same node as

(one of) the HTCondor scheduler(s)

Page 7: Using ssh as portal - The CMS CRAB over glideinWMS experience

CHEP 2013 Using SSH as portal for CMS 7

glideinWMS and CRAB2 Server

HTCondorScheduler

glideinWMS

A schematic view

Page 8: Using ssh as portal - The CMS CRAB over glideinWMS experience

CHEP 2013 Using SSH as portal for CMS 8

Operational experience

● CMS AnaOps experienced a lot of operational issues with CRAB2 Server submitting through glideinWMS– One major problem being the server

losing track of job status

● Users were complaining a lot

Page 9: Using ssh as portal - The CMS CRAB over glideinWMS experience

CHEP 2013 Using SSH as portal for CMS 9

A look at the CRAB2 Server

● Designed to be very modular and flexible– But that made it

complex, too

● CMS decided togive up on fixing it– CRAB3 would simply

replace it

Page 10: Using ssh as portal - The CMS CRAB over glideinWMS experience

CHEP 2013 Using SSH as portal for CMS 10

A look at the CRAB2 Server

● Designed to be very modular and flexible– But that made it

complex, too

● CMS decided togive up on fixing it– CRAB3 would simply

replace it

But by late 2012,CRAB3 still did

not exist

Page 11: Using ssh as portal - The CMS CRAB over glideinWMS experience

CHEP 2013 Using SSH as portal for CMS 11

What was wrong with CRAB2 Client?

● The CRAB2 Client never went away– A significant fraction of CMS users kept using it

after CRAB2 Server was put in production

– And they were, by and large, liking it

● It just could not be used to submit to glideinWMS– Due to lack of

(native) remote submission to HTCondor

Page 12: Using ssh as portal - The CMS CRAB over glideinWMS experience

CHEP 2013 Using SSH as portal for CMS 12

Why not just use ssh?

● The major stated benefit of CRAB2 Server was “remote submission using a standard interface”(there were of course other stated benefits, but they had very little impact when paired with glideinWMS)

● But what users normally use for remote access?

– SSH● So, what if we used ssh in CRAB2 client as well?

– The gsissh dialect to support x509 proxies

Page 13: Using ssh as portal - The CMS CRAB over glideinWMS experience

CHEP 2013 Using SSH as portal for CMS 13

The ssh-enabled CRAB2 Client

● There are three stages in CRAB2 Client– Job submission

● Includes uploading the input sandbox

– Job monitoring

– Output log fetching

● The Client uses ssh and/or scp to talk to a node running the HTCondor scheduler

Page 14: Using ssh as portal - The CMS CRAB over glideinWMS experience

CHEP 2013 Using SSH as portal for CMS 14

The ssh-enabled CRAB2 Client

crab_submit

crab_monitor

crab_fetch

HTCondorScheduler

glideinWMS

ssh

scp input_sandox

ssh condor_submit

ssh condor_q

scp output_logs

Client node

A schematic view

Page 15: Using ssh as portal - The CMS CRAB over glideinWMS experience

CHEP 2013 Using SSH as portal for CMS 15

Server side setup

● gsisshd is a standard package– Just installed it from RPM

● Using lcmaps for authorization– With callbacks to

GUMS in the USA, andArgus in Europe

● Each user gets a standard Linux account– No technical limits on what the user can run,

but we would kill any offenders, if spotted

Page 16: Using ssh as portal - The CMS CRAB over glideinWMS experience

CHEP 2013 Using SSH as portal for CMS 16

Users liked it!

● Took only 4 months for all users to completely switch from Server to ssh-Client for glideinWMS

● And another 4 months to get 90% of all work

ssh-based Client

Page 17: Using ssh as portal - The CMS CRAB over glideinWMS experience

CHEP 2013 Using SSH as portal for CMS 17

Smooth operations

● We have not seen any major problems with ssh-based CRAB2 client itself– Now almost all problems are in the Grid layer

● Maybe not too surprising– CRAB2 Client is really

just a fancy wrapper

– SSH is a very mature tool Operational load now at

historical lowin spite of increasing usage

Page 18: Using ssh as portal - The CMS CRAB over glideinWMS experience

CHEP 2013 Using SSH as portal for CMS 18

Works at scale

Includes non-ssh jobs as wellIncludes non-ssh jobs as well

Essentially ssh-only Essentially ssh-only

Page 19: Using ssh as portal - The CMS CRAB over glideinWMS experience

CHEP 2013 Using SSH as portal for CMS 19

Hickups during the voyage

● We occasionally hit a ssh quirk– In order to speed up ssh connections,

we re-use the ssh control session (-S)

– If it gets stuck, user needed to manually remove it

● CRAB2 Client now automatically detects and fixes the problem

Page 20: Using ssh as portal - The CMS CRAB over glideinWMS experience

CHEP 2013 Using SSH as portal for CMS 20

Remaining issue

● The major remaining issue is the distribution of the CRAB2 Client code– This still must be installed on the client node

– Any change must be pushed to ALL the users● Making server-side changes challenging

Page 21: Using ssh as portal - The CMS CRAB over glideinWMS experience

CHEP 2013 Using SSH as portal for CMS 21

Conclusions

● Using (gsi)ssh as a portalhas proven to be very effective

● By removing complexity out ofthe code we have to maintain,life got much easier

Page 22: Using ssh as portal - The CMS CRAB over glideinWMS experience

CHEP 2013 Using SSH as portal for CMS 22

Acknowledgments

● The idea of using ssh came from the RCondor package (see poster #322)

● This work was partially sponsored by the US National Science Foundation under Grants No. PHY-1148698 and PHY-1120138.


Top Related