Download - SAM: Tevatron Experiments Using the Grid
11 March 2004 Getting Ready for the Grid
SAM: Tevatron Experiments Using the Grid
• CDF and D0 Need the Grid– Requirements, the CAF and SAM– Grid from the User Perspective
• Grid to Meet the Need– How SAM works– SAM usage by D0 and CDF
• Near Future: SAMGrid
Rick St. Denis, University of Glasgow
11 March 2004 Getting Ready for the Grid
Reviews: Director’s (technically),
International Finance Committee (fiscally)
FNAL PAC (for its physics merit)
Maximize physics output @ low Lumi
–L3 output rate: 80 -> 360Hz by 06
Spokespersons’ Requirements for CDF
50% computing outside FNAL
CDF needs the Grid
11 March 2004 Getting Ready for the Grid
Scale of CDF Requirements
THz %offsite CPU
Speed
#duals
FY04 3.7 25% 3GHz 150
FY05 9.0 50% 5GHz +360
FY06 16.5 50% 8GHz +220
6-7 sites, 100Duals each, by 2006 + 700 @FNAL
11 March 2004 Getting Ready for the Grid
CDF Computing Model
• Develop Analysis on desktop– Access to all CDF data from
anywhere• Large scale processing on batch
clusters– Submission from anywhere– interactive tools: ls,top,head/tail/cat– Output to scratch space or desktop
Implemented Now with CAF (not Grid standard)
Exists Now
11 March 2004 Getting Ready for the Grid
Central Analysis Facility
• CAF is a pile of PC’s with a pile of disks. (1200 processors and 100TB)
• This can be implemented anywhere as dCAF: Decentralized CAF.
• Output of jobs can go to desktop or a scratch area
• Need a password for this: authentication (kerberos).
11 March 2004 Getting Ready for the Grid
Sequential Access through Metadata
• Metadata: SAM allows groups of files to be identified into datasets using attributes (metadata) such as production pass version or top quark mass to associate them.
• File Retrieval: SAM moves files to users as they request them.
• File Storage: SAM allows output files to be stored with new metadata.
11 March 2004 Getting Ready for the Grid
Metadata
File Type: SAMMC Data File
File Name: Bs_conc_4o5_3.root
File ID: 2494282
File Size: 530926740 [B]
File Start Time: 01/29/2004 16:00:00
File End Time: 01/29/2004 17:00:00
Application Family: generator
Application Version: 1.00
Description: BsDspi_phipi MONTE CARLO Dataset 4o5 part 3
Run Number: 167634
[sam@nglas08 ~]$ sam get metadata --file=Bs_conc_4o5_3.root
totalevents = 7290
Work Group: cdf
Node Name: cdfsam.cnaf.infn.it
dataset = BsMC-lucchesi_test
html = http://www.pd.infn.it/~lucchesi
11 March 2004 Getting Ready for the Grid
Use Cases
• User Level MC Production
– All Users have access
– No data on site -> write to tape at FNAL
• User Level Data Access
– All users have access
– Selected samples automaticaly copied on site
SAM provides this
11 March 2004 Getting Ready for the Grid
Functionality
• User selects a place to run, saying what dataset they will use
• System checks they can do this (privileges)
• User access to data at any place
• User output is stored on any disk or back to tape at FNAL and results are made available for transfer to any site for others to analyse.
11 March 2004 Getting Ready for the Grid
CAF Gui/CLI
User Perspective
Analysis program
Grid
Toronto KoreaItaly Taiwan FermiCAF UK
CAF Gui/CLI
User Perspective
Only Fermilab
Uses SAM
Outside LabGrid
Uses SAMUses SAM
11 March 2004 Getting Ready for the Grid
Meeting the Needs
• SAM: How it works
• Progress in SAM
• CDFGridWorkshop: “Nerd’s Paradise”
• D0 and CDF Usage
11 March 2004 Getting Ready for the Grid
Fcdfdata016 Disk/Cache
Stationcentral-analysis
Daemon(smaster)
StagerDaemon
(stagerng)
FSS(Deamon)
(fss)
StagerDaemon
(stagerng)
Disk/Cache
Disk/Cache
Disk/CacheStagerDaemon
(stagerng)
StagerDaemon
(stagerng)
StagerDaemon
(stagerng)
11 March 2004 Getting Ready for the Grid
Node1
Cache
Node2
Cache
Node3
Cache
Node4
Cache
Node5
Cache
Station
smaster
Stager
stagerng
Stager
stagerng
Stager
stagerng
Stager
stagerng
Stager
stagerng
A Farm: Station with Stagers and Caches
11 March 2004 Getting Ready for the Grid
What can 20 duals and 6 TB do?Stream Events Days Input
Size
Top,W/Z 20.5 M 10.3 4.5TB
Hadronic B and charm
156M 78.3 34.2TB
Need to transfer 0.6 GB/min or 1 TB/Day
11 March 2004 Getting Ready for the Grid
fcdfdata016<fcdfdata016>
Disks/Cache
11 March 2004 Getting Ready for the Grid
fcdfdata016<fcdfdata016>
Stationcentral-analysis
smaster
Disks/Cache
11 March 2004 Getting Ready for the Grid
fcdfdata016<fcdfdata016>
Stationcentral-analysis
smaster
Disks/Cache
Stager
stagerng
11 March 2004 Getting Ready for the Grid
fcdfdata016<fcdfdata016>sam submit--script=userscript--group=groupname--cpu-per-event=--defname=
Stationcentral-analysis
smaster
Disks/Cache
Stager
stagerng
11 March 2004 Getting Ready for the Grid
fcdfdata016<fcdfdata016>>>>>>> Starting project with the Station MasterStation Master contacted, result: Started project 49008(49008_sam_) for group testWaiting for the project to initialize...
Stationcentral-analysis
smaster
Disks/Cache
Stager
stagerng
11 March 2004 Getting Ready for the Grid
fcdfdata016<fcdfdata016>Callback from server: 'OK|Project is ready'
Stationcentral-analysis
smaster
Disks/Cache
Stager
stagerng
Project
pmaster
11 March 2004 Getting Ready for the Grid
fcdfdata016<fcdfdata016>>>>>>> Submitting the job to the batch system.
Stationcentral-analysis
smaster
Disks/Cache
Stager
stagerng
Project
pmaster
11 March 2004 Getting Ready for the Grid
fcdfdata016<fcdfdata016>Job <52554> is submitted to queue <sam_lo>.
Stationcentral-analysis
smaster
Disks/Cache
Stager
stagerng
Project
pmaster
Batch (LSF)
52554 <user> PSUSP
11 March 2004 Getting Ready for the Grid
fcdfdata016<fcdfdata016>Job <52554> is submitted to queue <sam_lo>.
Stationcentral-analysis
smaster
Disks/Cache
Stager
stagerng
Project
pmaster
Batch (LSF)
52554 <user> PSUSP
Optimizer
11 March 2004 Getting Ready for the Grid
fcdfdata016<fcdfdata016>Job <52554> is submitted to queue <sam_lo>.
Stationcentral-analysis
smaster
Disks/Cache
Stager
stagerng
Project
pmaster
Batch (LSF)
52554 <user> PSUSP
eworker
eworkereworker
eworker
11 March 2004 Getting Ready for the Grid
fcdfdata016<fcdfdata016>Job <52554> is submitted to queue <sam_lo>.
Stationcentral-analysis
smaster
Disks/Cache
Stager
stagerng
Project
pmaster
Batch (LSF)
52554 <user> PSUSP
eworker
eworkereworker
eworker
encp
encp encp encp
11 March 2004 Getting Ready for the Grid
fcdfdata016<fcdfdata016>Job <52554> is submitted to queue <sam_lo>.
Stationcentral-analysis
smaster
Disks/Cache
Stager
stagerng
Project
pmaster
Batch (LSF)
52554 <user> PSUSP
eworker
eworkereworker
eworker
encp
encp encp encp
Enstore
11 March 2004 Getting Ready for the Grid
fcdfdata016<fcdfdata016>Job <52554> is submitted to queue <sam_lo>.
Stationcentral-analysis
smaster
Disks/Cache
Stager
stagerng
Project
pmaster
Batch (LSF)
52554 <user> PSUSP
eworker
eworkereworker
eworker
encp
encp encp encp
Enstore
11 March 2004 Getting Ready for the Grid
fcdfdata016<fcdfdata016>Job <52554> is submitted to queue <sam_lo>.
Stationcentral-analysis
smaster
Disks/Cache
Stager
stagerng
Project
pmaster
Batch (LSF)
52554 <user> PSUSP
eworker
eworkereworker
eworker
encp
encp encp encp
Enstore
11 March 2004 Getting Ready for the Grid
fcdfdata016<fcdfdata016>Job <52554> is submitted to queue <sam_lo>.
Stationcentral-analysis
smaster
Disks/Cache
Stager
stagerng
Project
pmaster
Batch (LSF)
52554 <user> PSUSP
eworker
eworker
encp encp
Enstore
11 March 2004 Getting Ready for the Grid
fcdfdata016<fcdfdata016>Job <52554> is submitted to queue <sam_lo>.
Stationcentral-analysis
smaster
Disks/Cache
Stager
stagerng
Project
pmaster
Batch (LSF)
52554 <user> RUN
eworker
eworker
encp encp
Enstore
11 March 2004 Getting Ready for the Grid
fcdfdata016<fcdfdata016>Job <52554> is submitted to queue <sam_lo>.
Stationcentral-analysis
smaster
Disks/Cache
Stager
stagerng
Project
pmaster
Batch (LSF)
52554 <user> RUN
eworker
eworker
encp encp
Enstore
samscript.sh
userscript
11 March 2004 Getting Ready for the Grid
fcdfdata016<fcdfdata016>Job <52554> is submitted to queue <sam_lo>.
Stationcentral-analysis
smaster
Disks/Cache
Stager
stagerng
Project
pmaster
Batch (LSF)
52554 <user> RUN
eworker
eworker
encp encp
Enstore
samscript.sh
userscript
consumer
11 March 2004 Getting Ready for the Grid
fcdfdata016<fcdfdata016>Job <52554> is submitted to queue <sam_lo>.
Stationcentral-analysis
smaster
Disks/Cache
Stager
stagerng
Project
pmaster
Batch (LSF)
52554 <user> RUN
eworker
eworker
encp encp
Enstore
samscript.sh
userscript
consumer
11 March 2004 Getting Ready for the Grid
fcdfdata016<fcdfdata016>Job <52554> is submitted to queue <sam_lo>.
Stationcentral-analysis
smaster
Disks/Cache
Stager
stagerng
Project
pmaster
Batch (LSF)
52554 <user> RUN
eworker
eworker
encp encp
Enstore
samscript.sh
userscript
consumer
11 March 2004 Getting Ready for the Grid
fcdfdata016<fcdfdata016>Job <52554> is submitted to queue <sam_lo>.
Stationcentral-analysis
smaster
Disks/Cache
Stager
stagerng
Project
pmaster
Batch (LSF)
52554 <user> RUN
eworker
eworker
encp encp
Enstore
samscript.sh
userscript
consumer
11 March 2004 Getting Ready for the Grid
fcdfdata016<fcdfdata016>Job <52554> is submitted to queue <sam_lo>.
Stationcentral-analysis
smaster
Disks/Cache
Stager
stagerng
Project
pmaster
Batch (LSF)
52554 <user> RUN
eworker
eworker
encp encp
Enstore
samscript.sh
userscript
consumer
11 March 2004 Getting Ready for the Grid
SAMManager:sam Getting next input file...
SAMManager:sam Project master will call back.
11 March 2004 Getting Ready for the Grid
fcdfdata016<fcdfdata016>Job <52554> is submitted to queue <sam_lo>.
Stationcentral-analysis
smaster
Disks/Cache
Stager
stagerng
Project
pmaster
Batch (LSF)
52554 <user> RUN
eworker
eworker
encp encp
Enstore
samscript.sh
userscript
consumer
11 March 2004 Getting Ready for the Grid
fcdfdata016<fcdfdata016>Job <52554> is submitted to queue <sam_lo>.
Stationcentral-analysis
smaster
Disks/Cache
Stager
stagerng
Project
pmaster
Batch (LSF)
52554 <user> RUN
eworker
eworker
encp encp
Enstore
samscript.sh
userscript
consumer
11 March 2004 Getting Ready for the Grid
fcdfdata016<fcdfdata016>Job <52554> is submitted to queue <sam_lo>.
Stationcentral-analysis
smaster
Disks/Cache
Stager
stagerng
Project
pmaster
Batch (LSF)
52554 <user> RUN
samscript.sh
userscript
consumer
11 March 2004 Getting Ready for the Grid
fcdfdata016<fcdfdata016>Job <52554> is submitted to queue <sam_lo>.
Stationcentral-analysis
smaster
Disks/Cache
Stager
stagerng
Project
pmaster
Batch (LSF)
52554 <user> RUN
samscript.sh
userscript
consumer
11 March 2004 Getting Ready for the Grid
fcdfdata016<fcdfdata016>Job <52554> is submitted to queue <sam_lo>.
Stationcentral-analysis
smaster
Disks/Cache
Stager
stagerng
Project
pmaster
Batch (LSF)
52554 <user> RUN
samscript.sh
userscript
consumer
11 March 2004 Getting Ready for the Grid
fcdfdata016<fcdfdata016>Job <52554> is submitted to queue <sam_lo>.
Stationcentral-analysis
smaster
Disks/Cache
Stager
stagerng
Project
pmaster
Batch (LSF)
52554 <user> RUN
samscript.sh
userscript
consumer
rm
rm
11 March 2004 Getting Ready for the Grid
fcdfdata016<fcdfdata016>Job <52554> is submitted to queue <sam_lo>.
Stationcentral-analysis
smaster
Disks/Cache
Stager
stagerng
Project
pmaster
Batch (LSF)
52554 <user> RUN
samscript.sh
userscript
consumer
11 March 2004 Getting Ready for the Grid
fcdfdata016<fcdfdata016>Job <52554> is submitted to queue <sam_lo>.
Stationcentral-analysis
smaster
Disks/Cache
Stager
stagerng
Project
pmaster
Batch (LSF)
52554 <user> RUN
samscript.sh
userscript
consumer
Optimizer
11 March 2004 Getting Ready for the Grid
fcdfdata016<fcdfdata016>Job <52554> is submitted to queue <sam_lo>.
Stationcentral-analysis
smaster
Disks/Cache
Stager
stagerng
Project
pmaster
Batch (LSF)
52554 <user> RUN
samscript.sh
userscript
consumer
eworkereworker
11 March 2004 Getting Ready for the Grid
fcdfdata016<fcdfdata016>Job <52554> is submitted to queue <sam_lo>.
Stationcentral-analysis
smaster
Disks/Cache
Stager
stagerng
Project
pmaster
Batch (LSF)
52554 <user> RUN
samscript.sh
userscript
consumer
eworkereworker
rcp rcp
11 March 2004 Getting Ready for the Grid
fcdfdata016<fcdfdata016>Job <52554> is submitted to queue <sam_lo>.
Stationcentral-analysis
smaster
Disks/Cache
Stager
stagerng
Project
pmaster
Batch (LSF)
52554 <user> RUN
samscript.sh
userscript
consumer
eworkereworker
rcp rcp
Other Cache
11 March 2004 Getting Ready for the Grid
fcdfdata016<fcdfdata016>Job <52554> is submitted to queue <sam_lo>.
Stationcentral-analysis
smaster
Disks/Cache
Stager
stagerng
Project
pmaster
Batch (LSF)
52554 <user> RUN
samscript.sh
userscript
consumer
eworkereworker
rcp rcp
Other Cache
11 March 2004 Getting Ready for the Grid
fcdfdata016<fcdfdata016>Job <52554> is submitted to queue <sam_lo>.
Stationcentral-analysis
smaster
Disks/Cache
Stager
stagerng
Project
pmaster
Batch (LSF)
52554 <user> RUN
samscript.sh
userscript
consumer
eworkereworker
rcp rcp
11 March 2004 Getting Ready for the Grid
fcdfdata016<fcdfdata016>Job <52554> is submitted to queue <sam_lo>.
Stationcentral-analysis
smaster
Disks/Cache
Stager
stagerng
Project
pmaster
Batch (LSF)
52554 <user> RUN
samscript.sh
userscript
consumer
11 March 2004 Getting Ready for the Grid
fcdfdata016<fcdfdata016>Job <52554> is submitted to queue <sam_lo>.
Stationcentral-analysis
smaster
Disks/Cache
Stager
stagerng
Project
pmaster
Batch (LSF)
52554 <user> RUN
samscript.sh
userscript
consumer
11 March 2004 Getting Ready for the Grid
fcdfdata016<fcdfdata016>Job <52554> is submitted to queue <sam_lo>.
Stationcentral-analysis
smaster
Disks/Cache
Stager
stagerng
Project
pmaster
Batch (LSF)
52554 <user> RUN
samscript.sh
userscript
consumer
11 March 2004 Getting Ready for the Grid
fcdfdata016<fcdfdata016>Job <52554> is submitted to queue <sam_lo>.
Stationcentral-analysis
smaster
Disks/Cache
Stager
stagerng
Project
pmaster
Batch (LSF)
52554 <user> RUN
samscript.sh
userscript
consumer
11 March 2004 Getting Ready for the Grid
fcdfdata016<fcdfdata016>Job <52554> is submitted to queue <sam_lo>.
Stationcentral-analysis
smaster
Disks/Cache
Stager
stagerng
Project
pmaster
Batch (LSF)
52554 <user> RUN
samscript.sh
userscript
consumer
11 March 2004 Getting Ready for the Grid
fcdfdata016<fcdfdata016>Job <52554> is submitted to queue <sam_lo>.
Stationcentral-analysis
smaster
Disks/Cache
Stager
stagerng
Project
pmaster
Batch (LSF)
52554 <user> RUN
samscript.sh
userscript
consumer
11 March 2004 Getting Ready for the Grid
fcdfdata016<fcdfdata016>Job <52554> is submitted to queue <sam_lo>.
Stationcentral-analysis
smaster
Disks/Cache
Stager
stagerng
Project
pmaster
Batch (LSF)
52554 <user> RUN
samscript.sh
userscript
consumer
11 March 2004 Getting Ready for the Grid
fcdfdata016<fcdfdata016>Job <52554> is submitted to queue <sam_lo>.
Stationcentral-analysis
smaster
Disks/Cache
Stager
stagerng
Project
pmaster
Batch (LSF)
52554 <user> RUN
samscript.sh
userscript
consumer
11 March 2004 Getting Ready for the Grid
fcdfdata016<fcdfdata016>Job <52554> is submitted to queue <sam_lo>.
Stationcentral-analysis
smaster
Disks/Cache
Stager
stagerng
Project
pmaster
Batch (LSF)
52554 <user> RUN
samscript.sh
userscript
consumer
11 March 2004 Getting Ready for the Grid
fcdfdata016<fcdfdata016>
Stationcentral-analysis
smaster
Disks/Cache
Stager
stagerng
Project
pmaster
Batch (LSF)
11 March 2004 Getting Ready for the Grid
fcdfdata016<fcdfdata016>
Stationcentral-analysis
smaster
Disks/Cache
Stager
stagerngBatch (LSF)
11 March 2004 Getting Ready for the Grid
fcdfdata016<fcdfdata016>sam submit….
<fcdfdata016>sam submit….
<fcdfdata016>sam run project…
Stationcentral-analysis
smaster
Disks/Cache
Stager
stagerngBatch (LSF)
11 March 2004 Getting Ready for the Grid
fcdfdata016<fcdfdata016>
Stationcentral-analysis
smaster
Disks/Cache
Stager
stagerngBatch (LSF)
52668 <user1> RUN52675 <user2> RUN
52756 <user3> PSUSP
Project
pmaster
Project
pmaster
Project
pmaster
samscript.shsamscript.sh
userscriptuserscript
consumer consumer
eworker
eworker
rcp
encp
Other CacheEnstore
11 March 2004 Getting Ready for the Grid
fcdfdata016<fcdfdata016>
Stationcentral-analysis
smaster
Disks/Cache
Stager
stagerngBatch (LSF)
52668 <user1> RUN52675 <user2> RUN
52756 <user3> PSUSP
Project
pmaster
Project
pmaster
Project
pmaster
samscript.shsamscript.sh
userscriptuserscript
consumer consumer
eworker
eworker
rcp
encp
Other CacheEnstore
11 March 2004 Getting Ready for the Grid
SAM Animation
worldScenerio.html
11 March 2004 Getting Ready for the Grid
Storing Files
Getting things to tape from Glasgow
11 March 2004 Getting Ready for the Grid
fcdfdata016<fcdfdata016>
Disks
FSSCentral-analysis
fss
Stager
stagerng
11 March 2004 Getting Ready for the Grid
fcdfdata016<fcdfdata016>sam store descrip.py --source=<file loc>[--dest=/pnfs…..]
Disks
FSSCentral-analysis
fss
Stager
stagerng
11 March 2004 Getting Ready for the Grid
fcdfdata016<fcdfdata016>sam store descrip.py --source=<file loc>[--dest=/pnfs…..]
Disks
FSSCentral-analysis
fss
Stager
stagerngDescrip.pyMetadataInfo about file
Sam checks info,checks location,
11 March 2004 Getting Ready for the Grid
fcdfdata016<fcdfdata016>sam store descrip.py --source=<file loc>[--dest=/pnfs…..]
Disks
FSSCentral-analysis
fss
Stager
stagerng
eworker
encp, rcp,bbftp
11 March 2004 Getting Ready for the Grid
Node from ReallyFar Away
Disk
FssFrom Really
Far Away
Stager
fcdfdata016
Fsscentral-analysis
Stager
Tmp Disk
Enstore
11 March 2004 Getting Ready for the Grid
Node from ReallyFar Away
Disk
FssRouting:
fcdfdata016
Stager
fcdfdata016
Fsscentral-analysis
Stager
Tmp Disk
Enstore
sam store enstore
11 March 2004 Getting Ready for the Grid
Node from ReallyFar Away
Disk
FssRouting:
fcdfdata016
Stager
fcdfdata016
Fsscentral-analysis
Stager
Tmp Disk
Enstore
eworker
bbftp fcdfdata016
11 March 2004 Getting Ready for the Grid
Node from ReallyFar Away
Disk
FssFrom really
Far away
Stager
fcdfdata016
Fsscentral-analysis
Stager
Tmp Disk
Enstore
eworker
bbftp fcdfdata016
11 March 2004 Getting Ready for the Grid
Node from ReallyFar Away
Disk
FssFrom really
Far away
Stager
fcdfdata016
Fsscentral-analysis
Stager
Tmp Disk
Enstore
11 March 2004 Getting Ready for the Grid
Node from ReallyFar Away
Disk
FssFrom really
Far away
Stager
fcdfdata016
Fsscentral-analysis
Stager
Tmp Disk
Enstore
eworker
encp
11 March 2004 Getting Ready for the Grid
Node from ReallyFar Away
Disk
FssFrom really
Far away
Stager
fcdfdata016
Fsscentral-analysis
Stager
Tmp Disk
Enstore
eworker
encp
11 March 2004 Getting Ready for the Grid
Node from ReallyFar Away
Disk
FssFrom really
Far away
Stager
fcdfdata016
Fsscentral-analysis
Stager
Tmp Disk
Enstore
rm
11 March 2004 Getting Ready for the Grid
Node from ReallyFar Away
Disk
FssFrom really
Far away
Stager
fcdfdata016
Fsscentral-analysis
Stager
Tmp Disk
Enstore
11 March 2004 Getting Ready for the Grid
D0 Sam
D0 relies entirely on SAM for analysis
11 March 2004 Getting Ready for the Grid
D0
11 March 2004 Getting Ready for the Grid
D0 Files
4000-8000 Files/Day
11 March 2004 Getting Ready for the Grid
D0 Data Volume
1TB-3TB/day
11 March 2004 Getting Ready for the Grid
D0 Files Per Month By Year1999 2000 2001 2002 2003
100,000 files
Run II Start
11 March 2004 Getting Ready for the Grid
D0 Total Files
2.5Million Files Served
11 March 2004 Getting Ready for the Grid
D0 Data Per Month By Year
50 TB per month
1999 2000 2001 2002 2003
Run II Start
11 March 2004 Getting Ready for the Grid
D0 Total Data Moved
700TB moved
11 March 2004 Getting Ready for the Grid
Progress in SAM: CDF• All 800,000 CDF data files are in SAM
• Sam is in beta testing on the CDF CAF (1200 cpus): passed 20TB/Day delivery
• Karlsruhe uses SAM routinely
• Minos uses SAM for its Data Handling
• Steve Mrenna (Phenomenology) depositing ALPGEN files in SAM for common CDF/D0 use.
11 March 2004 Getting Ready for the Grid
Florida workshop:• 11 installations in about 2 hours. Integrated with
dCAF in 2 cases in 2 days.• 3 in Asia, 4 in Europe • 6 sites committed to summer 2004 usage of their
facilities for all of CDF (mostly MC)• Sam installation now: initsam cdf <stationname>• Follow-up on April 1.• Each site has a local user support person to reduce
load on core development team.• Generally: Security ate 80% of the effort!
Now 20!
11 March 2004 Getting Ready for the Grid
CDF
11 March 2004 Getting Ready for the Grid
Installations progress
Participating Institues installation and testing progress
INSTITUTE krb5 Caf
Head Caf
Node DCAF Works
CDF Sam
Software
Sam Station
sam_par_ret Sam
AC++Dump
Sam File
Store
Sam File
Store Remote
Sam AC++Dump
on CAF
MIT Yes ?
Korea Yes Yes Yes Yes Yes knu Yes Yes
Pisa Yes Yes Yes Yes Yes pisa Yes Yes Yes Yes Yes
Japan Yes Yes Yes Problems Yes japan Yes Yes Yes
Karlsruhe Yes Yes Yes Problems Yes fzzka Yes Yes Yes Yes Yes
Liverpool Yes Yes Yes Problems Yes liverpool Yes Yes Yes
Toronto Yes In progress
Yes toronto Yes
Taiwan Yes Yes Yes Yes Yes taiwan Yes Yes
TTU Yes -ttu,-ttu-phys
Yes
Glasgow Yes In Progress
Yes glasgow Yes Yes
UCSD Yes Yes Yes Yes Yes ucsd Yes
CNAF Yes Yes Yes Yes Yes cnaf Yes Yes Yes Yes Yes
Florida Workshop: After 2 Days
11 March 2004 Getting Ready for the Grid
2TB/Day: Karlsruhe
11 March 2004 Getting Ready for the Grid
CDF Dcache on CAF
ALL CDF on CAF reads 25TB/Day
NonGrid Running
11 March 2004 Getting Ready for the Grid
Karlsruhe: 1500 files/Day
CDF Files in a Month
11 March 2004 Getting Ready for the Grid
Karlsruhe: 5-10M Evt/Day
CDF Events Transfer in a Month
11 March 2004 Getting Ready for the Grid
All CDF Files Moved by SAM
300K Files
2002 2003
D0: 2.5M files
11 March 2004 Getting Ready for the Grid
Total CDF Data Moved
200 TB
2002 2003
D0:700TB
11 March 2004 Getting Ready for the Grid
Advantage of Local Processing
• Karlsruhe processes 2TB/day. Rest of CDF on Central Cluster processes 25TB/day.
• 5 users actively at Karlsruhe: 100 for rest of CDF: getting 1.6 X the resources.
• They pin the datasets that are of interest and when new ones come, copy them in automatically.
11 March 2004 Getting Ready for the Grid
In the near term future:JIM
Adding Grid Standard Tools
11 March 2004 Getting Ready for the Grid
CDF Grid Strategy• 25% of CDF Computing from external
resources. All CDF computing on CDF Grid by April 15: Utilize resources fully controlled by CDF: Kerberos/fbsng: dCAF + SAM
• October 15, 2004: JIM to capture shared resources
• June 2005: 50% of Computing resources external
11 March 2004 Getting Ready for the Grid
Desktop
Anywhere
CondorSubmitter
@regional centers
SAM DBCondor Matchmaker
@FNAL
Globus GKCAF SubmitterSAM Station
@ each site
WN
Private LAN
Private LAN
dCache
June 2004testing
June 2005required
Simple JIM
11 March 2004 Getting Ready for the Grid
Detailed JIM
SiteSite SiteSite SiteSite
Resource Selector
Info Collector
Info Gatherer
Match Making
User InterfaceUser Interface User InterfaceUser Interface
SubmissionGlobal Job Queue
Grid Client
SubmissionSubmission
User InterfaceUser Interface User InterfaceUser Interface
Global DH ServicesSAM Naming Server
SAM Log Server
Resource Optimizer
SAM DB ServerRC MetaData Catalog
Bookkeeping Service
SAM Stager(s)
SAM Station(+other servs)
Data Handling
Worker Nodes
Grid Gateway
Local Job Handler(CAF, D0MC, BS, ...)
JIM Advertise
Local Job Handling
Cluster
AAA
Dist.FS
Info Manager
XML DB server
Site Conf.Glob/Loc JID map...
Info Providers
MDS
MSS Cache Site
Web ServGrid Monitoring
User Tools
Flow of: job data meta-data
11 March 2004 Getting Ready for the Grid
Conclusions
• CDF has embraced the need for the Grid to achieve its physics mission
• SAM is working for D0 and growing in CDF