19 february 2004samgrid project review samgrid: future plans cdf accepts the need for the grid...
TRANSCRIPT
19 February 2004 SAMGrid Project Review
SAMGrid: Future Plans
• CDF Accepts the Need for the Grid– Requirements
• D0 Relies on the Grid– Requirements
• How to Meet the Need– Status of SAMGrid– The Grid Tools
Rick St. Denis, University of Glasgow
19 February 2004 SAMGrid Project Review
Director’s review, International Finance
Committee: 50% computing outside FNAL
Maximize physics output @ low Lumi
–L3 output rate: 80 -> 360Hz by 06
Spokespersons’ Requirements for CDF
CDFGrid supported by FNAL PAC
CDF needs the Grid
19 February 2004 SAMGrid Project Review
Scale of CDF Requirements
THz %offsite CPU
Speed
#duals
FY04 3.7 25% 3GHz 150
FY05 9.0 50% 5GHz +360
FY06 16.5 50% 8GHz +220
6-7 sites, 100Duals each, by 2006 + 700 @FNAL
19 February 2004 SAMGrid Project Review
What can 20 duals and 6 TB do?Stream Events Days Input
Size
A:Top,W/Z 20.5 M 10.3 4.5TB
H:Hadronic B and charm
156M 78.3 34.2TB
Need to transfer 0.6 GB/min or 1 TB/Day
19 February 2004 SAMGrid Project Review
CDF Computing Model
• Develop Analysis on desktop– Access to all CDF data from
anywhere• Large scale processing on batch
clusters– Submission from anywhere– interactive tools: ls,top,head/tail/cat– Output to scratch space or desktop
Implemented Now with CAF
19 February 2004 SAMGrid Project Review
Use Cases for Summer 2004
• User Level MC Production– All CDF Users have access– No data on site -> SAM write
• User Level Data Access– All users have access– Selected samples on site: Full SAM
Support
SAM Essential for Summer 2004
19 February 2004 SAMGrid Project Review
Medium Term Vision
• Many Sites
• Fully transparent submission to all of CDF resources: 75% FNAL, 25% outside
• Fully transparent input and output of data
• Farm future: not as a special facility
19 February 2004 SAMGrid Project Review
Summer 04 Functionality
• User selects submission site, saying what dataset they will use
• System checks they can do this (privileges)
• User access with SAM/dCache
• User registers output with SAM
19 February 2004 SAMGrid Project Review
CAF Gui/CLI
CDFGrid from a User Perspective
AC++
Grid
Toronto KoreaItaly Taiwan FermiCAF UK
CAF Gui/CLI
CDF Grid from a User Perspective
Only Fermilab
Uses SAM
Outside LabGrid
Uses SAMUses SAM
19 February 2004 SAMGrid Project Review
October 04
• To extend beyond 25% outside computing JIM is essential: JIM Test for CDF June04, production October 04
• HOWEVER: It already seems that the 25% resources are not sufficient for the production passes: will want JIM earlier.
19 February 2004 SAMGrid Project Review
CDF Grid Strategy• 25% of CDF Computing from external
resources. All CDF computing on CDF Grid by April 15: Utilize resources fully controlled by CDF: Kerberos/fbsng: dCAF + SAM
• October 15, 2004: JIM to capture shared resources
• June 2005: 50% of Computing resources external
19 February 2004 SAMGrid Project Review
D0 Priorities
• Mc production using JIM
• Reprocessing using as many DH tools as possible
• Analysis
• Remote is 10% now.
• All MC and 20% of reprocessing is Now offsite
19 February 2004 SAMGrid Project Review
Meeting the Needs
• Progress in SAM
• JIM Status
• RunJob
• CDFGridWorkshop: “Nerd’s Paradise”
• Strict Project Management and process to respond to operational issues
19 February 2004 SAMGrid Project Review
In the near term future:JIM
Adding Grid Standard Tools
19 February 2004 SAMGrid Project Review
Desktop
Anywhere
CondorSubmitter
@regional centers
SAM DBCondor Matchmaker
@FNAL
Globus GKCAF SubmitterSAM Station
@ each site
WN
Private LAN
Private LAN
dCache
June 2004testing
June 2005required
Simple JIM
19 February 2004 SAMGrid Project Review
Detailed JIM
SiteSite SiteSite SiteSite
Resource Selector
Info Collector
Info Gatherer
Match Making
User InterfaceUser Interface User InterfaceUser Interface
SubmissionGlobal Job Queue
Grid Client
SubmissionSubmission
User InterfaceUser Interface User InterfaceUser Interface
Global DH ServicesSAM Naming Server
SAM Log Server
Resource Optimizer
SAM DB ServerRC MetaData Catalog
Bookkeeping Service
SAM Stager(s)
SAM Station(+other servs)
Data Handling
Worker Nodes
Grid Gateway
Local Job Handler(CAF, D0MC, BS, ...)
JIM Advertise
Local Job Handling
Cluster
AAA
Dist.FS
Info Manager
XML DB server
Site Conf.Glob/Loc JID map...
Info Providers
MDS
MSS Cache Site
Web ServGrid Monitoring
User Tools
Flow of: job data meta-data
19 February 2004 SAMGrid Project Review
Progress in SAM• Dbserver, the database server between
applications and Oracle, was upgraded to use a common schema for CDF and D0.
• All CDF data files are in SAM • Sam in is in beta testing on the CDF CAF
(1200 cpus): passed 20TB/Day delivery• Minos uses SAM for its Data Handling• Steve Mrenna (Phenomenology) depositing
ALPGEN files in SAM for common CDF/D0 use.
19 February 2004 SAMGrid Project Review
Planned Sam Projects
Not yet started…
19 February 2004 SAMGrid Project Review
Planned Sam Project
• MC/farm Requests– Merge systems of MC request with Farm
Request: Eliminate double work.
• Autodestinations– Awkward to use, being discussed in
design.
19 February 2004 SAMGrid Project Review
Assimilating/Disseminating the Grid
SAMGrid, ARDA, GRIDPP2, Grid3+,PPDG4/5 and making them
aware of us
19 February 2004 SAMGrid Project Review
How to Go Grid Standard
• GGF participation
• Internal implementation of interfaces and standards
• Workshop participation in our strong areas where the Grid has a vacuum
• Projects to deploy well-defined standards.
19 February 2004 SAMGrid Project Review
GGF
• Programme committee for workshop on nuclear and particle physics applications: Paper exists with catalog of GGF groups and how their interests overlap with ours
• Workshop on use cases: paper failed acceptance, learned too late: prepare for next time
19 February 2004 SAMGrid Project Review
GridPP2
• Bid will reward us with positions; recognized a need for a metadata task force :strong interest in SAM solution.
• ARDA problematic; agree there is a standard, and mine is the right one to start from.
• There is more to grid than ARDA.
19 February 2004 SAMGrid Project Review
Project to GridProject
• Chains and Links/SBIRII
• Caching
• Schema rationalized
19 February 2004 SAMGrid Project Review
Chains and Links/SBIR-II• Query language to pursue for the Grid,• SBIRII forces well-defined interface• Within the Metadata workshop
context: • Priority for Grid high, • Oversubscription of key CDF/D0
personnel: great opportunity, could lose it.
19 February 2004 SAMGrid Project Review
Caching• SRM interface within SAM to SAM
cache
• SRM interface to dcache
• Application of caching according to requirements: multiple local sam caches, multiple dcache-linked caches as a sam cache, capitalize on redundant distributed cache (nb. worker nodes)
19 February 2004 SAMGrid Project Review
Dcache and SAM• Dcache shapes traffic into disk: If a SAM
cache is large, need to use Dcache instead of nfs mounts
• Dcache gives the user what is requested. 1TB gets same priority as 1GB: CDF users must send email requesting data to be staged.
• SAM examines consumption rate before staging next files – No EMAIL needed.
• SAM uses Dcache for its Caching at FNAL.• This needs further work with SRM
19 February 2004 SAMGrid Project Review
SAM Schema Modularization
• Modularize the schema,
• Modularize the API,
• Define interfaces,
• Migrate,
• Easier project management.
19 February 2004 SAMGrid Project Review
Link of Schema to Cache
• Dcache without tape but with postgres becomes Local Replica Catalog
• Need protocol to connect to central (virtual) database (performance)
• CSS-DSG - CCF Group strong interactions• Awareness of Grid• Pull out local replica API, cache in SAM
schema
19 February 2004 SAMGrid Project Review
Authorization and Accounting
• Work with CMS
• Gabriele and Virginia Tech
• Virtual Organizations/VOX can we have a solution soon?
19 February 2004 SAMGrid Project Review
Monitoring• XML-based information gathering
• Grid Mechanisms :MONAlisa
• Look at making components with hooks:– C++ API– Caching interfaces– DBServer
19 February 2004 SAMGrid Project Review
Conclusions
• D0 and CDF reliant on Grid
• JIM deployment on Track for March and June milestones.
• FNAL has a huge role to play in the Grid if it can work together and technically address problems.