tony doyle [email protected] executive summary, pparc, mrc london, 15 may 2003
TRANSCRIPT
Tony [email protected]
“Executive Summary”, PPARC, MRC London, 15 May 2003
Tony Doyle - University of Glasgow
OutlineOutline
• Questions from PPARC?
• Motivation• Overview• How Does the
Grid Work?• Middleware
Development• Testbed Status• Is the Middleware
Robust?
• LHC Computing Grid• Applications• Tier-1 and -2 Centre
Resources• Dissemination• Achievements• Timeline• UK Grid Priorities• Roadmap• Future Grid
Components
Tony Doyle - University of Glasgow
Questions from PPARC?Questions from PPARC?
1. What has been achieved in the first 18 months?• See Oversight Committee (4): Executive Summary
2. Development in the medium-long term?• GridPP1 (09/01-08/04) Prototype short
“Web to Grid”
• GridPP2 (09/04-08/07) Production medium“Prototype to Production”
• (09/07-) Exploitation long-term“LHC exploitation”
Tony Doyle - University of Glasgow
Rare Phenomena –Rare Phenomena –Huge BackgroundHuge Background
9 or
ders
of
mag
nitu
deThe HIGGS
All interactions
Tony Doyle - University of Glasgow
Executive SummaryExecutive Summary
• Introduction
• Project Management
• Resources
• CERN
• DataGrid
• Applications
• Tier-1/A
• Tier-2
• Dissemination
• Future Funding
Ref: PMB-26-EXEC
• The prototype Grid is working
• Under control via Project Map
• Small modifications
• Scaling issues for LCG-1
• Making impact Internationally
• Engaged (value added)
• Tier-A production mode
• Latent resources
• UK flagship project
• GridPP2 proposal in preparation
Tony Doyle - University of Glasgow
OverviewOverview
EDG - UK Contributions
ArchitectureTestbed-1Network MonitoringCertificates & SecurityStorage Element R-GMALCFGMDS deploymentGridSiteSlashGridSpitfire…
Applications (start-up phase)
BaBarCDF/D0 (SAM)ATLAS/LHCbCMS(ALICE)UKQCD
£17m 3-year project funded by PPARC through the e-Science Programme
CERN - LCG (start-up phase)
funding for staff and hardware...CERN
DataGrid
Tier - 1/A
ApplicationsOperations
http://www.gridpp.ac.uk
Tony Doyle - University of Glasgow
£17m++ 3-Year Project£17m++ 3-Year Project
• Five components– Tier-1/A = Hardware + CLRC e-Science Staff
– DataGrid = 15 DataGrid Posts + CLRC PPD Staff
– Applications = 13 Experiments Posts (to interface middleware)
– Operations = Travel (~100 people)+ Management + Early Investment
– CERN = 25 LCG posts + Tier-0 + LTA
6/Oct/2002
£3.79m
£5.67m
£3.67m
£2.08m£1.79m
CERN
DataGrid
Tier - 1/A
ApplicationsOperations
Tony Doyle - University of Glasgow
Project Management Project Management - 7 Elements- 7 Elements
Tony Doyle - University of Glasgow
Year 0 Year 0 Year 1 Year 1
The GridPPProject
haspassed
prototypedeploymentstage 1...
Tony Doyle - University of Glasgow
How Does theHow Does theGrid Work?Grid Work?
1. Authenticationgrid-proxy-init
2. Job submissiondg-job-submit
3. Monitoring and controldg-job-statusdg-job-canceldg-job-get-output
4. Data publication and replication
globus-url-copy, GDMP
5. Resource scheduling – use of Mass Storage Systems
JDL, sandboxes, storage elements
0. Web User Interface…
Tony Doyle - University of Glasgow
Middleware DevelopmentMiddleware Development
Tony Doyle - University of Glasgow
Testbed StatusTestbed Status1010thth May 2003 14:40 May 2003 14:40
UK-wide developmentusing EU-DataGrid tools (v1.47).Not yet robust, but sufficient for prototype development.See http://www.gridpp.ac.uk/map/
Tony Doyle - University of Glasgow
Is the Middleware Robust?Is the Middleware Robust?
1. Code Base (1/3 Mloc)
2. Software Evaluation Process
3. Testbed Infrastructure: Unit Test Build Integration CertificationProduction
4. Code Development PlatformsEU 2nd Year Review – 04-05 Jan. 2003 – Quality Assurance – n° 12
Test and Validation process
WPs add unittested code toCVS repository
Run nightlybuild
& auto. tests
Grid certification
Fix problemsApplication
Certification
Buildsystem
Certification (**)Testbed ~40cpu
Production (*)Testbed ~1000cpu
WP specific (*)machines
Certified publicrelease
for use by apps.
24x7 (**)
Build system
Test Group
WPs
Bugzilla anomalies reports
Unit Test Build Certification Production
Users
Development (*)Testbed ~15cpu
I ndividual WPtests
I ntegrationTeam
I ntegration
Offi ce hours
Overallreleasetests
Tag
ged p
acka
ge
Tag
ged r
elea
se s
elec
ted f
or c
erti
fica
tion
Releasescandidate
TaggedReleases
ReleasescandidateCertifiedReleases
Cer
tifi
ed r
elea
se s
elec
ted
for
dep
loym
ent
Apps. Representatives
(**) with LCG(*)Current infrastructure
EU 2nd Y ear R eview – 04-05 Jan. 2003 – Q uality A ssurance – n° 9
Anomalies follow- up
400100Users
335 000197 000EDG MW lines of
Code
5000 GB2000 GBStorage
1000400CPU
126Sites
20022001
A ssociat ed st at ist ics: T est beds evolut ion
Bugzilla Anomalies follow-up
0
20
40
60
80
100
120
Month
Num
ber
of a
nom
alie
s
Number of new anomalies Number of pending anomalies Main EDG releases
v1.1.0 v1.1.2 v1.2.Beta v1.4.0v1.2.0 v1.3.0
Tony Doyle - University of Glasgow
LHC Computing GridLHC Computing Grid
• Grid deployment project
• Not grid development
• Establishes the global computing infrastructure
• Allows all participating physicists to exploit LHC data
• Fosters and develops the required collaboration between– LHC experiments– peer computing centres– middleware providers
• Based at CERN
• Started March 2002
Tony Doyle - University of Glasgow
LHC Computing GridLHC Computing Grid
• Solid programme established
• Agreed between and supported by all LHC experiments
• LHC experiments contributing to development of common software (e.g. new persistency solution)
• First LHC global grid service for deployment in July 2003
• Basis for a generic Science Grid Infrastructure
Tony Doyle - U niversity of G lasgow
PO O LPO O L
Dict ionar yS vcS t r eamer S vcS t r eamer S vc
Per sist encyM gr
I Refl ect ionS t r eamer S vc Dict ionar yS vc
S t or ageM gr
CacheM gr
I PRefl ect ion
F ileCat alog
I Cnv
I ReadW r ite
I Per s
C++
Placement S vc
I F Catalog
I Placement
Dict ionar yS vcDict ionar yS vcS t r eamer S vcS t r eamer S vcS t r eamer S vcS t r eamer S vc
Per sist encyM grPer sist encyM gr
I Refl ect ionS t r eamer S vcS t r eamer S vc Dict ionar yS vcDict ionar yS vc
S t or ageM grS t or ageM gr
CacheM grCacheM gr
I PRefl ect ion
F ileCat alogF ileCat alog
I Cnv
I ReadW r ite
I Per s
C++
Placement S vcPlacement S vc
I F Catalog
I Placement
P ersistency F ram ework :
Tony Doyle - University of Glasgow
Application InterfacesApplication Interfaces
Fabric
TapeStorage
Elements
RequestFormulator and
Planner
Client Applications
ComputeElements
Indicates component that w ill be replaced
DiskStorage
Elements
LANs andWANs
Resource andServices Catalog
ReplicaCatalog
Meta-dataCatalog
Authentication and SecurityGSISAM-specific user, group , node, st at ion regis tration B bftp ‘cookie’
Connectivity and Resource
CORBA UDP File transfer protocol s - ftp, b bftp, rcp GridFTP
Mass Storage s ystems protocol se.g. encp, hp ss
Collective Services
C atalogproto co ls
Signi fi cant Event Log ger Naming Service Database ManagerC atalog Manager
SAM R es ource M an ag em entB atch Sys tems - LSF, FB S, PB S,
C ondorData Mov erJob Services
Storage ManagerJob ManagerCache ManagerRequest Manager
“Dataset Editor” “File Storage Server”“Project Master” “Station M aster” “Station M aster”
Web Python codes, Java codesCom mand line D0 Fram ework C++ codes
“Stager”“Optim iser”
CodeRepostory
Name in “quotes” is SAM-given software component name
or addedenhanced using PPDG and Grid tools
Tony Doyle - University of Glasgow
Tier-1/A: Year of GrowthTier-1/A: Year of Growth
020406080
100120140
Oct-01
Dec-01
Feb-02
Apr-02
Jun-02
Aug-02
Personal
Server
GridPP Certificates
BaBar Use
Tony Doyle - University of Glasgow
Tier-2 Web-BasedTier-2 Web-BasedMonitoringMonitoring
ScotGrid reached its 350,000th processing hour on Tuesday 6th May 2003.
Status of Cambridge now available via
Ganglia Posted on Mon 28 April
2003 10:02 BST
Tony Doyle - University of Glasgow
CPU Resources by Experiment
2%18%
8%
9%
25%
13%
10%
15%
ALICE ATLAS CMS LHCb BaBar CDF D0 Other
Tier-1 and -2 Centre ResourcesTier-1 and -2 Centre Resources
Estimated resources at start of GridPP2 (Sept. 2004)
Tier-2: (6000 CPUs + 500 TB) Tier-1: 1000 CPUs + 500 TB
Tier-2 Number of CPUs Total Disk (TB) Total Tape (TB)London 2454 99 20NorthGrid 2718 209 332SouthGrid 918 67 8ScotGrid 368 79 0Total 6458 455 360
Tier-1
Shared distributed resources:required to meet experimentrequirements
Connected by networkand grid…
E = mc2
Grid
Tony Doyle - University of Glasgow
Dissemination:Dissemination: e-Science and Web e-Science and Web
e-Science ‘All Hands’ Meeting held at Sheffield, 2-4 September 2002– ~ 300 people in total
– ~ 19 GridPP People attended
– ~ 13 GridPP ‘Abstracts’ accepted (total ~100)
– ~ 10 GridPP Posters displayed
– 4 GridPP Invited talks
– 3 GridPP DemonstrationsGridPP Web Page Requests:
Currently 35,000 per month
month: reqs: pages: --------: ------: -----: Nov 2001: 29918: 4374: Dec 2001: 29315: 5576: Jan 2002: 50892: 7594: Feb 2002: 66166: 7724: Mar 2002: 135683: 8180: Apr 2002: 222939: 12008: May 2002: 249830: 11879: Jun 2002: 205480: 12679: Jul 2002: 148604: 14125: Aug 2002: 186690: 21449: Sep 2002: 266853: 24318: Oct 2002: 237031: 26893: Nov 2002: 243710: 29796: Dec 2002: 204119: 27101: Jan 2003: 251185: 27291: Feb 2003: 295381: 30002: Mar 2003: 419985: 35650: Apr 2003: 316816: 34548:
Tony Doyle - University of Glasgow
Dissemination:Dissemination:PostersPosters
ATLAS SAM
OptorSimGridPP Tier-1/A ScotGrid
BaBarLHCbCMS
Storage
10 Posters for NeSC Opening and e-Science All Hands Meeting
Tony Doyle - University of Glasgow
Dissemination:Dissemination:DemonstrationsDemonstrations
• Super Computing 2002 Baltimore, US
• Major event in November 2002
• GridPP participated in three successful Worldwide demos– WorldGrid– Replica Location Service– SAMGrid
• These and other Web-based demos available online from http://www.gridpp.ac.uk/demos/
Tony Doyle - University of Glasgow
Achievements IAchievements I
1. Dedicated people actively developing a Grid
2. All with personal certificates
3. Using the largest UK grid testbed(16 sites and more than 100 servers)
4. Deployed within EU-wide programme
5. Linked to Worldwide Grid testbeds
Tony Doyle - University of Glasgow
Achievements IIAchievements II
6. Grid Deployment Programme Defined The Basis for LHC Computing
7. Active Tier-1/A Production Centre meeting International Requirements
8. Latent Tier-2 resources being monitored
9. Significant middleware development programme
10.First simple applications using the Grid testbed (open approach)
Tony Doyle - University of Glasgow
020406080
100120140
Oct-01
Dec-01
Feb-02
Apr-02
Jun-02
Aug-02
Personal
Server020406080
100120140
Oct-01
Dec-01
Feb-02
Apr-02
Jun-02
Aug-02
Personal
Server
GridPP
From Prototype (hundreds) to Production (thousands) Grid..
2002 2005200420032001
Generaldeploymentof e-Sciencemethods…
Certs.
Tony Doyle - University of Glasgow
TimelineTimeline
2002 200520042003
Q1 Q2 Q3 Q4 Q1 Q2 Q3 Q4Q1 Q2 Q3 Q4Q1 Q2 Q3 Q4
GridPP1-Procure, Install, Compute, Data
Develop, Test, Refine
LHC Computing Grid
Initial Grid Tests
EGEE
GridPP2
Grid Service
PrototypesPrototypes ProductionProduction
2001
Q1 Q2 Q3 Q4
DataGrid
Middleware and Hardware Upgrades
Worldwide Grid Demonstrations
Transition and Planning Phase…GridPP2 Proposal
Tony Doyle - University of Glasgow
Priorities:Priorities:GridPP2 ProposalGridPP2 Proposal
1. Tier-1/A staff – National Grid Centre
2. Tier-1/A hardware – International Role
3. Tier-2 staff – UK e-Science Grid
4. Applications– Grid Integration (GridPP2)– Development (experiments proposals)
5. Middleware – EU-wide development
6. Tier-2 hardware – non-PPARC funding
7. CERN staff – quality assurance
8. CERN hardware – pro-rata contribution
• Final proposal writing phase…
ALL of
theseare
required to
address the LHC
Computing Challenge
Tony Doyle - University of Glasgow
2003 2004 2005 2006 2007 2008
App. Development
App. Integration
Middleware
EGEE
Tier-2 Hardware
Tier-2 Staff
Tier-1 Hardware
Tier-1 Staff
Tier-0 Hardware
Tier-0 Staff
Dissemination
Travel + Ops
Management
Roadmap: Roadmap: GridPP2 ProposalGridPP2 Proposal
GridPP1 GridPP2 Exploitation
Tony Doyle - University of Glasgow
1% 6%8%
10%
23%
12%
1% 3%3%3%
2% 8%
11%
9%
Tier-0 Hardware
Tier-0 Staff
Tier-1 Hardware
Tier-1 Staff
Tier-2 Hardware
Tier-2 Staff
Tier-2 Staff (Inst.)
App. Integration
App. Development
Middleware
EGEE
Management
Dissemination
Travel and Ops
App. Development
Tier-2
Middleware Tier-1
App. Int.
CERNEGEE
GridPP2 era
Components:Components:GridPP2 ProposalGridPP2 Proposal
Tony Doyle - University of Glasgow
Future Grid ComponentsFuture Grid Components
1. Dynamic Grid Optimisation e.g. OptorSim
• Automatic data replication to improve data access
2. Grid User Interfaces e.g. GUIDO
• Hide complexity of middleware from the end-user physicist