lcg project status & plans (with an emphasis on applications software) torre wenaus, bnl/cern...
TRANSCRIPT
LCG Project Status & Plans(with an emphasis on applications software)
Torre Wenaus, BNL/CERN
LCG Applications Area Manager
http://cern.ch/lcg/peb/applications
US ATLAS PCAP Review
November 14, 2002
PCAP Review, November 14, 2002 Slide 2
Torre Wenaus, BNL/CERN
The LHC Computing Grid (LCG) Project
Approved (3 years) by CERN Council, September 2001 meanwhile extended by 1 year due to LHC delay
Injecting substantial new facilities and personnel resources Scope:
Common software for physics applications Tools, frameworks, analysis environment
Computing for the LHC Computing facilities (fabrics) Grid middleware, deployment
Deliver a global analysis environment
Goal – Prepare and deploy the LHC computing environment
PCAP Review, November 14, 2002 Slide 3
Torre Wenaus, BNL/CERN
Goal of the LHC Computing Grid Project - LCG
Phase 1 – 2002-05development of common applications, libraries, frameworks, prototyping of the environment, operation of a pilot computing service
Phase 2 – 2006-08acquire, build and operate the LHC computing service
PCAP Review, November 14, 2002 Slide 4
Torre Wenaus, BNL/CERN
CERN will provide the data reconstruction & recording service (Tier 0)-- but only a small part of the analysis capacity
Other Total CERN as Total CERN asTier 0 Tier 1 Total Tier 1 Tier 1 % of Tier 1 Tier 0 + 1 % of total
Tier 0 + 1
Processing (K SI2000) 12,000 8,000 20,000 49,000 57,000 14% 69,000 29%Disk (PetaBytes) 1.1 1.0 2.1 8.7 9.7 10% 10.8 20%Magnetic tape (PetaBytes) 12.3 1.2 13.5 20.3 21.6 6% 33.9 40%
-------------- CERN --------------
Summary of Computing Capacity Required for all LHC Experiments in 2008
current planning for capacity at CERN + principal Regional Centres 2002: 650 KSI2000 <1% of capacity required n 2008 2005: 6,600 KSI2000 < 10% of 2008 capacity
Non-CERN Hardware Need
PCAP Review, November 14, 2002 Slide 5
Torre Wenaus, BNL/CERN
LHC Manpower needs for Core Software
2000Have (miss)
2001 2002 2003 2004 2005
ALICE 12(5) 17.5 16.5 17 17.5 16.5
ATLAS 23(8) 36 35 30 28 29
CMS 15(10) 27 31 33 33 33
LHCb 14(5) 25 24 23 22 21
Total 64(28) 105.5 106.5 103 100.5 99.5
Only computing professionals counted
From LHC Computing Review (FTEs)
PCAP Review, November 14, 2002 Slide 6
Torre Wenaus, BNL/CERN
The LHC Computing Grid Project Structure
Project Overview Board
ProjectExecution
Board (PEB)
Software andComputingCommittee
(SC2)
Requirements,Work plan,Monitoring
WP
RTAG
WP WP WP WP
Project Leader
GridProjects
Project Work Packages
PCAP Review, November 14, 2002 Slide 7
Torre Wenaus, BNL/CERN
Funding LCG
National funding of Computing Services at Regional Centres
Funding from the CERN base budget Special contributions of people and materials at CERN
during Phase I of the project Germany, United Kingdom, Italy, ……
Grid projects Institutes taking part in applications common projects Industrial collaboration Institutes providing grid infrastructure services
operations centre, user support, training, ……
PCAP Review, November 14, 2002 Slide 8
Torre Wenaus, BNL/CERN
Project Execution Board
Decision taking –- as close as possible to the work - by those who will be responsible for the consequences
Two bodies set up to coordinate & take decisions Architects Forum
software architect from each experiment and the application area manager
makes common design decisions and agreements between experiments in the applications area
supported by a weekly applications area meeting open to all participants
Grid Deployment Board representatives from the experiments and from each country with an
active Regional Centre taking part in the LCG Grid Service forges the agreements, takes the decisions, defines the standards and
policies that are needed to set up and manage the LCG Global Grid Services
coordinates the planning of resources for physics and computing data challenges
PCAP Review, November 14, 2002 Slide 9
Torre Wenaus, BNL/CERN
LCG Areas of Work
Fabric (Computing System) Physics Data Management Fabric Management Physics Data Storage LAN Management Wide-area Networking Security Internet Services
Grid Technology Grid middleware Standard application
services layer Inter-project
coherence/compatibility
Physics Applications Software Application Software
Infrastructure – libraries, tools Object persistency, data
management tools Common Frameworks –
Simulation, Analysis, .. Adaptation of Physics
Applications to Grid environment Grid tools, Portals
Grid Deployment Data Challenges Grid Operations Network Planning Regional Centre Coordination Security & access policy
PCAP Review, November 14, 2002 Slide 10
Torre Wenaus, BNL/CERN
Fabric Area
CERN Tier 0+1 centre Automated systems management package Evolution & operation of CERN prototype –
integration into the LCG grid Tier 1,2 centre collaboration
develop/share experience on installing and operating a Grid exchange information on planning and experience of large fabric
management look for areas for collaboration and cooperation
Technology tracking & costing new technology assessment nearing completion re-costing of Phase II will be done next year
PCAP Review, November 14, 2002 Slide 11
Torre Wenaus, BNL/CERN
Grid Technology (GTA) in LCG
Quote from recent Les Robertson slide:LCG expects to obtain Grid Technology from projects funded
by national and regionale-science initiatives -- and from industry
concentrating ourselves on deploying a global grid service
All true, but there is a real role for GTA, not just deployment, in LCG:Ensuring that the needed middleware is/will be there, tested, selected and of production grade
Message seems to be getting through; (re)organization in progress to create an active GTA
Has been dormant and subject to EDG conflict of interest up to now
New leader David Foster, CERN
PCAP Review, November 14, 2002 Slide 12
Torre Wenaus, BNL/CERN
A few of the Grid Projects with strong HEP collaboration
US projects European projects
Many national, regional Grid projects --GridPP(UK), INFN-grid(I),NorduGrid, Dutch Grid, …
PCAP Review, November 14, 2002 Slide 13
Torre Wenaus, BNL/CERN
Grid Technology Area
This area of the project is concerned with ensuring that the LCG requirements are known to current and
potential Grid projects active lobbying for suitable solutions – influencing plans and
priorities evaluating potential solutions negotiating support for tools developed by Grid projects
developing a plan to supply solutions that do not emerge from other sources
BUT this must be done with caution – important to avoid HEP-SPECIAL solutionsimportant to migrate to standards as they
emerge(avoid emotional attachment to prototypes)
PCAP Review, November 14, 2002 Slide 14
Torre Wenaus, BNL/CERN
Grid Technology Status
A base set of requirements has been defined (HEPCAL) 43 use cases ~2/3 of which should be satisfied ~2003 by currently funded
projects Good experience of working with Grid projects in Europe and the
United States Practical results from testbeds used for physics simulation campaigns Built on the Globus toolkit GLUE initiative –
working on integration of the two main HEP Grid project groupings – around the (European) DataGrid project subscribing to the (US) Virtual Data Toolkit - VDT
PCAP Review, November 14, 2002 Slide 15
Torre Wenaus, BNL/CERN
Grid Deployment Area
New leader Ian Bird, CERN, formerly Jefferson Lab Job is to set up and operate a Global Grid Service
stable, reliable, manageable Grid for – Data Challenges and regular production work
integrating computing fabrics at Regional Centres learn how to provide support, maintenance, operation
Short term (this year): consolidate (stabilize, maintain) middleware –
and see it used for some physics learn what a “production grid” really means by working with
the Grid R&D projects
PCAP Review, November 14, 2002 Slide 16
Torre Wenaus, BNL/CERN
Grid Deployment Board
Computing management of regional centers and experiments Decision forum for planning, deploying and operating the LCG grid
Service and resource scheduling and planning Registration, authentication, authorization, security LCG Grid operations LCG Grid user support
One person from each country with an active regional center Typically senior manager of a regional center
This role currently admixed with [inappropriately] broader responsibility
Middleware selection For political/expediency reasons (dormancy of GTA) Not harmful because it is being led well (David Foster) Should be corrected during/after LCG-1
PCAP Review, November 14, 2002 Slide 17
Torre Wenaus, BNL/CERN
Medium term (next year):
Target June 03 - deploy a Global Grid Service (LCG-1) sustained 24 X 7 service including sites from three continents
identical or compatible Grid middleware and infrastructure
several times the capacity of the CERN facility and as easy to use
Having stabilised this base service – progressive evolution –
number of nodes, performance, capacity and quality of service
integrate new middleware functionality migrate to de facto standards as soon as they emerge
Priority: Move from testbeds to a reliable service
PCAP Review, November 14, 2002 Slide 18
Torre Wenaus, BNL/CERN
Centers taking part in LCG-1
Tier 1 Centres FZK Karlsruhe, CNAF Bologna, Rutherford Appleton Lab (UK),
IN2P3 Lyon, University of Tokyo, Fermilab, Brookhaven National Lab
Other Centres GSI, Moscow State University, NIKHEF Amsterdam, Academica
Sinica (Taipei), NorduGrid, Caltech, University of Florida, Ohio Supercomputing Centre, Torino, Milano, Legnaro, ……
year 2002 2003 2004 2005
Processing capacity (K-SI2000)CERN 200 380 730 1,440Other Tier 1 Centres 450 1290 2730 5240Other Regional Centres 610 1620 2200 2220Total 1,260 3,290 5,660 8,900CERN as % of total 16% 12% 13% 16%
Estimates of Processing Capacity Plannedin Regional Centres that are taking part in Phase 1
PCAP Review, November 14, 2002 Slide 19
Torre Wenaus, BNL/CERN
LCG-1 as a service for LHC experiments
Mid-2003 5-10 of the larger regional centres available as one of the services used for simulation
campaigns
2H03 add more capacity at operational regional centres add more regional centres activate operations centre, user support infrastructure
Early 2004 principal service for physics data challenges
PCAP Review, November 14, 2002 Slide 20
Torre Wenaus, BNL/CERN
Applications Area Projects
Software Process and Infrastructure (operating) Librarian, QA, testing, developer tools, documentation, training, …
Persistency Framework (operating) POOL hybrid ROOT/relational data store
Mathematical libraries (operating) Math and statistics libraries; GSL etc. as NAGC replacement
Core Tools and Services (just launched) Foundation and utility libraries, basic framework services, system
services, object dictionary and whiteboard, grid enabled services Physics Interfaces (being initiated)
Interfaces and tools by which physicists directly use the software. Interactive (distributed) analysis, visualization, grid portals
Simulation (coming soon) Geant4, FLUKA, virtual simulation, geometry description & model, …
Generator Services (coming soon) Generator librarian, support, tool development
PCAP Review, November 14, 2002 Slide 21
Torre Wenaus, BNL/CERN
Applications Area Organization
Project
WP WP WP
Project
WP
Project
WP WPWP
Overall management, coordination, architectureApps AreaLeader
ProjectLeaders
Work PackageLeaders
ArchitectsForum
…
Direct technical collaboration between experiment participants,IT, EP, ROOT, LCG personnel
PCAP Review, November 14, 2002 Slide 22
Torre Wenaus, BNL/CERN
02Q1 02Q2 02Q3 02Q4 03Q1 03Q2 03Q3 03Q4 04Q1 04Q2Simulation tools XDetector description & model XConditions database XData dictionary XInteractive framew orks XStatistical analysis XDetector & event visualization XPhysics packages XFramew ork services XC++ class libraries XEvent processing framew ork XDistributed analysis interfaces XDistributed production systems XSmall scale persistency XSoftw are testing XSoftw are distribution XOO language usage XLCG benchmarking suite XOnline notebooks X
Candidate RTAG timeline from March
Blue: RTAG/activity launched or (light blue) imminent
PCAP Review, November 14, 2002 Slide 23
Torre Wenaus, BNL/CERN
LCG Applications Area Timeline Highlights
2002 200520042003
Q1 Q2 Q3 Q4 Q1 Q2 Q3 Q4Q1 Q2 Q3 Q4Q1 Q2 Q3 Q4
Hybrid Event Store available for general users
Distributed production using grid services
First Global Grid Service (LCG-1) available
Distributed end-user interactive analysis
Full Persistency Framework
LCG-1 reliability and performance targets
“50% prototype” (LCG-3)
LCG TDR
Applications
LCG
POOL V0.1 internal releaseArchitectural blueprint complete
LCG launch week
PCAP Review, November 14, 2002 Slide 24
Torre Wenaus, BNL/CERN
Personnel status
15 new LCG hires in place and working; a few more soon Manpower ramp is on schedule Contributions from UK, Spain, Switzerland, Germany,
Sweden, Israel, Portugal, US
~10 FTEs from IT (DB and API groups) also participating ~7 FTEs from experiments (CERN EP and outside CERN)
also participating, primarily in persistency project at present
Important experiment contributions also in the RTAG process
PCAP Review, November 14, 2002 Slide 25
Torre Wenaus, BNL/CERN
Software Architecture Blueprint RTAG
Established early June 2002 Goals
Integration of LCG and non LCG software to build coherent applications
Provide the specifications of an architectural model that allows this, i.e. a ‘blueprint’
Mandate Define the main domains and identify the principal components Define the architectural relationships between these
‘frameworks’ and components, identify the main requirements for their inter-communication, and suggest possible first implementations.
Identify the high level deliverables and their order of priority. Derive a set of requirements for the LCG
PCAP Review, November 14, 2002 Slide 26
Torre Wenaus, BNL/CERN
Architecture Blueprint Report
Executive summary Response of the RTAG to the mandate Blueprint scope Requirements Use of ROOT Blueprint architecture design precepts
High level architectural issues, approaches Blueprint architectural elements
Specific architectural elements, suggested patterns, examples Domain decomposition Schedule and resources Recommendations
After 14 RTAG meetings,much email...
A 36-page final reportAccepted by SC2 October 11
http://lcgapp.cern.ch/project/blueprint/BlueprintReport-final.dochttp://lcgapp.cern.ch/project/blueprint/BlueprintPlan.xls
PCAP Review, November 14, 2002 Slide 27
Torre Wenaus, BNL/CERN
Architecture requirements
Long lifetime: support technology evolution Languages: LCG core sw in C++ today; support language
evolution Seamless distributed operation TGV and airplane work: usability off-network Modularity of components Component communication via public interfaces Interchangeability of implementations
PCAP Review, November 14, 2002 Slide 28
Torre Wenaus, BNL/CERN
Architecture Requirements (2)
Integration into coherent framework and experiment software
Design for end-user’s convenience more than the developer’s
Re-use existing implementations Software quality at least as good as any LHC experiment Meet performance, quality requirements of trigger/DAQ
software Platforms: Linux/gcc, Linux/icc, Solaris, Windows
PCAP Review, November 14, 2002 Slide 29
Torre Wenaus, BNL/CERN
Basic Framework
Foundation Libraries
Simulation Framework
Reconstruction Framework
Visualization Framework
Applications
. . .
Optional Libraries
OtherFrameworks
Software Structure
Implementation-neutral services
STL,ROOT libs,
CLHEP,Boost, …
Grid middleware, …
ROOT, Qt, …
PCAP Review, November 14, 2002 Slide 30
Torre Wenaus, BNL/CERN
Component Model
Granularity driven by component replacement criteria; development team organization; dependency minimization
Communication via public interfaces Plug-ins
Logical module encapsulating a service that can be loaded, activated and unloaded at run time
APIs targeted not only to end-users but to embedding frameworks and internal plug-ins
PCAP Review, November 14, 2002 Slide 31
Torre Wenaus, BNL/CERN
Distributed Operation
Architecture should enable but not require the use of distributed resources via the Grid
Configuration and control of Grid-based operation via dedicated services
Making use of optional grid middleware services at the foundation level of the software structure
Insulating higher level software from the middleware Supporting replaceability
Apart from these services, Grid-based operation should be largely transparent
Services should gracefully adapt to ‘unplugged’ environments
Transition to ‘local operation’ modes, or fail informatively
PCAP Review, November 14, 2002 Slide 32
Torre Wenaus, BNL/CERN
Managing Objects
Object Dictionary To query a class about its internal structure (Introspection) Essential for persistency, data browsing, interactive rapid
prototyping, etc. The ROOT team and LCG plan to develop and converge
on a common dictionary (common interface and implementation) with an interface anticipating a C++ standard (XTI)
To be used by LCG, ROOT and CINT Timescale ~1 year
Object Whiteboard Uniform access to application-defined transient objects
PCAP Review, November 14, 2002 Slide 33
Torre Wenaus, BNL/CERN
Other Architectural Elements
Python-based Component Bus Plug-in integration of components providing a wide
variety of functionality Component interfaces to bus derived from their C++
interfaces Scripting Languages
Python and CINT (ROOT) to both be available Access to objects via object whiteboard in these
environments Interface to the Grid
Must support convenient, efficient configuration of computing elements with all needed components
PCAP Review, November 14, 2002 Slide 34
Torre Wenaus, BNL/CERN
(LHCb) Example of LCG–Experiment SW Mapping
Converter
Algorithm
Event DataService
PersistencyService
DataFiles
AlgorithmAlgorithm
Transient Event Store
Detec. DataService
PersistencyService
DataFiles
Transient Detector
Store
MessageService
J obOptionsService
Particle Prop.Service
OtherServices
HistogramService
PersistencyService
DataFiles
TransientHistogram
Store
ApplicationManager
ConverterConverterEventSelector
LCGPool
LCGDDD
LCGPool
Other LCGservices
LCGCLS
LCGCLS
HepPDT
LCGCLS
PCAP Review, November 14, 2002 Slide 35
Torre Wenaus, BNL/CERN
EventGeneration
Core Services
Dictionary
Whiteboard
Foundation and Utility Libraries
DetectorSimulation
Engine
Persistency
StoreMgr
Reconstruction
Algorithms
Geometry Event Model
GridServices
I nteractiveServices
Modeler
GUIAnalysis
EvtGen
Calibration
Scheduler
Fitter
PluginMgr
Monitor
NTuple
Scripting
FileCatalog
ROOT GEANT4 DataGrid Python Qt
Monitor
. . .MySQLFLUKA
EventGeneration
Core Services
Dictionary
Whiteboard
Foundation and Utility Libraries
DetectorSimulation
Engine
Persistency
StoreMgr
Reconstruction
Algorithms
Geometry Event Model
GridServices
I nteractiveServices
Modeler
GUIAnalysis
EvtGen
Calibration
Scheduler
Fitter
PluginMgr
Monitor
NTuple
Scripting
FileCatalog
ROOT GEANT4 DataGrid Python Qt
Monitor
. . .MySQLFLUKA
Domain Decomposition
Products mentioned are examples; not a comprehensive list
Grey: not in common project scope(also event processing framework, TDAQ)
PCAP Review, November 14, 2002 Slide 36
Torre Wenaus, BNL/CERN
Use of ROOT in LCG Software
Among the LHC experiments ALICE has based its applications directly on ROOT The 3 others base their applications on components with
implementation-independent interfaces Look for software that can be encapsulated into these components
All experiments agree that ROOT is an important element of LHC software
Leverage existing software effectively and do not unnecessarily reinvent wheels
Therefore the blueprint establishes a user/provider relationship between the LCG applications area and ROOT
Will draw on a great ROOT strength: users are listened to very carefully!
The ROOT team has been very responsive to needs for new and extended functionality coming from the persistency effort
PCAP Review, November 14, 2002 Slide 37
Torre Wenaus, BNL/CERN
Personnel Resources – Required and Available
Estimate of Required Effort
FTEs today: 15 LCG, 10 CERN IT, 7 CERN EP + experiments
0
10
20
30
40
50
60S
ep-0
2
Dec
-02
Mar
-03
Jun-
03
Sep
-03
Dec
-03
Mar
-04
Jun-
04
Sep
-04
Dec
-04
Mar
-05
Quarter ending
FT
Es
SPI
Math libraries
Physics interfaces
Generator services
Simulation
CoreToolsS&Services
POOL
Blue = Available effort:
Future estimate: 20 LCG, 13 IT, 28 EP + experiments
Now
Summary of LCG-funded Resources Used - Estimate to end 2002
Experience-Weighted FTEsCERN Special FundingApplications 8.9
Persistency 1.8Software Process Support 2.0Simulation 1.8Root 1.4Architecture 0.0Grid Interfacing 1.1Training 0.7
Fabric 5.4System Management & Operations 1.6Development (e.g. Monitoring) 1.8Data Storage Management 0.8Grid Security 1.1
Grid Technology 3.0Data Management 2.3Grid Gatekeeper 0.7
Grid Deployment 4.6Int 2.1OPS 2.5
Management 2.1LCG 2.1
EU FundingDataGrid 6.9
Total Weighted FTEs in calendar 2002 30.8
PCAP Review, November 14, 2002 Slide 39
Torre Wenaus, BNL/CERN
RTAG Conclusions and Recommendations
Use of ROOT as described Start common project on core tools and services Start common project on physics interfaces Start RTAG on analysis, including distributed aspects Tool/technology recommendations
CLHEP, CINT, Python, Qt, AIDA, … Develop a clear process for adopting third party software
PCAP Review, November 14, 2002 Slide 40
Torre Wenaus, BNL/CERN
Core Libraries and Services Project
Pere Mato (CERN/LHCb) is leading the new Core Libraries and Services (CLS) Project
Project being launched now, developing immediate plans over the next week or so and a full work plan over the next couple of months
Scope: Foundation, utility libraries Basic framework services Object dictionary Object whiteboard System services Grid enabled services
Many areas of immediate relevance to POOL Clear process for adopting third party libraries will be addressed early
in this project
PCAP Review, November 14, 2002 Slide 41
Torre Wenaus, BNL/CERN
Physics Interfaces Project
Launching it now, led by Vincenzo Innocente (CMS) Covers the interfaces and tools by which physicists will
directly use the software Should be treated coherently, hence coverage by a single
project Expected scope once analysis RTAG concludes:
Interactive environment Analysis tools Visualization Distributed analysis Grid portals
PCAP Review, November 14, 2002 Slide 42
Torre Wenaus, BNL/CERN
POOL
Pool of persistent objects for LHC, currently in prototype Targeted at event data but not excluding other data Hybrid technology approach
Object level data storage using file-based object store (ROOT) RDBMS for meta data: file catalogs, object collections, etc (MySQL)
Leverages existing ROOT I/O technology and adds value Transparent cross-file and cross-technology object navigation RDBMS integration
Integration with Grid technology (eg EDG/Globus replica catalog) network and grid decoupled working modes
Follows and exemplifies the LCG blueprint approach Components with well defined responsibilities Communicating via public component interfaces Implementation technology neutral
PCAP Review, November 14, 2002 Slide 43
Torre Wenaus, BNL/CERN
Pool Release Schedule
End September - V0.1 (Released on schedule) All core components for navigation exist and interoperate Assumes ROOT object (TObject) on read and write
End October - V0.2 First collection implementation
End November - V0.3 (First public release) EDG/Globus FileCatalog integrated Persistency for general C++ classes (not instrumented by
ROOT) Event meta data annotation and query
June 2003 – Production release
PCAP Review, November 14, 2002 Slide 44
Torre Wenaus, BNL/CERN
Four Experiments, Four Viewpoints, Two Paths
Four viewpoints (of course) among the experiments; two basic positions
ATLAS, CMS, LHCb similar views; ALICE differing The blueprint establishes the basis for a good working
relationship among all LCG applications software is developed according to
the blueprint To be developed and used by ATLAS, CMS and LHCb,
with ALICE contributing mainly via ROOT
ALICE continues to develop their line making direct use of ROOT as their software framework
PCAP Review, November 14, 2002 Slide 45
Torre Wenaus, BNL/CERN
What next in LCG?
Much to be done to be done in middleware functionality -
Job Scheduling, Workflow Management, Database access, Replica Management, Monitoring, Global Resource Optimisation, Evolution to Web Services, ….
But we also need an evolution from Research & Development Engineering
Effective use of high-bandwidth Wide Area Networkswork on protocols, file transfer
High quality computer centre services high quality Grid servicessociology as well as technology
PCAP Review, November 14, 2002 Slide 46
Torre Wenaus, BNL/CERN
Concluding Remarks
Good engagement and support from the experiments and CERN Determination to make the LCG work
LCG organizational structure seems to be working, albeit slow Facilities money problematic; manpower on track Grid Technology and Grid Deployment areas still in flux
Important for US to make sure interests are represented and experience injected
Applications area going well Architecture laid out Working relationship among the four experiments Many common projects First software deliverable (POOL) on schedule