grids for the lhc
DESCRIPTION
Grids for the LHC. Paula Eerola Lund University, Sweden Four Seas Conference Istanbul 5-10 September 2004. Acknowledgement: much of the material is from Ian Bird, Lepton-Photon Symposium 2003, Fermilab. Outline. Introduction What is a Grid? Grids and high-energy physics? Grid projects - PowerPoint PPT PresentationTRANSCRIPT
Grids for the LHCGrids for the LHC
Paula EerolaPaula EerolaLund University, SwedenLund University, Sweden
Four Seas ConferenceFour Seas ConferenceIstanbulIstanbul5-10 September 20045-10 September 2004
Acknowledgement: much of the material is from Ian Bird, Lepton-Photon Symposium 2003, Fermilab.
Paula EerolaPaula EerolaFour Seas Conference, Four Seas Conference, IstanbulIstanbul, 5-10 September 2004, 5-10 September 2004
2
OutlineOutline IntroductionIntroduction
– What is a Grid?What is a Grid?– Grids and high-energy physics?Grids and high-energy physics?
Grid projectsGrid projects– EGEEEGEE– NorduGridNorduGrid
LHC Computing Grid projectLHC Computing Grid project– Using grid technology to access and Using grid technology to access and
analyze LHC dataanalyze LHC data OutlookOutlook
Paula EerolaPaula EerolaFour Seas Conference, Four Seas Conference, IstanbulIstanbul, 5-10 September 2004, 5-10 September 2004
3
IntroductionIntroduction
What is a Grid?What is a Grid?
Paula EerolaPaula EerolaFour Seas Conference, Four Seas Conference, IstanbulIstanbul, 5-10 September 2004, 5-10 September 2004
4
About the GridAbout the Grid
WEBWEB: get : get information information on on any computer in the any computer in the worldworld
GRIDGRID: get : get CPUCPU-resources, -resources, diskdisk-resources, -resources, tapetape--resources on any resources on any computer in the worldcomputer in the world
Grid needs advanced Grid needs advanced software, software, middlewaremiddleware, , which connects the which connects the computers togethercomputers together
Grid is the future Grid is the future infrastructure of infrastructure of computing and data computing and data managementmanagement
Paula EerolaPaula EerolaFour Seas Conference, Four Seas Conference, IstanbulIstanbul, 5-10 September 2004, 5-10 September 2004
5
Short historyShort history 1996: Start of the 1996: Start of the Globus Globus project for connecting US project for connecting US
supercomputers togethersupercomputers together (funded by US Defence (funded by US Defence Advanced Research Projects Agency...)Advanced Research Projects Agency...)
1998: early Grid testbeds in the USA - 1998: early Grid testbeds in the USA - supercomputing centers connected togethersupercomputing centers connected together
1998 Ian Foster, Carl Kesselman:1998 Ian Foster, Carl Kesselman:GRID:GRID: Blueprint for a new Computing InfrastructureBlueprint for a new Computing Infrastructure
2000— PC capacity increases, prices drop 2000— PC capacity increases, prices drop supercomputers become obsolete supercomputers become obsolete Grid focus is Grid focus is moved from supercomputers to PC-clustersmoved from supercomputers to PC-clusters
1990’s – WEB, 2000’s – GRID?1990’s – WEB, 2000’s – GRID? Huge Huge commercial interestscommercial interests: IBM, HP, Intel, …: IBM, HP, Intel, …
Paula EerolaPaula EerolaFour Seas Conference, Four Seas Conference, IstanbulIstanbul, 5-10 September 2004, 5-10 September 2004
6
Grid prerequisitesGrid prerequisites Powerful PCs are cheapPowerful PCs are cheap PC-clusters are everywherePC-clusters are everywhere Networks are improving Networks are improving
even faster than CPUseven faster than CPUs Network & Storage & Network & Storage &
Computing Computing exponentials:exponentials:– CPU performance (# CPU performance (#
transistors) doubles every 18 transistors) doubles every 18 monthsmonths
– Data storage (bits per area) Data storage (bits per area) doubles every 12 monthsdoubles every 12 months
– Network capacity (bits per Network capacity (bits per sec) doubles every 9 monthssec) doubles every 9 months
Paula EerolaPaula EerolaFour Seas Conference, Four Seas Conference, IstanbulIstanbul, 5-10 September 2004, 5-10 September 2004
7
Grids and high-energy Grids and high-energy physics?physics?
The Large Hadron The Large Hadron Collider, LHC, start 2007Collider, LHC, start 2007
4 experiments, ATLAS, 4 experiments, ATLAS, CMS, ALICE, LHCb, with CMS, ALICE, LHCb, with physicists from all over physicists from all over the worldthe world
LHC computing = data LHC computing = data processing, data storage, processing, data storage, production of simulated production of simulated datadata
LHC computing is of LHC computing is of unprecedented scaleunprecedented scale
Massive data flow The 4 experiments are accumulating 5-8 PetaBytes of data/year
Massive data flow The 4 experiments are accumulating 5-8 PetaBytes of data/year
Paula EerolaFour Seas Conference, Istanbul, 5-10 September 2004
8
Needed capacity Storage – 10 PetaBytes of disk and tape Processing – 100,000 of today’s fastest PCs World-wide data analysis Physicists are located in all the continents
Needed capacity Storage – 10 PetaBytes of disk and tape Processing – 100,000 of today’s fastest PCs World-wide data analysis Physicists are located in all the continents
Computing must be distributed for many reasons
Not feasible to put all the capacity in one place Political, economic, staffing: easier to get funding for resources at home country Faster access to data for all physicists around the world Better sharing of computing resources required by physicists
Computing must be distributed for many reasons
Not feasible to put all the capacity in one place Political, economic, staffing: easier to get funding for resources at home country Faster access to data for all physicists around the world Better sharing of computing resources required by physicists
Paula EerolaFour Seas Conference, Istanbul, 5-10 September 2004
9
LHC Computing HierarchyLHC Computing Hierarchy
Tier 1FNAL CenterIN2P3 Center INFN Center RAL Center
Tier 1 Centres = large computer Tier 1 Centres = large computer centers (about 10). Tier 1’s centers (about 10). Tier 1’s provide permanent storage and provide permanent storage and management of management of rawraw, , summarysummary and other data needed during and other data needed during the analysis process.the analysis process.
Tier 2 Centres = smaller Tier 2 Centres = smaller computer centers (several 10’s). computer centers (several 10’s). Tier 2 Centres provide disk Tier 2 Centres provide disk storage and concentrate on storage and concentrate on simulation and end-user simulation and end-user analysis.analysis.
Tier2 CenterTier2 CenterTier2 CenterTier2 CenterTier2 Center
Tier 2
InstituteInstituteInstituteInstitute
Workstations
Physics data cache
CERN Center PBs of Disk;
Tape Robot
~100-1500 MBytes/s Tier 0
Experiment
Tier 0= CERN. Tier 0 receivesTier 0= CERN. Tier 0 receives rawraw data from the Experiments data from the Experiments and records them on permanent and records them on permanent mass storage. First-pass mass storage. First-pass reconstruction of the data, reconstruction of the data, producing producing summarysummary data. data.
Paula EerolaPaula EerolaFour Seas Conference, Four Seas Conference, IstanbulIstanbul, 5-10 September 2004, 5-10 September 2004
10
Grid technology as a Grid technology as a solutionsolution
Grid technology can provide optimized Grid technology can provide optimized access to and use of the computing and access to and use of the computing and storage resourcesstorage resources
Several HEP experiments currently running Several HEP experiments currently running (Babar, CDF/DO, STAR/PHENIX), with (Babar, CDF/DO, STAR/PHENIX), with significant data and computing significant data and computing requirements, have already started to requirements, have already started to deploy grid-based solutionsdeploy grid-based solutions
Grid technology is not yet off-the shelf Grid technology is not yet off-the shelf product product Requires Requires developmentdevelopment of of middleware, protocols, services,…middleware, protocols, services,…Grid development and engineering Grid development and engineering
projects: EDG, EGEE, NorduGrid, Grid3,projects: EDG, EGEE, NorduGrid, Grid3,….….
Grid development and engineering Grid development and engineering projects: EDG, EGEE, NorduGrid, Grid3,projects: EDG, EGEE, NorduGrid, Grid3,….….
Grid projectsGrid projects
Paula EerolaPaula EerolaFour Seas Conference, Four Seas Conference, IstanbulIstanbul, 5-10 September 2004, 5-10 September 2004
12
US, Asia, AustraliaUS, Asia, Australia
USAUSA NASA Information Power GridNASA Information Power Grid DOE Science GridDOE Science Grid NSF National Virtual NSF National Virtual
ObservatoryObservatory NSF GriPhyNNSF GriPhyN DOE Particle Physics Data GridDOE Particle Physics Data Grid NSF TeraGridNSF TeraGrid DOE ASCI GridDOE ASCI Grid DOE Earth Systems GridDOE Earth Systems Grid DARPA CoABS GridDARPA CoABS Grid NEESGridNEESGrid DOH BIRNDOH BIRN NSF iVDGLNSF iVDGL ……
Asia, AustraliaAsia, Australia Australia: ECOGRID, Australia: ECOGRID, GRIDBUS,… GRIDBUS,… Japan: BIOGRID, Japan: BIOGRID, NAREGI, …NAREGI, … South Korea: National South Korea: National Grid Basic Plan, Grid Grid Basic Plan, Grid Forum Korea,…Forum Korea,… … …
Paula EerolaPaula EerolaFour Seas Conference, Four Seas Conference, IstanbulIstanbul, 5-10 September 2004, 5-10 September 2004
13
EuropeEurope
EGEEEGEE NorduGridNorduGrid EDG, LCG EDG, LCG UK GridPP UK GridPP INFN Grid, ItalyINFN Grid, Italy Cross-grid projectsCross-grid projects in in
order order to link to link together together Grid Grid projects projects
Many Grid projects have particle physics as the initiator Other fields are joining in: healthcare, bioinformatics,… Address different aspects of grids:
Middleware Infrastructure Networking, cross-Atlantic interoperation
Many Grid projects have particle physics as the initiator Other fields are joining in: healthcare, bioinformatics,… Address different aspects of grids:
Middleware Infrastructure Networking, cross-Atlantic interoperation
PARTNERS
70 partners organized in nine regional federations
Coordinating and Lead Partner: CERN
CENTRAL EUROPE – FRANCE - GERMANY & SWITZERLAND – ITALY - IRELAND & UK - NORTHERN EUROPE - SOUTH-EAST EUROPE - SOUTH-WEST EUROPE – RUSSIA - USA
STRATEGY
Leverage current and planned national and regional Grid programmes Build on existing investments in Grid Technology by EU and US Exploit the international dimensions of the HEP-LCG programme Make the most of planned collaboration with NSF CyberInfrastructure initiative
A seamless international Grid infrastructure to provide researchers in academia and industry
with a distributed computing facility
ACTIVITY AREAS
SERVICESDeliver “production level” grid services (manageable, robust, resilient to failure) Ensure security and scalability
MIDDLEWARE Professional Grid middleware re-engineering activity in support of the production services
NETWORKING Proactively market Grid services to new research communities in academia and industry Provide necessary education
Paula EerolaPaula EerolaFour Seas Conference, Four Seas Conference, IstanbulIstanbul, 5-10 September 2004, 5-10 September 2004
15
Create a Create a European-wide Grid InfrastructureEuropean-wide Grid Infrastructure for the support of for the support of research in all scientific areasresearch in all scientific areas, on top of the EU Reseach , on top of the EU Reseach Network infrastructure (GEANT)Network infrastructure (GEANT) Integrate regional grid efforts
EGEE: goals and partnersEGEE: goals and partners
9 regional federations covering 70 partners in 26 countrieshttp://public.eu-egee.org/
Paula EerolaPaula EerolaFour Seas Conference, Four Seas Conference, IstanbulIstanbul, 5-10 September 2004, 5-10 September 2004
16
EGEE projectEGEE project Project funded by EU FP6, 32 MEuro for 2 years Project start 1 April 2004 Activities:
•Grid Infrastructure: Provide a Grid service for science research•Next generation of Grid middleware gLite•Dissemination, Training and Applications (initially HEP & Bio)
Paula EerolaPaula EerolaFour Seas Conference, Four Seas Conference, IstanbulIstanbul, 5-10 September 2004, 5-10 September 2004
17
EGEE: timelineEGEE: timeline
Paula EerolaPaula EerolaFour Seas Conference, Four Seas Conference, IstanbulIstanbul, 5-10 September 2004, 5-10 September 2004
18
Grid in Scandinavia: Grid in Scandinavia: the NorduGrid Projectthe NorduGrid Project
Nordic Testbed forNordic Testbed for
Wide Area Wide Area Computing and Computing and Data HandlingData Handling
www.nordugrid.orgwww.nordugrid.org
Paula EerolaPaula EerolaFour Seas Conference, Four Seas Conference, IstanbulIstanbul, 5-10 September 2004, 5-10 September 2004
19
NorduGrid: original NorduGrid: original objectives and current objectives and current statusstatusGoals 2001 (project start):Goals 2001 (project start): Introduce the Grid to Introduce the Grid to
ScandinaviaScandinavia Create a Grid Create a Grid
infrastructure in Nordic infrastructure in Nordic countriescountries
Apply available Grid Apply available Grid technologies/middlewartechnologies/middleware e
Operate a functional Operate a functional Testbed Testbed
Expose the Expose the infrastructure to end-infrastructure to end-users of different users of different scientific communities scientific communities
Status 2004:Status 2004: The project has grown world-wide: The project has grown world-wide:
nodes in Germany, Slovenia, nodes in Germany, Slovenia, Australia,...Australia,...
39 nodes, 3500 CPUs39 nodes, 3500 CPUs Created own NorduGrid Created own NorduGrid
Middleware, ARC (Advanced Middleware, ARC (Advanced Resource Connector), which is Resource Connector), which is operating in a stable wayoperating in a stable way
Applications: massive production Applications: massive production of ATLAS simulation and of ATLAS simulation and reconstructionreconstruction
Other applications: AMANDA Other applications: AMANDA simulation, genomics, bio-simulation, genomics, bio-informatics, visualization (for informatics, visualization (for metheorological data), multimedia metheorological data), multimedia applications,...applications,...
Status 2004:Status 2004: The project has grown world-wide: The project has grown world-wide:
nodes in Germany, Slovenia, nodes in Germany, Slovenia, Australia,...Australia,...
39 nodes, 3500 CPUs39 nodes, 3500 CPUs Created own NorduGrid Created own NorduGrid
Middleware, ARC (Advanced Middleware, ARC (Advanced Resource Connector), which is Resource Connector), which is operating in a stable wayoperating in a stable way
Applications: massive production Applications: massive production of ATLAS simulation and of ATLAS simulation and reconstructionreconstruction
Other applications: AMANDA Other applications: AMANDA simulation, genomics, bio-simulation, genomics, bio-informatics, visualization (for informatics, visualization (for metheorological data), multimedia metheorological data), multimedia applications,...applications,...
Paula EerolaPaula EerolaFour Seas Conference, Four Seas Conference, IstanbulIstanbul, 5-10 September 2004, 5-10 September 2004
20
Current NorduGrid Current NorduGrid statusstatus
The The LLHC HC CComputing omputing GGrid, rid, LCGLCG
The distributed computing environment The distributed computing environment to to analyse the LHC dataanalyse the LHC data
lcg.web.cern.chlcg.web.cern.ch
Paula EerolaPaula EerolaFour Seas Conference, Four Seas Conference, IstanbulIstanbul, 5-10 September 2004, 5-10 September 2004
22
LCG - goalsLCG - goalsGoal: prepare and deploy the computing Goal: prepare and deploy the computing
environmentenvironment
that will be used to that will be used to analyse the LHC dataanalyse the LHC data
Phase 1: 2003 – 2005Phase 1: 2003 – 2005 Build a service prototypeBuild a service prototype Gain experience in running a production grid Gain experience in running a production grid
serviceservice
Phase 2: 2006 – 2008Phase 2: 2006 – 2008 Build and commission the initial LHC computing Build and commission the initial LHC computing
environmentenvironment
2003 2006
Technical Design Report for Phase 2
LCG full multi-tier prototype batch+interactive service
LCG service opens
LCG with upgraded m/w, management etc.
20052004Event simulation productions
Paula EerolaPaula EerolaFour Seas Conference, Four Seas Conference, IstanbulIstanbul, 5-10 September 2004, 5-10 September 2004
23
LCG composition and LCG composition and taskstasks
The LCG Project is a The LCG Project is a collaborationcollaboration of of – The LHC experimentsThe LHC experiments– The Regional Computing CentresThe Regional Computing Centres– Physics institutesPhysics institutes
Development and operation of Development and operation of a distributed a distributed ccomputing service omputing service
– computing and storage resources in computing computing and storage resources in computing centres, physics institutes and universities around the centres, physics institutes and universities around the worldworld
– reliable, coherent environment for the experimentsreliable, coherent environment for the experiments Support forSupport for applications applications
– provision of common tools, frameworks, environment, provision of common tools, frameworks, environment, data persistencydata persistency
Paula EerolaPaula EerolaFour Seas Conference, Four Seas Conference, IstanbulIstanbul, 5-10 September 2004, 5-10 September 2004
24
Resource targets ´04Resource targets ´04 CPUCPU
(kSI2K)(kSI2K)DiskDisk TBTB
SupportSupport FTEFTE
Tape Tape TBTB
CERNCERN 700700 160160 10.010.0 10001000
Czech Rep.Czech Rep. 6060 55 2.52.5 55
FranceFrance 420420 8181 10.210.2 540540
GermanyGermany 207207 4040 9.09.0 6262
HollandHolland 124124 33 4.04.0 1212
ItalyItaly 507507 6060 16.016.0 100100
JapanJapan 220220 4545 5.05.0 100100
PolandPoland 8686 99 5.05.0 2828
RussiaRussia 120120 3030 10.010.0 4040
TaiwanTaiwan 220220 3030 4.04.0 120120
SpainSpain 150150 3030 4.04.0 100100
SwedenSweden 179179 4040 2.02.0 4040
SwitzerlandSwitzerland 2626 55 2.02.0 4040
UKUK 16561656 226226 17.317.3 295295
USAUSA 801801 176176 15.515.5 17411741
TotalTotal 56005600 11691169 120.0120.0 42234223
LCG status Sept ’04LCG status Sept ’04 Tier 0 Tier 0 CERNCERN
Tier 1 CentresTier 1 Centres Brookhaven Brookhaven CNAF BolognaCNAF Bologna PIC BarcelonaPIC Barcelona FermilabFermilab FZK Karlsruhe FZK Karlsruhe IN2P3 LyonIN2P3 Lyon Rutherford Rutherford
(UK)(UK) Univ. of TokyoUniv. of Tokyo CERNCERN
Tier 2 centersTier 2 centers South-East Europe: South-East Europe:
HellasGrid, AUTH, HellasGrid, AUTH, Tel-Aviv, Tel-Aviv, WeizmannWeizmann
BudapestBudapest PraguePrague KrakowKrakow Warsaw Warsaw Moscow regionMoscow region ItalyItaly …………
Paula EerolaPaula EerolaFour Seas Conference, Four Seas Conference, IstanbulIstanbul, 5-10 September 2004, 5-10 September 2004
26
LCG status Sept ´04LCG status Sept ´04 First production serviceFirst production service for LHC experiments for LHC experiments
operational operational – Over 70 centers, over 6000 CPUs, although Over 70 centers, over 6000 CPUs, although many of many of
these sites are small and cannot run big simulationsthese sites are small and cannot run big simulations– LCG-2 middlewareLCG-2 middleware – – testing, certification, packaging, testing, certification, packaging,
configuration, distribution and site validationconfiguration, distribution and site validation Grid operations centersGrid operations centers in RAL and Taipei (+US) in RAL and Taipei (+US) – –
performance monitoring, problem solving – 24x7 performance monitoring, problem solving – 24x7 globallyglobally
Grid call centersGrid call centers in FZK Karlsruhe and Taipei. in FZK Karlsruhe and Taipei. Progress towards inter-operation between LCG, Progress towards inter-operation between LCG,
NorduGrid, Grid3 (US)NorduGrid, Grid3 (US)
OutlookOutlook
EU vision of EU vision of ee--infrastructure in Europeinfrastructure in Europe
Paula EerolaFour Seas Conference, Istanbul, 5-10 September 2004
28
Moving towards an Moving towards an ee--infrastructureinfrastructure
IPv6
IPv6
IPv6
GridsGrids
GridsGÉANT
Grids middleware
Paula EerolaFour Seas Conference, Istanbul, 5-10 September 2004
29
Moving towards an e-infrastructure
Grids middleware
Grid-empowered e-infrastructure – “all in one”
e-Infrastructure
Paula EerolaPaula EerolaFour Seas Conference, Four Seas Conference, IstanbulIstanbul, 5-10 September 2004, 5-10 September 2004
30
SummarySummary
Huge investment in Huge investment in ee-science and -science and Grids in EuropeGrids in Europe– regional, national, cross-national, regional, national, cross-national,
EUEU Emerging vision of European-wide Emerging vision of European-wide ee--
science infrastructure for researchscience infrastructure for research High Energy Physics is a major High Energy Physics is a major
application that needs this application that needs this infrastructure today and is pushing infrastructure today and is pushing the limits of the technologythe limits of the technology