Michal Turala Warszawa, 25 February 2005 1
Computing development
projects
GRIDS
M. Turala
The Henryk Niewodniczanski Instytut of Nuclear Physics PAN
and
the Academic Computing Center Cyfronet AGH,
Kraków
Michal Turala Warszawa, 25 February 2005 2
Outline- computing requirements of the future HEP
experiments
- HEP world wide computing models and
related grid projects
- Polish computing projects: PIONIER and
GRIDS
- Polish participation in the LHC Computing
Grid (LCG) project
Michal Turala Warszawa, 25 February 2005 3
Data preselection in real time
- many different
physics processes
- several levels
of filtering
- high efficiency
for events of interest
- total reduction factor of about 107
LHC data rate and filtering
Level 1 - Special Hardware
Level 2 - Embedded Processors/Farm
40 MHz 40 MHz (1000 TB/sec) equivalent)
(1000 TB/sec) equivalent)
Level 3 – Farm of commodity CPU
75 KHz 75 KHz (75 GB/sec)fully digitised
(75 GB/sec)fully digitised5 KHz5 KHz (5 GB/sec)
(5 GB/sec)100 Hz100 Hz (100 MB/sec)
(100 MB/sec)
Data Recording &
Data Recording &
Offline Analysis
Offline Analysis
Michal Turala Warszawa, 25 February 2005 4
Data rate for LHC p-p events
Typical parameters:Nominal rate - 109 events/s
(luminosity 1034/cm2s, collision rate 40MHz)
Registration rate - ~100 events/s (270 events/s)
Event size - ~1 M Byte/ event (2 M Byte/ event)
Running time ~ 107 s/ year
Raw data volume ~ 2 Peta Byte/year/experiment
Monte Carlo’s ~ 1 Peta Byte/year/experiment
The rate and volume of HEP data doubles every 12 months !!!
Already today BaBar, Belle, CDF, DO experiments produce 1 TB/ day
Michal Turala Warszawa, 25 February 2005 5
Data analysis scheme
Interactive Data AnalysisInteractive
Data Analysis
ProcessedData
ProcessedData
1-100 GB/sec1-100 GB/sec
200 TB / year
200 TB / year
DetectorDetector
Raw dataRaw data
EventEventReconstructionReconstruction
EventEventReconstructionReconstruction
EventEventSimulationSimulation
EventEventSimulationSimulation
One Experiment
One Experiment
One Experiment
One Experiment35K SI9535K SI95
~200 MB/sec~200 MB/sec
250K SI95250K SI95
350K SI9564 GB/sec350K SI95
64 GB/sec
500 TB500 TB
1 PB / year1 PB / year
~100 MB/sec~100 MB/sec analysis objects
Event FilterEvent Filter(selection &(selection &
reconstruction)reconstruction)
Event FilterEvent Filter(selection &(selection &
reconstruction)reconstruction) Event
Summary Data
Event Summary
Data
Batch PhysicsBatch PhysicsAnalysisAnalysis
Batch PhysicsBatch PhysicsAnalysisAnalysis
0.1 to 10.1 to 1GB/secGB/sec
Thousands of scientistsfrom M. Delfino
Michal Turala Warszawa, 25 February 2005 6
Multi-tier model of data analysis
7
LHC computing model (Cloud)
CERNTier2
Lab a
Uni a
Lab c
Uni n
Lab m
Lab b
Uni bUni y
Uni x
PhysicsDepartment
Desktop
Germany
Tier 1
USAFermiLab
UK
France
Italy
NL
USABrookhaven
……….
The LHC Computing
Centre
8
ICFA Network Task Force (1998): required network bandwidth (Mbps)
1998 2000 2005
BW Utilized Per Physicist (and Peak BW Used)
0.05 - 0.25 (0.5 - 2)
0.2 – 2 (2-10)
0.8 – 10 (10 – 100)
BW Utilized by a University Group 0.25 - 10 1.5 - 45 34 - 622
BW to a Home Laboratory Or Regional Center
1.5 - 45 34 - 155 622 - 5000
BW to a Central Laboratory Housing Major Experiments
34 - 155 155 - 622
2500 - 10000
BW on a Transoceanic Link 1.5 - 20 34 - 155 622 - 5000
100–1000 X Bandwidth Increase Foreseen for 1998-2005100–1000 X Bandwidth Increase Foreseen for 1998-2005See the ICFA-NTF Requirements Report:See the ICFA-NTF Requirements Report:
http://l3www.cern.ch/~newman/icfareq98.html
Michal Turala Warszawa, 25 February 2005 9
LHC computing – specifications for Tier0 and Tier1
CERN ALICE ATLAS CMS LHCb CPU (kSI95) 824 690 820 225Disk Pool (TB) 535 410 1143 330Aut. Tape (TB) 3200 8959 1540 912Shelf Tape (TB) - - 2632 310Tape I/O (MB/s) 1200 800 800 400Cost 2005-7 (MCHF) 18.1 23.7 23.1 7.0
Tier 1CPU (kSI95) 234 209 417 140 Disk Pool (TB) 273 360 943 150Aut. Tape (TB) 400 1839 590 262Shelf Tape (TB) - - 683 55Tape I/O (MB/s) 1200 800 800 400# Tier 4 6 5 5Cost av (MCHF) 7.1 8.5 13.6 4.0
Michal Turala Warszawa, 25 February 2005 10
Development of Grid projects
11
EGSO
AVO
GRIA
CrossGrid
GridLab
GRIP
DataTAG
EuroGrid
DAMIEN
DataGrid
GEMSSMammoGrid
BioGridSeLeNe
OpenMolGrid
COG
FlowGrid
GRACE
MOSES
1/ 10/ 2000 1/ 10/ 2001 1/ 10/ 2002
Applications
Infrastructure
Middleware
•Infrastructure
DataTag•Computing
EuroGrid, DataGrid, Damien
•Tools and Middleware
GridLab, GRIP•Applications
EGSO, CrossGrid, BioGrid, FlowGrid, Moses, COG, GEMSS, Grace, Mammogrid, OpenMolGrid, Selene,
•P2P / ASP / Webservices
P2People, ASP-BP,GRIA, MMAPS, GRASP, GRIP, WEBSI
•Clustering
GridStart
EU FP5 Grid Projects EU FP5 Grid Projects 2000-20042000-2004
(EU Funding: 58 M€)from M. Lemke at CGW04
12
Strong Polish Participation in FP5 Grid Research Projects
2 Polish-led Projects (out of 12) CrossGrid
CYFRONET Cracow ICM Warsaw PSNC Poznan INP Cracow INS Warsaw
GridLab PSNC Poznan
Significant share of funding to Poland versus EU25 FP5 IST Grid Research Funding: 9.96 % FP5 wider IST Grid Project Funding: 5 % GDP: 3.8 % Population: 8.8 %
CROSSGRIDCROSSGRID partners partners
from M. Lemke at CGW04
Michal Turala Warszawa, 25 February 2005 13
CrossGrid testbeds
LUBLIN
SZCZECINBYDGOSZCZ
TORUŃ
OPOLE
16 sites in 10 countries, about 200 processors and 4 TB disk storage
Testbeds for- development- production- testing- tutorials - external users
Middleware: from EDG 1.2 to LCG-2.3.0
Last week CrossGrid has concluded successfully its final review
Michal Turala Warszawa, 25 February 2005 14
CrossGrid applications
POZNAŃ
WROCŁAW
KATOWICE KRAKÓW
WARSZAWA
SZCZECIN
TORUŃ
BIAŁYSTOK
ELBLĄGOLSZTYN
KIELCE
PUŁAWY
RZESZÓW
OPOLE
BIELSKO-BIAŁA
CZĘSTOCHOWA
Medical
Blood flow simulation, supporting vascular
surgeons in the treatment of
arteriosclerosis
Flood prediction and simulation based on
weather forecasts and geographical data
Flood predicti
on
Distributed data mining in high energy physics,
supporting the LHC collider experiments at
CERN
Physics
Large-scale weather forecasting combined
withair pollution modeling (for various pollutants)
MeteoPollution
Michal Turala Warszawa, 25 February 2005 15
Grid for real time data filtering GDAŃSK
POZNAŃ
WROCŁAW
ZIELONA GÓRA ŁÓDŹ
KATOWICE KRAKÓW
LUBLIN
WARSZAWA
SZCZECINBYDGOSZCZ
TORUŃ
BIAŁYSTOK
ELBLĄGOLSZTYN
KIELCE
PUŁAWY
RZESZÓW
OPOLE
BIELSKO-BIAŁA
KOSZALIN
RADOM
CZĘSTOCHOWA
Studies on a possible use of remote computing farms for event filtering; in 2004 beam test data shipped to Cracow, and back to CERN, in real time.
Michal Turala Warszawa, 25 February 2005 16
LHC Computing Grid project-LCGObjectives
- design, prototyping and implementation of the computing environment for LHC experiments (MC, reconstruction and data analysis): - infrastructure - middleware - operations (VO)
Schedule - phase 1 (2002 – 2005; ~50 MCHF); R&D and prototyping (up to 30% of the final size)- phase 2 (2006 – 2008 ); preparation of a Technical Design Report, Memoranda of Understanding, deployment (2007)
Coordination - Grid Deployment Board: representatives of the world HEP community, supervising of the LCG grid deployment and testing
17
Country providing resourcesCountry anticipating joining EGEE/LCG
In EGEE-0 (LCG-2): 91 sites >9000 cpu ~5 PB storage
Computing Resources – Dec. 2004
From F. Gagliardi at CGW04
Three Polish institutions involved - ACC Cyfronet Cracow - ICM Warsaw - PSNC Poznan
Polish investment in the local infrastructureEGEE supporting the operations
Michal Turala Warszawa, 25 February 2005 18
Polish Participation in LCG project
Polish Tier2 • INP/ ACC Cyfronet Cracow
• resources (plans for 2004)• 128 processors (50%), • storage: disk ~ 10TB, tape (UniTree) ~ 10 TB (?)
• manpower • engineers/ physicists ~ 1 FTE + 2 FTE (EGEE)
• ATLAS data challenges – qualified in 2002
• INS/ ICM Warsaw• resources (plans for 2004)
• 128 processors (50%), • storage: disk ~ 10TB, tape ~ 10 TB
• manpower• engineers/ physicists ~ 1 FTE + 2 FTE (EGEE)
• Connected to LCG-1 world-wide testbed in September 2003
Michal Turala Warszawa, 25 February 2005 19
Polish networking - PIONIER
from the report of PSNC to ICFA ICIC, Feb. 2004 (M. Przybylski)
5200 km fibres installed, connecting 21 MAN centres
Multi-lambda connections planned
Good connectivity of HEP centres to MANs- IFJ PAN to MAN Cracow – 100 Mb/s -> 1 Gbs,- INS to MAN Warsaw – 155 Mb/s
Stockholm
GEANT
Prague ŁÓDŹ
TORUŃ
P OZNAŃ
B YDGOS ZCZ
OLS ZTYN
B IAŁYS TOK
GDAŃS K
KOS ZALIN
S ZCZECIN
ZIELONAGÓRA
WROCŁAW
CZĘS TOCHOWA
KRAKÓW RZES ZÓW
LUB LIN
KIELCE
P UŁAWY
RADOM
KATOWICEGLIWICE
B IELS KO-B IAŁA
OP OLE
GUB IN WARS ZAWA
CIES ZYN
S IEDLCE
P IO NIE R n o d e sIn s ta lle d fib e r
F ib e rs p la n n e d in 2 0 0 4P IO NIE R n o d e s p la n n e d in 2 0 0 4
Michal Turala Warszawa, 25 February 2005 20
PC Linux cluster at ACC Cyfronet CrossGrid – LCG-1
4 nodes 1U 2x PIII, 1GHz 512 MB RAM 40 GB HDD 2 x FastEthernet 100Mb/s
23 nodes 1U 2x Xeon 2,4Ghz 1 GB RAM 40 GB HDD Ethernet 100Mb/s+1Gb/s
HP ProCurve Switch 40 ports 100Mb/s, 1 port 1Gb/s (uplink)
Monitoring: 1U unit KVM keyboard touch pad LCD
Ethernet100 Mb
PIII 1GHz
Xeon 2,4 GHz
INTERNET
Last year 40 nodes of I64 processors have been added; in 2005 investments of 140 Linux 32 bit processors and 20 - 40 TB of disk storage are planned
Michal Turala Warszawa, 25 February 2005 21
ACC Cyfronet in LCG-1
Sept. 2003: Sites taking part in the initial LCG service (red dots)
KrakówPoland
Karlsruhe Germany
This is the very first really running global computing and data Grid, which covers participants on three continents
Small Test clusters at 14 institutions;Grid middleware package(mainly parts of EDG and VDT)
a global Grid testbed
from K-P. Mickelat CGW03l
Michal Turala Warszawa, 25 February 2005 22
Linux Cluster at INS/ ICM CrossGrid – EGEE - LCG
• cluster at the Warsaw University (Physics Department)• Worker Nodes: 10 CPUS (Athlon 1.7 MHz)• Storage Element: ~ 0.5 TB• Network: 155 Mb/s• LCG 2.3.0, registered in LCG Test Zone
PRESENT STATE
NEAR FUTURE (to be ready in June 2005)
• cluster at the Warsaw University (ICM) • Worker Nodes: 100 - 180 CPUS (64-bit)• Storage Element: ~ 9 TB• Network: 1 Gb/s (PIONEER)
from K. Nawrocki
Michal Turala Warszawa, 25 February 2005 23
PC Linux Cluster at ACC Cyfronet CrossGrid – EGEE- LCG-1
LCG cluster at ACC Cyfronetstatistics for 2004
CPU time Walltime
[hours]
20493,18667 25000,28444
2327,6625 3574,646111
10471,79639 11674,375
Atlas
Alice
LHCb
CPU time Walltime
[seconds]
Atlas 73775472 90001024
Alice 8379585 12868726
LHCb 37698467 42027750
24
ATLAS DC Status
ATLAS
ATLAS DC2 - CPU usage
41%
30%
29%
LCG
NorduGrid
Grid3
ATLAS DC2 - LCG - September 71%
2%
0%
1%
2%
14%
3%
1%
3%
9%
8%
3%2%5%1%4%
1%
1%
3%
0%
1%
1%
4%1%
0%
12%
0%
1%
1%
2%
10%
1% 4%
at.uibk
ca.triumf
ca.ualberta
ca.umontreal
ca.utoronto
ch.cern
cz.golias
cz.skurut
de.fzk
es.ifae
es.ific
es.uam
fr.in2p3
it.infn.cnaf
it.infn.lnl
it.infn.mi
it.infn.na
it.infn.na
it.infn.roma
it.infn.to
it.infn.lnf
jp.icepp
nl.nikhef
pl.zeus
ru.msu
tw.sinica
uk.bham
uk.ic
uk.lancs
uk.man
uk.rl
~ 1350 kSI2k.months~ 120,000 jobs
~ 10 Million events fully simulated (Geant4)~ 27 TB
All 3 Grids have been proven to be usable for a real production
DC2 Phase I started beginning of July, finishing now 3 Grids were used
LCG ( ~70 sites, up to 7600 CPUs) NorduGrid (22 sites, ~3280 CPUs (800), ~14TB) Grid3 (28 sites, ~2000 CPUs)
LCG 41%
Grid3 29%
NorduGrid30%
from L. Robertson at C-RRB, Oct. 2004
Michal Turala Warszawa, 25 February 2005 25
„In response to the LCG MoU draft document and using data of the PASTA report the plans for Polish Tier2 infrastructure have been prepared – they are summarized in the Table
Polish LHC Tier2 - future
2001 2002 2003 2004 2005 2006 2007 2008 2009 2010
CPU (kSI2000) 4 20 40 100 150 500 1000 1500 2000
Disk, LHC (TBytes) 1 5 10 20 30 100 300 500 600
Tape, LHC (TBytes) 5 5 10 10 20 50 100 180 180
WAN (Mbits/s) 34 155 622 10000 10000 10000 10000 20000 20000 20000
Manpower (FTE) 2 3 4 4 5 6 6 6 6
It is planned that in the next few years the LCG resources will grow incrementally mainly due to local investments. A step is expected around 2007, when the matter of LHC computing fundings should be finally resolved.”
from the report to LCG GDB, 2004
Michal Turala Warszawa, 25 February 2005 26
Thank you for your attention