lhcopn status and plans
DESCRIPTION
LHCOPN Status and Plans. LHCOPN Status and Plans. Joint Techs Hawaii. Joint-Techs Hawaii. David Foster Head, Communications and Networks CERN January 2008. David Foster Head, Communications and Networks CERN January 2008. Acknowledgments. - PowerPoint PPT PresentationTRANSCRIPT
1
1
LHCOPN Status and Plans
David FosterHead, Communications and Networks
CERNJanuary 2008
Joint TechsHawaii
LHCOPN Status and Plans
Joint-TechsHawaii
David FosterHead, Communications and Networks
CERNJanuary 2008
2
2
Acknowledgments Many presentations and material in the public domain
have contributed to this presentation, too numerous to mention individually.
3
3
LHC
Mont Blanc, 4810 m
Downtown Geneva
4
4 CERN – March 2007
26659m in Circumference
SC Magnets pre‑cooled to -193.2°C (80 K) using 10 080 tonnes of liquid nitrogen
60 tonnes of liquid helium bring them down to -271.3°C (1.9 K).
600 Million Proton Collisions/second
The internal pressure of the LHC is 10-13 atm, ten times less than the pressure on the Moon
5
5
CERN’s Detectors• To observe the collisions, collaborators from around the
world are building four huge experiments: ALICE, ATLAS, CMS, LHCb
• Detector components are constructed all over the world• Funding comes mostly from the participating institutes,
less than 20% from CERN
CMS
ALICE
ATLAS
LHCb
6
6
The LHC Computing Challenge• Signal/Noise 10-9
• Data volume• High rate x large number of
channels x 4 experiments 15 PetaBytes of new data each
year• Compute power
• Event complexity x Nb. events x thousands users
100 k of today's fastest CPUs• Worldwide analysis & funding
• Computing funding locally in major regions & countries
• Efficient analysis everywhere GRID technology
7
7 CERN – March 2007
8
8 CERN – March 2007
10
10
The WLCG Distribution of Resources
Tier-0 – the accelerator centre• Data acquisition and initial Processing of raw data• Distribution of data to the different Tier’s
Canada – Triumf (Vancouver)France – IN2P3 (Lyon)Germany – Forschunszentrum KarlsruheItaly – CNAF (Bologna)Netherlands – NIKHEF/SARA (Amsterdam)Nordic countries – distributed Tier-1
Spain – PIC (Barcelona)Taiwan – Academia SInica (Taipei)UK – CLRC (Oxford)US – FermiLab (Illinois) – Brookhaven (NY)
Tier-1 (11 centers ) – “online” to the data acquisition process high availability
• Managed Mass Storage – grid-enabled data service
• Data-heavy analysis• National, regional support
Tier-2 – ~200 centres in ~40 countries• Simulation• End-user analysis – batch and interactive
14
11
11
Centers around the world form a Supercomputer
• The EGEE and OSG projects are the basis of the Worldwide LHC Computing Grid Project WLCG
Inter-operation between Grids is working!
12
12
Tier-1 Centers: TRIUMF (Canada); GridKA(Germany); IN2P3 (France); CNAF (Italy); SARA/NIKHEF (NL); Nordic Data Grid Facility (NDGF); ASCC (Taipei); RAL (UK); BNL (US); FNAL (US); PIC (Spain)
The Grid is now in operation, working on: reliability, scaling up, sustainability
13
13
Guaranteed bandwidth can be a good thing
14
14
LHCOPN Mission• To assure the T0-T1 transfer capability.
• Essential for the Grid to distribute data out to the T1’s.• Capacity must be large enough to deal with most situation including “Catch
up”• The excess capacity can be used for T1-T1 transfers.
• Lower priority than T0-T1• May not be sufficient for all T1-T1 requirements
• Resiliency Objective• No single failure should cause a T1 to be isolated.
• Infrastructure can be improved• Naturally started as an unprotected “star” – insufficient for a production
network but enabled rapid progress.• Has become a reason for and has leveraged cross border fiber.
• Excellent side effect of the overall approach.
15
15
LHCOPN Design Information• All technical content is on the LHCOPN Twiki:
http://lhcopn.cern.ch• Coordination Process
• LHCOPN Meetings (every 3 months)• Active Working Groups
– Routing– Monitoring– Operations
• Active Interfaces to External Networking Activities• European Network Policy Groups• US Research Networking• Grid Deployment Board• LCG Management Board• EGEE
16
16 CERN – March 2007
17
17
SWITCH
COLT - ISP
Interoute - ISP
Globalcrossing - ISP
WHO - CIC
CITIC74 - CIC
CIXP
TIFR - Tier2
UniGeneva - Tier2
RIPN
USLHCnet Chicago – NYC - Amst
CA-TRIUMF - Tier1
ES-PIC - Tier1
DE-KIT - Tier1
FR-CCIN2P3 - Tier1
IT-INFN-CNAF - Tier1
NDGF - Tier1
NL-T1 - Tier1
TW-ASGC - Tier1
UK-T1-RAL - Tier1
US-T1-BNL - Tier1c
US-FNAL-CMS - Tier1c
CH-CERN – Tier0LHCOPN
Geant2
Equinix -TIX
Russian Tier2s
CERN WANNetwork
10Gbps
5G
6G
40G
20G
20G
12.5G20G
1Gbps100Mbps
CERN External Network Links
18
18
GPN
g513-e-rci76-1
IX Europe
[email protected] - last update: 20070801
e513-x-mfte6-1
e513-e-rci65-3
e513-e-rci76-2
SWITCH AS559GEANT AS20965
Chicago POP
CIXP E513-X
StarLight Force10
swice2.switch.ch C7606
I-root dns server
Akamai AS21357
as1-gva C2509as2-gva C2511
swice3.switch.ch C7606
LHCOPN
CITIC74 195.202.0.0/20
who-7204-a
who-7204-b
FNAL AS3152ESnet AS293
Abilene AS11537
RIPE RIS(04) AS12654
K-root dns server
e600chi.uslhcnet.org
WHO 158.232.0.0/16
Reuters AS65020
e513-e-rci76-1
e600nyc.uslhcnet.orgNew York POP
USLHCnet AS1297 192.65.196.0/23e600gva1e600gva2
l513-c-rftec-2
x424nyc.uslhcnet.org
Internet GC AS3549
rt1.gen.ch.geant2.net JT640
as1(-5)-csen C2511
e513-e-shp3m-4
e513-e-rci72-4
tt87.ripe.net
Internet Level3 AS3356
l513-c-rftec-1
rt1.par.fr.geant2.net JT640
evo-us
Abilene AS11537
TIX
Tier2UniGeJINR AS2875KIAE AS6801RadioMSU AS2683
tt31.ripe.net
ext-dns-2
ext-dns-1
g513-e-rci76-2
evo-eu
e513-e-mhpyl-1GN2 - E2E
Internet COLT AS8220
CERN External Network E513-E – AS513
Amsterdam
Internet Level3 AS3356
e600ams
r513-c-rca80-1GPRS - VPN
19
19
Transatlantic Link Negotiations Yesterday
A major provider lost their shirt on this deal!
20
20
LHCOPN Architecture 2004 Starting Point
Tier-2s and Tier-1s are inter-connected by the general
purpose research networks
Any Tier-2 mayaccess data at
any Tier-1
Tier-2 IN2P3TRIUMF
ASCC
FNAL
BNL
Nordic
CNAF
SARAPIC
RAL
GridKa
Tier-2
Tier-2
Tier-2
Tier-2
Tier-2
Tier-2
Tier-2Tier-2Tier-2
21
21
GÉANT2: Consortium of 34 NRENs
Multi-Wavelength Core (to 40) + 0.6-10G Loops
Dark Fiber Core Among16 Countries:
AustriaBelgiumBosnia-Herzegovina Czech RepublicDenmarkFranceGermanyHungaryIrelandItaly,NetherlandSlovakiaSloveniaSpainSwitzerlandUnited Kingdom
22 PoPs, ~200 Sites38k km Leased Services, 12k km Dark Fiber Supporting Light Paths for LHC, eVLBI, et al.
H. Doebbeling
22
22
23
23
Basic Link Layer Monitoring• Perfsonar very well advanced in deployment (but not yet
complete). Monitors the “up/down” status of the links.• Integrated into the “End to End Coordination Unit”
(E2ECU) run by DANTE• Provides simple indications of “hard” faults.• Insufficient to understand the quality of the connectivity
24
24
25
25
26
26
Active Monitoring• Active monitoring needed
• Implementation consistency needed for accurate results• One-way delay• TCP achievable bandwidth• ICMP based round trip time• Traceroute information for path changes• Needed for service quality issues
• First mission is T0-T1 and T1-T1• T1 deployment could be also used for T1-T2
measurements as a second step and with corresponding T2 infrastructure.
27
27
Background Stats
28
28
Monitoring Evolution• Long standing collaboration of the measurement and monitoring technologies
• Monitoring working group of the LHCOPN• ESNet and Dante have been leading the effort
• Proposal for a Managed Service by Dante• Manage the tools, archives• Manage the hardware, O/S• Manage integrity of information
• Sites have some obligations• On-site operations support• Provision of a terminal server• Dedicated IP port on the border router• PSTN/ISDN line for out of band communication• Gigabit Ethernet Switch• GPS Antenna• Protected power• Rack Space
29
29
Operational Procedures• Have to be finalised but need to deal with change and
incident management.• Many parties involved.• Have to agree on the real processes involved
• Recent Operations workshop made some progress• Try to avoid, wherever possible, too many “coordination units”.• All parties agreed we need some centralised information to have
a global view of the network and incidents.• Further workshop planned to quantify this.• We also need to understand existing processes used by T1’s.
30
30
Resiliency Issues• The physical fiber path considerations continue
• Some lambdas have been re-routed. Others still may be.
• Layer3 backup paths for RAL and PIC are still an issue.• In the case of RAL, excessive costs seem to be a problem.• For PIC, still some hope of a CBF between RedIris and Renater
• Overall the situation is quite good with the CBF links, but can still be improved.• Most major “single” failures are protected against.
31
31
T0-T1 Lambda routing (schematic) Connect. Communicate. Collaborate
DEFrankfurt
Basel
T1 GRIDKA
T1
Zurich
CNAF
DK
Copenhagen
NL
SARA
UK
London
T1
BNL
T1FNAL
CH
NY
Starlight
MAN LAN
FR
Paris
T1
IN2P3
Barcelona
T1
PIC
ES
Madrid
T1
RAL
ITMilan
Lyon
Strasbourg/Kehl
GENEVA
AtlanticOcean
VSNL N
VSNL S
AC-2/Yellow
Stuttgart
T1 NDGF
T0
HamburgT1SURFnet
T0-T1s:CERN-RALCERN-PICCERN-IN2P3CERN-CNAFCERN-GRIDKACERN-NDGFCERN-SARACERN-TRIUMFCERN-ASGCUSLHCNET NY (AC-2)USLHCNET NY (VSNL N)USLHCNET Chicago (VSNL S)
T1
TRIUMF T1
ASGC
???
Via SMW-3 or 4 (?)
Amsterdam
32
32
T1-T1 Lambda routing (schematic) Connect. Communicate. Collaborate
DEFrankfurt
Basel
T1 GRIDKA
T1
Zurich
CNAF
DK
Copenhagen
NL
SARA
UK
London
T1
BNL
T1FNAL
CH
NY
Starlight
MAN LAN
FR
Paris
T1
IN2P3
Barcelona
T1
PIC
ES
Madrid
T1
RAL
ITMilan
Lyon
Strasbourg/Kehl
GENEVA
AtlanticOcean
VSNL N
VSNL S
AC-2/Yellow
Stuttgart
T1 NDGF
T0
HamburgT1SURFnet
T1-T1s:GRIDKA-CNAFGRIDKA-IN2P3GRIDKA-SARASARA-NDGF
T1
TRIUMF T1
ASGC
???
Via SMW-3 or 4 (?)
33
33
Some Initial ObservationsConnect. Communicate. Collaborate
DEFrankfurt
Basel
T1 GRIDKA
T1
Zurich
CNAF
DK
Copenhagen
NL
SARA
UK
London
T1
BNL
T1FNAL
CH
NY
Starlight
MAN LAN
FR
Paris
T1
IN2P3
Barcelona
T1
PIC
ES
Madrid
T1
RAL
ITMilan
Lyon
Strasbourg/Kehl
GENEVA
AtlanticOcean
VSNL N
VSNL S
AC-2/Yellow
Stuttgart
T1 NDGF
T0
HamburgT1SURFnet(Between CERN and BASEL)
Following lambdas run in same fibre pair:CERN-GRIDKACERN-NDGFCERN-SARACERN-SURFnet-TRIUMF/ASGC (x2)USLHCNET NY (AC-2)Following lambdas run in same (sub-)duct/trench:(all above +)CERN-CNAFUSLHCNET NY (VSNL N) [supplier is COLT]Following lambda MAY run in same (sub-)duct/trench as all above:USLHCNET Chicago (VSNL S) [awaiting info from Qwest…]
(Between BASEL and Zurich)Following lambdas run in same trench:CERN-CNAFGRIDKA-CNAF (T1-T1)Following lambda MAY run in same trench as all above:USLHCNET Chicago (VSNL S) [awaiting info from Qwest…]
T1
TRIUMF T1
ASGC
???
Via SMW-3 or 4 (?)
KEYGEANT2NRENUSLHCNETVia SURFnetT1-T1 (CBF)
34
34
Closing Remarks• The LHCOPN is an important part of the overall
requirements for LHC Networking.• It is a (relatively) simple concept.
• Statically Allocated 10G Paths in Europe• Managed Bandwidth on the 10G transatlantic links via
USLHCNet
• Multi-domain operations remain to be completely solved• This is a new requirement for the parties involved and a learning
process for everyone
• Many tools and ideas exist and the work is now to pull this all together into a robust operational framework
35
3535Simple solutions are often the best!