lhcopn status and plans

34
1 1 LHCOPN Status and Plans David Foster Head, Communications and Networks CERN January 2008 Joint Techs Hawaii LHCOPN Status and Plans Joint-Techs Hawaii David Foster Head, Communications and Networks CERN January 2008

Upload: lawson

Post on 11-Feb-2016

31 views

Category:

Documents


0 download

DESCRIPTION

LHCOPN Status and Plans. LHCOPN Status and Plans. Joint Techs Hawaii. Joint-Techs Hawaii. David Foster Head, Communications and Networks CERN January 2008. David Foster Head, Communications and Networks CERN January 2008. Acknowledgments. - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: LHCOPN Status and Plans

1

1

LHCOPN Status and Plans

David FosterHead, Communications and Networks

CERNJanuary 2008

Joint TechsHawaii

LHCOPN Status and Plans

Joint-TechsHawaii

David FosterHead, Communications and Networks

CERNJanuary 2008

Page 2: LHCOPN Status and Plans

2

2

Acknowledgments Many presentations and material in the public domain

have contributed to this presentation, too numerous to mention individually.

Page 3: LHCOPN Status and Plans

3

3

LHC

Mont Blanc, 4810 m

Downtown Geneva

Page 4: LHCOPN Status and Plans

4

4 CERN – March 2007

26659m in Circumference

SC Magnets pre‑cooled to -193.2°C (80 K) using 10 080 tonnes of liquid nitrogen

60 tonnes of liquid helium bring them down to -271.3°C (1.9 K).

600 Million Proton Collisions/second

The internal pressure of the LHC is 10-13 atm, ten times less than the pressure on the Moon

Page 5: LHCOPN Status and Plans

5

5

CERN’s Detectors• To observe the collisions, collaborators from around the

world are building four huge experiments: ALICE, ATLAS, CMS, LHCb

• Detector components are constructed all over the world• Funding comes mostly from the participating institutes,

less than 20% from CERN

CMS

ALICE

ATLAS

LHCb

Page 6: LHCOPN Status and Plans

6

6

The LHC Computing Challenge• Signal/Noise 10-9

• Data volume• High rate x large number of

channels x 4 experiments 15 PetaBytes of new data each

year• Compute power

• Event complexity x Nb. events x thousands users

100 k of today's fastest CPUs• Worldwide analysis & funding

• Computing funding locally in major regions & countries

• Efficient analysis everywhere GRID technology

Page 7: LHCOPN Status and Plans

7

7 CERN – March 2007

Page 8: LHCOPN Status and Plans

8

8 CERN – March 2007

Page 9: LHCOPN Status and Plans

10

10

The WLCG Distribution of Resources

Tier-0 – the accelerator centre• Data acquisition and initial Processing of raw data• Distribution of data to the different Tier’s

Canada – Triumf (Vancouver)France – IN2P3 (Lyon)Germany – Forschunszentrum KarlsruheItaly – CNAF (Bologna)Netherlands – NIKHEF/SARA (Amsterdam)Nordic countries – distributed Tier-1

Spain – PIC (Barcelona)Taiwan – Academia SInica (Taipei)UK – CLRC (Oxford)US – FermiLab (Illinois) – Brookhaven (NY)

Tier-1 (11 centers ) – “online” to the data acquisition process high availability

• Managed Mass Storage – grid-enabled data service

• Data-heavy analysis• National, regional support

Tier-2 – ~200 centres in ~40 countries• Simulation• End-user analysis – batch and interactive

14

Page 10: LHCOPN Status and Plans

11

11

Centers around the world form a Supercomputer

• The EGEE and OSG projects are the basis of the Worldwide LHC Computing Grid Project WLCG

Inter-operation between Grids is working!

Page 11: LHCOPN Status and Plans

12

12

Tier-1 Centers: TRIUMF (Canada); GridKA(Germany); IN2P3 (France); CNAF (Italy); SARA/NIKHEF (NL); Nordic Data Grid Facility (NDGF); ASCC (Taipei); RAL (UK); BNL (US); FNAL (US); PIC (Spain)

The Grid is now in operation, working on: reliability, scaling up, sustainability

Page 12: LHCOPN Status and Plans

13

13

Guaranteed bandwidth can be a good thing

Page 13: LHCOPN Status and Plans

14

14

LHCOPN Mission• To assure the T0-T1 transfer capability.

• Essential for the Grid to distribute data out to the T1’s.• Capacity must be large enough to deal with most situation including “Catch

up”• The excess capacity can be used for T1-T1 transfers.

• Lower priority than T0-T1• May not be sufficient for all T1-T1 requirements

• Resiliency Objective• No single failure should cause a T1 to be isolated.

• Infrastructure can be improved• Naturally started as an unprotected “star” – insufficient for a production

network but enabled rapid progress.• Has become a reason for and has leveraged cross border fiber.

• Excellent side effect of the overall approach.

Page 14: LHCOPN Status and Plans

15

15

LHCOPN Design Information• All technical content is on the LHCOPN Twiki:

http://lhcopn.cern.ch• Coordination Process

• LHCOPN Meetings (every 3 months)• Active Working Groups

– Routing– Monitoring– Operations

• Active Interfaces to External Networking Activities• European Network Policy Groups• US Research Networking• Grid Deployment Board• LCG Management Board• EGEE

Page 15: LHCOPN Status and Plans

16

16 CERN – March 2007

Page 16: LHCOPN Status and Plans

17

17

SWITCH

COLT - ISP

Interoute - ISP

Globalcrossing - ISP

WHO - CIC

CITIC74 - CIC

CIXP

TIFR - Tier2

UniGeneva - Tier2

RIPN

USLHCnet Chicago – NYC - Amst

CA-TRIUMF - Tier1

ES-PIC - Tier1

DE-KIT - Tier1

FR-CCIN2P3 - Tier1

IT-INFN-CNAF - Tier1

NDGF - Tier1

NL-T1 - Tier1

TW-ASGC - Tier1

UK-T1-RAL - Tier1

US-T1-BNL - Tier1c

US-FNAL-CMS - Tier1c

CH-CERN – Tier0LHCOPN

Geant2

Equinix -TIX

Russian Tier2s

CERN WANNetwork

10Gbps

5G

6G

40G

20G

20G

12.5G20G

1Gbps100Mbps

CERN External Network Links

Page 17: LHCOPN Status and Plans

18

18

GPN

g513-e-rci76-1

IX Europe

[email protected] - last update: 20070801

e513-x-mfte6-1

e513-e-rci65-3

e513-e-rci76-2

SWITCH AS559GEANT AS20965

Chicago POP

CIXP E513-X

StarLight Force10

swice2.switch.ch C7606

I-root dns server

Akamai AS21357

as1-gva C2509as2-gva C2511

swice3.switch.ch C7606

LHCOPN

CITIC74 195.202.0.0/20

who-7204-a

who-7204-b

FNAL AS3152ESnet AS293

Abilene AS11537

RIPE RIS(04) AS12654

K-root dns server

e600chi.uslhcnet.org

WHO 158.232.0.0/16

Reuters AS65020

e513-e-rci76-1

e600nyc.uslhcnet.orgNew York POP

USLHCnet AS1297 192.65.196.0/23e600gva1e600gva2

l513-c-rftec-2

x424nyc.uslhcnet.org

Internet GC AS3549

rt1.gen.ch.geant2.net JT640

as1(-5)-csen C2511

e513-e-shp3m-4

e513-e-rci72-4

tt87.ripe.net

Internet Level3 AS3356

l513-c-rftec-1

rt1.par.fr.geant2.net JT640

evo-us

Abilene AS11537

TIX

Tier2UniGeJINR AS2875KIAE AS6801RadioMSU AS2683

tt31.ripe.net

ext-dns-2

ext-dns-1

g513-e-rci76-2

evo-eu

e513-e-mhpyl-1GN2 - E2E

Internet COLT AS8220

CERN External Network E513-E – AS513

Amsterdam

Internet Level3 AS3356

e600ams

r513-c-rca80-1GPRS - VPN

Page 18: LHCOPN Status and Plans

19

19

Transatlantic Link Negotiations Yesterday

A major provider lost their shirt on this deal!

Page 19: LHCOPN Status and Plans

20

20

LHCOPN Architecture 2004 Starting Point

Tier-2s and Tier-1s are inter-connected by the general

purpose research networks

Any Tier-2 mayaccess data at

any Tier-1

Tier-2 IN2P3TRIUMF

ASCC

FNAL

BNL

Nordic

CNAF

SARAPIC

RAL

GridKa

Tier-2

Tier-2

Tier-2

Tier-2

Tier-2

Tier-2

Tier-2Tier-2Tier-2

Page 20: LHCOPN Status and Plans

21

21

GÉANT2: Consortium of 34 NRENs

Multi-Wavelength Core (to 40) + 0.6-10G Loops

Dark Fiber Core Among16 Countries:

AustriaBelgiumBosnia-Herzegovina Czech RepublicDenmarkFranceGermanyHungaryIrelandItaly,NetherlandSlovakiaSloveniaSpainSwitzerlandUnited Kingdom

22 PoPs, ~200 Sites38k km Leased Services, 12k km Dark Fiber Supporting Light Paths for LHC, eVLBI, et al.

H. Doebbeling

Page 21: LHCOPN Status and Plans

22

22

Page 22: LHCOPN Status and Plans

23

23

Basic Link Layer Monitoring• Perfsonar very well advanced in deployment (but not yet

complete). Monitors the “up/down” status of the links.• Integrated into the “End to End Coordination Unit”

(E2ECU) run by DANTE• Provides simple indications of “hard” faults.• Insufficient to understand the quality of the connectivity

Page 23: LHCOPN Status and Plans

24

24

Page 24: LHCOPN Status and Plans

25

25

Page 25: LHCOPN Status and Plans

26

26

Active Monitoring• Active monitoring needed

• Implementation consistency needed for accurate results• One-way delay• TCP achievable bandwidth• ICMP based round trip time• Traceroute information for path changes• Needed for service quality issues

• First mission is T0-T1 and T1-T1• T1 deployment could be also used for T1-T2

measurements as a second step and with corresponding T2 infrastructure.

Page 26: LHCOPN Status and Plans

27

27

Background Stats

Page 27: LHCOPN Status and Plans

28

28

Monitoring Evolution• Long standing collaboration of the measurement and monitoring technologies

• Monitoring working group of the LHCOPN• ESNet and Dante have been leading the effort

• Proposal for a Managed Service by Dante• Manage the tools, archives• Manage the hardware, O/S• Manage integrity of information

• Sites have some obligations• On-site operations support• Provision of a terminal server• Dedicated IP port on the border router• PSTN/ISDN line for out of band communication• Gigabit Ethernet Switch• GPS Antenna• Protected power• Rack Space

Page 28: LHCOPN Status and Plans

29

29

Operational Procedures• Have to be finalised but need to deal with change and

incident management.• Many parties involved.• Have to agree on the real processes involved

• Recent Operations workshop made some progress• Try to avoid, wherever possible, too many “coordination units”.• All parties agreed we need some centralised information to have

a global view of the network and incidents.• Further workshop planned to quantify this.• We also need to understand existing processes used by T1’s.

Page 29: LHCOPN Status and Plans

30

30

Resiliency Issues• The physical fiber path considerations continue

• Some lambdas have been re-routed. Others still may be.

• Layer3 backup paths for RAL and PIC are still an issue.• In the case of RAL, excessive costs seem to be a problem.• For PIC, still some hope of a CBF between RedIris and Renater

• Overall the situation is quite good with the CBF links, but can still be improved.• Most major “single” failures are protected against.

Page 30: LHCOPN Status and Plans

31

31

T0-T1 Lambda routing (schematic) Connect. Communicate. Collaborate

DEFrankfurt

Basel

T1 GRIDKA

T1

Zurich

CNAF

DK

Copenhagen

NL

SARA

UK

London

T1

BNL

T1FNAL

CH

NY

Starlight

MAN LAN

FR

Paris

T1

IN2P3

Barcelona

T1

PIC

ES

Madrid

T1

RAL

ITMilan

Lyon

Strasbourg/Kehl

GENEVA

AtlanticOcean

VSNL N

VSNL S

AC-2/Yellow

Stuttgart

T1 NDGF

T0

HamburgT1SURFnet

T0-T1s:CERN-RALCERN-PICCERN-IN2P3CERN-CNAFCERN-GRIDKACERN-NDGFCERN-SARACERN-TRIUMFCERN-ASGCUSLHCNET NY (AC-2)USLHCNET NY (VSNL N)USLHCNET Chicago (VSNL S)

T1

TRIUMF T1

ASGC

???

Via SMW-3 or 4 (?)

Amsterdam

Page 31: LHCOPN Status and Plans

32

32

T1-T1 Lambda routing (schematic) Connect. Communicate. Collaborate

DEFrankfurt

Basel

T1 GRIDKA

T1

Zurich

CNAF

DK

Copenhagen

NL

SARA

UK

London

T1

BNL

T1FNAL

CH

NY

Starlight

MAN LAN

FR

Paris

T1

IN2P3

Barcelona

T1

PIC

ES

Madrid

T1

RAL

ITMilan

Lyon

Strasbourg/Kehl

GENEVA

AtlanticOcean

VSNL N

VSNL S

AC-2/Yellow

Stuttgart

T1 NDGF

T0

HamburgT1SURFnet

T1-T1s:GRIDKA-CNAFGRIDKA-IN2P3GRIDKA-SARASARA-NDGF

T1

TRIUMF T1

ASGC

???

Via SMW-3 or 4 (?)

Page 32: LHCOPN Status and Plans

33

33

Some Initial ObservationsConnect. Communicate. Collaborate

DEFrankfurt

Basel

T1 GRIDKA

T1

Zurich

CNAF

DK

Copenhagen

NL

SARA

UK

London

T1

BNL

T1FNAL

CH

NY

Starlight

MAN LAN

FR

Paris

T1

IN2P3

Barcelona

T1

PIC

ES

Madrid

T1

RAL

ITMilan

Lyon

Strasbourg/Kehl

GENEVA

AtlanticOcean

VSNL N

VSNL S

AC-2/Yellow

Stuttgart

T1 NDGF

T0

HamburgT1SURFnet(Between CERN and BASEL)

Following lambdas run in same fibre pair:CERN-GRIDKACERN-NDGFCERN-SARACERN-SURFnet-TRIUMF/ASGC (x2)USLHCNET NY (AC-2)Following lambdas run in same (sub-)duct/trench:(all above +)CERN-CNAFUSLHCNET NY (VSNL N) [supplier is COLT]Following lambda MAY run in same (sub-)duct/trench as all above:USLHCNET Chicago (VSNL S) [awaiting info from Qwest…]

(Between BASEL and Zurich)Following lambdas run in same trench:CERN-CNAFGRIDKA-CNAF (T1-T1)Following lambda MAY run in same trench as all above:USLHCNET Chicago (VSNL S) [awaiting info from Qwest…]

T1

TRIUMF T1

ASGC

???

Via SMW-3 or 4 (?)

KEYGEANT2NRENUSLHCNETVia SURFnetT1-T1 (CBF)

Page 33: LHCOPN Status and Plans

34

34

Closing Remarks• The LHCOPN is an important part of the overall

requirements for LHC Networking.• It is a (relatively) simple concept.

• Statically Allocated 10G Paths in Europe• Managed Bandwidth on the 10G transatlantic links via

USLHCNet

• Multi-domain operations remain to be completely solved• This is a new requirement for the parties involved and a learning

process for everyone

• Many tools and ideas exist and the work is now to pull this all together into a robust operational framework

Page 34: LHCOPN Status and Plans

35

3535Simple solutions are often the best!