27.05.2004bernd panzer-steindel, cern/it wan raw/esd data distribution for lhc

6
27.05.2004 Bernd Panzer-Steindel, CE RN/IT WAN RAW/ESD Data Distribution for LHC WAN RAW/ESD Data Distribution for LHC

Upload: aaron-gunn

Post on 28-Mar-2015

223 views

Category:

Documents


4 download

TRANSCRIPT

Page 1: 27.05.2004Bernd Panzer-Steindel, CERN/IT WAN RAW/ESD Data Distribution for LHC

27.05.2004 Bernd Panzer-Steindel, CERN/IT

WAN RAW/ESD Data Distribution for LHCWAN RAW/ESD Data Distribution for LHC

Page 2: 27.05.2004Bernd Panzer-Steindel, CERN/IT WAN RAW/ESD Data Distribution for LHC

27.05.2004 Bernd Panzer-Steindel, CERN/IT

T0 T0 T1 dataflow T1 dataflow

T0 Mass Storage recording of the RAW data from the 4 LHC experimentsT0 First ESD production RAW data and ESD export to the Tier1 centers one copy of the RAW data spread over the T1 centers of an experiment several copies of the ESD data sets (3-6), experiment dependent ESD size ~= 0.5 * RAW data (each T1 2/3 or one copy of the ESD) ~10PB per year (requirements from the latest discussions with the 4 experiments)T1T0 Data import (new ESD versions, MC data, AOD, etc.)

near real time export to the Tier1 centers during LHC running (200 days per year) + ALICE heavy ion data during the remaining 100 daysdata transfers are between mass storage systems

near real time == from disk cache sizing of the tape system == minimal data recall from tape

Page 3: 27.05.2004Bernd Panzer-Steindel, CERN/IT WAN RAW/ESD Data Distribution for LHC

27.05.2004 Bernd Panzer-Steindel, CERN/IT

There are currently 7+ Tier1 centers(RAL, Fermilab, Brookhaven, Karlsruhe, IN2P3, CNAF, PIC,…)

The T0 export requirements need at least a 10 Gbit/s link per Tier1(plus more if one includes the Tier1-Tier2 communications)

The CERN T0 needs at least a 70 Gbit/s connection

the efficiency is still unknown

NetworkNetwork

Page 4: 27.05.2004Bernd Panzer-Steindel, CERN/IT WAN RAW/ESD Data Distribution for LHC

27.05.2004 Bernd Panzer-Steindel, CERN/IT

We need to start Service Data Challenges whichshould test/stress all necessary layers for these large continuous datatransfers

network hardware circuit switching versus packet switching, QoStransport TCP/IP parameters, new implementationstransfer mechanisms GRIDFTPmass storage systems ENSTORE, CASTOR, HPSS, etc.coupling to the mass storage systems SRM 1.xreplication systemdata movement service (control and bookkeeping layer)

Key points :resilience and error-recovery !! resilience and error-recovery !! resilience and error-recovery !!modular layerssimplicityperformance

Page 5: 27.05.2004Bernd Panzer-Steindel, CERN/IT WAN RAW/ESD Data Distribution for LHC

27.05.2004 Bernd Panzer-Steindel, CERN/IT

Proposed timescales and schedulingProposed timescales and schedulingMidyear Endyear

200410Gbit “end-to-end” tests with FermilabFirst version of the LHC Community Network proposal

10Gbit “end-to-end” test complete with European PartnerMeasure performance variability and understand H/W and S/W Issues to ALL sites.Document circuit switched options and costs, first real test if possible.

2005Circuit/Packet switch design completed.LHC Community network proposal completed.All T1 Fabric architecture documents completed.LCG TDR completed

Sustained throughput test achieved to some sites: 2-4 Gb/sec for 2 months. H/W and S/W problems solved.

2006All CERN b/w provisioned.All T1 bandwidth in production (10Gb links)Sustained throughput tests achieved to most sites.

Verified performance to all sites for at least 2 months.

Page 6: 27.05.2004Bernd Panzer-Steindel, CERN/IT WAN RAW/ESD Data Distribution for LHC

27.05.2004 Bernd Panzer-Steindel, CERN/IT

These WAN service data challenges needs dedication ofmaterial : cpu server, disk server, tape drives , etc.services : HSM, load balanced GRIDFTP, networkpersonnel : for running the DC, debugging, tuning, software selection and tests on the T0 and the different T1 centers

dedication of material and personnel for longer time periodsmonths not weeks !important for getting the necessary experience, only 2 years for areliable working system worldwide, T1 – T0 network

to be watched :interference with ongoing productions (HSM, WAN capacity, etc.)

need to start now with a more detailed plan and start to ‘fill’ the networkright now !challenging, interesting and very important

who is participating when and how ?????discussion………………………..