10gbit between gridka and openlab (and their obstacles)

28
Service Challenge Meeting - ISGC_2005 , 26. Apr. 2005 Bruno Hoeft 10Gbit between GridKa and openlab (and their obstacles) Forschungszentrum Karlsruhe GmbH Institute for Scientific Computing P.O. Box 3640 D-76021 Karlsruhe, Germany http://www.gridka.de Bruno Hoeft

Upload: vance-giles

Post on 30-Dec-2015

32 views

Category:

Documents


4 download

DESCRIPTION

10Gbit between GridKa and openlab (and their obstacles). Forschungszentrum Karlsruhe GmbH Institute for Scientific Computing P.O. Box 3640 D-76021 Karlsruhe, Germany http://www.gridka.de Bruno Hoeft. LAN of GridKa Strucure Prevision of installation in 2008 WAN History (1G/2003) - PowerPoint PPT Presentation

TRANSCRIPT

Service Challenge Meeting - ISGC_2005 , 26. Apr. 2005Bruno Hoeft

10Gbitbetween GridKa and openlab

(and their obstacles)

Forschungszentrum Karlsruhe GmbHInstitute for Scientific Computing

P.O. Box 3640D-76021 Karlsruhe, Germany

http://www.gridka.de

Bruno Hoeft

Service Challenge Meeting - ISGC_2005 , 26. Apr. 2005Bruno Hoeft

Outline

• LAN of GridKa– Strucure– Prevision of installation in 2008

• WAN– History (1G/2003)– Current 10G– Testbed– Challenges crossing multi NREN (National Research

and Education Network)– Quality and quantity evaluation of GridKa and

openlap network connection– File transfer– Caching effect

Service Challenge Meeting - ISGC_2005 , 26. Apr. 2005Bruno Hoeft

CMSATLAS

LHCbCERN

Tier 0 Centre at CERN

working groups

virtual organizations

Germany(FZK)

Tier 1

USA(Fermi, BNL)

UK (RAL)

France(IN2P3)

Italy(CNAF)

……….

CERN Tier 1

……….Tier 0

The LHC multi-TierComputing

Model

Tier 3 (Institute

computer)

Tier 4(Desktop)

Grid Computing Centre Karlsruhe

Tier 2(Uni-CCs,Lab-CCs)

Lab y

Uni a

Lab i

Uni b

Lab z

Lab x

Uni cUni d

Uni e

LHC Computing Grid Project - LCG

Service Challenge Meeting - ISGC_2005 , 26. Apr. 2005Bruno Hoeft

Projects at GridKa

Atlas

(SLAC, USA)

(CERN)(FermiLab ,USA)

(FermiLab ,USA)

calculating jobs with “real” data

LHC Experimente non-LHC Experimente

1,5 Mio. Jobs and

4,2 Mio. hours calculation in 2004

Service Challenge Meeting - ISGC_2005 , 26. Apr. 2005Bruno Hoeft

gradual extention of GridKa gradual extention of GridKa resourcesresources

Apr 2004 Okt 2004 Apr 2005 % of 2008

Processors 680 1070 1.280 30 %

Computing power / kSI2k 580 920 1290 12 %

Disk [TB] 160 220 270 18 %

Tape [TB] 280 375 475 12 %

Internet [Gb/s] 2 10* 10 50 %

April 2005:

• biggest Linux-Cluster within the German Science Society

• largest Online-Storage at a single installation in Germany

• strongest Internet connection in Germany

• available at the Grid with over 100 installations in Europe

* Internet connetion for

sc

Service Challenge Meeting - ISGC_2005 , 26. Apr. 2005Bruno Hoeft

routerprivate network

……

ComputeNode

ComputeNode

ComputeNode

ComputeNode

Ethernet 320 MbitEthernet 100 MbitEthernet 1 GbitFiberChannel 2GbitEthernet 10 Gbit

FS

FS NASFS NAS

FS NAS

FS

FS

SAN

switch

Network installation

Storagedirect

internet access

1 Gbit/s

1 Gbit/s

1Gbit/sGridKa

PIX

LoginServer

…DFNDFN

LoginServer

router

sc nodes

…sc nodes

sc nodes

SAN

10Gbit/sservice challenge only

Service Challenge Meeting - ISGC_2005 , 26. Apr. 2005Bruno Hoeft

router

1Gbit/sGridKa

PIX

LoginServer

FS

private network

Storagedirect

internet access

1 Gbit/s

1 Gbit/s

… FS NASFS NAS

FS NAS

FS

FS

DFNDFN

……

ComputeNode

ComputeNode

ComputeNode

ComputeNode

LoginServer

switch

router

Network installation

SAN

Master Controller

RM

RM

RM

RM

RM

RM

-Ganglia-Nagios-Cacti

switch

incl. Management Network

Ethernet 320 MbitEthernet 100 MbitEthernet 1 GbitFiberChannel 2GbitEthernet 10 Gbit

10Gbit/sservice challenge only

sc nodes

…sc nodes

sc nodes

SAN

Service Challenge Meeting - ISGC_2005 , 26. Apr. 2005Bruno Hoeft

150 71001845

22028114

535

38220

697

50

320

1200

85

640

2000

140

1400

0

500

1000

1500

2000

2500

2001 2002 2003 2004 2005 2006 2007 2008

WorkerNodes FileServer DiskStorage

CN - Rack

CN - Rack

CN - Rack

CN - Rack

…………

CN - Rack

CN - Rack

CN - Rack

CN - Rack

FS

FS

FS

FS

FS

FS

Block Adminis-tration

Block A

Backbone Router

completed end 2005

CN - Rack

CN - Rack

CN - Rack

CN - Rack

…………

CN - Rack

CN - Rack

CN - Rack

CN - Rack

FS

FS

FS

FS

FS

FS

Block Adminis-tration

Block C

Backbone Router

2006

2007 ??

10 Gbit

10 Gbit

10 Gbit 10 Gbit10 Gbit Internet 10 Gbit light path to CERN

10 Gbit

10 Gbit

CN - Rack

CN - Rack

CN - Rack

CN - Rack

…………

CN - Rack

CN - Rack

CN - Rack

CN - Rack

FS

FS

FS

FS

FS

FS

Block Adminis-tration

Block B

Backbone Router

CN - Rack

CN - Rack

CN - Rack

CN - Rack

…………

CN - Rack

CN - Rack

CN - Rack

CN - Rack

FS

FS

FS

FS

FS

FS

Block Adminis-tration

Block D

Backbone Router

Projection of installation in 2008

Service Challenge Meeting - ISGC_2005 , 26. Apr. 2005Bruno Hoeft

CERNGridFTPserver

GridFTPserver

WAN 2003/4 -- Gigabit GridKa – CERN (DataTag)

Géant10 Gbps

DFN

2.4 Gbps

GridFTP tested over 1 Gbps

Karlsruhe

Frankfurt

2x 1 Gbps

98% of 1Gbit

Geneva

1000

0

Service Challenge Meeting - ISGC_2005 , 26. Apr. 2005Bruno Hoeft

CERNGridFTPserver

GridFTPserver

10Gigabit WAN SC GridKa – CERN(openlab)

Géant10 Gbps

DFN

10 Gbps

10 Gbps

Karlsruhe

Frankfurt Geneva

ar-karlsruhe1-ge5-2-700.g-win.dfn.de

r-internet.fzk.de

cr-karlsruhe1-po12-0.g-win.dfn.de

dfn.de1.de.geant.net

de.it1.it.geant.net

it.ch1.ch.geant.net

swiCE2-P6-1.switch.ch

swiCE3-G4-3.switch.ch

10 Gbps

Service Challenge Meeting - ISGC_2005 , 26. Apr. 2005Bruno Hoeft

CERNGridFTPserver

GridFTPserver

Géant10 Gbps

DFN

10 Gbps

10 Gbps 10 Gbps

Karlsruhe

Frankfurt

ar-karlsruhe1-ge5-2-700.g-win.dfn.de

r-internet.fzk.de

cr-karlsruhe1-po12-0.g-win.dfn.de

dfn.de1.de.geant.net

swiCE3-G4-3.switch.ch

de.fr1.fr.geant.net

fr.ch1.ch.geant.net

LS

P R

outi

ng

Geneva

10Gigabit WAN SC GridKa – CERN(openlab)

Service Challenge Meeting - ISGC_2005 , 26. Apr. 2005Bruno Hoeft

10Gigabit WAN SC GridKa – CERN(openlab)

CERNGridFTPserver

GridFTPserver

Géant10 Gbps

DFN

10 Gbps

10 Gbps 10 Gbps

Karlsruhe

Frankfurt

-Bandwidth evaluation (tcp/udp)- MPLS via France (MPLS - MultiProtocol Label Switching)

- LBE (Least Best Effort)

- GridFTP server pool HD to HD Storage to Storage

- SRM

Geneva

Service Challenge Meeting - ISGC_2005 , 26. Apr. 2005Bruno Hoeft

Hardware

• Various Xeon dual 2.8 and 3.0 GHz IBM x-series (Intel and Broadcom NIC)

• Recently added 3.0 GHz EM64T (800 FSB)• Cisco 6509 with 4 10 Gb ports and lots of 1 Gb• Storage:Datadirect 2A8500 with 16 TB• Linux RH ES 3.0 (U2 and U3), GPFS• 10 GE Link to GEANT via DFN (least best effort )

TCP/IP stack• 4 MB buffer• 2 MB window size

Service Challenge Meeting - ISGC_2005 , 26. Apr. 2005Bruno Hoeft

Quality Evaluation (UDP stream)

10gkt113

10gkt111

oplapro73

957

Mb

itjitter: 0,028; ooo: 0

jitter: 0,021; ooo: 0

957

Mb

itL

AN

ooo - out of order

W A N W A N

[10gtk111] iperf -s –u [10gtk113] iperf -c 192.108.46.111 -u -b 1000M[ ID] Interval Transfer Bandwidth Jitter Lost/Total Datagrams[ 3] 0 -10 sec 1141 MBytes 957 Mbits/sec 0.028 ms 0/813885 (0%)[10gtk113] iperf -s –u [10gtk111] iperf -c 192.108.46.113 -u -b 1000M[ ID] Interval Transfer Bandwidth Jitter Lost/Total Datagrams[ 3] 0 -30 sec 3422 MBytes 957 Mbits/sec 0.022 ms 0/2440780 (0%)

LAN

WAN

953Mbit; jitter: 0,018 ms; ooo: 215 – 0,008%884Mbit; jitter: 0,015 ms; ooo: 4239 – 0,018%

[oplapro73]> iperf -s -u [10gtk113] iperf -c 192.16.160.13 -u -b 1000M[ ID] Interval Transfer Bandwidth Jitter Lost/Total Datagrams[ 3] 0 -30 sec 3408 MBytes 953 Mbits/sec 0.019 ms 7299/2438606 (0.3%)[ 3] 0 -30 sec 215 datagrams received out-of-order

[oplapro73]> iperf -c 192.108.46.113 -u -b 1000M [10gtk113] iperf -s -u[ ID] Interval Transfer Bandwidth Jitter Lost/Total Datagrams[ 3] 0 -30 sec 3162 MBytes 884 Mbits/sec 0.015 ms 130855/2386375 (5.5%)[ 3] 0 -30 sec 4239 datagrams received out-of-order

Reno

scalable

symmetric=

957 Mbit/sec

957 Mbit/sec=

953 Mbit/sec

884 Mbit/secasymmetric

Service Challenge Meeting - ISGC_2005 , 26. Apr. 2005Bruno Hoeft

Bandwidth evaluation (TCP stream)

W A N 348Mbit/sec – single stream W A N

Reno

10gkt101 –

10gtk105 Renooplapro73

iperf

Service Challenge Meeting - ISGC_2005 , 26. Apr. 2005Bruno Hoeft

10gkt10[1-5]

W A N 5 nodes a 112MByte/sec – 24 parrallel streams W A N

Reno

iperf Oplapro7[1-5]

Reno

Bandwidth evaluation

Service Challenge Meeting - ISGC_2005 , 26. Apr. 2005Bruno Hoeft

Evaluation of max throughput

700

1400

5600

4900

4200

3500

2800

2100

7000

6300

Mbit/s

18:00 20:00

9 nodes each site-8 * 845 Mbit-1 * 540 Mbithigher speed at one stream is resulting in a packet loss

Service Challenge Meeting - ISGC_2005 , 26. Apr. 2005Bruno Hoeft

Gridftp sc1 throughput

Sc1 – 500MByte sustained

4th Feb. 05 6th Feb. 05 8th Feb. 057th Feb. 055th Feb. 05

4900

4200

3500

2800

2100

1400

700

Mbit/sto /dev/null to hd/SAN

19 Nodes- 15 WorkerNodes * 20MByte IDE/ATA HD- 1 FileServer * 50MByte SCSI HD- 3 FileServer * 50MByte SAN

Service Challenge Meeting - ISGC_2005 , 26. Apr. 2005Bruno Hoeft

SC2

• approx 1/5 of the load

Service Challenge Meeting - ISGC_2005 , 26. Apr. 2005Bruno Hoeft

SC2

• five nodes at GridKa

• gridftp to gpfs

Service Challenge Meeting - ISGC_2005 , 26. Apr. 2005Bruno Hoeft

SC2-part 2- Trouble shooting with radiant- Shaping diffrent host performances (load bolancing)- Parallel threads did not perfom better- Best performance

•20 parallel file copies•Equal Nodes (no performance diffrenz)

Service Challenge Meeting - ISGC_2005 , 26. Apr. 2005Bruno Hoeft

cacheGridftp 2 /dev/null

80

70

60

50

40

30

MByte/sec

MByte - TCP windos1,0 1,5

Service Challenge Meeting - ISGC_2005 , 26. Apr. 2005Bruno Hoeft

cacheGridftp 2 gpfs

MByte - TCP windos

80

60

40

20

0

MByte/sec

0,5 1,0 1,5

Service Challenge Meeting - ISGC_2005 , 26. Apr. 2005Bruno Hoeft

cache

80

70

60

50

40

30

MByte/sec

80

60

40

20

0

MByte/sec

MByte - TCP windos0,5 1,0 1,5

1,0 1,5

Service Challenge Meeting - ISGC_2005 , 26. Apr. 2005Bruno Hoeft

cache

Gridftp 8Gbyte File8

0

7

0

6

0

5

0

4

0

3

0

2

0

MByte/sec

MByte - TCP windos

0,5

1,0 1,5

80

70

60

50

40

MByte/sec

0 1,0 1,5 2,0 3,0

* *current throughputaverage throughput

2Gbyte16% of time

2,5messurement

points

Service Challenge Meeting - ISGC_2005 , 26. Apr. 2005Bruno Hoeft

Conclusion

• multi NREN 10Gbit link up to 7.0Mbit usable• SC2 part 1 : approx. 1/5 of the CERN

aggregated load• SC2 part 2 : 250 MByte stable, peak over

400MByte• TCP for WAN (un-)modified?

Service Challenge Meeting - ISGC_2005 , 26. Apr. 2005Bruno Hoeft

• Gpfs for data dest. • Digging into HW details to discover bottlenecks

(packet drop due to bad PCI timing)• Stabilise the transport • Installation of SRM• Installation of dcache

• Planned 2006– Lightpath via X-Win (DFN) / Geant2 to

CERN

Future Work

some solved, but still ongoing

since sc2 is not aproaching the edge,the impreession is far more stable

for SC, migrate to production

since sc2 end of Marchsingle stream up to 115MBytesustained multistream 60 to 70MByte

Service Challenge Meeting - ISGC_2005 , 26. Apr. 2005Bruno Hoeft

Forschungszentrum Karlsruhein der Helmholtz-Gemeinschaft