1 a basic r&d for an analysis framework distributed on wide area network hiroshi sakamoto...

17
1 A Basic R&D for an A Basic R&D for an Analysis Framework Analysis Framework Distributed on Wide Area Distributed on Wide Area Network Network Hiroshi Sakamoto Hiroshi Sakamoto International Center for Elementary International Center for Elementary Particle Physics (ICEPP), the Particle Physics (ICEPP), the University of Tokyo University of Tokyo For For Regional Center R&D Collaboration, Regional Center R&D Collaboration, Computing Research Center, KEK Computing Research Center, KEK ICEPP, the University of Tokyo ICEPP, the University of Tokyo

Upload: lambert-merritt

Post on 26-Dec-2015

214 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: 1 A Basic R&D for an Analysis Framework Distributed on Wide Area Network Hiroshi Sakamoto International Center for Elementary Particle Physics (ICEPP),

11

A Basic R&D for an Analysis A Basic R&D for an Analysis Framework Distributed on Wide Framework Distributed on Wide

Area NetworkArea Network

Hiroshi SakamotoHiroshi SakamotoInternational Center for Elementary Particle Physics International Center for Elementary Particle Physics

(ICEPP), the University of Tokyo(ICEPP), the University of TokyoForFor

Regional Center R&D Collaboration,Regional Center R&D Collaboration,Computing Research Center, KEKComputing Research Center, KEK

ICEPP, the University of TokyoICEPP, the University of Tokyo

Page 2: 1 A Basic R&D for an Analysis Framework Distributed on Wide Area Network Hiroshi Sakamoto International Center for Elementary Particle Physics (ICEPP),

22

KEK-ICEPP R&D CollaborationKEK-ICEPP R&D Collaboration

ICEPP, Univ. TokyoICEPP, Univ. TokyoRegional Analysis Center for LHC ATLAS ExperimentRegional Analysis Center for LHC ATLAS Experiment

KEKKEKCentral HEP Facility in JapanCentral HEP Facility in Japan

Joint R&D ProgramJoint R&D ProgramComputing Research Center, KEK and ICEPP, TokyoComputing Research Center, KEK and ICEPP, Tokyo

Started in 2001Started in 2001

For Distributed Analysis FrameworkFor Distributed Analysis Framework

Introduces Computing Grid TechnologyIntroduces Computing Grid Technology

Page 3: 1 A Basic R&D for an Analysis Framework Distributed on Wide Area Network Hiroshi Sakamoto International Center for Elementary Particle Physics (ICEPP),

33

R&D ItemsR&D ItemsPC FarmPC Farm

Blade ServersBlade Servers

Storage SystemStorage SystemIDE File Server PrototypeIDE File Server Prototype

Wide Area NetworkWide Area NetworkKEK-ICEPP GbE ConnectionKEK-ICEPP GbE Connection

CERN-ICEPP Connectivity StudyCERN-ICEPP Connectivity Study

Integrated StudyIntegrated StudyHPSS Access via WANHPSS Access via WAN

Page 4: 1 A Basic R&D for an Analysis Framework Distributed on Wide Area Network Hiroshi Sakamoto International Center for Elementary Particle Physics (ICEPP),

44

HP Blade ServerHP Blade ServerHP ProLiant BL20p HP ProLiant BL20p

Xeon 2.8GHz 2CPU/nodeXeon 2.8GHz 2CPU/node

1GB memory1GB memory

SCSI 36GBx2, hardware RAID1SCSI 36GBx2, hardware RAID1

3 GbE NIC3 GbE NIC

iLO remote administration tooliLO remote administration tool

8 blades/ Enclosure(6U)8 blades/ Enclosure(6U)

Total 108 blades(216 CPUs) in 3 racksTotal 108 blades(216 CPUs) in 3 racks

Rocks 2.3.2(RedHat 7.3 base) for Cluster managRocks 2.3.2(RedHat 7.3 base) for Cluster managementement

Page 5: 1 A Basic R&D for an Analysis Framework Distributed on Wide Area Network Hiroshi Sakamoto International Center for Elementary Particle Physics (ICEPP),

55

HP Blade Server 2HP Blade Server 2

Page 6: 1 A Basic R&D for an Analysis Framework Distributed on Wide Area Network Hiroshi Sakamoto International Center for Elementary Particle Physics (ICEPP),

66

HP Blade Server Network HP Blade Server Network ConfigurationConfiguration

SW

NIC1

NIC2

NIC3

iLO

NIC1

NIC2

NIC3

iLO

NIC1

NIC2

NIC3

iLO

NIC1

NIC2

NIC3

iLO

NIC1

NIC2

NIC3

iLO

NIC1

NIC2

NIC3

iLO

NIC1

NIC2

NIC3

iLO

NIC1

NIC2

NIC3

iLO

NIC1

NIC2

NIC3

iLO

NIC1

NIC2

NIC3

iLO

NIC1

NIC2

NIC3

iLO

NIC1

NIC2

NIC3

iLO

NIC1

NIC2

NIC3

iLO

NIC1

NIC2

NIC3

iLO

NIC1

NIC2

NIC3

iLO

NIC1

NIC2

NIC3

iLO

SW

SW

FrontendServer

CampusLAN

Private LANPXE/Control

Private LANData

Private LANiLO

Blade Servers

NIC1

NIC2

NIC3

iLO

NIC1

NIC2

NIC3

iLO

NIC1

NIC2

NIC3

iLO

NIC1

NIC2

NIC3

iLO

NIC1

NIC2

NIC3

iLO

NIC1

NIC2

NIC3

iLO

NIC1

NIC2

NIC3

iLO

NIC1

NIC2

NIC3

iLO

Page 7: 1 A Basic R&D for an Analysis Framework Distributed on Wide Area Network Hiroshi Sakamoto International Center for Elementary Particle Physics (ICEPP),

77

File Server PrototypeFile Server Prototype

2 Xeon 2.8GHz (FSB 533MHz)2 Xeon 2.8GHz (FSB 533MHz)

Memory 1GBMemory 1GB

2 PCI RAID I/F, 3ware 7500-82 PCI RAID I/F, 3ware 7500-8

2 GbE(1000Base-T) LAN2 GbE(1000Base-T) LAN

4U Rack mount Case, 16 hot-swap IDE HDD bay4U Rack mount Case, 16 hot-swap IDE HDD bay

16 250GB HDD + 40GB HDD16 250GB HDD + 40GB HDD

460W N+1 Redundant Power Supply460W N+1 Redundant Power Supply

7 servers, total 24.5TB7 servers, total 24.5TB

Page 8: 1 A Basic R&D for an Analysis Framework Distributed on Wide Area Network Hiroshi Sakamoto International Center for Elementary Particle Physics (ICEPP),

88

8 HDD RAID5

0

20

40

60

80

100

120

140

160

180

EXT2 EXT3 XFS J FS ReiserFS

FileSystem

Tra

nsf

er

Rat

e (

MB

/s)

READ WRITE

File Server Prototype 2File Server Prototype 2BenchmarksBenchmarks

Read 350MB/s, Read 350MB/s, Write 170-250MB/s Write 170-250MB/s RAID0 16 HDDRAID0 16 HDD

Read Read 140-160MB/s, 140-160MB/s, Write 20-50MB/s, Write 20-50MB/s, RAID5 8 HDDRAID5 8 HDD

16 HDD RAID0

0

50

100

150

200

250

300

350

400

EXT2 EXT3 XFS J FS ReiserFS

FileSystem

Tra

nsf

er

Rat

e (

MB

/s)

READ WRITE

RAID0 16HDD

RAID5 8 HDD

Page 9: 1 A Basic R&D for an Analysis Framework Distributed on Wide Area Network Hiroshi Sakamoto International Center for Elementary Particle Physics (ICEPP),

99

GbE between KEK and ICEPPGbE between KEK and ICEPP10Gbps SuperSINET Backbone10Gbps SuperSINET Backbone

CommodityCommodity

Project Oriented 1GbpsProject Oriented 1GbpsKEK-ICEPP for ATLAS ExperimentKEK-ICEPP for ATLAS Experiment

Page 10: 1 A Basic R&D for an Analysis Framework Distributed on Wide Area Network Hiroshi Sakamoto International Center for Elementary Particle Physics (ICEPP),

1010

WAN Performance MeasurementWAN Performance Measurement“A” settingTCP 479Mbps -P 1 -t 1200 -w 128KBTCP 925Mbps -P 2 -t 1200 -w 128KBTCP 931Mbps -P 4 -t 1200 -w 128KBUDP 953Mbps -b 1000MB -t 1200 -w 128KB

"A" setting: 104.9MB/s "B" setting: 110.2MB/s

“B” settingTCP 922Mbps -P 1 -t 1200 -w 4096KBUDP 954Mbps -b 1000MB -t 1200 -w 4096KB

Page 11: 1 A Basic R&D for an Analysis Framework Distributed on Wide Area Network Hiroshi Sakamoto International Center for Elementary Particle Physics (ICEPP),

1111

Wide Area NetworkWide Area NetworkInternational networkInternational network

2.5Gbps x 2 (=5Gbps) to NY2.5Gbps x 2 (=5Gbps) to NY

Page 12: 1 A Basic R&D for an Analysis Framework Distributed on Wide Area Network Hiroshi Sakamoto International Center for Elementary Particle Physics (ICEPP),

1212

CERN-Tokyo (Very preliminary)CERN-Tokyo (Very preliminary)[UDP] 17:00-17:30option: -b 1000MB -w 256KB[ ID] Interval Transfer Bandwidth Jitter Lost/Total Datagrams [ 3] 0.0-1095.5 sec 116 GBytes 906 Mbits/sec 0.015 ms 4723300/89157875 (5.3%)[ 3] 0.0-1095.5 sec 134933 datagrams received out-of-order

[TCP] 11:00-16:30[ ID] Interval Transfer Bandwidth Option[SUM] 0.0-1202.7 sec 645 MBytes 4.50 Mbits/sec -P 4 -w 256KB[SUM] 0.0-1203.3 sec 1.13 GBytes 8.04 Mbits/sec -P 8 -w 256KB[SUM] 0.0-1215.0 sec 2.35 GBytes 16.6 Mbits/sec -P 16 -w 256KB[SUM] 0.0-1219.9 sec 5.17 GBytes 36.4 Mbits/sec -P 32 -w 256KB[SUM] 0.0-1230.9 sec 8.30 GBytes 57.9 Mbits/sec -P 64 -w 256KB[SUM] 0.0-1860.0 sec 30.2 GBytes 140 Mbits/sec -P 128 -w 256KB[SUM] 0.0-1860.2 sec 37.0 GBytes 171 Mbits/sec -P 160 -w 256KB

0

20

40

60

80

100

120

140

4 8 16 32 64 128

Page 13: 1 A Basic R&D for an Analysis Framework Distributed on Wide Area Network Hiroshi Sakamoto International Center for Elementary Particle Physics (ICEPP),

1313

1Gbps

100Mbps

ICEPP KEK

100 CPUs6CPUs

HPSS 120TB

GRID testbed environmentwith HPSS through GbE-WAN

NorduGrid - grid-manager - gridftp-serverGlobus-mdsGlobus-replicaPBS server

NorduGrid - grid-manager - gridftp-serverGlobus-mdsPBS server

PBS clientsPBS clients

HPSS servers

~ 60km0.2TBSECE

CECE

SE

User PCs

Page 14: 1 A Basic R&D for an Analysis Framework Distributed on Wide Area Network Hiroshi Sakamoto International Center for Elementary Particle Physics (ICEPP),

1414

Basic Network Performance Basic Network Performance

0 0.2 0.4 0.6 0.8 1[106]

0

200

400

600744Mb/s 2.8ms

700Mb/s 0.39ms

HPSS mover -> KEK client HPSS mover -> ICEPP client

HPSS mover Bufsize=256kB

Network performance by netperf

TCP Buffer size of Client (Byte)

Transfe

r speed (M

Bit/s)

RTT~3msRTT~3ms

packet loss free.packet loss free.

MTU=1500MTU=1500

CPU/NIC is CPU/NIC is bottleneck.bottleneck.

Max TCP Buffer Max TCP Buffer Size(256k) in Size(256k) in HPSS servers HPSS servers cannot be cannot be changed.changed.(optimized for (optimized for IBM SP switch)IBM SP switch)

WANLAN

Page 15: 1 A Basic R&D for an Analysis Framework Distributed on Wide Area Network Hiroshi Sakamoto International Center for Elementary Particle Physics (ICEPP),

1515

Client disk speed @ KEK = 48MB/sClient disk speed @ ICEPP=33MB/s

0 2 4 6 8 100

20

40

60

80

KEK client (LAN) ICEPP client(WAN)

# of file transfer in parallel

Agg

rega

te T

rans

fer

spee

d (M

B/s

)Pftp→pftp HPSS mover disk Client disk

client disk speed 35-45MB/s

to /dev/null

to client disk Ftp buffer=64MB

Page 16: 1 A Basic R&D for an Analysis Framework Distributed on Wide Area Network Hiroshi Sakamoto International Center for Elementary Particle Physics (ICEPP),

1616

SummarySummary

R&D for Distributed Analysis FrameworkR&D for Distributed Analysis FrameworkStudy of FabricStudy of Fabric

Blade ServersBlade Servers

Disk Array/File Server PrototypeDisk Array/File Server Prototype

Network ConnectivityNetwork ConnectivityKEK-ICEPPKEK-ICEPP

CERN-ICEPPCERN-ICEPP

Integrated PerformanceIntegrated PerformanceHPSS Access via WANHPSS Access via WAN

Page 17: 1 A Basic R&D for an Analysis Framework Distributed on Wide Area Network Hiroshi Sakamoto International Center for Elementary Particle Physics (ICEPP),

1717

ProspectsProspects

Network ConnectivityNetwork ConnectivityCERN-KEK/Tokyo to 10Gbps in 2004CERN-KEK/Tokyo to 10Gbps in 2004

Tokyo-Taipei Connectivity Study soonTokyo-Taipei Connectivity Study soon

File Server UpgradeFile Server Upgrade100TB Disk Capacity100TB Disk Capacity

Hierarchical Storage Study soonHierarchical Storage Study soon

Prepare for ExperimentsPrepare for Experiments2004 Data Challenges2004 Data Challenges