sc3 experiences

16
SC3 experiences Ron Trompert SARA

Upload: benny

Post on 23-Jan-2016

50 views

Category:

Documents


0 download

DESCRIPTION

SC3 experiences. Ron Trompert SARA. SC3 Infrastructure. Starting point DMF-based HSM DMF has no SRM implementation DMF does not support functionality promised by the SRM standard, like file pinning. SC3 Infrastructure. dCache. dCache provides an srm I/F - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: SC3 experiences

SC3 experiences

Ron Trompert

SARA

Page 2: SC3 experiences

SC3 Infrastructure

Starting pointDMF-based HSM

DMF has no SRM implementation

DMF does not support functionality promised by the SRM standard, like file pinning.

Page 3: SC3 experiences

SC3 Infrastructure

dCache provides an srm I/F

dCache provides flexibility with respect to HSM backends

If we need to switch to another HSM setup for some reason

dCache

Page 4: SC3 experiences

SC3 Infrastructure: throughput phase

Page 5: SC3 experiences

SC3 Throughput phase

Disk2disk: 100-110 MB/s Problems with stability of the nodes:solved by limiting the number of I/O movers

Disk2tape: 50 MB/sNot enough bandwidth, SAN not dedicated

Page 6: SC3 experiences

SC3 Infrastructure: service phase

Page 7: SC3 experiences

SC3 service phase statistics

Percentage of computational resources used (october-december)

LHCb ATLAS

SARA 28 0

NIKHEF 21 39

Page 8: SC3 experiences

SC3 service phase statistics

LHCb ATLAS

GBs in 7638 881

GBs out 5 0

GB stored 3334 900

Page 9: SC3 experiences

SC3 service phase statistics

Setting up the infrastructure took longer than we had hoped so unfortunately we missed ALICE.

Sizes and number of files transferred to srm SE

LHCb ATLAS

Average file size 188 MB 211 MB

# inbound transfers 41508 4277

#inbound transfers

files size < 100 MB5013 3526

# inbound transfers

file size < 1MB4922 3261

Page 10: SC3 experiences

SC3 service phase observations

Networking problemsHardware problems

10GE to CERN was dedicated but the 10G switch not. Switching back and forth between dedicated 10GE and Geant.

Routing problems

Considerably less data stored for Atlas than expected.

In plans on Wiki 20 TB

Page 11: SC3 experiences

SC3 service phase observations

Communication problemNetwork changes not reported

We were not informed of changes in subnets.

Problems are not always reported Failed transfers are not always reported Network outage CERN-SARA between Xmas and

New Year, nobody informed us

Monitoring: experiment monitoring websites in Wiki but also found other monitoring website urls in emails.Not clear what the experiments exact plans are

When there are no transfers and no problems are reported, it is not clear whether there is something wrong or things go just as planned.

Page 12: SC3 experiences

SC3 service phase observations

Failed transfers by attempting to overwrite files

Not allowed by PNFS

At dCache sites running a gridftp door on there srm node files can be thrown away immediately using edg-gridftp-rm or glite-gridftp-rm

At dCache sites that don’t run a gridftp door on the srm node an advisory delete can be done. But then files are not immediately deleted.

Page 13: SC3 experiences

SC3 service phase observations

dCache security (gsi)dcapUsing dccp it is possible to get anything in /pnfs/grid.sara.nl/data/<vo> by anyone

Unix permissions on directories are not honoured Files in a directory with –rwxr-x--- are world

readable.

File permission are honoured but when data is copied in /pnfs it gets –rw-r--r--.

Using gsidcap you are authenticated but the behaviour above stays the same.

Write permissions are OK.

Maybe this is OK for HEP VOs but for some VOs this is too liberal.

Page 14: SC3 experiences

SC3 service phase observations

Oracle databaseEvery now and then it just hangs and needs to be restarted.

Backups didn’t work but FTS and LFC did.

Page 15: SC3 experiences

SC3 service phase observations

A user wanted to run a job using root I/O which is rfio/dcap based.

Rfio/dcap are unauthenticated protocols to access data

Rfio comes automatically when installing a classic SE with yaim.

We don’t really like it but what do the other T1s think about this?

Page 16: SC3 experiences

SC4 Outlook

Current plans (being updated)

-Replace old SE by SRM SE-Setup DB node for FTS/LFC

-Setup T2 tests-Separate T1 tape storage from general storage