nl-t1 report
DESCRIPTION
NL-T1 Report. Ron Trompert. Contents. Infrastructure Usage CPU Disk Tape I/O Disk storage Compute Tape LHCOPN dashboard Issues New procurements dCache. Infrastructure. Infrastructure. CPU Usage. CPU Usage. SARA. NIKHEF. ATLAS Disk Usage. SARA disk in GB. NIKHEF disk in GB. - PowerPoint PPT PresentationTRANSCRIPT
SARA Reken- en Netwerkdiensten
NL-T1 ReportRon Trompert
ATLAS NL Cloud meeting 05-04-2011
SARA Reken- en Netwerkdiensten
Contents
ATLAS NL Cloud meeting 05-04-2011
Infrastructure
UsageCPU
Disk
Tape
I/O Disk storage
Compute
Tape
LHCOPN dashboard
Issues
New procurements
dCache
SARA Reken- en Netwerkdiensten
Infrastructure
ATLAS NL Cloud meeting 05-04-2011
SARA Reken- en Netwerkdiensten
Infrastructure
ATLAS NL Cloud meeting 05-04-2011
SARA Reken- en Netwerkdiensten
CPU Usage
ATLAS NL Cloud meeting 05-04-2011
SARA Reken- en Netwerkdiensten
CPU Usage
ATLAS NL Cloud meeting 05-04-2011
SARA
NIKHEF
SARA Reken- en Netwerkdiensten
ATLAS Disk Usage
ATLAS NL Cloud meeting 05-04-2011
SARA disk in GB
NIKHEF disk in GB
SARA Reken- en Netwerkdiensten
ATLAS Tape Usage
ATLAS NL Cloud meeting 05-04-2011
SARA tape usage in TB
SARA Reken- en Netwerkdiensten
I/O storage
ATLAS NL Cloud meeting 05-04-2011
In
Out
SARA Reken- en Netwerkdiensten
I/O Gina
ATLAS NL Cloud meeting 05-04-2011
SARA Reken- en Netwerkdiensten
I/O tape
ATLAS NL Cloud meeting 05-04-2011
Read performance: about 400-500 MB/s from tape to dCache on average when there is no heavy tape writing going on
Have done some on the fly tuning to get there Adapted dcache hsm copy script Cxfs/dmf client node tuning (queue lengths)
Performance is OK given the circumstances but not as much as what we aim for (1 GB/s)
Should be better with new hardware and DMF5
Have replaced hpn-ssh with globus gridftp to copy files between cxfs/dmf client nodes and dCache
Write performance is also about 400-500 MB/s
Adapted hsm copy script to compute checksums+flush and fsync writing to cxfs/dmf clients to avoid data corruption
SARA Reken- en Netwerkdiensten
LHC OPN Dashboard
NL-T1 overleg| 30-11-2010
SARA Reken- en Netwerkdiensten
Issues
ATLAS NL Cloud meeting 05-04-2011
Part of the farm at NIKHEF has not been usuable due to longstanding network issues related to the built-in switches of blade centers delivered in the autumn of 2009. The vendor has not been very active in attempting to resolve this but we hope to have a solution soon.
Due to the issue above, ATLAS jobs are only running on a part of the farm which implies that they are queued for a longer period of time. Pilot factories submit lots of jobs so that it appears that sometimes the batch system does not find any non-ATLAS runnable jobs due to this huge queue. This leads to unused job slots.
According the VO ID card ATLAS jobs need 3072 MB of virtual memory. Nevertheless, our batch systems limits this at 4096MB and this is still too small for some ATLAS jobs. How is ATLAS going to tackle this?
SARA Reken- en Netwerkdiensten
Issues
ATLAS NL Cloud meeting 05-04-2011
Is ATLAS able to use CVMFS using the mount point /cmvfs/atlas.cern.ch/? This would solve two problems?
A third of the content of the BDII
No quota on experiment software disk
We have seen transfers fro ATLASLOCALGROUPDISK to elsewhere. Isn’t LOCAL supposed to be LOCAL?
FTS channelsWouldn’t it be good to let the site admins within the NL cloud be channel admin of their own channels? Then you can tune the channel anyway you want or turn yourself off when going in downtime.
SARA Reken- en Netwerkdiensten
New procurements
ATLAS NL Cloud meeting 05-04-2011
Compute: 50 KSi2006 rate, 27 KSi2006 rate at NIKHEF and 23 KSi2006 rate at SARA
Tape: 2PB
Disk: 850 TiB at SARA, 280 TiB at NIKHEF
Pledges are still under discussion
SARA Reken- en Netwerkdiensten
New procurements: Mass storage
ATLAS NL Cloud meeting 05-04-2011
Scalable solution with DMF5
Investigating faster SAN storage
SARA Reken- en Netwerkdiensten
dCache@SARA
ATLAS NL Cloud meeting 05-04-2011
The Golden release 1.9.5-* has been a very reliable workhorse the past year. But …..
There will be a new Golden release 1.9.12 with some very nice features for admins but also for users, like, for example:
srmGetTurl does not wait anymore for the standard 4 seconds
WebDAV (http/https). Mount dCache on your laptop.
So, we intend to upgrade