statistics of caf usage, interaction with the grid marco meoni cern - offline week – 11.07.2008
Post on 24-Dec-2015
218 Views
Preview:
TRANSCRIPT
Statistics of CAF usage, Interaction with the GRID
Marco MEONICERN - Offline Week – 11.07.2008
Outline
CAF Usage and Users’ grouping
Disk monitoring
Datasets
CPU Fairshare monitoring
User query
Conclusions & Outlook
CERN Analysis Facility
Cluster of 40 machines since two years80 CPUs, 8 TB of disk pool
35 machines as PRO partition, 5 as DEV
Head node is xrootd redirector and PROOF master
Other nodes are xrootd data servers and PROOF slaves
Available resources in CAF must be fairly used
Highest attention to how disks and CPUs are used
Users are groupedAt present, sub-detectors and physics working groups
Users can belong to several groups (PWG has precedence over sub-detector)
Each grouphas a disk space (quota) which is used to stage datasets from AliEn
– has a CPU fairshare target (priority) to regulate concurrent queries
CAF Usage
CAF GroupsGroups #Users Disk quota (GB) CPU quota (%)PWG0 5 1000 10
PWG1 1 1000 10
PWG2 21 1000 10
PWG3 8 1000 10
PWG4 17 1000 10
EMCAL 1 10
HMPID 1 10
ITS 3 10
T0 1 10
MUON 3 10
PHOS 1 10
TPC 2 10
TOF 1 10
ZDC 1 10
proofteam 5 100 10
testusers 40 10
marco 1 200 10
COMMON 1 2000 10
Not absolute quotas
– 18 registered groups– ~60 users– 165 users have used
CAF:please register to groups!
Resource Monitoring
• ML ApMon running on each node– Sends monitoring information each minute
– Default monitoring (Load, CPU, memory, swap, disk I/O, network)
– Additional information:• PROOF and disk servers status (xrootd/olbd)
• Number of PROOF sessions (proofd master)
• Number of queued staging requests and hosted files (DS manager)
Status Table
lxb6047: 310lxb6048: 309lxb6049: 308lxb6050: 308lxb6051: 308lxb6052: 309lxb6053: 309lxb6054: 0lxb6055: 309lxb6056: 311lxb6057: 307lxb6058: 308lxb6059: 309lxb6060: 310lxb6061: 311lxb6062: 309lxb6063: 309lxb6064: 307lxb6065: 308lxb6066: 1089lxb6067: 309lxb6068: 311lxb6069: 309lxb6070: 313lxb6071: 311lxb6072: 309lxb6073: 312lxb6074: 312lxb6075: 310lxb6076: 311lxb6077: 309lxb6078: 307lxb6079: 312lxb6080: 309 ----- 10992
2081 1434 1491 1285 1510 1135 1548 1248 2200 1997 3017 1370 1486 2275 1105 657 1337 1254 2104 1595 1354 1077 3700 1236 1794 1147 1378 887 759 1422 1710 2859 1125 1088-----53665
lxb6047: 4505252lxb6048: 4765808lxb6049: 4535616lxb6050: 4626584lxb6051: 4611296lxb6052: 4357392lxb6053: 4618860lxb6054: 0lxb6055: 4617420lxb6056: 4616616lxb6057: 4604636lxb6058: 4616228lxb6059: 4498200lxb6060: 4503860lxb6061: 4615572lxb6062: 4442524lxb6063: 4648184lxb6064: 4506060lxb6065: 4617604lxb6066: 8205852lxb6067: 4616636lxb6068: 4610376lxb6069: 4503408lxb6070: 4621144lxb6071: 4617128lxb6072: 4503624lxb6073: 4614408lxb6074: 4578440lxb6075: 4617216lxb6076: 4575632lxb6077: 4508424lxb6078: 4502668lxb6079: 4503700lxb6080: 4408096 -------- 154294464
136503228 131620332 129713692 130692932 131562416 131304256 133114200 147588376 136338280 129738196 134240916 131261948 137193076 136599452 135547372 133563072 138580720 135901416 136091560 79725312 135399544 127204136 131962576 138608904 138949284 125406368 143095248 128861436 126755308 139379600 136096716 125332288 132440152 142560756----------4508933068
Hosted files and Disk Usage
#Raw files: 11k #Sim files: 54kRaw on disk:154GBSim on disk:4.5TB
Number of Files Disk Pool usage (Kb)
Raw data Sim data Raw data Sim data
ESDs from RAW dataproduction ready to bestaged
• Datasets (DS) are used to stage files from AliEn• A DS is a list of files (usually ESDs or archives)
registered by users for processing with PROOF• DSs may share same physical files• Staging script issues new staging requests and
touch files every 5 mins• Files are uniformly distributed by the xrootd data
manager
Interaction with the GRID
Dataset Manager
• The DS manager takes care of the quotas at file
level
• Physical location of files is regulated by xrootd
• The DS manager daemon sends:• The overall number of files • Number of new, touched, disappeared, corrupted files• Staging requests• Disk utilization for each user and for each group• Number of files on each node and total size
Dataset Monitoring- PWG1 is using 0% of 1TB- PWG3 is using 5% of 1TB
Datasets List• /COMMON/COMMON/ESD5000_part | 1000 | /esdTree | 100000 | 50 GB | 100 %• /COMMON/COMMON/ESD5000_small | 100 | /esdTree | 10000 | 4 GB | 100 %• /COMMON/COMMON/run15034_PbPb | 967 | /esdTree | 939 | 500 GB | 97 %• /COMMON/COMMON/run15035_PbPb | 962 | /esdTree | 952 | 505 GB | 98 %• /COMMON/COMMON/run15036_PbPb | 961 | /esdTree | 957 | 505 GB | 99 %• /COMMON/COMMON/run82XX_part1 | 10000 | /esdTree | 999500 | 289 GB | 99 %• /COMMON/COMMON/run82XX_part2 | 10000 | /esdTree | 922600 | 289 GB | 92 %• /COMMON/COMMON/run82XX_part3 | 10000 | /esdTree | 943100 | 288 GB | 94 %• /COMMON/COMMON/sim_160000_esd | 95 | /esdTree | 9400 | 267 MB | 98 %• /PWG0/COMMON/run30000X_10TeV_0.5T | 2167 | /esdTree | 216700 | 90 GB | 100 %• /PWG0/COMMON/run31000X_0.9TeV_0.5T | 2162 | /esdTree | 216200 | 57 GB | 100 %• /PWG0/COMMON/run32000X_10TeV_0.5T_Phojet | 2191 | /esdTree | 219100 | 83 GB | 100 %• /PWG0/COMMON/run33000X_10TeV_0T | 2191 | /esdTree | 219100 | 108 GB | 100 %• /PWG0/COMMON/run34000X_0.9TeV_0T | 2175 | /esdTree | 217500 | 65 GB | 100 %• /PWG0/COMMON/run35000X_10TeV_0T_Phojet | 2190 | /esdTree | 219000 | 98 GB | 100 %• /PWG0/phristov/kPhojet_k5kG_10000 | 100 | /esdTree | 1100 | 4 GB | 11 %• /PWG0/phristov/kPhojet_k5kG_900 | 97 | /esdTree | 2000 | 4 GB | 20 %• /PWG0/phristov/kPythia6_k5kG_10000 | 99 | /esdTree | 1600 | 4 GB | 16 %• /PWG0/phristov/kPythia6_k5kG_900 | 99 | /esdTree | 1100 | 4 GB | 11 %• /PWG2/COMMON/run82XX_test4 | 10 | /esdTree | 1000 | 297 MB | 100 %• /PWG2/COMMON/run82XX_test5 | 10 | /esdTree | 1000 | 297 MB | 100 %• /PWG2/akisiel/LHC500C0005 | 100 | /esdTree | 97 | 663 MB | 100 %• /PWG2/akisiel/LHC500C2030 | 996 | /esdTree | 995 | 4 GB | 99 %• /PWG2/belikov/40825 | 1355 | /HLTesdTree | 1052963 | 143 GB | 99 %• /PWG2/hricaud/LHC07f_160033DataSet | 915 | /esdTree | 91400 | 2 GB | 99 %• /PWG2/hricaud/LHC07f_160038_root_archiveDataSet| 862 | /esdTree | 86200 | 449 GB | 100 %• /PWG2/jgrosseo/sim_1600XX_esd | 33568 | /esdTree | 3293900 | 103 GB | 98 %• /PWG2/mvala/PDC07_pp_0_9_82xx_1 | 99 | /rsnMVTree | 990000 | 1 GB | 100 %• /PWG2/mvala/RSNMV_PDC06_14TeV | 677 | /rsnMVTree | 6442101 | 24 GB | 100 %• /PWG2/mvala/RSNMV_PDC07_09_part1 | 326 | /rsnMVTree | 2959173 | 5 GB | 100 %• /PWG2/mvala/RSNMV_PDC07_09_part1_new | 326 | /rsnMVTree | 2959173 | 5 GB | 100 %• /PWG2/pganoti/FirstPhys900Field_310000 | 1088 | /esdTree | 108800 | 28 GB | 100 %• /PWG3/arnaldi/PDC07_LHC07g_200314 | 615 | /HLTesdTree | 45000 | 787 MB | 94 %• /PWG3/arnaldi/PDC07_LHC07g_200315 | 594 | /HLTesdTree | 42600 | 744 MB | 95 %• /PWG3/arnaldi/PDC07_LHC07g_200316 | 366 | /HLTesdTree | 30700 | 513 MB | 99 %• /PWG3/arnaldi/PDC07_LHC07g_200317 | 251 | /HLTesdTree | 20100 | 333 MB | 100 %• /PWG3/arnaldi/PDC08_170167_001 | 1 | N/A | 33 MB | 0 %• /PWG3/arnaldi/PDC08_LHC08t_170165 | 976 | /HLTesdTree | 487000 | 4 GB | 99 %• /PWG3/arnaldi/PDC08_LHC08t_170166 | 990 | /HLTesdTree | 495000 | 4 GB | 100 %• /PWG3/arnaldi/PDC08_LHC08t_170167 | 975 | /HLTesdTree | 424500 | 8 GB | 87 %• /PWG3/arnaldi/myDataSet | 975 | /HLTesdTree | 424500 | 8 GB | 87 %• /PWG4/anju/myDataSet | 946 | /esdTree | 94500 | 27 GB | 99 %• /PWG4/arian/jetjet15-50 | 9817 | /esdTree | 973300 | 630 GB | 99 %• /PWG4/arian/jetjetAbove_50 | 94 | /esdTree | 8000 | 7 GB | 85 %• /PWG4/arian/jetjetAbove_50_real | 958 | /esdTree | 90500 | 73 GB | 94 %• /PWG4/elopez/jetjet15-50_28000x | 7732 | /esdTree | 739800 | 60 GB | 95 %• /PWG4/elopez/jetjet50_r27000x | 8411 | /esdTree | 793100 | 92 GB | 94 %
• Jury produced Pt specturm plots staging his own DS (run #40825, TPC+ITS, field on)
• Start staging common DSs of reconstructed runs?
• Jury produced Pt specturm plots staging his own DS (run #40825, TPC+ITS, field on)
• Start staging common DSs of reconstructed runs?
~4.7GB used out of 6GB (34*200MB -10%)
• Usages retrieved each 5 mins, averaged each 6 hours• Compute new priorities applying a correction formula in [*quota..*quota]
100%
f(x) = q + q*exp(kx) k = 1/q*Ln(1/4)
10%
40%
quota (q)
priorityMin
priorityMax
0% 20%
CPU Fairshare
•α = 0.5, β = 2
usage
• Priorities are used for CPU fairshare and converge to quotas
• Usages are averaged to gracefully converge to quotas• If no competition, users get max CPUs•Only relative priorities are modified!
Priority Monitoring
CPU quotas in practice
- only PWGs + default groups - default usually has the highest usage
Query Monitoring
– When a user query completes, PROOF master sends statistics:• Read bytes
• Consumed CPU time (base for CPU fairshare)
• Number of processed events
• User waiting time
– Values are aggregated per user and group
accumulated
per interval
Query Monitoring
Outlook• User sessions monitoring
– in average 4-7 sessions in parallel (daily hours, EU time), with peek of 15-20 users during the tutorial sessions: running history missing
– need to monitor #workers per user when load-based scheduling will be introduced
• Additional monitoring per single query (disk used and Files/sec not implemented yet)
• Network– traffic correlation among nodes
– Xrootd activity with the new bulk staging requests
• Debug– Tool to monitor and kill a hanging session when Reset doesn’t work (need to restart the
cluster)
• Hardware– New ALICE MAC cluster “ready” (16 workers)
– New IT 8-core machines coming
• Training– PROOF/CAF is the key setup for interactive user analysis (and more)
– Number of people attending the monthly tutorial is increasing (20 persons last week!)
top related