proof farm preparation for atlas fdr-1 wensheng deng, tadashi maeno, sergey panitkin, robert petkus,...
TRANSCRIPT
PROOF Farm preparation for Atlas FDR-1
Wensheng Deng, Tadashi Maeno, Sergey Panitkin,Robert Petkus, Ofer Rind, Torre Wenaus, Shuwei
Ye
BNL
Outline
Introduction Atlas FDR-1 Farm preparation for FDR1 PROOF tests Analyses
Sergey Panitkin
S. Rajagopalan, FDR meeting for U.S.
FDR: What is it?
Provides a realistic test of the computing model from online (SFO) to analysis at Tier-2’s.
Exercise the full software infrastructure (CondDB, TAGDB, trigger configuration, simulation with mis-alignments, etc.) using mixed events.
Implement the calibration/alignment model.
Implement the Data Quality monitoring.
Specifics (from D. Charlon, T/P week):
Prepare a sample of mixed events that looks like raw data (bytestream)
Stream the events from the SFO output at Point 1 Including express and calibration streams
Copy to Tier 0 (and replicate to Tier 1’s)
Run calibration and DQ procedures on express/calibration stream
Bulk processing after 24-48 hours incorporating any new calibrations.
Distribute ESD and AOD to Tier-1s (later to Tier 2’s as well)
Make TAG and DPDs
Distributed Analysis
Reprocess data after a certain time.
S. Rajagopalan, FDR meeting for U.S.
FDR-1 Time Line
January: Sample preparation, mixing events
Week of Feb. 4: FDR-1 run Stream data through SFOs
Transfer to T0, processing of ES and CS.
Bulk processing completed by weekend.
Including ESD and AOD production
Regular shifts: DQ monitoring, Calibration and Tier-0 processing shifts
Expert coverage at Tier-1 as well to ensure smooth data transfer.
Week of February 11: AOD samples transferred to Tier-1s
DPD production at Tier-1.
Week of February 18/25: All data samples should be available for subsequent analysis.
At some later point: Reprocessing at Tier-1’s and re-production of DPDs.
FDR-1 should complete before April and feedback into FDR-2
PROOF farm preparation
Existing Atlas PROOF farm @BNL was expanded in anticipation of FDR1
10 new nodes each with: 8 CPUs 16 GB RAM 500 GB Hard drive Expect additional 64 GB Solid State Disk (SSD) 1Gb network
Standard Atlas software stack Ganglia Monitoring Latest version of root (currently 5.18 as of Jan. 28, 2008)
Sergey Panitkin
Current Farm Configuration
Sergey Panitkin
“Old farm”
10 nodes – 4 GB RAM each 40 cores: 1.8 GHz Opterons20 TB of HDD space (10x4x500 GB)
Extension
10 nodes - 16 GB RAM each80 cores: 2.0 GHz Kentsfields5 TB of HDD space (10x500 GB)640 GB SSD space (10x64 GB)
+
Farm resource distribution issues
The new “extension” machines are “CPU heavy”:8 cores, 1 HDD Tests showed that 1 CPU core requires ~ 10MB/s in typical I/O bound
Atlas analysis
Tests showed 1 SATA HD can sustain ~ 20 MB/s, e.g. ~ 2 cores
In order to provide adequate bandwidth for 8 cores per box we needed to augment “extension” machines with SSDs
SSDs provide high bandwidth capable of sustaining 8 core load, but have relatively small volume – 64 GB per machine. They will be able to accommodate only a fraction of the expected FDR1 data.
Hence, SSD space should be actively managed The exact scheme of data management needs to be worked out The following slides attempt to summarize current discussion about
data management with current proof farm configuration
Sergey Panitkin
New Solid State Disks
Model: Mtron MSP-SATA7035064 Capacity 64 GB Average access time ~0.1 ms (typical HD ~10ms) Sustained read ~120MB/s Sustained write ~80 MB/s IOPS (Sequential/ Random) 81,000/18,000 Write endurance >140 years @ 50GB write per day MTBF 1,000,000 hours 7-bit Error Correction Code
Sergey Panitkin
Farm resource distribution
Sergey Panitkin
SSD 640GB
HDD 5TB
HDD 20TB
40 Cores
80 Cores“Old Farm”
Extension
BNLXRDHDD1
BNLXRDHDD2
BNLXRDSSD
Plans for FDR1 and beyond
Test data transfer from dCache Direct transfer (xrdcp) via Xrootd door on dCache
Two step transfer (dccp-xrdcp) through intermediate storage
Integration with Atlas DDM
Implement dq2 registration for dataset transfers
Gain experience with SSDs Scalability tests with SSDs and regular HDs
Choice of optimal PROOF configuration for SSD nodes
Data staging mechanism within the farm
HD to SSD data transfer
SSD space monitoring and management
Analysis policies ( free for all, analysis train, subscription, etc)
Test “fast Xrootd access” – new I/O mode for Xrootd client
Test Xrootd/PROOF federation (geographically distributed) with Wisconsin
Organize local user community to analyze FDR data
Sergey Panitkin
Data Flow I
We expect that all the data (AODs, DPDs, TAGS, etc) will first arrive at dCache.
We assume that certain subset of the data will be copied from dCache to the PROOF farm for analysis in root.
This movement is expected to be done using a set of custom scripts and is initiated by the Xrootd/PROOF farm manager .
Scripts will copy datasets using xrdcp via Xrootd door on dCache.
Fall back solution exists in case Xrootd door on dCache is unstable.
Copied datasets will be registered in DQ2.
On the xrootd farm datasets will be stored on HDD space (currentely ~25 TB)
Certain high priority datasets will be copied to SSD disks by farm manager for analysis with PROOF
Determination of the high priority datasets will be done based on physics analysis priorities (FDR coordinator, PWG, etc)
The exact scheme for SSD “subscription” needs to be worked out
Subscription, On-demand loading, etc
Look at Alice
Sergey Panitkin
Integration with Atlas DDM
Sergey Panitkin
/data
Xrootd/PROOF Farm
dCache
Panda
xrdcp with dq2 registration
/ssd
xrdcptentakel
DQ2
T0
dq2_ls –fp –s BNLXRDHDD1 “my_dataset”
analysis
Atlas user
Grid transfers
FDR tests
Batch analyses with Xrootd as data server AOD analysis. Compare speed with dCache – D.Adams, H.Ma
Store (all?) TAGS on the farm Our previous tests showed that Athena analyses gain from TAGs
stored on Xrootd
Use PROOF farm for physics analysis Athena Root Access analysis of AODs using PROOF
ARA was demonstrated to run on PROOF in January (Shuwei Ye)
Store (all?) FDR1 DPDs on the farm FDR1 DPDs made by H. Ma already copied to the farm
DPD based analyses Stephanie Majewski plans to study increase in the sensitivity of an
inclusive SUSY search using information from isolated tracks
Sergey Panitkin
Root version mismatch issues
All of datasets for FDR1 will be produced with rel. 13, which relies on root v.5.14
PROOF farm currently uses the latest production version of root -5.18. This version has many improvements in functionality and stability compare to v.5.14. It is recommend by PROOF developers
Due to changes in xrootd protocol clients running root v.5.14 cannot work with xrootd/PROOF servers from v.5.18
In order to run ARA analysis on PROOF or utilize it as Xrootd SE for AOD/TAG analysis, the PROOF farm needs to be downgraded to v5.14. Such downgrade will hurt root based analysis of AANT and DnPDs.
In principle we can run 2 farms in parallel The old farm with PROOF v.5.14
The extension farm with PROOF v.5.18
The data management scheme described on previous slides can be trivially applied to both farms.
This is a temporary solution. Athena is expected to use root v 5.18 in the next release. This will largely remove version mismatch problems
Sergey Panitkin
Current status
Work in progress! File transfer from dCache is functional New LRC was created Files copied to Xrootd are registered in LRC via custom dq2_cr Datasets can be found using DDM tools dq2-list-dataset-replicas
user.HongMa.fdr08_run1.0003050.StreamEgamma.merge.AOD.o1_r6_t1.DPD_v130040_V5
INCOMPLETE: BNLPANDA,BNLXRDHDD1
COMPLETE:
List of files in a dataset on Xrootd can be obtained via dq2_ls Several FDR1 AOD datasets and one DPD dataset were transferred
using this mechanis Issues: Still need better integration with DDM Possible problem with large files transfers via dCache door
Sergey Panitkin