STARCOMPUTING
STAR Computing Status,Out-source Plans, Residual Needs
Torre WenausSTAR Computing Leader
BNL
RHIC Computing Advisory Committee MeetingBNL
October 11, 1999
Torre Wenaus, BNL
RHIC Computing Advisory Committee 10/98
STARCOMPUTING
Outline
STAR Computing Status
Out-source plans
Residual needs
Conclusions
Torre Wenaus, BNL
RHIC Computing Advisory Committee 10/98
STARCOMPUTING
Manpower
Very important development in last 6 months: Big new influx of postdocs, students into computing and related activities
Increased participation and pace of activity in QA online computing production tools and operations databases reconstruction software
Planned dedicated database person never hired (funding); databases consequently late but we are now transitioning from an interim to our final database
Still missing online/general computing systems support person Open position cancelled due to lack of funding Shortfall continues to be made up by the local computing group
Torre Wenaus, BNL
RHIC Computing Advisory Committee 10/98
STARCOMPUTING
Some of our Youthful Manpower
Dave Alvarez, Wayne, SVT
Lee Barnby, Kent, QA and production
Jerome Baudot, Strasbourg, SSD
Selemon Bekele, OSU, SVT
Marguerite Belt Tonjes, Michigan, EMC
Helen Caines, Ohio State, SVT
Manuel Calderon, Yale, StMcEvent
Gary Cheung, UT, QA
Laurent Conin, Nantes, database
Wensheng Deng, Kent, production
Jamie Dunlop, Yale, RICH
Patricia Fachini, Sao Paolo/Wayne, SVT
Dominik Flierl, Frankfurt, L3 DST
Marcelo Gameiro, Sao Paolo, SVT
Jon Gangs, Yale, online
Dave Hardtke, LBNL, Calibrations, DB
Mike Heffner, Davis, FTPC
Eric Hjort, Purdue, TPC
Amy Hummel, Creighton, TPC, production
Holm Hummler, MPG, FTPC
Matt Horsley, Yale, RICH
Jennifer Klay, Davis, PID
Matt Lamont, Birmingham, QA
Curtis Lansdell, UT, QA
Brian Lasiuk, Yale, TPC, RICH
Frank Laue, OSU, online
Lilian Martin, Subatch, SSD
Marcelo Munhoz, Sao Paolo/Wayne, online
Aya Ishihara, UT, QA
Adam Kisiel, Warsaw, online, Linux
Frank Laue, OSU, calibration
Hui Long, UCLA, TPC
Vladimir Morozov, LBNL, simulation
Alex Nevski, RICH
Sergei Panitkin, Kent, online
Caroline Peter, Geneva, RICH
Li Qun, LBNL, TPC
Jeff Reid, UW, QA
Fabrice Retiere, calibrations
Christelle Roy, Subatech, SSD
Dan Russ, CMU, trigger, production
Raimond Snellings, LBNL, TPC, QA
Jun Takahashi, Sao Paolo, SVT
Aihong Tang, Kent
Greg Thompson, Wayne, SVT
Fuquian Wang, LBNL, calibrations
Robert Willson, OSU, SVT
Richard Witt, Kent
Gene Van Buren, UCLA, documentation, tools, QA
Eugene Yamamoto, UCLA, calibrations, cosmics
David Zimmerman, LBNL, Grand Challenge
A partial list of young students and postdocs now active in aspects of software...
Torre Wenaus, BNL
RHIC Computing Advisory Committee 10/98
STARCOMPUTING
Status of Computing Requirements
Internal review (particularly simulation) in process in connection with evaluating PDSF upgrade needs
No major changes with respect to earlier reviews RCF resources should meet STAR reconstruction and central
analysis needs (recognizing 1.5x re-reconstruction factor allows little margin for the unexpected)
Existing (primarily Cray T3E) offsite simulation facilities inadequate for simulation needs
Simulation needs addressed by PDSF ramp-up plans
Torre Wenaus, BNL
RHIC Computing Advisory Committee 10/98
STARCOMPUTING
Current STAR Software EnvironmentCurrent software base a mix of C++ (55%) and Fortran (45%)
Rapid evolution from ~20%/80% in September ‘98 New development, and all physics analysis, in C++
ROOT as analysis tool and foundation for framework adopted 11/98 Legacy Fortran codes and data structures supported without change Deployed in offline production and analysis in Mock Data
Challenge 2, Feb-Mar ‘99ROOT adopted for event data store after MDC2
Complemented by MySQL relational DB: no more ObjectivityPost-reconstruction: C++/OO data model ‘StEvent’ implemented
Initially purely transient; design unconstrained by I/O (ROOT or Objectivity)
Later implemented in persistent form using ROOT without changing interface
Basis of all analysis software development Next step: migrate the OO data model upstream to reconstruction
Torre Wenaus, BNL
RHIC Computing Advisory Committee 10/98
STARCOMPUTING
MDC2 and Post-MDC2
STAR MDC2: Full production deployment of ROOT based offline chain and I/O.
All MDC2 production based on ROOT Statistics suffered from software and hardware problems and the
short MDC2 duration; about 1/3 of ‘best case scenario’ Very active physics analysis and QA program StEvent (OO/C++ data model) in place and in use
During and after MDC2: Addressing the problems Program size: up to 850MB. Reduced to <500MB in broad cleanup Robustness of multi-branch I/O (multiple file streams) improved
XDF based I/O maintained as stably functional alternative Improvements to ‘Maker’ organization of component packages Completed by late May; infrastructure stabilized
Torre Wenaus, BNL
RHIC Computing Advisory Committee 10/98
STARCOMPUTING
Software Status for Engineering Run
Offline environment and infrastructure stabilized Shift of focus to consolidation: usability improvements,
documentation, user-driven enhancements, developing and responding to QA
DAQ format data supported in offline from raw files through analysis
Stably functional data storage ‘Universal’ I/O interface transparently supports all STAR file types
DAQ raw data, XDF, ROOT, (Grand Challenge and online pool to come)
ROOT I/O debugging proceeded through June; now stable
StEvent in wide use for physics analysis and QA software Persistent version of StEvent implemented and deployed
Very active analysis and QA program
Calibration/parameter DB not ready (now 10/99 being deployed)
Torre Wenaus, BNL
RHIC Computing Advisory Committee 10/98
STARCOMPUTING
Real Data Processing
Currently live detector is the TPC 75% of TPC read out (beam data and cosmics) Can read and analyze zero suppressed TPC data all the way to DST real data DST read and used in StEvent post-reco analysis Bad channel suppression implemented and tested. First order alignment was worked out (~1mm), the rest to come
from residuals analysis 10 000 cosmics with no field and several runs with field on All interesting real data from engineering run passed through
regular production reconstruction and QA now preparing for second iteration incorporating improvements
in reconstruction codes, calibrations
Torre Wenaus, BNL
RHIC Computing Advisory Committee 10/98
STARCOMPUTING
Event Store and Data Management
Success of ROOT-based event data storage from MDC2 on relegated Objectivity to metadata management role, if any
ROOT provides storage for the data itself
We can use a simpler, safer tool in metadata role without compromising our data model, and avoid complexities and risks of Objectivity
MySQL adopted (relational DB, open software, widely used, very fast, but not a full-featured heavyweight like ORACLE)
Wonderful experience so far. Excellent tools, very robust, extremely fast
Scalability OK so far (eg. 2M rows of 100bytes); multiple servers can be used as needed to address scalability needs
Not taxing the tool because metadata, not large volume data, is stored
Objectivity is gone from STAR
Torre Wenaus, BNL
RHIC Computing Advisory Committee 10/98
STARCOMPUTING
Requirements: STAR 8/99 View (My Version)
Requirement Obj 97 Obj 99 ROOT 97 ROOT 99
C++ API OK OK OK OK
Scalability OK ? No file mgmt MySQL
Aggregate I/O OK ? OK OK
HPSS Planned OK? No OK
Integrity, availability OK OK No file mgmt MySQL
Recovery from lost data OK OK No file mgmt OK, MySQL
Versions, schema evolve OK Your job Crude Almost OK
Long term availability OK? ??? OK? OK
Access control OS Your job OS OS, MySQL
Admin tools OK Basic No MySQL
Recovery of subsets OK OK No file mgmt OK, MySQL
WAN distribution OK Hard No file mgmt MySQL
Data locality control OK OK OS OS, MySQL
Linux No OK OK OK
Torre Wenaus, BNL
RHIC Computing Advisory Committee 10/98
STARCOMPUTING
RHIC Data Management: Factors For Evaluation
My perception of changes in the STAR view from ‘97 to now are shown
Objy Root+MySQL Factor
Cost
Performance and capability as data access solution
Quality of technical support
Ease of use, quality of doc
Ease of integration with analysis
Ease of maintenance, risk
Commonality among experiments
Extent, leverage of outside usage
Affordable/manageable outside RCF
Quality of data distribution mechanisms
Integrity of replica copies
Availability of browser tools
Flexibility in controlling permanent storage location
Level of relevant standards compliance, eg. ODMG
Java access
Partitioning DB and resources among groups
Torre Wenaus, BNL
RHIC Computing Advisory Committee 10/98
STARCOMPUTING
STAR Production Database
MySQL based production database (for want of a better term) in place with the following components
File catalogs Simulation data catalog
populated with all simulation-derived data in HPSS and on disk Real data catalog
populated with all real raw and reconstructed data Run log and online log
fully populated and interfaced to online run log entry Event tag databases
database of DAQ-level event tags. Populated by offline scanner; needs to be interfaced to buffer boxand extended with downstream tags
Production operations database production job status and QA info
Torre Wenaus, BNL
RHIC Computing Advisory Committee 10/98
STARCOMPUTING
ROOT Status in STAR
ROOT is with us to stay! No major deficiencies, obstacles found; no post-ROOT visions
contemplated ROOT community growing: Fermilab Run II, ALICE, MINOS We are leveraging community developments First US ROOT workshop at FNAL in March
Broad participation, >50 from all major US labs, experiments ROOT team present; heeded our priority requests
I/O improvements: robust multi-stream I/O and schema evolutionStandard Template Library supportBoth emerging in subsequent ROOT releases
FNAL participation in development, documentationROOT guide and training materials recently released
Our framework is based on ROOT, but application codes need not depend on ROOT (neither is it forbidden to use ROOT in application codes).
Torre Wenaus, BNL
RHIC Computing Advisory Committee 10/98
STARCOMPUTING
Software Releases and DocumentationRelease policy and mechanisms stable and working fairly smoothly
Extensive testing and QA: nightly (latest version) and weekly (higher statistics testing before ‘dev’ version is released to ‘new’)
Software build tools switched from gmake to cons (perl) more flexible, easier to maintain, faster
Major push in recent months to improve scope and quality of documentation Documentation coordinator (coercer!) appointed New documentation and code navigation tools developed Needs prioritized; pressure being applied; new doc has started to appear Ongoing monthly tutorial program
With cons, doc/code tools, database tools, … perl has become a major STAR tool
Software by type: All Modified in last 2 months C 18938 1264
C++ 115966 52491
FORTRAN 93506 54383
IDL 8261 162
KUMAC 5578 0
MORTRAN 7122 3043
Makefile 3009 2323
scripts 36188 26402
Torre Wenaus, BNL
RHIC Computing Advisory Committee 10/98
STARCOMPUTING
QA
Major effort during and since MDC2 Organized effort under ‘QA Czar’ Peter Jacobs; weekly meetings
and QA reports ‘QA signoff’ integrated with software release procedures Suite of histograms and other QA measures in continuous use and
development Automated tools managing production and extraction of QA
measures from test and production running recently deployed Acts as a very effective driver for debugging and development of
the software, engaging a lot of people
Torre Wenaus, BNL
RHIC Computing Advisory Committee 10/98
STARCOMPUTING
Current Software Status
Infrastructure for year one pretty much there
Simulation stable ~7TB production simulation data generated
Reconstruction software for year one mostly there lots of current work on quality, calibrations, global reconstruction TPC in the best shape; EMC in the worst (two new FTEs should
help EMC catch up; 10% installation in year 1) well exercised in production; ~2.5TB of reconstruction output
generated in production
Physics analysis software now actively underway in all working groups contributing strongly to reconstruction and QA
Major shift of focus in recent months away from infrastructure and towards reconstruction and analysis
Reflected in program of STAR Computing Week last week; predominantly reco/analysis
Torre Wenaus, BNL
RHIC Computing Advisory Committee 10/98
STARCOMPUTING
Priority Work for Year One Readiness
In Progress...
Extending data management tools (MySQL DB + disk file management
+ HPSS file management + multi-component ROOT files)
Complete schema evolution, in collaboration with ROOT team
Completion of the DB: integration of slow control as data source,
completion of online integration, extension to all detectors
Extend and apply OO data model (StEvent) to reconstruction
Continued QA development
Reconstruction and analysis code development Responding to QA results and addressing year 1 code completeness
Improving and better integrating visualization tools
Management of CAS processing and data distribution both for mining
and individual physicist level analysis Integration and deployment of Grand Challenge
Torre Wenaus, BNL
RHIC Computing Advisory Committee 10/98
STARCOMPUTING
STAR Analysis: CAS Usage Plan
CAS processing with DST input based on managed production by the physics working groups (PWG) using the Grand Challenge Architecture
Later stage processing on micro-DSTs (standardized at the PWG level) and ‘nano-DSTs’ (defined by individuals or small groups) occurs under the control of individual physicists and small groups
Mix of LSF-based batch, and interactive on both Linux and Sun, but with far greater emphasis on Linux
For I/O intensive processing, local Linux disks (14GB usable) and Suns available
Usage of local disks and availability of data to be managed through the file catalog
Web-based interface to management, submission and monitoring of analysis jobs in development
Torre Wenaus, BNL
RHIC Computing Advisory Committee 10/98
STARCOMPUTING
Grand Challenge
What does the Grand Challenge do for the user?
Optimizes access to HPSS based data store Improves data access for individual users
Allows event access by query: Present query string to GCA (e.g. NumberLambdas>1)Receive iterator over events which satisfy query as files are
extracted from HPSS Pre-fetches files so that “the next” file is requested from HPSS
while you are analyzing the data in your first file Coordinates data access among multiple users
Coordinates ftp requests so that a tape is staged only once per set of queries which request files on that tape
General user-level HPSS retrieval tool
Torre Wenaus, BNL
RHIC Computing Advisory Committee 10/98
STARCOMPUTING
Grand Challenge queries
Queries based on physics tag selections:
SELECT (component1, component2, …)
FROM dataset_name
WHERE (predicate_conditions_on_properties)
Example:
SELECT dst, hits
FROM Run00289005
WHERE glb_trk_tot>0 & glb_trk_tot<10
Queries based on physics tag selections:
SELECT (component1, component2, …)
FROM dataset_name
WHERE (predicate_conditions_on_properties)
Example:
SELECT dst, hits
FROM Run00289005
WHERE glb_trk_tot>0 & glb_trk_tot<10
Event components:
fzd, raw, dst-xdf, dst-root, hits, StrangeTag, FlowTag, StrangeMuDst, …
Mapping from run/event/component to file via the database
GC index assembles tags + component file locations for each event
Tag based query match yields the files requiring retrieval to serve up that event
Event list based queries allow using the GCA for general-purpose coordinated HPSS retrieval
Event list based retrieval:
SELECT dst, hits
Run 00289005 Event 1
Run 00293002 Event 24
Run 00299001 Event 3
...
Event list based retrieval:
SELECT dst, hits
Run 00289005 Event 1
Run 00293002 Event 24
Run 00299001 Event 3
...
Torre Wenaus, BNL
RHIC Computing Advisory Committee 10/98
STARCOMPUTING
8 Oct. 1999 D. Olson, star-gc_8oct99.ppt 3
Block view of STAR-GC
root4star
STAR mysql
DB server
file system(root event files)
StIOMaker
STACS(GC server)
HPSS
pftp
Grand Challenge in STAR
Torre Wenaus, BNL
RHIC Computing Advisory Committee 10/98
STARCOMPUTING
STAR GC Implementation Plan
Interface GC client code to STAR framework Already runs on solaris, linux Needs integration into framework I/O management Needs connections to STAR MySQL DB
Apply GC index builder to STAR event tags Interface is defined Has been used with non-STAR ROOT files Needs connection to STAR ROOT and mysql DB
(New) manpower for implementation now available Experienced in STAR databases Needs to come up to speed on GCA
Torre Wenaus, BNL
RHIC Computing Advisory Committee 10/98
STARCOMPUTING
Current STAR Status at RCF
Computing operations during the engineering run fairly smooth, apart from very severe security disruptions
Data volumes small, and direct DAQ->RCF data path not yet commissioned
Effectively using the newly expanded Linux farm Steady reconstruction production on CRS; transition to year 1
operation should be smooth New CRS job management software deployed in MDC2 works well
and meets our needs Analysis software development and production underway on CAS Tools managing analysis operations under development
Integration of Grand Challenge data management tools into production and physics analysis operations to take place over the next few months
Not needed for early running (low data volumes)
Torre Wenaus, BNL
RHIC Computing Advisory Committee 10/98
STARCOMPUTING
Concerns: RCF ManpowerUnderstaffing directly impacts
Depth of support/knowledge base in crucial technologies, eg. AFS, HPSS
Level and quality of user and experiment-specific support Scope of RCF participation in software; Much less central
support/development effort in common software than at other labs (FNAL, SLAC)
e.g. ROOT used by all four experiments, but no RCF involvement
Exacerbated by very tight manpower within the experiment software efforts
Some generic software development supported by LDRD (NOVA project of STAR/ATLAS group)
The existing overextended staff is getting the essentials done, but the data flood is still to come
Concerns over RCF understaffing recently increased with departure of Tim Sailer
Torre Wenaus, BNL
RHIC Computing Advisory Committee 10/98
STARCOMPUTING
Concerns: Computer/Network SecurityCareful balance required between ensuring security and providing a
productive and capable development and production environment Not yet clear whether we are in balance or have already strayed to
an unproductive environment Unstable offsite connections, broken farm functionality, database
configuration gymnastics, farm (even interactive part) cut off from the world), limited access to our data disks
Experiencing difficulties, and expecting new ones, from particularly the ‘private subnet’ configuration unilaterally implemented by RCF
Need should be (re)evaluated in light of new lab firewallRCF security closely coupled to overall lab computer/network security;
coherent site-wide plan, as non-intrusive as possible, is needed We are still recovering from the knee-jerk ‘slam the doors’
response of the lab to the August incident Punching holes in the firewall to enable work to get done
I now regularly use PDSF@NERSC when offsite to avoid being tripped up by BNL security
Torre Wenaus, BNL
RHIC Computing Advisory Committee 10/98
STARCOMPUTING
Other Concerns
HPSS transfer failures During MDC2 in certain periods up to 20% of file transfers to
HPSS failed dangerously transfers seem to succeed; no errors and the file seemingly visible in
HPSS with the right size but on reading we find the file not readable
John Riordan has list of errors seen during reading In reconstruction we can guard against this, but it would be a much
more serious problem for DAQ data: cannot afford to read back from HPSS to check its integrity.
Continuing networking disruptions A regular problem in recent months; network dropping out or very
slow for unknown/unannounced reasons If unintentional: bad network management If intentional: bad network management
Torre Wenaus, BNL
RHIC Computing Advisory Committee 10/98
STARCOMPUTING
Public Information and Documentation Needed
Clear list of services RCF provides, the level of support of these services, resources allocated to each experiment, personnel support responsibles
- rcas: LSF, ....
- rcrs: CRS software, ....
- AFS: (stability, home directories, ...)
- Disks: (inside / outside access)
- HPSS
- experiment DAQ/online interface
Web based information is very incomplete e.g. information on planned facilities for year one and after largely a restatement of first point
General communication RCF still needs improvement in general user communication,
responsiveness
Torre Wenaus, BNL
RHIC Computing Advisory Committee 10/98
STARCOMPUTING
Outsourcing Computing in STARBroad local RCF, BNL-STAR and remote usage of STAR software. STAR
environment setup counts since early August: RCF 118571 BNL Sun 33508 BNL Linux 13418 Desktop 6038 HP 801 LBL 29308 Rice 12707 Indiana 19852
Non-RCF usage currently comparable to RCF usage: good distributed computing support is essential in STAR
Enabled by AFS based environment; AFS an indispensable tool But inappropriate for data access usage
Agreement reached with RCF for read-only access to RCF NFS data disks from STAR BNL computers; seems to be working well
New BNL-STAR facilities 6 dual 500MHz/18GB (2 arrived), 120GB disk
For software development, software and OS/compiler testing, online monitoring, services (web, DB,…)
Supported and managed by STAR personnel Supports STAR environment for Linux desktop boxes
Torre Wenaus, BNL
RHIC Computing Advisory Committee 10/98
STARCOMPUTING
STAR Offsite Computing: PDSF
pdsf at LBNL/NERSC Virtually identical configuration to RCF Intel/Linux farm, limited Sun/Solaris, HPSS based data archiving Current (10/99) scale relative to STAR RCF:
CPU ~50% (1200 Si95), disk ~85% (2.5TB) Long term goal: resources ~equal to STAR’s share of RCF
Consistent with long-standing plan that RCF hosts ~50% of experiments’ computing facilities: simulation and some analysis offsite
Ramp-up currently being planned
Other NERSC resources: T3Es major source of simulation cycles 210,000 hours allocated in FY00: one of the larger allocations in
terms of CPU and storage Focus in future will be on PDSF; no porting to next MPP
generation
Torre Wenaus, BNL
RHIC Computing Advisory Committee 10/98
STARCOMPUTING
STAR Offsite Computing: PSC
Cray T3Es at Pittsburgh Supercomputing Center STAR Geant3 based simulation used at PSC to generate ~4TB of
simulated data in support of Mock Data Challenges and software development
Supported by local CMU group Recently retired when our allocation ran out and could not be
renewed Increasing reliance on PSC
Torre Wenaus, BNL
RHIC Computing Advisory Committee 10/98
STARCOMPUTING
STAR Offsite Computing: Universities
Physics analysis computing at home institutions Processing of ‘micro-DSTs’ and DST subsets Software development
Primarily based on small Linux clusters
Relatively small data volumes; aggregate total of ~10TB/yr Data transfer needs of US institutes should be met by net Overseas institutes will rely on tape based transfers
Existing self-service scheme will probably suffice
Some simulation production at universities Rice, Dubna
Torre Wenaus, BNL
RHIC Computing Advisory Committee 10/98
STARCOMPUTING
Residual Needs
Data transfer to PDSF and other offsite institutes Existing self-service DLT probably satisfactory for non-PDSF tape
needs, but little experience to date 100GB/day network transfer rate today adequate for PDSF/NERSC
data transfer Future PDSF transfer needs (network, tape) to be quantified once
PDSF scale-up is better understood
Torre Wenaus, BNL
RHIC Computing Advisory Committee 10/98
STARCOMPUTING
Conclusions: STAR at RCFOverall: RCF is an effective facility for STAR data processing and
management Sound choices in overall architecture, hardware, software Well aligned with the HENP community
Community tools and expertise easily exploited Valuable synergies with non-RHIC programs, notably ATLAS
Production stress tests have been successful and instructive On schedule; facilities have been there when needed
RCF interacts effectively and consults appropriately for the most part, and is generally responsive to input
Weak points are security issues and interactions with general users (as opposed to experiment liaisons and principals)
Mock Data Challenges have been highly effective in exercising, debugging and optimizing RCF production facilities as well as our software
Based on status to date, we expect STAR and RCF to be ready for whatever RHIC Year 1 throws at us.