star c omputing star computing status, out-source plans, residual needs torre wenaus star computing...

34
STAR COMPUTING STAR Computing Status, Out-source Plans, Residual Needs Torre Wenaus STAR Computing Leader BNL RHIC Computing Advisory Committee Meeting BNL October 11, 1999

Upload: coral-davis

Post on 27-Dec-2015

215 views

Category:

Documents


0 download

TRANSCRIPT

STARCOMPUTING

STAR Computing Status,Out-source Plans, Residual Needs

Torre WenausSTAR Computing Leader

BNL

RHIC Computing Advisory Committee MeetingBNL

October 11, 1999

Torre Wenaus, BNL

RHIC Computing Advisory Committee 10/98

STARCOMPUTING

Outline

STAR Computing Status

Out-source plans

Residual needs

Conclusions

Torre Wenaus, BNL

RHIC Computing Advisory Committee 10/98

STARCOMPUTING

Manpower

Very important development in last 6 months: Big new influx of postdocs, students into computing and related activities

Increased participation and pace of activity in QA online computing production tools and operations databases reconstruction software

Planned dedicated database person never hired (funding); databases consequently late but we are now transitioning from an interim to our final database

Still missing online/general computing systems support person Open position cancelled due to lack of funding Shortfall continues to be made up by the local computing group

Torre Wenaus, BNL

RHIC Computing Advisory Committee 10/98

STARCOMPUTING

Some of our Youthful Manpower

Dave Alvarez, Wayne, SVT

Lee Barnby, Kent, QA and production

Jerome Baudot, Strasbourg, SSD

Selemon Bekele, OSU, SVT

Marguerite Belt Tonjes, Michigan, EMC

Helen Caines, Ohio State, SVT

Manuel Calderon, Yale, StMcEvent

Gary Cheung, UT, QA

Laurent Conin, Nantes, database

Wensheng Deng, Kent, production

Jamie Dunlop, Yale, RICH

Patricia Fachini, Sao Paolo/Wayne, SVT

Dominik Flierl, Frankfurt, L3 DST

Marcelo Gameiro, Sao Paolo, SVT

Jon Gangs, Yale, online

Dave Hardtke, LBNL, Calibrations, DB

Mike Heffner, Davis, FTPC

Eric Hjort, Purdue, TPC

Amy Hummel, Creighton, TPC, production

Holm Hummler, MPG, FTPC

Matt Horsley, Yale, RICH

Jennifer Klay, Davis, PID

Matt Lamont, Birmingham, QA

Curtis Lansdell, UT, QA

Brian Lasiuk, Yale, TPC, RICH

Frank Laue, OSU, online

Lilian Martin, Subatch, SSD

Marcelo Munhoz, Sao Paolo/Wayne, online

Aya Ishihara, UT, QA

Adam Kisiel, Warsaw, online, Linux

Frank Laue, OSU, calibration

Hui Long, UCLA, TPC

Vladimir Morozov, LBNL, simulation

Alex Nevski, RICH

Sergei Panitkin, Kent, online

Caroline Peter, Geneva, RICH

Li Qun, LBNL, TPC

Jeff Reid, UW, QA

Fabrice Retiere, calibrations

Christelle Roy, Subatech, SSD

Dan Russ, CMU, trigger, production

Raimond Snellings, LBNL, TPC, QA

Jun Takahashi, Sao Paolo, SVT

Aihong Tang, Kent

Greg Thompson, Wayne, SVT

Fuquian Wang, LBNL, calibrations

Robert Willson, OSU, SVT

Richard Witt, Kent

Gene Van Buren, UCLA, documentation, tools, QA

Eugene Yamamoto, UCLA, calibrations, cosmics

David Zimmerman, LBNL, Grand Challenge

A partial list of young students and postdocs now active in aspects of software...

Torre Wenaus, BNL

RHIC Computing Advisory Committee 10/98

STARCOMPUTING

Status of Computing Requirements

Internal review (particularly simulation) in process in connection with evaluating PDSF upgrade needs

No major changes with respect to earlier reviews RCF resources should meet STAR reconstruction and central

analysis needs (recognizing 1.5x re-reconstruction factor allows little margin for the unexpected)

Existing (primarily Cray T3E) offsite simulation facilities inadequate for simulation needs

Simulation needs addressed by PDSF ramp-up plans

Torre Wenaus, BNL

RHIC Computing Advisory Committee 10/98

STARCOMPUTING

Current STAR Software EnvironmentCurrent software base a mix of C++ (55%) and Fortran (45%)

Rapid evolution from ~20%/80% in September ‘98 New development, and all physics analysis, in C++

ROOT as analysis tool and foundation for framework adopted 11/98 Legacy Fortran codes and data structures supported without change Deployed in offline production and analysis in Mock Data

Challenge 2, Feb-Mar ‘99ROOT adopted for event data store after MDC2

Complemented by MySQL relational DB: no more ObjectivityPost-reconstruction: C++/OO data model ‘StEvent’ implemented

Initially purely transient; design unconstrained by I/O (ROOT or Objectivity)

Later implemented in persistent form using ROOT without changing interface

Basis of all analysis software development Next step: migrate the OO data model upstream to reconstruction

Torre Wenaus, BNL

RHIC Computing Advisory Committee 10/98

STARCOMPUTING

MDC2 and Post-MDC2

STAR MDC2: Full production deployment of ROOT based offline chain and I/O.

All MDC2 production based on ROOT Statistics suffered from software and hardware problems and the

short MDC2 duration; about 1/3 of ‘best case scenario’ Very active physics analysis and QA program StEvent (OO/C++ data model) in place and in use

During and after MDC2: Addressing the problems Program size: up to 850MB. Reduced to <500MB in broad cleanup Robustness of multi-branch I/O (multiple file streams) improved

XDF based I/O maintained as stably functional alternative Improvements to ‘Maker’ organization of component packages Completed by late May; infrastructure stabilized

Torre Wenaus, BNL

RHIC Computing Advisory Committee 10/98

STARCOMPUTING

Software Status for Engineering Run

Offline environment and infrastructure stabilized Shift of focus to consolidation: usability improvements,

documentation, user-driven enhancements, developing and responding to QA

DAQ format data supported in offline from raw files through analysis

Stably functional data storage ‘Universal’ I/O interface transparently supports all STAR file types

DAQ raw data, XDF, ROOT, (Grand Challenge and online pool to come)

ROOT I/O debugging proceeded through June; now stable

StEvent in wide use for physics analysis and QA software Persistent version of StEvent implemented and deployed

Very active analysis and QA program

Calibration/parameter DB not ready (now 10/99 being deployed)

Torre Wenaus, BNL

RHIC Computing Advisory Committee 10/98

STARCOMPUTING

Real Data Processing

Currently live detector is the TPC 75% of TPC read out (beam data and cosmics) Can read and analyze zero suppressed TPC data all the way to DST real data DST read and used in StEvent post-reco analysis Bad channel suppression implemented and tested. First order alignment was worked out (~1mm), the rest to come

from residuals analysis 10 000 cosmics with no field and several runs with field on All interesting real data from engineering run passed through

regular production reconstruction and QA now preparing for second iteration incorporating improvements

in reconstruction codes, calibrations

Torre Wenaus, BNL

RHIC Computing Advisory Committee 10/98

STARCOMPUTING

Event Store and Data Management

Success of ROOT-based event data storage from MDC2 on relegated Objectivity to metadata management role, if any

ROOT provides storage for the data itself

We can use a simpler, safer tool in metadata role without compromising our data model, and avoid complexities and risks of Objectivity

MySQL adopted (relational DB, open software, widely used, very fast, but not a full-featured heavyweight like ORACLE)

Wonderful experience so far. Excellent tools, very robust, extremely fast

Scalability OK so far (eg. 2M rows of 100bytes); multiple servers can be used as needed to address scalability needs

Not taxing the tool because metadata, not large volume data, is stored

Objectivity is gone from STAR

Torre Wenaus, BNL

RHIC Computing Advisory Committee 10/98

STARCOMPUTING

Requirements: STAR 8/99 View (My Version)

Requirement Obj 97 Obj 99 ROOT 97 ROOT 99

C++ API OK OK OK OK

Scalability OK ? No file mgmt MySQL

Aggregate I/O OK ? OK OK

HPSS Planned OK? No OK

Integrity, availability OK OK No file mgmt MySQL

Recovery from lost data OK OK No file mgmt OK, MySQL

Versions, schema evolve OK Your job Crude Almost OK

Long term availability OK? ??? OK? OK

Access control OS Your job OS OS, MySQL

Admin tools OK Basic No MySQL

Recovery of subsets OK OK No file mgmt OK, MySQL

WAN distribution OK Hard No file mgmt MySQL

Data locality control OK OK OS OS, MySQL

Linux No OK OK OK

Torre Wenaus, BNL

RHIC Computing Advisory Committee 10/98

STARCOMPUTING

RHIC Data Management: Factors For Evaluation

My perception of changes in the STAR view from ‘97 to now are shown

Objy Root+MySQL Factor

Cost

Performance and capability as data access solution

Quality of technical support

Ease of use, quality of doc

Ease of integration with analysis

Ease of maintenance, risk

Commonality among experiments

Extent, leverage of outside usage

Affordable/manageable outside RCF

Quality of data distribution mechanisms

Integrity of replica copies

Availability of browser tools

Flexibility in controlling permanent storage location

Level of relevant standards compliance, eg. ODMG

Java access

Partitioning DB and resources among groups

Torre Wenaus, BNL

RHIC Computing Advisory Committee 10/98

STARCOMPUTING

STAR Production Database

MySQL based production database (for want of a better term) in place with the following components

File catalogs Simulation data catalog

populated with all simulation-derived data in HPSS and on disk Real data catalog

populated with all real raw and reconstructed data Run log and online log

fully populated and interfaced to online run log entry Event tag databases

database of DAQ-level event tags. Populated by offline scanner; needs to be interfaced to buffer boxand extended with downstream tags

Production operations database production job status and QA info

Torre Wenaus, BNL

RHIC Computing Advisory Committee 10/98

STARCOMPUTING

ROOT Status in STAR

ROOT is with us to stay! No major deficiencies, obstacles found; no post-ROOT visions

contemplated ROOT community growing: Fermilab Run II, ALICE, MINOS We are leveraging community developments First US ROOT workshop at FNAL in March

Broad participation, >50 from all major US labs, experiments ROOT team present; heeded our priority requests

I/O improvements: robust multi-stream I/O and schema evolutionStandard Template Library supportBoth emerging in subsequent ROOT releases

FNAL participation in development, documentationROOT guide and training materials recently released

Our framework is based on ROOT, but application codes need not depend on ROOT (neither is it forbidden to use ROOT in application codes).

Torre Wenaus, BNL

RHIC Computing Advisory Committee 10/98

STARCOMPUTING

Software Releases and DocumentationRelease policy and mechanisms stable and working fairly smoothly

Extensive testing and QA: nightly (latest version) and weekly (higher statistics testing before ‘dev’ version is released to ‘new’)

Software build tools switched from gmake to cons (perl) more flexible, easier to maintain, faster

Major push in recent months to improve scope and quality of documentation Documentation coordinator (coercer!) appointed New documentation and code navigation tools developed Needs prioritized; pressure being applied; new doc has started to appear Ongoing monthly tutorial program

With cons, doc/code tools, database tools, … perl has become a major STAR tool

Software by type: All Modified in last 2 months C 18938 1264

C++ 115966 52491

FORTRAN 93506 54383

IDL 8261 162

KUMAC 5578 0

MORTRAN 7122 3043

Makefile 3009 2323

scripts 36188 26402

Torre Wenaus, BNL

RHIC Computing Advisory Committee 10/98

STARCOMPUTING

QA

Major effort during and since MDC2 Organized effort under ‘QA Czar’ Peter Jacobs; weekly meetings

and QA reports ‘QA signoff’ integrated with software release procedures Suite of histograms and other QA measures in continuous use and

development Automated tools managing production and extraction of QA

measures from test and production running recently deployed Acts as a very effective driver for debugging and development of

the software, engaging a lot of people

Torre Wenaus, BNL

RHIC Computing Advisory Committee 10/98

STARCOMPUTING

Current Software Status

Infrastructure for year one pretty much there

Simulation stable ~7TB production simulation data generated

Reconstruction software for year one mostly there lots of current work on quality, calibrations, global reconstruction TPC in the best shape; EMC in the worst (two new FTEs should

help EMC catch up; 10% installation in year 1) well exercised in production; ~2.5TB of reconstruction output

generated in production

Physics analysis software now actively underway in all working groups contributing strongly to reconstruction and QA

Major shift of focus in recent months away from infrastructure and towards reconstruction and analysis

Reflected in program of STAR Computing Week last week; predominantly reco/analysis

Torre Wenaus, BNL

RHIC Computing Advisory Committee 10/98

STARCOMPUTING

Priority Work for Year One Readiness

In Progress...

Extending data management tools (MySQL DB + disk file management

+ HPSS file management + multi-component ROOT files)

Complete schema evolution, in collaboration with ROOT team

Completion of the DB: integration of slow control as data source,

completion of online integration, extension to all detectors

Extend and apply OO data model (StEvent) to reconstruction

Continued QA development

Reconstruction and analysis code development Responding to QA results and addressing year 1 code completeness

Improving and better integrating visualization tools

Management of CAS processing and data distribution both for mining

and individual physicist level analysis Integration and deployment of Grand Challenge

Torre Wenaus, BNL

RHIC Computing Advisory Committee 10/98

STARCOMPUTING

STAR Analysis: CAS Usage Plan

CAS processing with DST input based on managed production by the physics working groups (PWG) using the Grand Challenge Architecture

Later stage processing on micro-DSTs (standardized at the PWG level) and ‘nano-DSTs’ (defined by individuals or small groups) occurs under the control of individual physicists and small groups

Mix of LSF-based batch, and interactive on both Linux and Sun, but with far greater emphasis on Linux

For I/O intensive processing, local Linux disks (14GB usable) and Suns available

Usage of local disks and availability of data to be managed through the file catalog

Web-based interface to management, submission and monitoring of analysis jobs in development

Torre Wenaus, BNL

RHIC Computing Advisory Committee 10/98

STARCOMPUTING

Grand Challenge

What does the Grand Challenge do for the user?

Optimizes access to HPSS based data store Improves data access for individual users

Allows event access by query: Present query string to GCA (e.g. NumberLambdas>1)Receive iterator over events which satisfy query as files are

extracted from HPSS Pre-fetches files so that “the next” file is requested from HPSS

while you are analyzing the data in your first file Coordinates data access among multiple users

Coordinates ftp requests so that a tape is staged only once per set of queries which request files on that tape

General user-level HPSS retrieval tool

Torre Wenaus, BNL

RHIC Computing Advisory Committee 10/98

STARCOMPUTING

Grand Challenge queries

Queries based on physics tag selections:

SELECT (component1, component2, …)

FROM dataset_name

WHERE (predicate_conditions_on_properties)

Example:

SELECT dst, hits

FROM Run00289005

WHERE glb_trk_tot>0 & glb_trk_tot<10

Queries based on physics tag selections:

SELECT (component1, component2, …)

FROM dataset_name

WHERE (predicate_conditions_on_properties)

Example:

SELECT dst, hits

FROM Run00289005

WHERE glb_trk_tot>0 & glb_trk_tot<10

Event components:

fzd, raw, dst-xdf, dst-root, hits, StrangeTag, FlowTag, StrangeMuDst, …

Mapping from run/event/component to file via the database

GC index assembles tags + component file locations for each event

Tag based query match yields the files requiring retrieval to serve up that event

Event list based queries allow using the GCA for general-purpose coordinated HPSS retrieval

Event list based retrieval:

SELECT dst, hits

Run 00289005 Event 1

Run 00293002 Event 24

Run 00299001 Event 3

...

Event list based retrieval:

SELECT dst, hits

Run 00289005 Event 1

Run 00293002 Event 24

Run 00299001 Event 3

...

Torre Wenaus, BNL

RHIC Computing Advisory Committee 10/98

STARCOMPUTING

8 Oct. 1999 D. Olson, star-gc_8oct99.ppt 3

Block view of STAR-GC

root4star

STAR mysql

DB server

file system(root event files)

StIOMaker

STACS(GC server)

HPSS

pftp

Grand Challenge in STAR

Torre Wenaus, BNL

RHIC Computing Advisory Committee 10/98

STARCOMPUTING

STAR GC Implementation Plan

Interface GC client code to STAR framework Already runs on solaris, linux Needs integration into framework I/O management Needs connections to STAR MySQL DB

Apply GC index builder to STAR event tags Interface is defined Has been used with non-STAR ROOT files Needs connection to STAR ROOT and mysql DB

(New) manpower for implementation now available Experienced in STAR databases Needs to come up to speed on GCA

Torre Wenaus, BNL

RHIC Computing Advisory Committee 10/98

STARCOMPUTING

Current STAR Status at RCF

Computing operations during the engineering run fairly smooth, apart from very severe security disruptions

Data volumes small, and direct DAQ->RCF data path not yet commissioned

Effectively using the newly expanded Linux farm Steady reconstruction production on CRS; transition to year 1

operation should be smooth New CRS job management software deployed in MDC2 works well

and meets our needs Analysis software development and production underway on CAS Tools managing analysis operations under development

Integration of Grand Challenge data management tools into production and physics analysis operations to take place over the next few months

Not needed for early running (low data volumes)

Torre Wenaus, BNL

RHIC Computing Advisory Committee 10/98

STARCOMPUTING

Concerns: RCF ManpowerUnderstaffing directly impacts

Depth of support/knowledge base in crucial technologies, eg. AFS, HPSS

Level and quality of user and experiment-specific support Scope of RCF participation in software; Much less central

support/development effort in common software than at other labs (FNAL, SLAC)

e.g. ROOT used by all four experiments, but no RCF involvement

Exacerbated by very tight manpower within the experiment software efforts

Some generic software development supported by LDRD (NOVA project of STAR/ATLAS group)

The existing overextended staff is getting the essentials done, but the data flood is still to come

Concerns over RCF understaffing recently increased with departure of Tim Sailer

Torre Wenaus, BNL

RHIC Computing Advisory Committee 10/98

STARCOMPUTING

Concerns: Computer/Network SecurityCareful balance required between ensuring security and providing a

productive and capable development and production environment Not yet clear whether we are in balance or have already strayed to

an unproductive environment Unstable offsite connections, broken farm functionality, database

configuration gymnastics, farm (even interactive part) cut off from the world), limited access to our data disks

Experiencing difficulties, and expecting new ones, from particularly the ‘private subnet’ configuration unilaterally implemented by RCF

Need should be (re)evaluated in light of new lab firewallRCF security closely coupled to overall lab computer/network security;

coherent site-wide plan, as non-intrusive as possible, is needed We are still recovering from the knee-jerk ‘slam the doors’

response of the lab to the August incident Punching holes in the firewall to enable work to get done

I now regularly use PDSF@NERSC when offsite to avoid being tripped up by BNL security

Torre Wenaus, BNL

RHIC Computing Advisory Committee 10/98

STARCOMPUTING

Other Concerns

HPSS transfer failures During MDC2 in certain periods up to 20% of file transfers to

HPSS failed dangerously transfers seem to succeed; no errors and the file seemingly visible in

HPSS with the right size but on reading we find the file not readable

John Riordan has list of errors seen during reading In reconstruction we can guard against this, but it would be a much

more serious problem for DAQ data: cannot afford to read back from HPSS to check its integrity.

Continuing networking disruptions A regular problem in recent months; network dropping out or very

slow for unknown/unannounced reasons If unintentional: bad network management If intentional: bad network management

Torre Wenaus, BNL

RHIC Computing Advisory Committee 10/98

STARCOMPUTING

Public Information and Documentation Needed

Clear list of services RCF provides, the level of support of these services, resources allocated to each experiment, personnel support responsibles

- rcas: LSF, ....

- rcrs: CRS software, ....

- AFS: (stability, home directories, ...)

- Disks: (inside / outside access)

- HPSS

- experiment DAQ/online interface

Web based information is very incomplete e.g. information on planned facilities for year one and after largely a restatement of first point

General communication RCF still needs improvement in general user communication,

responsiveness

Torre Wenaus, BNL

RHIC Computing Advisory Committee 10/98

STARCOMPUTING

Outsourcing Computing in STARBroad local RCF, BNL-STAR and remote usage of STAR software. STAR

environment setup counts since early August: RCF 118571 BNL Sun 33508 BNL Linux 13418 Desktop 6038 HP 801 LBL 29308 Rice 12707 Indiana 19852

Non-RCF usage currently comparable to RCF usage: good distributed computing support is essential in STAR

Enabled by AFS based environment; AFS an indispensable tool But inappropriate for data access usage

Agreement reached with RCF for read-only access to RCF NFS data disks from STAR BNL computers; seems to be working well

New BNL-STAR facilities 6 dual 500MHz/18GB (2 arrived), 120GB disk

For software development, software and OS/compiler testing, online monitoring, services (web, DB,…)

Supported and managed by STAR personnel Supports STAR environment for Linux desktop boxes

Torre Wenaus, BNL

RHIC Computing Advisory Committee 10/98

STARCOMPUTING

STAR Offsite Computing: PDSF

pdsf at LBNL/NERSC Virtually identical configuration to RCF Intel/Linux farm, limited Sun/Solaris, HPSS based data archiving Current (10/99) scale relative to STAR RCF:

CPU ~50% (1200 Si95), disk ~85% (2.5TB) Long term goal: resources ~equal to STAR’s share of RCF

Consistent with long-standing plan that RCF hosts ~50% of experiments’ computing facilities: simulation and some analysis offsite

Ramp-up currently being planned

Other NERSC resources: T3Es major source of simulation cycles 210,000 hours allocated in FY00: one of the larger allocations in

terms of CPU and storage Focus in future will be on PDSF; no porting to next MPP

generation

Torre Wenaus, BNL

RHIC Computing Advisory Committee 10/98

STARCOMPUTING

STAR Offsite Computing: PSC

Cray T3Es at Pittsburgh Supercomputing Center STAR Geant3 based simulation used at PSC to generate ~4TB of

simulated data in support of Mock Data Challenges and software development

Supported by local CMU group Recently retired when our allocation ran out and could not be

renewed Increasing reliance on PSC

Torre Wenaus, BNL

RHIC Computing Advisory Committee 10/98

STARCOMPUTING

STAR Offsite Computing: Universities

Physics analysis computing at home institutions Processing of ‘micro-DSTs’ and DST subsets Software development

Primarily based on small Linux clusters

Relatively small data volumes; aggregate total of ~10TB/yr Data transfer needs of US institutes should be met by net Overseas institutes will rely on tape based transfers

Existing self-service scheme will probably suffice

Some simulation production at universities Rice, Dubna

Torre Wenaus, BNL

RHIC Computing Advisory Committee 10/98

STARCOMPUTING

Residual Needs

Data transfer to PDSF and other offsite institutes Existing self-service DLT probably satisfactory for non-PDSF tape

needs, but little experience to date 100GB/day network transfer rate today adequate for PDSF/NERSC

data transfer Future PDSF transfer needs (network, tape) to be quantified once

PDSF scale-up is better understood

Torre Wenaus, BNL

RHIC Computing Advisory Committee 10/98

STARCOMPUTING

Conclusions: STAR at RCFOverall: RCF is an effective facility for STAR data processing and

management Sound choices in overall architecture, hardware, software Well aligned with the HENP community

Community tools and expertise easily exploited Valuable synergies with non-RHIC programs, notably ATLAS

Production stress tests have been successful and instructive On schedule; facilities have been there when needed

RCF interacts effectively and consults appropriately for the most part, and is generally responsive to input

Weak points are security issues and interactions with general users (as opposed to experiment liaisons and principals)

Mock Data Challenges have been highly effective in exercising, debugging and optimizing RCF production facilities as well as our software

Based on status to date, we expect STAR and RCF to be ready for whatever RHIC Year 1 throws at us.