henp computing at bnl torre wenaus star software and computing leader bnl rhic & ags users...

38
HENP Computing at BNL Torre Wenaus STAR Software and Computing Leader BNL RHIC & AGS Users Meeting Asilomar, CA October 21, 1999

Upload: elijah-fox

Post on 03-Jan-2016

221 views

Category:

Documents


1 download

TRANSCRIPT

Page 1: HENP Computing at BNL Torre Wenaus STAR Software and Computing Leader BNL RHIC & AGS Users Meeting Asilomar, CA October 21, 1999

HENP Computing at BNL

Torre WenausSTAR Software and Computing Leader

BNL

RHIC & AGS Users MeetingAsilomar, CA

October 21, 1999

Page 2: HENP Computing at BNL Torre Wenaus STAR Software and Computing Leader BNL RHIC & AGS Users Meeting Asilomar, CA October 21, 1999

Torre Wenaus, BNL

RHIC/AGS Users Meeting 10/99

ContentBruce’s talk

ATLAS

Linux

Mock Data Challenges

D0

focus on areas really changing the scale of HENP comp at BNL

Mount’s APOGEE talk

Security

Software ‘attracting good people’

ROOT; Phenix’s online threaded

Objectivity, MySQL

RIKEN comp center

Esnet

Open Science

Page 3: HENP Computing at BNL Torre Wenaus STAR Software and Computing Leader BNL RHIC & AGS Users Meeting Asilomar, CA October 21, 1999

Torre Wenaus, BNL

RHIC/AGS Users Meeting 10/99

Historical Perspective

Prior to RHIC, BNL has hosted many small to modest scale AGS experiments

With RHIC, BNL moves into realm of large collider detectors computing task at a scale similar to SLAC, Fermilab, CERN etc.

Has required a dramatic change in scale of HENP computing at BNL RHIC Computing Facility (RCF) established Feb 1997 to supply

primary (non-simulation) RHIC computing needs Successful operations in two ‘Mock Data Challenge’ production

stress tests and in summer 1999 engineering run First physics run in early 2000

Presence of RCF a strong factor in the selection of BNL as the principal US computing site for the CERN LHC ATLAS experiment

Requirements and computing plan similar to RCF Will operate in close coordination with RCF LHC and ATLAS operations begin in 2005

Page 4: HENP Computing at BNL Torre Wenaus STAR Software and Computing Leader BNL RHIC & AGS Users Meeting Asilomar, CA October 21, 1999

Torre Wenaus, BNL

RHIC/AGS Users Meeting 10/99

This Talk

Will focus on the major growth of HENP computing as a BNL activity brought by these new programs

RHIC computing at BNL ATLAS computing at BNL Brief mention of some other programs Conclusions

Thanks to Bruce Gibbard, RHIC computing facility head, and others (indicated on slides) for materials

Page 5: HENP Computing at BNL Torre Wenaus STAR Software and Computing Leader BNL RHIC & AGS Users Meeting Asilomar, CA October 21, 1999

Torre Wenaus, BNL

RHIC/AGS Users Meeting 10/99

RHIC Computing at RCF

Four experiments: PHENIX, STAR, PHOBOS, BRAHMS 4:4:2:1 relative scales of computing task

Aggregate raw data recording rate of ~60 MBytes/sec Annual raw data volume ~600 TBytes

NB. Size of global WWW content estimated at 7 Tbytes

Event reconstruction: 13,000 SPECint95 (450MHz PC = 18 SPECint95)

Event filtering (data mining) and physics analysis: 7,000 SPECint95 ‘mining’ interesting data off of tape for physics analyses

aggregate access rates of ~200 MBytes/sec iterative, interactive analysis of disk-based data by hundreds of users

aggregate access rates of ~1000 MBytes/sec

Software development and distribution 100’s of developers; many 100k lines of code per experiment RCF is primary development and distribution (AFS) site

Page 6: HENP Computing at BNL Torre Wenaus STAR Software and Computing Leader BNL RHIC & AGS Users Meeting Asilomar, CA October 21, 1999

Torre Wenaus, BNL

RHIC/AGS Users Meeting 10/99

Computing Strategies

Extensive use of community/commercial/commodity products hardware and software increasing use of open software (eg. Linux, MySQL database)

Exploit ‘embarrassingly parallel’ nature of HENP computing farms of loosely coupled processors (Linux PCs on Ethernet) limited use of Sun machines for I/O intensive analysis

Hierarchical storage management (disk + tape robot/shelf) and flexible partitioning of event data based on access characteristics

optimize storage cost and access latencies to interesting data

Extensive use of OO software technologies adopted by all four RHIC experiments, ATLAS, other BNL HENP

software efforts (eg. D0), and virtually all other forthcoming expts primarily C++; some Java Object I/O: Objectivity commercial OO database and ROOT

community (CERN) developed tool

Page 7: HENP Computing at BNL Torre Wenaus STAR Software and Computing Leader BNL RHIC & AGS Users Meeting Asilomar, CA October 21, 1999

Torre Wenaus, BNL

RHIC/AGS Users Meeting 10/99

Event Data Storage and Management

Major software challenge: event data storage and management ROOT: HENP community tool (from CERN)

used by all RHIC experiments for event data storage Objectivity: Commercial object database

Used by PHENIX for conditions database RCF did Linux port

Relational databases (MySQL, ORACLE) Many cataloguing applications in experiments, RCF MySQL developed by STAR as complement to ROOT for event store,

replacing Objectivity Grand Challenge Architecture

Managed access to HPSS-resident data, particularly for data mining LBNL-led with ANL, BNL participation; deployment at RCF

Particle Physics Data Grid: transparent wide-area data processing US HENP ‘Next Generation Internet’ project, primarily LHC directed RCF/RHIC will act as early testbed

Page 8: HENP Computing at BNL Torre Wenaus STAR Software and Computing Leader BNL RHIC & AGS Users Meeting Asilomar, CA October 21, 1999

Torre Wenaus, BNL

RHIC/AGS Users Meeting 10/99

ATLAS Computing at BNL

A Toroidal LHC ApparatuS One of 4 experiments at LHC 14 TeV pp collider

ATLAS computing at CERN estimated to be >10 times that of RHIC

Augmented by regional centers outside CERN Total scale similar to CERN installation

US ATLAS will have one primary ‘Tier 1’ regional center, at BNL ~20% of CERN facility; ~2x RCF

BNL also manages the US ATLAS construction project; ~20% of full ATLAS detector

Simulation, data mining, physics analysis, and software development will be primary missions of the BNL Tier 1 center

Page 9: HENP Computing at BNL Torre Wenaus STAR Software and Computing Leader BNL RHIC & AGS Users Meeting Asilomar, CA October 21, 1999

Torre Wenaus, BNL

RHIC/AGS Users Meeting 10/99

ATLAS: Commonality and Synergy with RHIC

Qualitative requirements and Tier 1 quantitative requirements similar to RCF Exploit economies of scale in hardware and software Share technical expertise Learn from and build on RHIC computing as a ‘real world testbed’

Commonality: Complete coincidence of supported platforms

Intel/Linux processor farms, Sun/Solaris Objectivity -- and shared concerns over Objectivity! HPSS -- and shared concerns over HPSS! Data mining, Grand Challenge ROOT as an interim analysis tool Particle Physics Data Grid

Page 10: HENP Computing at BNL Torre Wenaus STAR Software and Computing Leader BNL RHIC & AGS Users Meeting Asilomar, CA October 21, 1999

Torre Wenaus, BNL

RHIC/AGS Users Meeting 10/99

Current Status

RHIC RCF Hardware for first year physics in place, except for some tape store

hardware (5 drives; IBM server upgrades) Extensive testing and tuning to be done

performance, reliability, robustness All year 1 requirements satisfied except for disk capacity (later

augmentation an option; not critically needed now) In production use by experiments Positive review by Technical Advisory Committee just concluded

US ATLAS Tier 1 center Initial facility in place, usage by US ATLAS ramping up Operating out of RCF ATLAS software installed and operating More hardware on the way; further increases at proposal stage Dedicated manpower ramping up

Page 11: HENP Computing at BNL Torre Wenaus STAR Software and Computing Leader BNL RHIC & AGS Users Meeting Asilomar, CA October 21, 1999

Torre Wenaus, BNL

RHIC/AGS Users Meeting 10/99

Conclusions

RHIC and RCF have brought BNL to the forefront of HENP computing Computing scale, imminent operation, mainstream approaches and

community involvement make RHIC computing an important testbed for today’s technologies and a stepping stone to the next generation

Performance to date gives confidence for RHIC operations Strong software efforts at BNL in the experiments

BNL as host of US ATLAS Tier 1 center will be a leading HENP computing center in the years to come

Leveraging the facilities, expertise and experience of RCF and the RHIC program

Facility installation to be complemented by a software development effort integrated with the local US ATLAS group

Programs well supported by Brookhaven as part of an increased attention to scientific computing at the lab

Lots of potential for involvement!

Page 12: HENP Computing at BNL Torre Wenaus STAR Software and Computing Leader BNL RHIC & AGS Users Meeting Asilomar, CA October 21, 1999

Torre Wenaus, BNL

RHIC/AGS Users Meeting 10/99

RIKEN QCDSP Parallel Computer

Special purpose massively parallel machine based on DSPs for quantum field theory calculations

4D mesh with nearest neighbor connections

12,288 node, 600 Gflops

Custom designed and built Collaboration centered at

Columbia

RIKEN BNL Research Center

192 mother boards, 64 processors each

Page 13: HENP Computing at BNL Torre Wenaus STAR Software and Computing Leader BNL RHIC & AGS Users Meeting Asilomar, CA October 21, 1999

Torre Wenaus, BNL

RHIC/AGS Users Meeting 10/99

CDIC - Center for Data Intensive Computing

Newly established BNL Center developing collaborative projects Close ties to SUNY at Stony Brook

Some of the HENP projects proposed or begun RHIC Visualization

Newly established collaboration with Stony Brook to develop dynamic 3D visualization tools for RHIC interactions and `beam’s eye’ view

RHIC Computing Proposed collaboration with IBM to use idle PC cycles for RHIC

physics simulation (generator level) Data Mining

New project studying application of `rough sets’ data mining concepts to RHIC event classification and feature extraction

Accelerator Design Proposed parallel simulation of beam dynamics for accelerator design

and optimization

Page 14: HENP Computing at BNL Torre Wenaus STAR Software and Computing Leader BNL RHIC & AGS Users Meeting Asilomar, CA October 21, 1999

Torre Wenaus, BNL

RHIC/AGS Users Meeting 10/99

Visualization

RHIC Au-Au collision animation

(Quicktime movie available on web)

PHENIX event simulation

Page 15: HENP Computing at BNL Torre Wenaus STAR Software and Computing Leader BNL RHIC & AGS Users Meeting Asilomar, CA October 21, 1999

Torre Wenaus, BNL

RHIC/AGS Users Meeting 10/99

ESnet Utilization

Page 16: HENP Computing at BNL Torre Wenaus STAR Software and Computing Leader BNL RHIC & AGS Users Meeting Asilomar, CA October 21, 1999

Torre Wenaus, BNL

RHIC/AGS Users Meeting 10/99

Open Software/Open Science Conference

BNL Oct 2, 1999 Educate scientists on open source projects Stimulate open source applications in

science Present science applications to open source

developers

Page 17: HENP Computing at BNL Torre Wenaus STAR Software and Computing Leader BNL RHIC & AGS Users Meeting Asilomar, CA October 21, 1999

Torre Wenaus, BNL

RHIC/AGS Users Meeting 10/99

HENP Computing Challenges

Experiment Data ComputeE895 (AGS) 10 TB/yr 600 SPECint95BaBar (SLAC) 400 TB/yr 5,000 SPECint95STAR (RHIC) 266 TB/yr 10,100 SPECint95PHENIX (RHIC) 700 TB/yr 8,500 SPECint95D0 Run II (FNAL) 280 TB/yr 4,075 SPECint95CDF Run II (FNAL) 464 TB/yr 3,650 SPECint95ATLAS (LHC) 1100 TB/yr 2,000,000 SPECint95

Experiment Countries Institutes Collaborators Time FrameE895 (AGS) 3 12 49 2000BaBar (SLAC) 9 85 600 2010STAR (RHIC) 7 34 400 2010PHENIX (RHIC) 10 41 400 2010D0 Run II (FNAL) 11 77 500 2005CDF Run II (FNAL) 8 41 490 2005ATLAS (LHC) 34 144 1700 2015

Craig Tull, LBNL

Page 18: HENP Computing at BNL Torre Wenaus STAR Software and Computing Leader BNL RHIC & AGS Users Meeting Asilomar, CA October 21, 1999

Torre Wenaus, BNL

RHIC/AGS Users Meeting 10/99

STAR at RHICRHIC: Relativistic Heavy Ion Collider at Brookhaven National Laboratory

Colliding Au - Au nuclei at 200GeV/nucleon Principal objective: Discovery and characterization of the Quark

Gluon Plasma Additional spin physics program in polarized p - p Engineering run 6-8/99; first year physics run 1/00

STAR experiment One of two large ‘HEP-scale’ experiments at RHIC, >400

collaborators each (PHENIX is the other) Heart of experiment is a Time Projection Chamber (TPC) drift

chamber (operational) together with Si tracker (year 2) and electromagnetic calorimeter (staged over years 1-3)

Hadrons, jets, electrons and photons over large solid angle

Page 19: HENP Computing at BNL Torre Wenaus STAR Software and Computing Leader BNL RHIC & AGS Users Meeting Asilomar, CA October 21, 1999

Torre Wenaus, BNL

RHIC/AGS Users Meeting 10/99

The STAR Computing Task

Data recording rate of 20MB/sec; ~12MB raw data per event (~1Hz) ~4000+ tracks/event recorded in tracking detectors (factor of 2

uncertainty in physics generators) High statistics per event permit event by event measurement and

correlation of QGP signals such as strangeness enhancement, J/psi attenuation, high Pt parton energy loss modifications in jets, global thermodynamic variables (eg. Pt slope correlated with temperature)

17M Au-Au events (equivalent) recorded in nominal year

Relatively few but highly complex events requiring large processing power Wide range of physics studies: ~100 concurrent analyses in ~7

physics working groups

Page 20: HENP Computing at BNL Torre Wenaus STAR Software and Computing Leader BNL RHIC & AGS Users Meeting Asilomar, CA October 21, 1999

Torre Wenaus, BNL

RHIC/AGS Users Meeting 10/99

RHIC/STAR Computing Facilities

Dedicated RHIC computing center at BNL, the RHIC Computing Facility Data archiving and processing for reconstruction and analysis Three production components: Reconstruction (CRS) and analysis

(CAS) services and managed data store (MDS) 10,000 (CRS) + 7,500 (CAS) SpecInt95 CPU ~50TB disk, 270TB robotic tape, 200MB/s I/O bandwidth,

managed by High Performance Storage System (HPSS) developed by DOE/commercial consortium (IBM et al)

Current scale: ~2500 Si95 CPU, 3TB disk for STAR

Limited resources require the most cost-effective computing possible Commodity Intel farms (running Linux) for all but I/O intensive

analysis (Sun SMPs)

Smaller outside resources: Simulation, analysis facilities at outside computing centers Limited physics analysis computing at home institutions

Page 21: HENP Computing at BNL Torre Wenaus STAR Software and Computing Leader BNL RHIC & AGS Users Meeting Asilomar, CA October 21, 1999

Torre Wenaus, BNL

RHIC/AGS Users Meeting 10/99

Implementation of RHIC Computing ModelIncorporation of Offsite Facilities

T3E HPSS Tape storeSP2

Many universities, etc.

Berkeley

Japan

MIT

Doug Olson, LBNL

Page 22: HENP Computing at BNL Torre Wenaus STAR Software and Computing Leader BNL RHIC & AGS Users Meeting Asilomar, CA October 21, 1999

Torre Wenaus, BNL

RHIC/AGS Users Meeting 10/99

HENP Computing: Today’s Realities

Very Large Data Volumes

Large, Globally Distributed Collaborations

Long Lived Projects (>15 years)

Large (1-2M LOC), Complex Analyses

Distributed, Heterogeneous Systems

Very Limited Computing Manpower

Most Computing Manpower are not Professionals Not necessarily a bad thing! Good understanding and direct interest

among developers in the problem

Reliance on Open and Commercial Software & Standards

Evolving Computer Industry & Technology

Page 23: HENP Computing at BNL Torre Wenaus STAR Software and Computing Leader BNL RHIC & AGS Users Meeting Asilomar, CA October 21, 1999

Torre Wenaus, BNL

RHIC/AGS Users Meeting 10/99

Event Data StorageManagement of Petabyte data volumes arguably the most difficult task in HENP

computing today Solutions must map effectively onto OO software technology

Intensive community effort in Object Database technology in last 5 years Focus on Objectivity, the only commercial product that scales to PBytes Great early promise; strong potential to minimize in-house development

and match well the OO architecture of experiments Reality has been more difficult: development effort much greater than

expected, and mixed results on scalabilityIn parallel with Objectivity, community solutions have also been developed

Particularly, ROOT system from CERN supporting I/O of C++ based object models

When complemented by a relational database, provides a robust and scalable solution that integrates well with experiment software

The jury is still out STAR and some other experiments have dropped Objectivity in favor of

ROOT+RDBMS BaBar at SLAC is in production with Objectivity, and is working through

the problems

Page 24: HENP Computing at BNL Torre Wenaus STAR Software and Computing Leader BNL RHIC & AGS Users Meeting Asilomar, CA October 21, 1999

Torre Wenaus, BNL

RHIC/AGS Users Meeting 10/99

Data Management

Coupled to the event data storage problem, but distinct, is the problem of managing effective archiving and retrieval of the data

Hierarchical storage management system required, capable of managing Terabytes of disk-resident rapid-access data Petabytes of tape-resident data with medium latency access

Industry offers very few solutions today One (only) has been identified: HPSS Deployed at RCF (and many other sites), successfully but with

caveats Demands high manpower levels for development and 24x7 support Still under development, particularly in HENP applications, with

stability and robustness issues

Community HENP solutions under development in this area as well (Fermilab, DESY)

Page 25: HENP Computing at BNL Torre Wenaus STAR Software and Computing Leader BNL RHIC & AGS Users Meeting Asilomar, CA October 21, 1999

Torre Wenaus, BNL

RHIC/AGS Users Meeting 10/99

Distributed Computing

In current generation experiments such as RHIC, and to a much greater degree in the next generation such as LHC, distributed computing is essential

Fully empowering physicists not at the experimental site to participate in development and analysis, with effective access to the data

Distributing the computing and data management task among several large sites

The central site can no longer afford to support computing on its own

Near and long term efforts underway to address the need eg. NOVA project at BNL (Networked Object-based enVironment

for Analysis): small project to address immediate and near term needs (STAR/RHIC, ATLAS, possibly others)

Large, LHC directed projects such as the Particle Physics Data Grid project and the MONARC regional center modelling project

Page 26: HENP Computing at BNL Torre Wenaus STAR Software and Computing Leader BNL RHIC & AGS Users Meeting Asilomar, CA October 21, 1999

Torre Wenaus, BNL

RHIC/AGS Users Meeting 10/99

Computing Requirements

Nominal year processing and data volume requirements:

Raw data volume: 200TB

Reconstruction: 2800 Si95 total CPU, 30TB DST data 10x event size reduction from raw to reco 1.5 reconstruction passes/event assumed

Analysis: 4000 Si95 total analysis CPU, 15TB micro-DST data 1-1000 Si95-sec/event per MB of DST depending on analysis

Wide range, from CPU-limited to I/O limited ~100 active analyses, 5 passes per analysis micro-DST volumes from .1 to several TB

Simulation: 3300 Si95 total including reconstruction, 24TB

Total nominal year data volume: 270TB

Total nominal year CPU: 10,000 Si95

Page 27: HENP Computing at BNL Torre Wenaus STAR Software and Computing Leader BNL RHIC & AGS Users Meeting Asilomar, CA October 21, 1999

Torre Wenaus, BNL

RHIC/AGS Users Meeting 10/99

STAR Computing Facilities: RCF Data archiving and processing for reconstruction and analysis (not

simulation; done offsite) General user services (email, web browsing, etc.) Three production components: Reconstruction and analysis services

(CRS, CAS) and managed data store (MDS) Nominal year scale:

10,000 (CRS) + 7,500 (CAS) SpecInt95 CPU Intel farms running Linux for almost all processing; limited use of

Sun SMPs for I/O intensive analysisCost-effective, productive, well-aligned with the HENP

community ~50TB disk, 270TB robotic tape, 200MB/s, managed by HPSS

Current scale (when new procurements are in place): ~2500 Si95 CPU, 3TB disk for STAR ~8TB of data currently in HPSS

Page 28: HENP Computing at BNL Torre Wenaus STAR Software and Computing Leader BNL RHIC & AGS Users Meeting Asilomar, CA October 21, 1999

Torre Wenaus, BNL

RHIC/AGS Users Meeting 10/99

Computing Facilities

Dedicated RHIC computing center at BNL, the RHIC Computing Facility Data archiving and processing for reconstruction and analysi

Simulation done offsite 10,000 (reco) + 7,500 (analysis) Si95 CPU

Primarily Linux; some Sun for I/O intensive analysis ~50TB disk, 270TB robotic tape, 200MB/s, managed by HPSS Current scale (STAR allocation, ~40% of total):

~2500 Si95 CPU 3TB disk

Support for (a subset of) physics analysis computing at home institutions

Page 29: HENP Computing at BNL Torre Wenaus STAR Software and Computing Leader BNL RHIC & AGS Users Meeting Asilomar, CA October 21, 1999

Torre Wenaus, BNL

RHIC/AGS Users Meeting 10/99

Mock Data ChallengesMDC1: Sep/Oct ‘98

>200k (2TB) events simulated offsite; 170k reconstructed at RCF (goal was 100k)

Storage technologies exercised (Objectivity, ROOT) Data management architecture of Grand Challenge project demonstrated Concerns identified: HPSS, AFS, farm management software

MDC2: Feb/Mar ‘99 New ROOT-based infrastructure in production AFS improved, HPSS improved but still a concern Storage technology finalized (ROOT) New problem area, STAR program size, addressed in new procurements

and OS updates (more memory, swap)

Both data challenges: Effective demonstration of productive, cooperative, concurrent (in MDC1)

production operations among the four experiments Bottom line verdict: the facility works, and should perform in physics

datataking and analysis

Page 30: HENP Computing at BNL Torre Wenaus STAR Software and Computing Leader BNL RHIC & AGS Users Meeting Asilomar, CA October 21, 1999

Torre Wenaus, BNL

RHIC/AGS Users Meeting 10/99

Offline Software EnvironmentCurrent software base a mix of Fortran (55%) and C++ (45%)

from ~80%/20% (~95%/5% in non-infrastructure code) in 9/98 New development, and all post-reco analysis, in C++

Framework built over ROOT adopted 11/98 Origins in the ‘Makers’ of ATLFAST Supports legacy Fortran codes, table (IDL) based data structures

developed in previous StAF framework without change Deployed in offline production and analysis in our ‘Mock Data

Challenge 2’, 2-3/99Post-reconstruction analysis: C++/OO data model ‘StEvent’

StEvent interface is ‘generic C++’; analysis codes are unconstrained by ROOT and need not (but may) use it

Next step: migrate the OO data model upstream to reco

Page 31: HENP Computing at BNL Torre Wenaus STAR Software and Computing Leader BNL RHIC & AGS Users Meeting Asilomar, CA October 21, 1999

Torre Wenaus, BNL

RHIC/AGS Users Meeting 10/99

Initial RHIC DB Technology Choices

A RHIC-wide Event Store Task Force in Fall ‘97 addressed data management alternatives

Requirements formulated by the four experiments Objectivity and ROOT were the ‘contenders’ put forward STAR and PHENIX selected Objectivity as the basis for data

management Concluded that only Objectivity met the requirements of their event

stores ROOT selected by the smaller experiments and seen by all as

analysis tool with great potential Issue for the two larger experiments:

Where to draw a dividing line between Objectivity and ROOT in the data model and data processing

Page 32: HENP Computing at BNL Torre Wenaus STAR Software and Computing Leader BNL RHIC & AGS Users Meeting Asilomar, CA October 21, 1999

Torre Wenaus, BNL

RHIC/AGS Users Meeting 10/99

Event Store Requirements -- And Fall ‘97 View

Requirement Objectivity ROOT

Good C++ API OK OK

Scalability to RHIC data volumes OK No file mgmt

Adequate I/O throughput OK OK

HPSS compatibility Planned No

Integrity, availability of data OK No file mgmt

Recovery from permanently lost data OK No file mgmt

Object versioning, schema evolution OK Crude

Long term availability OK? OK?

Access control OS OS

Administration tools OK No

Backup, recovery of subsets of data OK No file mgmt

WAN distribution of data OK No file mgmt

Data locality control OK OS

Linux support No OK

Page 33: HENP Computing at BNL Torre Wenaus STAR Software and Computing Leader BNL RHIC & AGS Users Meeting Asilomar, CA October 21, 1999

Torre Wenaus, BNL

RHIC/AGS Users Meeting 10/99

Requirements: STAR 8/99 View (My Version)

Requirement Obj 97 Obj 99 ROOT 97 ROOT 99

C++ API OK OK OK OK

Scalability OK ? No file mgmt MySQL

Aggregate I/O OK ? OK OK

HPSS Planned OK? No OK

Integrity, availability OK OK No file mgmt MySQL

Recovery from lost data OK OK No file mgmt OK, MySQL

Versions, schema evolve OK Your job Crude Almost OK

Long term availability OK? ??? OK? OK

Access control OS Your job OS OS, MySQL

Admin tools OK Basic No MySQL

Recovery of subsets OK OK No file mgmt OK, MySQL

WAN distribution OK Hard No file mgmt MySQL

Data locality control OK OK OS OS, MySQL

Linux No OK OK OK

Page 34: HENP Computing at BNL Torre Wenaus STAR Software and Computing Leader BNL RHIC & AGS Users Meeting Asilomar, CA October 21, 1999

Torre Wenaus, BNL

RHIC/AGS Users Meeting 10/99

RHIC Data Management: Factors For Evaluation

My perception of changes in the STAR view from ‘97 to now are shown

Objy Root+MySQL Factor

Cost

Performance and capability as data access solution

Quality of technical support

Ease of use, quality of doc

Ease of integration with analysis

Ease of maintenance, risk

Commonality among experiments

Extent, leverage of outside usage

Affordable/manageable outside RCF

Quality of data distribution mechanisms

Integrity of replica copies

Availability of browser tools

Flexibility in controlling permanent storage location

Level of relevant standards compliance, eg. ODMG

Java access

Partitioning DB and resources among groups

Page 35: HENP Computing at BNL Torre Wenaus STAR Software and Computing Leader BNL RHIC & AGS Users Meeting Asilomar, CA October 21, 1999

Torre Wenaus, BNL

RHIC/AGS Users Meeting 10/99

Object Database: Storage Hierarchy vs User View

User deals only with ‘object model’ of his own design; storage details are hidden

Page 36: HENP Computing at BNL Torre Wenaus STAR Software and Computing Leader BNL RHIC & AGS Users Meeting Asilomar, CA October 21, 1999

Torre Wenaus, BNL

RHIC/AGS Users Meeting 10/99

ATLAS and US ATLAS

One of two large HEP experiments at CERN’s Large Hadron Collider (LHC)

Proton-proton collider; 14 TeV in center of mass 1 billion events/year Principal objective: Discovery and characterization of physics

‘beyond the Standard Model’: Higgs, Supersymmetry, … Startup 2005+

Brookhaven hosts the US Project Office for US contributions to ATLAS ~$170M; about 20% of the project

Brookhaven recently selected as host lab for US ATLAS Computing and site of US Regional Center

Extension of RHIC Computing Facility US ATLAS Computing projected to grow to ~$15M/yr

Page 37: HENP Computing at BNL Torre Wenaus STAR Software and Computing Leader BNL RHIC & AGS Users Meeting Asilomar, CA October 21, 1999

Torre Wenaus, BNL

RHIC/AGS Users Meeting 10/99

ConclusionsHENP is (unfortunately!) still pushing the envelope in the scale of the data

processing and management tasks of present and next generation experiments

The HENP community has looked to the commercial and open software worlds for tools and approaches, with strong successes in some areas (OO programming), qualified successes in others (HPSS), and the jury is still out on some (Object Databases)

Moore’s Law and the rise of Linux have made provisioning CPU cycles less of an issue

The community has converged on OO as the principal tool to make software development tractable

But solutions to data storage and management are much less clear A need on the rise is distributed computing, but internet-driven growth in

capacities and technologies will be a strong lever

Developments within the HENP community continue to be important, either as fully capable solutions or interim solutions pending further commercial/open software developments

Page 38: HENP Computing at BNL Torre Wenaus STAR Software and Computing Leader BNL RHIC & AGS Users Meeting Asilomar, CA October 21, 1999

Torre Wenaus, BNL

RHIC/AGS Users Meeting 10/99

ConclusionsThe circumstances of STAR

Startup this year Slow start in addressing event store implementation, C++ migration Large base of legacy software Extremely limited manpower and computing resources

drive us to very practical and pragmatic data management choices Beg, steal and borrow from the community Deploy community and industry standard technologies Isolate implementation choices behind standard interfaces, to revisit

and re-optimize in the futurewhich leverage existing STAR strengths

Component and standards-based software greatly eases integration of new technologies

preserving compatibility with existing tools for selective and fall-back use

while efficiently migrating legacy software and legacy physicists After some course corrections, we have a capable data management

architecture for startup that scales to STAR’s data volumes … but Objectivity is no longer in the picture.