object persistency & data handling session c - summary object persistency & data handling session c...

Object Persistency & Data Handling Session C - Summary Object Persistency & Data Handling Session C - Summary Dirk Duellmann Slide 2 CHEP 2000, PadovaSession C SummaryDirk Duellmann Slide 3 CHEP 2000, PadovaSession C SummaryDirk Duellmann Slide 4 CHEP 2000, PadovaSession C SummaryDirk Duellmann Espresso Feasibility Study n We identified solutions for most critical components of a scalable and performant ODBMS Prototype implementation shows promising performance and scalability Using a strict component approach allows to split the effort into independently developed, replaceable modules. n The development of an Open Source ODBMS seems possible within the HEP or general science community n A collaborative effort of the order of 15 person years seems sufficient to produce such a system with production quality Slide 5 CHEP 2000, PadovaSession C SummaryDirk Duellmann Slide 6 CHEP 2000, PadovaSession C SummaryDirk Duellmann Slide 7 CHEP 2000, PadovaSession C SummaryDirk Duellmann Slide 8 CHEP 2000, PadovaSession C SummaryDirk Duellmann HERA n ZEUS Objectivity based TagDB in production significant performance gain for event selection n H1 H1 will move to an analysis and event display framework based on ROOT DST and micro-DST (based on BOS & PAW) will be replaced by analysis objects stored in ROOT trees n HERA-B Conditions database based on Berkeley DB ROOT current being integrated Slide 9 CHEP 2000, PadovaSession C SummaryDirk Duellmann Files + Metadata Approach n RHIC STAR moved from Objectivity to ROOT I/O n ROOT files for event data n file catalogue implemented using mySQL PHENIX n ROOT files for event data n Objectivity/DB for conditions, configuration and file catalogue n Fermilab Run II CDF n ROOT for event data n file catalogue stored in Oracle D0 n D0OM for event data n Metadata based on Oracle Slide 10 CHEP 2000, PadovaSession C SummaryDirk Duellmann Sequential Access Model n Integrated information about Tape volumes File catalogue Runs Event properties Trigger configuration n Uses Enstore as MSS 1.5TB on Mammoth tapes (1.TB/day peak) n Being used to store Monte Carlo data D0 analysis tasks Slide 11 CHEP 2000, PadovaSession C SummaryDirk Duellmann Slide 12 CHEP 2000, PadovaSession C SummaryDirk Duellmann Slide 13 CHEP 2000, PadovaSession C SummaryDirk Duellmann Slide 14 CHEP 2000, PadovaSession C SummaryDirk Duellmann Mass Storage Systems n The CASTOR project at CERN has moved into production staging system backward compatible with SHIFT with additional HSM functionality main client will be COMPASS (@ 35MB/s) planned ALICE Mock Data Challenge to prove feasibility of 100MB/s over one week n EUROSTORE - Esprit project over the last 2 years Parallel Filesystem (QSW) + HSM (DESY) Prototype installation & testing (CERN) Operational system has been demonstrated Follow-on proposal has been submitted with the aim to provide fully tested product including a LINUX port Deployment at DESY foreseen for end 2000 Slide 15 CHEP 2000, PadovaSession C SummaryDirk Duellmann Slide 16 CHEP 2000, PadovaSession C SummaryDirk Duellmann Slide 17 CHEP 2000, PadovaSession C SummaryDirk Duellmann Slide 18 CHEP 2000, PadovaSession C SummaryDirk Duellmann Slide 19 CHEP 2000, PadovaSession C SummaryDirk Duellmann Slide 20 CHEP 2000, PadovaSession C SummaryDirk Duellmann Slide 21 CHEP 2000, PadovaSession C SummaryDirk Duellmann Slide 22 CHEP 2000, PadovaSession C SummaryDirk Duellmann Slide 23 CHEP 2000, PadovaSession C SummaryDirk Duellmann Slide 24 CHEP 2000, PadovaSession C SummaryDirk Duellmann Slide 25 CHEP 2000, PadovaSession C SummaryDirk Duellmann Language Binding & Insulation n Language Support - at least for C++ and JAVA or (in some cases) FORTRAN Trade-off between Trade-off between Risk for Experiment Code - insulation against change of persistency solution Risk for Experiment Code - insulation against change of persistency solution Maintainability - additional manual work and many additional classes Maintainability - additional manual work and many additional classes Transparency for End Users - as simple to use as transient data Transparency for End Users - as simple to use as transient data Flexibility - more than one storage solution at the same time - implement workable schema evolution Flexibility - more than one storage solution at the same time - implement workable schema evolution Performance - e.g. is I/O on demand needed ? - if yes, what is the right granularity: One object? One event? Performance - e.g. is I/O on demand needed ? - if yes, what is the right granularity: One object? One event? Slide 26 CHEP 2000, PadovaSession C SummaryDirk Duellmann Slide 27 CHEP 2000, PadovaSession C SummaryDirk Duellmann Slide 28 CHEP 2000, PadovaSession C SummaryDirk Duellmann Slide 29 CHEP 2000, PadovaSession C SummaryDirk Duellmann Slide 30 CHEP 2000, PadovaSession C SummaryDirk Duellmann Slide 31 CHEP 2000, PadovaSession C SummaryDirk Duellmann Language Binding & Insulation n Two main approaches are used In place access of persistent objects n The framework is implemented using language binding n Only C++ pointers to persistent data are exposed to user (CMS, STAR, PHENIX) Access of transient copies n Complete conversion into transient objects (BaBar) n On demand conversion into transient objects using smart pointers (LHCb) n Experiment specific insulation layer usually coupled to a specific application framework n In both cases: split into two interfaces framework uses more flexible, performant, exposed lower level end user uses more insulated, transparent, customised higher level n Is the mapping layer in between really experiment specific? Slide 32 CHEP 2000, PadovaSession C SummaryDirk Duellmann Schema Evolution & Object Conversion n BaBar - Objectivity/DB presented conversion scheme using their transient/persistent mapping scheme n Star - ROOT implemented an additional conversion mechanism which replaces the user schema evolution provided by ROOT n CLEO III - Objectivity/DB implements a system based on opaque data objects stored in Objectivity n Is a experiment independent implementation of schema evolution possible? Slide 33 CHEP 2000, PadovaSession C SummaryDirk Duellmann From Data Storage to Data Management n Consistent management of a distributed data store needs knowledge about semantics of the data which files belong to one event collection, run, calibration period n they should be discarded together n staged together n exported together n Strong coupling to system details n which application logic n batch system n mass storage system n Significantly larger functionality & complexity significant development effort Slide 34 CHEP 2000, PadovaSession C SummaryDirk Duellmann Performance Optimisation of Complex Storage Systems n Successful system optimisation requires correlated diagnostics on all levels Mass Storage System n number of mounted tapes, file lifetime in disk pool Data Server n I/O per server, per filesystem, per network interface Lock Server n number of locks, number of waiting processes, locked resources Client Host n I/O per client, per filesystem, per machine, total CPU usage n number of running processes Client Application n number of used objects, containers and databases, transaction timing n regular profiling runs n All system components need monitoring instrumentation understanding of chaotic areas like analysis servers is definitely non-trivial Slide 35 CHEP 2000, PadovaSession C SummaryDirk Duellmann Transactions & Recovery n Are transactions needed to allow fail safe concurrent access? Is it cheaper/easier to work in the old (manual) way? n With sequential recovery: just throw away the last file, the last group of files, change some meta data n Application level consistency checks? IT industry seems to have a different opinion n Use transactions to enforce consistency between the different parts of the store n Is the recovery of our data and meta data really that much simpler? n How does one integrate transactions in mutiple storage systems ? n The production experience of the next generation of experiments will tell us more. Slide 36 CHEP 2000, PadovaSession C SummaryDirk Duellmann Summary of the Summary n Significant progress in providing object persistency for a real life experiment BaBar successfully went into production with an ODBMS based store n Management of complex systems is a significant effort Solutions for schema evolution, insulation layers, data import/export have been developed for specific experiment frameworks Can some of those solutions be generalised? n Still open questions direct use of persistent objects or converted copies ? single ODBMS system or files + metadata in an RDBMS ? n More experience needed from running experiments RHIC and Fermilab runII experiments will soon be able to tell us more

object persistency & data handling session c - summary object persistency & data handling session c...

Documents

integrated slide

oracle slide

event data n metadata

production quality slide

event data n objectivitydb

cdf n root

language binding n

event data n file catalogue