a uniform and coherent approach to object persistency

21
Vincenzo Innocente, CERN/EP Software Strategy A Uniform and Coherent A Uniform and Coherent Approach Approach to Object Persistency to Object Persistency Vincenzo Innocente

Upload: sana

Post on 19-Jan-2016

45 views

Category:

Documents


0 download

DESCRIPTION

A Uniform and Coherent Approach to Object Persistency. Vincenzo Innocente. User Tag (N-tuple). Tracker Alignment. Ecal calibration. Tracks. Event Collection. Collection Data. Electrons. Event. HEP Data. Environmental data Detector and Accelerator status Calibrations, Alignments - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: A Uniform and Coherent Approach to Object Persistency

Vincenzo Innocente, CERN/EP

Software Strategy

A Uniform and Coherent ApproachA Uniform and Coherent Approachto Object Persistencyto Object Persistency

Vincenzo Innocente

Page 2: A Uniform and Coherent Approach to Object Persistency

Vincenzo Innocente, CERN/EP

Software Strategy 2

HEP DataHEP DataHEP DataHEP Data

Event Event CollectioCollectio

nn

CollectioCollectionn

DataDataEvent Event

ElectronsElectrons

Tracker Tracker AlignmenAlignmen

tt

TracksTracks Ecal Ecal

calibratiocalibrationn

User TagUser Tag(N-tuple)(N-tuple)

Environmental data Detector and Accelerator status Calibrations, Alignments

Event-Collection Data(luminosity, selection criteria, …)

Event Data, User Data

Navigation is essential for an effective physics analysisComplexity requires coherent access mechanisms

Page 3: A Uniform and Coherent Approach to Object Persistency

Vincenzo Innocente, CERN/EP

Software Strategy 3

Not in original design

Later selected DAQ

Later more filters to DVNs and Ntpule

Page 4: A Uniform and Coherent Approach to Object Persistency

Vincenzo Innocente, CERN/EP

Software Strategy 4

CMS Experiment-Data AnalysisCMS Experiment-Data AnalysisCMS Experiment-Data AnalysisCMS Experiment-Data Analysis

Detector ControlOnline Monitoring

Environmental data

storeRequest part

of event

Simulation

G3or G4

store

store

Data Quality

Calibrations

Group AnalysisUser Analysis

on demand

Request part

of event

Request part of eventStore rec-Obj

and calibrations

Quasi-online

Reconstruction

Request part

of event

Store rec-Obj

Persistent Object Store ManagerObject Database Management System

Event FilterObject Formatter

Page 5: A Uniform and Coherent Approach to Object Persistency

Vincenzo Innocente, CERN/EP

Software Strategy 5

Uniform approachUniform approachUniform approachUniform approach

Coherent data access model same mechanisms, same language, same transaction model

Save effort A single team of experts A single team of administrators

Leverage experience developers can easily move from one application to another (from event-

data to calibration-data applications)

Reuse design and code Basic requirements are often the same We can use the same code to manage event data, calibrations, “n-tuple”

Main road in producing better and higher quality software

Page 6: A Uniform and Coherent Approach to Object Persistency

Vincenzo Innocente, CERN/EP

Software Strategy 6

Reconstruction SourcesReconstruction SourcesReconstruction SourcesReconstruction Sources

Page 7: A Uniform and Coherent Approach to Object Persistency

Vincenzo Innocente, CERN/EP

Software Strategy 7

CMS Reconstruction ModelCMS Reconstruction ModelCMS Reconstruction ModelCMS Reconstruction Model

Detector Element

Raw Data

Sim Hits

Rec Hits

Digis

ConditionsGeometry

Event

Algorithm

Rec Objs

Algorithm

Rec Objs

Algorithm

Rec Objs

Algorithm

Page 8: A Uniform and Coherent Approach to Object Persistency

Vincenzo Innocente, CERN/EP

Software Strategy 8

Page 9: A Uniform and Coherent Approach to Object Persistency

Vincenzo Innocente, CERN/EP

Software Strategy 9

Raw EventRaw Event

RawData

RawEvent

RawData

..

.Vector of Digi Vector of Digi

ReadOut

ReadOut

IndexRawData are identified by thecorresponding ReadOut.

RawData belonging to different“detectors” are clustered into different containers.The granularity will be adjustedto optimize I/O performances.

An index at RawEvent level is used to avoid the access to allcontainers in search for a givenRawData.

A range index at RawData levelcould be used for fast randomaccess in complex detectors.Index implemented as an ordered vector of pairs

Page 10: A Uniform and Coherent Approach to Object Persistency

Vincenzo Innocente, CERN/EP

Software Strategy 10

Reconstruction Object ModelReconstruction Object ModelReconstruction Object ModelReconstruction Object Model

All persistent objects are managed by CARF.Physics Modules access them through standard C++ pointers

Page 11: A Uniform and Coherent Approach to Object Persistency

Vincenzo Innocente, CERN/EP

Software Strategy 11

CMS Reconstructed ObjectsCMS Reconstructed Objects

S Track

S-TrackReconstruct

or

S Track

..Vector of RHits

RecEvent

TrackSecInfo

TrackConstituen

ts

Reconstructed Objects produced by a given “algorithm” are managed by a Reconstructor.

A Reconstructed Object (Track) is split into several independent persistent objects to allow their clustering according to their access patterns (physics analysis, reconstruction, detailed detector studies, etc.).

The top level object acts as a proxy.Intermediate reconstructed objects (RHits) are cached by value into the final objects .

“rec”

“esd”

“aod”

Page 12: A Uniform and Coherent Approach to Object Persistency

Vincenzo Innocente, CERN/EP

Software Strategy 12

CARF2000 Event StructureCARF2000 Event StructureCARF2000 Event StructureCARF2000 Event Structure

Page 13: A Uniform and Coherent Approach to Object Persistency

Vincenzo Innocente, CERN/EP

Software Strategy 13

RecEvent

RecEvent

RecEvent

RecEvent

CMS Event StructureCMS Event StructureCMS Event StructureCMS Event Structure

RawEvent

EventCollectio

n

Run

EventCollectio

n

In case of re-reconstructionthe original structure is kept.Event objects are cloned and new collections created

Persistent

Transient

Page 14: A Uniform and Coherent Approach to Object Persistency

Vincenzo Innocente, CERN/EP

Software Strategy 14

Physical clusteringPhysical clusteringPhysical clusteringPhysical clustering

Page 15: A Uniform and Coherent Approach to Object Persistency

Vincenzo Innocente, CERN/EP

Software Strategy 15

CMS needs a real DBMSCMS needs a real DBMSCMS needs a real DBMSCMS needs a real DBMS

An experiment lasting 20 years can not rely just on ASCII files and file systems for its production bookkeeping, “condition” database, etc.

Even today at LEP, the management of all real and simulated data-sets (from raw-data to n-tuples) is a major enterprise Multiple models used (DST, N-tuple, HEPDB, FATMAN, ASCII)

A DBMS is the modern answer to such a problem

An ODBMS provides a coherent and scalable solution for managing all kind of data seamless integration with OO languages internal navigation capability

Page 16: A Uniform and Coherent Approach to Object Persistency

Vincenzo Innocente, CERN/EP

Software Strategy 16

CMS Experience CMS Experience CMS Experience CMS Experience

CMS has used Objectivity/DB for the current prototype activity in close contact with IT in the context of the RD45 project

Database Developers (just OO and C++) : Designing and implementing persistent classes not harder than for

native C++ classes.

Physics Software Developers (do not see Objectivity) : Persistent objects are accessed using standard C++ Same code can access either persistent or transient object

Framework (easy to manage DB) : Flexible and transparent distinction between logical associations and

physical clustering. Fully transparent I/O with performances essentially limited by the

disk speed (random access).

Page 17: A Uniform and Coherent Approach to Object Persistency

Vincenzo Innocente, CERN/EP

Software Strategy 17

CMS ExperienceCMS ExperienceCMS ExperienceCMS Experience

Administration (essentially file management) : Very flexible file-level management (localization, archival,

replication) using AMS features Several tools available to monitor activities and performance File size overhead (5% for realistic CMS object sizes) not larger

than for other “products”

Physicists (easy to use) : Personal Databases are invaluable and in common use

Analysis performance and flexibility improved by shallow (link) & deep (data) local copy of selected event sample

use same type of event-catalog as production Framework and CMS tools hide all details

All our tests show that Objectivity/DB can satisfy CMS requirements in terms of performance, scalability and

flexibility for all kind of data

Page 18: A Uniform and Coherent Approach to Object Persistency

Vincenzo Innocente, CERN/EP

Software Strategy 18

Alternatives: other ODBMSAlternatives: other ODBMSAlternatives: other ODBMSAlternatives: other ODBMS

Versant is a viable commercial alternative to Objectivity do we have time to build an effective partnership (eg. MSS interface)?

Espresso (by IT/DB) should be able to produce a fully fledged ODBMS in a couple of years once the proof-of-concept prototype is ready

Migrate CARF from Objectivity to another ODBMS We expect that it would take about one year Will not affect the basic principles of CMS software architecture and data

model Will involve only the core CARF development team. Will not disrupt production and physics analysis

Page 19: A Uniform and Coherent Approach to Object Persistency

Vincenzo Innocente, CERN/EP

Software Strategy 19

Alternatives: ORDBMSAlternatives: ORDBMSAlternatives: ORDBMSAlternatives: ORDBMS

ORDBMS (Relational DB with OO interface) are appearing on the marketUp to now they looked targeted to those who have already a relational

system and wish to make a transition to OO

A New ORACLE product has all the appearances of a fully fledged ODBMS

IT/DB is in the process of evaluating this new product as an event storeIf it will look promising CMS will join this evaluation next year.

We will consider the impact of ORDBMS on CMS Data Model and on migration effort before the end of 2001

Page 20: A Uniform and Coherent Approach to Object Persistency

Vincenzo Innocente, CERN/EP

Software Strategy 20

Fallback Solution: Hybrid ModelsFallback Solution: Hybrid ModelsFallback Solution: Hybrid ModelsFallback Solution: Hybrid Models

We believe that this solution could seriously compromise our ability to perform our physics program competitively

(R)DBMS for Event Catalog, Calibration, etc Object-Stream files for event data Ad-hoc networked data-server and MSS interface

Less flexible Rigid split between DBMS and event data One way navigation from DBMS to event data

More complex Two different I/O systems More effort to learn More resources for developing and maintaining our application software

This approach will be used by several experiment at BNL and FermiLab (RDBMS not directly accessible from user applications)

CMS is following closely these experiences.

Page 21: A Uniform and Coherent Approach to Object Persistency

Vincenzo Innocente, CERN/EP

Software Strategy 21

ConclusionConclusionConclusionConclusion

CMS has chosen to follow a uniform and coherent approach for the development of Experiment-Data Analysis Software

Today a Functional Prototype exists and includes A modular Object Oriented Framework A Service and Utility Toolkit A Persistent Object Service based on Objectivity/DB Specialized applications for DAQ, Simulation, Reconstruction and

Visualization A set of plug-in modules for detector and physics simulation,

reconstruction and analysis

CMS is currently reviewing the present architecture, the software design and the technical choices to prepare for next

software development cycle