iwlsc – chep 2006 the evolution of databases in hep a time-traveller's tale [ with an...
Post on 20-Jan-2016
217 Views
Preview:
TRANSCRIPT
IWLSC – CHEP 2006
The Evolution of Databases in HEPA Time-Traveller's Tale
[ With an intermezzo covering LHC++ ]
Jamie Shiers, CERN
~ ~ ~…, DD-CO, DD-US, CN-AS, CN-ASD, IT-ASD, IT-DB, IT-GD, ¿¿-??, …
Th
e E
volu
tion
of
Dat
abas
es in
HE
P
Abstract
The past decade has been an era of sometimes tumultuous change in the area of Computing for High Energy Physics.
This talk addresses the evolution of databases in HEP, starting from the LEP era and the visions presented during the CHEP 92 panel "Databases for High Energy Physics" (D. Baden, B. Linder, R. Mount, J. Shiers).
It then reviews the rise and fall of Object Databases as a "one size fits all solution" in the mid to late 90's and finally summarises the more pragmatic approaches that are being taken in the final stages of preparation for LHC data taking.
The various successes and failures (depending on one's viewpoint) regarding database deployment during this period are discussed, culminating in the current status of database deployment for the Worldwide LCG.
Th
e E
volu
tion
of
Dat
abas
es in
HE
P
Once open a time…
Databases for HEP Panel at CHEP 1992:
1. Should we buy or build database systems for our calibration and book-keeping needs?
2. Will database technology advance sufficiently in the next 8 to 10 years to be able to provide byte-level access to petabytes of SSC/LHC data?
Drew Baden, University of Maryland (PASS Project);B. Linder, Oracle Corporation;Richard Mount, Caltech;Jamie Shiers, CERN.
Th
e E
volu
tion
of
Dat
abas
es in
HE
PBuy or build? – Calibration &
Bookkeeping
1. Is it technically possible to use a commercial system?
2. Would it be manageable administratively and financially?
Questions already investigated during LEP planning phase
Computer Physics Communications 45 (1987) 299-310
“Possibility” in 1984 (technically but not Q2);
“Probability” in 1992 (technically and conceivably also Q2).
Th
e E
volu
tion
of
Dat
abas
es in
HE
PCalibration & Bookkeeping in
1992
Many experiments had independently developed solutions to these problems, with largely overlapping functionality
Two (of the) efforts at producing common solutions:
1. FATMEN – file catalog and more (used by DELPHI, L3, OPAL etc. );2. HEPDB – calibration database based on DBL3 and OPCAL
Both of these were based on ZEBRA-RZ ZEBRA-FZ for “updates”, with zftp/zserv for distribution
FATMEN also had an Oracle backend Later dropped…
Th
e E
volu
tion
of
Dat
abas
es in
HE
P
Arguments - 1992
A significant amount of HEP-specific code would need to be added – roughly comparable in size (within a factor of 2) of existing home-grown solutions
Commercial systems require well trained support staff at every site. (Home-grown too, but experience shows that this is typically much less than for commercial solutions.)
Licensing and support for large, diverse HEP collaborations clearly a concern
We shall revisit these questions later in the show…
Th
e E
volu
tion
of
Dat
abas
es in
HE
P¿Use of existing packages for
SSC/LHC? Could / should Zebra be used in 2001?
Hope to exploit language features of Fortran90 (?) and/or C++
File catalog for multi-PB of data? Move towards nameserver approach (LFC = castor ns)
Existing data access and management tools? Away from sequential access…
Where do we store the data?
Actually in a database; Assumed for metadata; less likely for data itself
In some ‘super filesystem’ – e.g. based on IEEE MSS reference model
Th
e E
volu
tion
of
Dat
abas
es in
HE
P
Zebra RZ and Scalability Issues
Some fields within RZ were limited to 16 bits
Fine for storage of HBOOK histograms etc but a limitation for larger catalogs
Move from 16-bit fields to 32-bit (Sunanda Banerjee)
A disconcertingly recurrent theme over the next decade…
(Also introduction of ‘eXchange’ format – makes zftp redundant in favour of binary ftp) (Burkhard Holl)
Th
e E
volu
tion
of
Dat
abas
es in
HE
PCHEP 92 – the Birth of OO in
HEP? Wide-ranging discussions on the future of s/w development in
HEP
A number of proposals presenting – including ZOO (I preferred “OOZE”) –
leading to:
RD41 – MOOSE The applicability of OO to offline particle physics code
RD44 – GEANT4 Produce a global object-oriented analysis and design of an
improved GEANT simulation toolkit for HEP RD45 – A Persistent Object Manager for HEP
(and later also LHC++ (subsequently ANAPHE))
ROOT
CERN - Computing Challenges 12
LHC++ - a “C++ CERNLIB”
Jamie ShiersIT/ASD group, CERN
http://wwwinfo.cern.ch/asd/lhc++/index.html
CERN - Computing Challenges 14
Background - CERNLIB
Widely used, Fortran-based library Developed at CERN over ~30 years
“Natural” evolution to Fortran 90/95/2K Significant effort spent on F90 preparation
Circa 1993: strong growth in OO (C++)Decision to drop plans for “F90 CERNLIB”Effort redirected towards OO solutions
CERN - Computing Challenges 15
Why C++? Why LHC++?
“The equivalent of the current CERN program library” (ATLAS TP)
“… only language that meets these requirements is C++” (CMS TP)
Decision of NA45 to move to C++Needs of R&D projects (RD41, 44, 45)Working group setup mid-95
Provide CERNLIB-like functionality in C++
Libraries for HEP Computing
CERN - Computing Challenges 16
Goals of Working Group
Re-use existing work where possible• e.g. CLHEP
Use standard solutions where appropriate• de-facto or de-jure: STL, OpenGL, NAG C, ODMG, ...
Investigate commercial components• including license / distribution issues / pros & cons
Formal User Requirements gathering step• following ESA PSS-05 guidelines
Be prepared for change!• LHC lifetime is extremely long
CERN - Computing Challenges 18
CERNLIB & LHC++KERNLIBMATHLIBPACKLIB
MINUITHBOOK
GRAFLIB (HIGZ etc.)MCLIBs
Pythia, Herwig, ...
GEANT3PGMs: PAW, ...
C++ , STL, CLHEPNAG C libraryApplications
GEMINIHistograms (HTL)
OpenGL, OpenInventor,Event Generators
Pythia 7, ...
GEANT4Components & Modules
CERN - Computing Challenges 19
Status (Oct. 98)[ LHC++ ] has achieved the goal of
demonstrating that ... standard tools ... could solve typical HEP problems.~2-3K lines of code + commercial components provide a concrete CERNLIBCERNLIB replacement.
Several experiments (CMS, NA45, ...) heavily using LHC++ components
~100 sites registered for access• ~120 predicted in 1998 COCOTIME review• On track for 150-200 by end-1999
CERN - Computing Challenges 20
Modularity
Emphasis on modularity of approach Can (have) replace(d) individual
components
Fahrenheit
Vendor STL
CERN - Computing Challenges 30
Software Migration
Strategy developed assuming migrationStrategy developed assuming migration Has already taken place! RogueWaveRogueWave to STLSTL; OpenGLOpenGL implementations HEP-specific code, e.g. histOOgrams histOOgrams to HTLHTL Future migrations: OpenInventorOpenInventor to FahrenheitFahrenheit?
C++C++ to JavaJava? JavaJava to ???Lifetime of future experiments >>
individual s/w packages!Wylbur to now ~ now to end-LHC
LHC++ Report - LCB 10 March 1999 34
Referee Report on LHC++
LCB 10 March 1999
J. Knobloch
http://nicewww.cern.ch/atlas/wwwslides/lhcpp9903/index.htm
LHC++ Report - LCB 10 March 1999 35
December 98 - Outstanding issues
• Alternatives to HEP Explorer should be investigated - growing interest in Java-based tools.
• There is interest in histogram storage without an OO DB
• Availability issues (frequency of releases, platforms and some licensing issues) remain to be finalised.
• Improvement in end-user training - slow outside CERN.
• CLHEP issues: releases coupled with rest of LHC++ or not? Need for a policy on further development.
• All experiments participating in LHC++ should define their milestones.
• Requirements and solutions for scripting should be evaluated and implemented.
• Information should be provided on LHC++ usage.
• Release strategy and priority setting should be looked at again, perhaps best through setting up an executive board.
LHC++ Report - LCB 10 March 1999 36
Asked the 4 Experiments
• Current usage of LHC++• Position on the various
components Priorities Concerns Activities
• Received answers from all 4 experiments
• ALICE repeats general rejection of LHC++ expressed at the last LCB:
We have briefly evaluated LHC++ and we were not very satisfied.
At the moment we are using ROOT and we are content with it. Currently we have no plans regarding LHC++
We will consider very seriously any solution that is adopted by the other three experiments as a common solution
We believe that ROOT can be such a common solution and we encourage the other experiments to consider it as such
LHC++ Report - LCB 10 March 1999 38
Proposal of conclusions
• Major progress has been made Meetings, new personnel
• Plan to establish an alternative to Explorer by something more adapted to the user requirements
• Improve the HEP specific documentation of commercial components
• Document the overall architecture and dependencies
• Clarify the boundaries of LHC++• Items from previous LCBs:
Scripting Usage statistics
• LHC++ is now moving from the development phase to the production phase. It is therefore appropriate to revisit the management of the project by establishing a “LHC++ Executive Board” composed of the LHC++ project leader and representatives of the LHC experiments
Discussing development and release plans
Setting priorities
CERN - Computing Challenges 39
Now back to our main thread…
We left off back at CHEP ’92… (The birth of OO in HEP?)
Th
e E
volu
tion
of
Dat
abas
es in
HE
P
RD45 – Initial Milestones
[The project ] should be approved for an initial period of one year. The following milestones should be reached by the end of the 1st year.
1. A requirements specification for the management of persistent objects typical of HEP data together with criteria for evaluating potential implementations. [ Later dropped – experiments far from ready ]
2. An evaluation of the suitability of ODMG's Object Definition Language for specifying an object model describing HEP event data.
3. Starting from such a model, the development of a prototype using commercial ODBMSes that conform to the ODMG standard. The functionality and performance of the ODBMSes should be evaluated.
It should be noted that the milestones concentrate on event
data. Studies or prototypes based on other HEP data should not be excluded, especially if they are valuable to gain experience in the initial months.
Th
e E
volu
tion
of
Dat
abas
es in
HE
P
RD45 – Initial Steps
Contacts with the main ODBMS vendors of that time O2, ObjectStore, Objectivity, Versant, Poet, …
Many presentations scheduled at CERN
Training on O2, Objectivity/DB, a few licenses acquired…
Prototyping focussed on Objectivity/DB with Versant (later) as primary fall-back
Scalability of architecture
Th
e E
volu
tion
of
Dat
abas
es in
HE
P
Objectivity/DB Scalability
Item Limit
Page size 216 bytes
# DB/Federated DB 216 - 1
# containers/DB 215 - 1
# logical pages/cont 216 - 1
Maximum DB size File-system limit
216 databases (files) of 100GB = 6.5PB
CERN has requested extended OID
16 bits again!
Th
e E
volu
tion
of
Dat
abas
es in
HE
P
LCRB review, March 1996
The RD45 project has made excellent progress in identifying and applying solutions for object persistence for HEP based on standards and commercial products
RD45 should be approved for a further year The LCRB agrees with the program of future work outlined in the
RD45 status report and regards the following activities (below) and milestones (next) as particularly important:
Provide the object persistence services needed for the first release of GEANT4 in early 1997
Collaborate with ATLAS and CMS in the development of those aspects of the Computing Technical Proposals which may be affected by the nature of object persistence services
Th
e E
volu
tion
of
Dat
abas
es in
HE
P
RD45 Milestones - 96
1. Identify and analyse the impact of using an ODBMS for event data on the Object Model, the physical organisation of the data, coding guidelines and the use of third party class libraries
2. Investigate and report on ways that Objectivity/DB features for replication, schema evolution and object versions can be used to solve data management problems typical of the HEP environment
3. Make an evaluation of the effectiveness of an ODBMS and MSS as the query and access method for physics analysis. The evaluation should include performance comparisons with PAW and Ntuples
Th
e E
volu
tion
of
Dat
abas
es in
HE
P
RD45 Milestones - 97
1. Demonstrate, by the end of 1997, the proof of principle that an ODBMS can satisfy the key requirements of typical production scenarios (e.g. event simulation and reconstruction), for data volumes up to 1TB. The key requirements will be defined, in conjunction with the LHC experiments, as part of this work,
2. Demonstrate the feasibility of using an ODBMS + MSS for Central Data Recording, at data rates sufficient to support ATLAS and CMS test-beam activities during 1997 and NA45 during their 1998 run,
3. Investigate and report on the impact of using an ODBMS for event data on end-users, including issues related to private and semi-private schema and collections, in typical scenarios including simulation, (re-)reconstruction and analysis.
Th
e E
volu
tion
of
Dat
abas
es in
HE
P
RD45 Milestones - 98
1. Provide, together with the IT/PDP group, production data management services based on Objectivity/DB and HPSS with sufficient capacity to solve the requirements of ATLAS and CMS test beam and simulation needs, COMPASS and NA45 tests for their '99 data taking runs.
2. Develop and provide appropriate database administration tools, (meta-)data browsers and data import/export facilities, as required for (1).
3. Develop and provide production versions of the HepOODBMS class libraries, including reference and end-user guides.
4. Continue R&D, based on input and use cases from the LHC collaborations to produce results in time for the next versions of the collaborations' Computing Technical Proposals (end 1999).
Th
e E
volu
tion
of
Dat
abas
es in
HE
P
Toward the 2001 Milestone“Choice of ODBMS vendor”
“If the ODBMS industry flourishes it is very likely that by 2005 CMS will be able to obtain products, embodying thousands of man-years of work, that are well matched to its worldwide data management and access needs. The cost of such products to CMS will be equivalent to at most a few man-years. We believe that the ODBMS industry and the corresponding market are likely to flourish. However, if this is not the case, a decision will have to be made in approximately the year 2000 to devote some tens of man-years of effort to the development of a less satisfactory data management system for the LHC experiments.”
(CMS Computing Technical Proposal, section 3.2, page 22)And the rest, as they say, is History…
CERN - Computing Challenges 48
Risk Analysis: Issues
Choice of TechnologyODBMS, ORDBMS, RDBMS, “light-weight” Persistency, files + meta-data, ...
Choice of Vendor (historically)#1 Objectivity, #2 Versant
Size of marketDid not take off as anticipated; unlikely to grow significantly in short-medium term
CERN - Computing Challenges 49
Risk Analysis and Actions
ODBMS market has not grown as predictedNeed to understand alternatives
Possibilities include: “Open Source” (?) ODBMS solution ORDBMS-based solution (also for event data) “Hybrid solutions”, incl. Meta-data + files
RD45 investigating & directly Based on experience at FNAL / BNL ...
Essential to consider all requirementsAnd not just file I/O…
CERN - Computing Challenges 50
Espresso
Espresso is a proof-of-concept prototype built to answer questions from Risk Analysis
Could we build an alternative to Objectivity/DB?How much manpower would be required?Can we overcome limitations of Objy’s current architecture?
Support for VLDBs, multi-FD work-arounds etc.
Test / validate import architectural choices
CERN - Computing Challenges 52
Espresso – Lessons Learnt
Initial prototype suggests that building a full ODBMS is technically feasible
Discussions with other sites suggest that interest goes well beyond HEP
Manpower estimates / possible resources indicate “project” would have to start “soon”
2002: 3-year project with full system end-2004
Th
e E
volu
tion
of
Dat
abas
es in
HE
P
ODBMS – In Retrospect Used – in production – by several experiments at CERN, SLAC and
other labs for a total of a few PB of data for just under a decade
It was – for some extended period – the baseline of ATLAS and CMS
Enhancements to the product obtained (with some effort)
MSS interface (actually much more general xrootd); Linux port VLDB support (ODMG compliance)
Much experience with ODBMS-like solutions obtained
“Risk analysis” clearly identified need for fallback, proof-of-concept prototype (Espresso) and eventually full solution (POOL)
Migration to Oracle+DATE (COMPASS) / POOL (LHC expts) successfully reported on a CHEP 2004 (scale of several hundred TB)
CERN - Computing Challenges 54
The Story So Far…
• 1992: CHEP – DB panel, CLHEP K/O, CVS …
• 1994: start of OO projects
• 1997: proposal of ODBMS+MSS; BaBar
• 2001: CMS change of baseline Objy
• Now: LCG Persistency Framework RTAG– Resulted in POOL project…
CERN - Computing Challenges 55
15. Observations – IT “Eloise” Retreat 2000
• Large volume event data storage and retrieval is a complex problem that the particle physics community has had to face for decades.
• The LHC data presents a particularly acute problem in the cataloguing and sparse retrieval domains, as the number of recorded events is very large and the signal to background ratios are very small. All currently proposed solutions involve the use of a database in one way or another.
• A satisfactory solution has been developed over the last years based on a modular interface complying with the ODMG standard, including C++ binding, and the Objectivity/DB object database product.
• The pure object database market has not had strong growth and the user and provider communities have expressed concerns. The “Espresso” software design and partial implementation, performed by the RD-45 collaboration, has provided an estimate of 15 person-years of qualified software engineers for development of an adequate solution using the same modular interface. This activity has completed, resulting in the recent snapshot release of the Espresso proof-of-concept prototype. No further development or support of this prototype is foreseen by DB group.
• Major relational database vendors have announced support for Object-Relational databases, including C++ bindings.
• Potentially this could fulfil the requirements for physics data persistency using a mainstream product from an established company.
• CERN already runs a large Oracle relational database service
CERN - Computing Challenges 56
Recommendation
• The conclusion of the Espresso project, that a HEP-developed object database solution for the storage of event data would require more resources than available, should be announced to the user community.
• The possibility of a joint project between Oracle and CERN should be explored to allow participation in the Oracle 9i beta test with the goals of evaluating this product as a potential fallback solution and providing timely feedback on physics-style requirements. Non-staff human resources should be identified such that there is no impact on current production services for Oracle and Objectivity.
Fellow, later also openlab resources
Th
e E
volu
tion
of
Dat
abas
es in
HE
P
Oracle for Physics Data
Work on LHC Computing started ~1992 (some would say earlier…)
Numerous projects kicked off 1994/5 to look at handling multi-PB of data; move from Fortran to OO (C++) etc.
Led to production solutions from ~1997
Always said that ‘disruptive technology’, like Web, would Always said that ‘disruptive technology’, like Web, would have to be taken into accounthave to be taken into account
In 2002, major project started to move 350TB of data out of ODBMS solution; >100MB/s for 24 hour periods
Now ~2TB of physics data stored in Oracle on Linux servers A few % of total data volume; expected to double in 2004 [ I guess its 10 x this by now? ]
Oracle 10g launch, ZH
Th
e E
volu
tion
of
Dat
abas
es in
HE
P
LCG and Oracle
Current thinking is that bulk data will be streamed to [ ROOT ] files RDBMS backend also being studied for ‘analysis data’
File catalog (109 files) and file-level metadata will be stored in Oracle in a Grid-aware catalog
[ This was the “RLS” family ]
In longer term, event level metadata may also be stored in the database, leading to much larger data volumes
A few PB, assuming total data volume of 100-200PB [ This was probably an over estimate – TAG : RAW ratio? ]
Current storage management system – CASTOR at CERN – also uses a [Oracle] database to manage the naming / location of files
Bulk data stored in tape silos and faulted in to huge disk caches
Oracle 10g launch, ZH
6060Джейми Шиерс Ноябрь 2004 г.Научные и корпоративные grid-
инфраструктуры
Физические изыскания - перспективы
• Реинжениринг всех сервисов СУБД для физической науки на базе
Oracle 10g RACOracle 10g RAC
• Цели:– Изолирование – ‘сервисы’ 10g и / или физическое разделение– Масштабируемость - как для вычислительной мощности
для обработки БД, так и для устройств хранения– Надежность – автоматический обход сбоя в случае проблем– Управляемость – упрощение процессов администрирования
• Вернемся к этому вопросу позже, в разделе ‘Enterprise Grids’ …
Th
e E
volu
tion
of
Dat
abas
es in
HE
P
Physics Activities - Futures
Re-engineering all DB services for Physics on
Oracle 10g RACOracle 10g RAC
Goals are: Isolation – 10g ‘services’ and / or physical separation Scalability - in both database processing power and
storage Reliability – automatic failover in case of problems Manageability – significantly easier to administer than now
Will revisit this under ‘Enterprise Grids’ later…
Oracle Grid Tech. day, Moscow
Th
e E
volu
tion
of
Dat
abas
es in
HE
P
CERN & Oracle Share a common vision regarding the future of high
performance computing Wide spread use of commodity dual processor PCs running Linux; Focus on Grid computing
CERN has managed to influence Oracle product
Oracle 10g features:
Support for native IEEE floats & doubles;
Support for “Ultra large” Databases (ULDB); 16 bit fields again!
Cross-platform transportable tablespaces;
Instant-client developer etc.
Oracle Grid Tech. day, Moscow
Th
e E
volu
tion
of
Dat
abas
es in
HE
P
LHC DB Applications
Clear that many “LHC construction / operations applications will use Oracle
This is true also for detector construction / monitoring / calibration applications
But the main change is closer to “physics applications”
There was no “general purpose” DB service for the physics community at the time of LEP
Some applications certainly (OPAL online tape DB, …)
But these are legion at the time of LHC… See hidden slides for some more info…
Moreover…
Th
e E
volu
tion
of
Dat
abas
es in
HE
P
2D Workshop - Background
A number of discussions, initially with LHC online community, revealed significant number of new database applications, and developers, in the pipeline
Meeting in October 2004 with offline and online representatives from all 4 LHC experiments led to proposal of Database Developers’ Workshop
This is it!
Significant amount of preparation by many people
Please profit from it – and not the wireless network
Silencing your mobile phone would be much appreciated…
[ A free Tom Kyte book was given to all attendees]
Th
e E
volu
tion
of
Dat
abas
es in
HE
P
(Workshop) Goals & Motivation
We are (finally) very close to LHC startup
Many new projects coming along
There is simply no time to waste!
Non-optimal (DB) applications are very expensive to ‘fix’ a posteriori
Try to establish some basic ‘best practice’ guidelines
And also well defined dev integration production path
Th
e E
volu
tion
of
Dat
abas
es in
HE
P
WLCG and Database Services Many ‘middleware’ components require a database:
dCache – PostgreSQL (CNAF porting to Oracle?) CASTOR / DPM / FTS* / LFC / VOMS – Oracle or MySQL Some MySQL only: RB, R-GMA#, SFT#
Most of these fall into the ‘Critical’ or ‘High’ category at Tier0 See definitions below; T0 = C/H, T1 = H/M, T2 = M/L
Implicit requirement for ‘high-ish service level’ (to avoid using a phrase such as H/A…)
At this level, no current need beyond site-local+ services Which may include RAC and / or DataGuard [ TBD together with service provider ]
Expected at AA & VO levels
*gLite 1.4 end October #Oracle version foreseen +R/O copies of LHCb FC?
Th
e E
volu
tion
of
Dat
abas
es in
HE
P
Those Questions again…
1. Should we buy or build database systems for our calibration and book-keeping needs?
It now seems to be accepted that we build our calibration & book-keeping systems on top of a database system.
Both commercial and open-source databases are supported.
2. Will database technology advance sufficiently in the next 8 to 10 years to be able to provide byte-level access to petabytes of SSC/LHC data?
We (HEP) have run production database services up to the PB level. The issues related to licensing, and – perhaps more importantly – support, to cover the full range of institutes participating in an LHC experiment, remain.
Risk analysis suggests a more cautious – and conservative – approach, such as that currently adopted. (Who are today the concrete alternatives to the market leader?)
Th
e E
volu
tion
of
Dat
abas
es in
HE
P
If you want to know more…
Visit http://hepdb.blogspot.com/
And also:
http://wwwasd.web.cern.ch/wwwasd/cernlib/rd45/ http://wwwasd.web.cern.ch/wwwasd/lhc++/indexold.html http://hep-proj-database.web.cern.ch/hep-proj-database/
Th
e E
volu
tion
of
Dat
abas
es in
HE
P
General Conclusions
“The past decade has been an era of sometimes tumultuous change in the area of Computing for High Energy Physics.”
This is certainly true – virtually no stone has been left unturned
Could we have got where we are now more directly?
With hindsight this might seem clear, but…
“Predicting is always very difficult. Especially predicting the future.”– Niels Bohr
top related