edg applications the european datagrid project team
Post on 17-Jan-2016
223 Views
Preview:
TRANSCRIPT
EDG Applications
The European DataGrid Project Team
http://www.eu-datagrid.org
EU DataGrid - Applications 2
EDG Application Areas
High Energy Physics
Biomedical Applications
Earth Observation Science Applications
EU DataGrid - Applications 3
High Energy Physics
4 Experiments on LHC CMSATLAS
LHCb
~6-8 PetaBytes / year~108 events/year
~103 batch and interactive users
EU DataGrid - Applications 4
Europe: 267 institutes, 4603 usersElsewhere: 208 institutes, 1632 users
CERN’s Network in the World
EU DataGrid - Applications 5
Data Flow in LHC
RAW Data
DAQ
Trigger
Reconstruction
Event Summary Data (ESD) Reconstruction Tags
RAW Tags Conditions / Calibration Data
Physics Generator
Detector Simulation
Generator Data
RAWmc Data
Monte Carlo
Reconstruction
Event Summary Data (ESD) Reconstruction Tags
RAWmc Tags Conditions / Calibration Data
EU DataGrid - Applications 6
Example: CMS Monte Carlo Production
EU DataGrid - Applications 7
CMS jobs description
CMKIN : MC Generation of the proton-proton interaction for a physics channel (dataset)
CMSIM: Detailed simulation of the CMS detector, processing the data produced during the CMKIN step
CMKINJob
CMSIMJob
Output data
Output data
Grid Storage
Write to Grid
Storage Element
Write to Grid
Storage Element
Read from
Grid
Stora
ge Elem
ent
* PIII 1GHz 512MB 46.8 SI95
size/event time*/event
CMKIN ~ 0.05MB ~ 0.4-0.5 sec
CMSIM ~ 1.8 MB ~ 6 min
EU DataGrid - Applications 8
CMS production components interfaced to EDG middleware Production is managed from the EDG User Interface with
IMPALA/BOSS
CMS Virtual Organization server at NIKHEF (Amsterdam)
CMS EDG
SECE
CMS software
BOSSDB
WorkloadManagement
System
JDL
RefDB
parameters
Push data or info
Pull info
UIIMPALA/BOSS
CE
CMS software
CE
CMS software
CE
SE
SE
SE
EU DataGrid - Applications 9
CMS EDG
SECE
CMS software
BOSSDB
WorkloadManagement
System
JDL
RefDB
parameters
data registration
input
dat a
lo
cat i
on
Push data or info
Pull info
UIIMPALA/BOSS
Replica Manager
CE
CMS software
CE
CMS software
CE
WN
SECE
CMS software
SE
SE
SE
X
CMS production components interfaced to EDG middleware CMKIN jobs running on all EDG Testbed sites with CMS software installed
CMSIM jobs running on CE close to the input data
produced data: scripts for batch replication to a dedicated SE
EU DataGrid - Applications 10
CMS production components interfaced to EDG middleware
Job monitoring and bookkeeping: BOSS DBs, EDG Logging & Bookkeeping service
CMS EDG
SECE
CMS software
BOSSDB
WorkloadManagement
System
JDL
RefDB
parameters
data registration
Job output filteringRuntime monitoring
input
dat a
lo
cat i
on
Push data or info
Pull info
UIIMPALA/BOSS
Replica Manager
CE
CMS software
CE
CMS software
CE
WN
SECE
CMS software
SE
SE
SE
EU DataGrid - Applications 11
CMS use of the system (Statistics)
CEsSEsN
b.
of
evts
time
Events Production within EDG is part of the Official CMS production
http://cmsdoc.cern.ch/cms/production/www/html/general/
EU DataGrid - Applications 12
Summary of CMS work and plans for use of EDG middleware
RESULTS We can distribute and run CMS s/w in the EDG environment
We have generated ~250K events for physics with ~10000 jobs in 3 week period
OBSERVATIONS and PLANNING for the future We were able to quickly add new sites to provide extra resources
There was a fast turnaround in bug fixing and installing new software
The stress test was labor intensive (since software was developing)
Release EDG 2.0 should fix the major problems and allow for enhanced scalability,and we look forward to evaluating it and using it in our Data Challenge work
EU DataGrid - Applications 13
ESA(IT) – KNMI(NL)Processing of raw GOMEdata to ozone profiles.
2 alternative algorithms~28000 profiles/day
IPSL(FR)Validate some of the
GOME ozone profiles (~106/y)Coincident in space and time
with Ground-Based measurements
Visualization & Analyze
LIDAR data (7 stations, 2.5MB per month)
DataGridenvironment
Level 2
(example of 1 day total O3)
Level 1
Raw satellite data from the GOME instrument(~75 GB - ~5000 orbits/y)
EDG EO challenge: Processing / validation of 1y of GOME data
EU DataGrid - Applications 14
EO WebMap Portal
EU DataGrid - Applications 15
Web Portal EO ProductCatalogue
EDGStorage Element
EDGUser Interface
EDGResource
BrokerEDGComputing
Element
EO Replica Catalogue
EOGrid Engine
EO ProductArchive
1. Search Level-1 catalogue
2. Retrieve Level-2 products
3. Level-2 Products already registered in RC?
8. Submit jobs to process Level-1 data
7. Register Level-1 data
11. Register level-2 data
9. Process Level-1 data
10. Transfer Level-2 data to SE
12. Return new Level-2 products
Yes? 4. Return available Level-2 productsNo? 5. Perform GRID processing on-the-fly 6. Transfer
Level-1 data from Archive to the Grid
Processing Sequence
EU DataGrid - Applications 16
Goals of the DataGrid applicationvalidate satellite data with all ground based data available in an easy way: Comparison of ozone profiles provided by satellite with lidar data in different locations and times (see the web portal) Statistical comparison and analysis in order to improve algorithms.
OZONE LAYER50 km
10 km
ERS/GOME satellite
Lidar at the Haute Provence Observatory
GOME Ozone Profile Validation
EU DataGrid - Applications 17
Level 2 Catalogue
Lidar data catalogue
Queries and data information retrieval from the Lidar metadata catalogue
GRID
ComputingElement
Storage Elements with
Lidar data
Queries and data information retrieval from the Gome Level 2 orbit or pixel metadata catalogues
When completed comparison between lidar and satellite ozone profiles
Satellite data validation Lidar site
Level 2 Catalogue
GRID Portal
Storage Elements with Gome L2 data
Submission of the Job in the GRID
1
2
3
4
Validation Processing Sequence
EU DataGrid - Applications 18
Validation Output
Figure 1:
Estimation of the bias between Gome and Lidar using one month of data.
Figure 2 :
example of 2 profiles : Comparison between Gome profile and lidar profile for the 2nd October 2000.
EU DataGrid - Applications 19
Perspectives for Biomedical Applications
Grids open new perspectives in large scale genomics analysis Complete genome annotation
Cross-genomes analysis
Data mining on distributed databases
Pipelining of huge automatic bio-informatics analysis
Medical image processing Large databases processing
Anatomy and physiology modeling
Epidemiological studies
EU DataGrid - Applications 20
Biomedical Applications
Bio-informatics Phylogenetics : BBE Lyon (T. Sylvestre) Search for primers : Centrale Paris (K. Kurata) Statistical genetics : CNG Evry (N. Margetic) Bio-informatics web portal : IBCP (C. Blanchet) Parasitology : LBP Clermont, Univ B. Pascal (N. Jacq) Data-mining on DNA chips : Karolinska (R. Médina, R. Martinez) Geometrical protein comparison : Univ. Padova (C. Ferrari)
Medical imaging MR image simulation : CREATIS (H. Benoit-Cattin) Medical data and metadata management : CREATIS (J.
Montagnat) Mammographies analysis ERIC/Lyon 2 (S. Miguet, T. Tweed) Simulation platform for PET/SPECT based on Geant4 : GATE
collaboration (L. Maigne)
Applications deployedApplications tested on EDGApplications under preparation
EU DataGrid - Applications 21
Medical Imaging
Medical images
Metadata
HH
1. query
2. visu
alisat
ion
3. similarity search4. scores
5. best results visualisation
LFN image patient hospital ...
EU DataGrid - Applications 22
Graphic layer
Job Monitoring
Grid File Browsing
File registration and retrieval
EU DataGrid - Applications 23
Graphical Interfaces
Image registration
Image retrieval
Local files Grid files Metadata
Query over metadata Query result
EU DataGrid - Applications 24
Image Registration
LFN image patient hospital ...
Imager
SE
EU DataGrid - Applications 25
Similarity search
Similarity computation
Results visualization
Job monitoring Ranked list of images
Source image Most similar images Low score images
EU DataGrid - Applications 26
Future: Interfacing medical data with the Grid
Client 1interface
Client 2interface
RSinterface
core
grid - serverinterface
header blankingencryption
StorageElement
ReplicaCatalog
ReplicationService
RCinterface
Metadata interface
Medical (trusted) site
Grid middleware
File metadataACLsizechecksum...
Application metadataACLencryption keysensitive metadata...Medical server
StorageElementMSS
Master File
Replica
Imager
EU DataGrid - Applications 27
Parallel Processing
Magnetic Resonance Images simulation using the grid
3 levels of parallelism:
Parallel isochromat computations
Multi-slice MRI computation
Parallel magnetization kernel
Magnetisationcomputation
kernel
Reconstructionalgorithm MRI
ImageVirtualobject
MRIsequence
EU DataGrid - Applications 28
Summary
Use Cases High Energy Physics
Earth Observation
Biomedical Applications
EU DataGrid - Applications 29
Further Information
High Energy Physics
http://datagrid-wp8.web.cern.ch/DataGrid-WP8/
Bio-Informatics
http://marianne.in2p3.fr/datagrid/wp10/index.html
Earth Observation
http://styx.esrin.esa.it/grid/
EU DataGrid - Applications 30
top related