use case 1: sciamachy data center
DESCRIPTION
SciaGrid project results. Use Case 1: Sciamachy Data Center. Wim Som de Cerff John van de Vegte Richard van Hees David Groep Jan Just Keijser Maurice Bouwhuis Pieter van Beek. Content. What is the NL-SCIA-DC? Why Grid? Implementation Results and outlook. What is Sciamachy?. - PowerPoint PPT PresentationTRANSCRIPT
Grid Tutorial 2008, SURFnet – November 2008 1
Use Case 1: Sciamachy Data Center
SciaGrid project results
Wim Som de Cerff John van de VegteRichard van HeesDavid GroepJan Just KeijserMaurice BouwhuisPieter van Beek
Grid Tutorial 2008, SURFnet – November 2008 2
Content
What is the NL-SCIA-DC?
Why Grid?
Implementation
Results and outlook
Grid Tutorial 2008, SURFnet – November 2008 3
What is Sciamachy?
SCIAMACHY is an passive imaging spectrometerSCanning Imaging Absorption spectroMeter for Atmospheric CartograpHY
Satellite instrument on the ESA ENVISAT satellite
Objective is to perform global measurements of trace gases (e.g. ozone, NO2, CH4, aerosols) in the troposphere and in the stratosphere.
The solar radiation transmitted, backscattered and reflected from the atmosphere is recorded at relatively high resolution (0.2 nm to 0.5 nm) over the range 240 nm to 1700 nm, and in selected regions between 2000 nm and 2400 nm.
SCIAMACHY has three different viewing geometries: nadir, limb, and sun/moon occultations which yield total column values as well as distribution profiles in the stratosphere and (in some cases) the troposphere for trace gases and aerosols.
Grid Tutorial 2008, SURFnet – November 2008 4
Sciamachy product examples Ozone hole Southern Hemisphere (October 2008)
Grid Tutorial 2008, SURFnet – November 2008 5
Grid Tutorial 2008, SURFnet – November 2008 6
Why is NL-SCIA-DC needed?
Complementing ESA’s distribution facilities
User need for fast and complete access to GOME and SCIAMACHY data
Supporting the development of Dutch algorithms
Distribution of Dutch data products
Domain specific search/query capabilities
Grid Tutorial 2008, SURFnet – November 2008 7
Goals for NL-SCIA-DC
Provide to the users:
Access to Sciamachy, GOME, MERIS and AATSR data
Selection methods, for easy selection of data
Downloading of selected datasets and products
Deployment of Dutch data products
Test environment for new data processors
(fast) dataset (re)processing capabilities
Grid Tutorial 2008, SURFnet – November 2008 8
Overview of NL-SCIA-DC
Tape Archive
Grid Tutorial 2008, SURFnet – November 2008 9
Data
GOME level 1b, 2: from 1996 up to now 1.5 Terabyte of data metadata and data products databases All pixels can be queried and browsed
Sciamachy: level 0, 1 and 2: from 2002 up to now 40 Terabyte of data, and growing metadata and data products database Accessible through catalogue, including extracted metadata All pixels can be queried and browsed Archive and metadata database are automatically updated
(satellite dish, ftp, DVD)
All data can be accessed onlineVia browser or application
Grid Tutorial 2008, SURFnet – November 2008 10
Data in the product databasesInstrument Product Coverage #files Datapolicy
GOME level 1 full-mission 60650 ESA registration
level 2 full-mission 59483 ESA registration
Fresco 1996 5403 Freely available
TOGOMI 1996 5389 Freely available
Sciamachy level 0 full-mission 58223 Restricted
level 1b full-mission 83520 ESA registration
level 2 full-mission 27895 ESA registration
Fresco 2007 4616 Freely available
TOSOMI 2007-01 459 Freely available
IMLM 2004 4245 Freely available
IMAP 2006-03 400 Freely available
Meris 2003-2008 26490 G-POD user
•PostgreSQL 8.3 with PostGIS extension used•Database is now 112 Gbyte and growing
Grid Tutorial 2008, SURFnet – November 2008 11
Users
The NL-SCIA-DC has 120 registered users from 22 countries, from 71 different organizations.
Bulk data users. Data is delivered directly to them by sftp. Current bulk data users (standing order) are KNMI, SRON, BIRA (Belgium), University of Heidelberg (Germany) and ISAO (Italy).
TEMIS (ESA)Tropospheric Emission Monitoring Internet Site (TEMIS) aims to compute and deliver global concentrations of tropospheric trace gases, and aerosol and UV products derived from observations of nadir-viewing satellite instruments such as GOME, SCIAMACHY and (A)ATSR. TEMIS is part of the Data User Programme (DUP) of the European Space Agency (ESA). The service of TEMIS centres around four themes: Air pollution monitoring, UV radiation monitoring, Support to Protocol monitoring, Support to Aviation control.
PROMOTE (GMES)To deliver the Atmosphere GMES Service Element a sustainable and reliable operational service to support informed decisions on the atmospheric policy issues of stratospheric ozone depletion, surface UV exposure, air quality and climate change
Grid Tutorial 2008, SURFnet – November 2008 12
User interface
‘classic’ client – serverJava AppletSearch, process, download
Grid Tutorial 2008, SURFnet – November 2008 13
Why Grid?
Datasets are large and not easily downloaded to a workstation
Users want to run their algorithm on a larger set of Sciamachy data
Running an algorithm on a large set takes too long on a single workstation
Algorithms are mostly embarrassingly parallel very much suited to run in a Grid environment!
Also very interesting for reprocessing of data
SCIA State
SCIA consoloidated L0
SCIA StateMeta DB ingest
DMOP
SCIA L0Meta DB ingest
L0 orbit
SCIA offline L2 SCIA L2Meta DB ingest
L2 orbit
SCIA consolidated L2
SCIA L2Meta DB ingest
L2 orbit
SCIA L2 CH4
SCIA CH4Meta DB ingest
L2 orbit
L2 orbit CH4DailyCatterASCII2NC
CH4 PLOT(IDL)
L2 daily
CH4MontlyAVG
NC
L2 dailyCH4 PLOT
(IDL)L3 montly
Picture
Picture
Sciamachy chains for metadata extraction, CH4 level 2, level 2 daily average, level 3 daily and plot processing
Grid Tutorial 2008, SURFnet – November 2008 14
SciaGrid Project
Together with NIKHEF and SARA
NIVR GO financed project
Aim: ‘Griddify’ the NL-SCIA-DC Share archives and databases at KNMI and SRON
Make data accessible for resources at NIKHEF and SARA (Grid)
Run NL-SCIA-DC jobs on Grid infrastructure, using the NL-SCIA-DC GUI
In the project:
Experiments with Storage Resource Broker (SRB)
Robot certificate
Pilot job engine
Grid Tutorial 2008, SURFnet – November 2008 15
OverviewNL-SCIA-DC interactive
usersNL-SCIA-DC
Bulk data users
User domain
PROMOTE/TEMISinteractive users
FTP site DLRSatellite receiver DVD FTP site ESRIN FTP site X
Data source domain
WWW PROMOTE / TEMIS
Web client NL-SCIA-DC
WWW NL-SCIA-DC
WWW domain
Grid FTP serversProcessing Scheduler
GRID domain
SARA/Nikhef resources
Data IngestData Distribution Product Processing
KNMI domain
Instrument data
Instrument metadata
NL-SCIA-DC metadataControlNADC
shared domain SRON domain
Data IngestData Distribution Product Processing
Instrument data
Instrument metadata
NL-SCIA-DC Server
NL-SCIA-DCIntermediate
NL-SCIA-DC ServerGrid Process Request
Grid Process Request
Grid Tutorial 2008, SURFnet – November 2008 16
Results SciaGrid
SRB did not solve our problem; Drawbacks: Adding an existing archive is not easy
Licensing of SRB
Future?
Solved the metadata part in an other way, Grid FTP selected for data access
Certificates: NL-SCIA-DC has (first issued) robot cert! Users can use their own login from NL-SCIA-DC to submit jobs
Pilot Job framework used Gain better successful submission ratios
Minimize Grid component installations at KNMI/SRON
Grid Tutorial 2008, SURFnet – November 2008 17
SciaGrid setupNL-SCIA-DC
NADC processing
suite
The EGEE GridTOPOS
NL-SCIA-DCFileserver
Grid
FT
P
HT
TP
S
TOPOSToken-pool
ServerHTTPS
GridWorkerNodes
GridWorkerNodes
GridWorkerNodes
GR
AM
Grid RB/WMS
NL-SCIA-DCGUI
NL-SCIA-DC Robot
Certificateserver
SO
AP
scp
Grid Tutorial 2008, SURFnet – November 2008 18
NL-SCIA-DC interactive users
NL-SCIA-DCBulk data users
User domain
PROMOTE/TEMISinteractive users
FTP site DLRSatellite receiver DVD FTP site ESRIN FTP site X
Data source domain
WWW PROMOTE / TEMIS
Web client NL-SCIA-DC
WWW NL-SCIA-DC
WWW domain
Grid FTP serversProcessing Scheduler
GRID domain
SARA/Nikhef resources
Data IngestData Distribution Product Processing
KNMI domain
Instrument data
Instrument metadata
NL-SCIA-DC metadataControlNADC
shared domain SRON domain
Data IngestData Distribution Product Processing
Instrument data
Instrument metadata
NL-SCIA-DC Server
NL-SCIA-DCIntermediate
NL-SCIA-DC ServerGrid Process Request
Grid Process Request
Status NL-SCIA-DCAvailable
Debug…
Grid Tutorial 2008, SURFnet – November 2008 19
Summary and outlook
Grid experiment was successful connection to the Grid established
Data is accessible at Grid resources
Jobs can be submitted using the NL-SCIA-DC GUI
Release of User interface asap so users can actually use the new functionality
NL-SCIA-DC operations in SciaVisie project
Grid component expanded in Big Grid (?)