use case 1: sciamachy data center

19
Grid Tutorial 2008, SURFnet – November 2008 1 Use Case 1: Sciamachy Data Center SciaGrid project results Wim Som de Cerff John van de Vegte Richard van Hees David Groep Jan Just Keijser Maurice Bouwhuis Pieter van Beek

Upload: lucita

Post on 12-Jan-2016

24 views

Category:

Documents


0 download

DESCRIPTION

SciaGrid project results. Use Case 1: Sciamachy Data Center. Wim Som de Cerff John van de Vegte Richard van Hees David Groep Jan Just Keijser Maurice Bouwhuis Pieter van Beek. Content. What is the NL-SCIA-DC? Why Grid? Implementation Results and outlook. What is Sciamachy?. - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Use Case 1: Sciamachy Data Center

Grid Tutorial 2008, SURFnet – November 2008 1

Use Case 1: Sciamachy Data Center

SciaGrid project results

Wim Som de Cerff John van de VegteRichard van HeesDavid GroepJan Just KeijserMaurice BouwhuisPieter van Beek

Page 2: Use Case 1: Sciamachy Data Center

Grid Tutorial 2008, SURFnet – November 2008 2

Content

What is the NL-SCIA-DC?

Why Grid?

Implementation

Results and outlook

Page 3: Use Case 1: Sciamachy Data Center

Grid Tutorial 2008, SURFnet – November 2008 3

What is Sciamachy?

SCIAMACHY is an passive imaging spectrometerSCanning Imaging Absorption spectroMeter for Atmospheric CartograpHY

Satellite instrument on the ESA ENVISAT satellite

Objective is to perform global measurements of trace gases (e.g. ozone, NO2, CH4, aerosols) in the troposphere and in the stratosphere.

The solar radiation transmitted, backscattered and reflected from the atmosphere is recorded at relatively high resolution (0.2 nm to 0.5 nm) over the range 240 nm to 1700 nm, and in selected regions between 2000 nm and 2400 nm.

SCIAMACHY has three different viewing geometries: nadir, limb, and sun/moon occultations which yield total column values as well as distribution profiles in the stratosphere and (in some cases) the troposphere for trace gases and aerosols.

Page 4: Use Case 1: Sciamachy Data Center

Grid Tutorial 2008, SURFnet – November 2008 4

Sciamachy product examples Ozone hole Southern Hemisphere (October 2008)

Page 5: Use Case 1: Sciamachy Data Center

Grid Tutorial 2008, SURFnet – November 2008 5

Page 6: Use Case 1: Sciamachy Data Center

Grid Tutorial 2008, SURFnet – November 2008 6

Why is NL-SCIA-DC needed?

Complementing ESA’s distribution facilities

User need for fast and complete access to GOME and SCIAMACHY data

Supporting the development of Dutch algorithms

Distribution of Dutch data products

Domain specific search/query capabilities

Page 7: Use Case 1: Sciamachy Data Center

Grid Tutorial 2008, SURFnet – November 2008 7

Goals for NL-SCIA-DC

Provide to the users:

Access to Sciamachy, GOME, MERIS and AATSR data

Selection methods, for easy selection of data

Downloading of selected datasets and products

Deployment of Dutch data products

Test environment for new data processors

(fast) dataset (re)processing capabilities

Page 8: Use Case 1: Sciamachy Data Center

Grid Tutorial 2008, SURFnet – November 2008 8

Overview of NL-SCIA-DC

Tape Archive

Page 9: Use Case 1: Sciamachy Data Center

Grid Tutorial 2008, SURFnet – November 2008 9

Data

GOME level 1b, 2: from 1996 up to now 1.5 Terabyte of data metadata and data products databases All pixels can be queried and browsed

Sciamachy: level 0, 1 and 2: from 2002 up to now 40 Terabyte of data, and growing metadata and data products database Accessible through catalogue, including extracted metadata All pixels can be queried and browsed Archive and metadata database are automatically updated

(satellite dish, ftp, DVD)

All data can be accessed onlineVia browser or application

Page 10: Use Case 1: Sciamachy Data Center

Grid Tutorial 2008, SURFnet – November 2008 10

Data in the product databasesInstrument Product Coverage #files Datapolicy

GOME level 1 full-mission 60650 ESA registration

level 2 full-mission 59483 ESA registration

Fresco 1996 5403 Freely available

TOGOMI 1996 5389 Freely available

Sciamachy level 0 full-mission 58223 Restricted

level 1b full-mission 83520 ESA registration

level 2 full-mission 27895 ESA registration

Fresco 2007 4616 Freely available

TOSOMI 2007-01 459 Freely available

IMLM 2004 4245 Freely available

IMAP 2006-03 400 Freely available

Meris 2003-2008 26490 G-POD user

•PostgreSQL 8.3 with PostGIS extension used•Database is now 112 Gbyte and growing

Page 11: Use Case 1: Sciamachy Data Center

Grid Tutorial 2008, SURFnet – November 2008 11

Users

The NL-SCIA-DC has 120 registered users from 22 countries, from 71 different organizations.

Bulk data users. Data is delivered directly to them by sftp. Current bulk data users (standing order) are KNMI, SRON, BIRA (Belgium), University of Heidelberg (Germany) and ISAO (Italy).

TEMIS (ESA)Tropospheric Emission Monitoring Internet Site (TEMIS) aims to compute and deliver global concentrations of tropospheric trace gases, and aerosol and UV products derived from observations of nadir-viewing satellite instruments such as GOME, SCIAMACHY and (A)ATSR. TEMIS is part of the Data User Programme (DUP) of the European Space Agency (ESA). The service of TEMIS centres around four themes: Air pollution monitoring, UV radiation monitoring, Support to Protocol monitoring, Support to Aviation control.

PROMOTE (GMES)To deliver the Atmosphere GMES Service Element a sustainable and reliable operational service to support informed decisions on the atmospheric policy issues of stratospheric ozone depletion, surface UV exposure, air quality and climate change

Page 12: Use Case 1: Sciamachy Data Center

Grid Tutorial 2008, SURFnet – November 2008 12

User interface

‘classic’ client – serverJava AppletSearch, process, download

Page 13: Use Case 1: Sciamachy Data Center

Grid Tutorial 2008, SURFnet – November 2008 13

Why Grid?

Datasets are large and not easily downloaded to a workstation

Users want to run their algorithm on a larger set of Sciamachy data

Running an algorithm on a large set takes too long on a single workstation

Algorithms are mostly embarrassingly parallel very much suited to run in a Grid environment!

Also very interesting for reprocessing of data

SCIA State

SCIA consoloidated L0

SCIA StateMeta DB ingest

DMOP

SCIA L0Meta DB ingest

L0 orbit

SCIA offline L2 SCIA L2Meta DB ingest

L2 orbit

SCIA consolidated L2

SCIA L2Meta DB ingest

L2 orbit

SCIA L2 CH4

SCIA CH4Meta DB ingest

L2 orbit

L2 orbit CH4DailyCatterASCII2NC

CH4 PLOT(IDL)

L2 daily

CH4MontlyAVG

NC

L2 dailyCH4 PLOT

(IDL)L3 montly

Picture

Picture

Sciamachy chains for metadata extraction, CH4 level 2, level 2 daily average, level 3 daily and plot processing

Page 14: Use Case 1: Sciamachy Data Center

Grid Tutorial 2008, SURFnet – November 2008 14

SciaGrid Project

Together with NIKHEF and SARA

NIVR GO financed project

Aim: ‘Griddify’ the NL-SCIA-DC Share archives and databases at KNMI and SRON

Make data accessible for resources at NIKHEF and SARA (Grid)

Run NL-SCIA-DC jobs on Grid infrastructure, using the NL-SCIA-DC GUI

In the project:

Experiments with Storage Resource Broker (SRB)

Robot certificate

Pilot job engine

Page 15: Use Case 1: Sciamachy Data Center

Grid Tutorial 2008, SURFnet – November 2008 15

OverviewNL-SCIA-DC interactive

usersNL-SCIA-DC

Bulk data users

User domain

PROMOTE/TEMISinteractive users

FTP site DLRSatellite receiver DVD FTP site ESRIN FTP site X

Data source domain

WWW PROMOTE / TEMIS

Web client NL-SCIA-DC

WWW NL-SCIA-DC

WWW domain

Grid FTP serversProcessing Scheduler

GRID domain

SARA/Nikhef resources

Data IngestData Distribution Product Processing

KNMI domain

Instrument data

Instrument metadata

NL-SCIA-DC metadataControlNADC

shared domain SRON domain

Data IngestData Distribution Product Processing

Instrument data

Instrument metadata

NL-SCIA-DC Server

NL-SCIA-DCIntermediate

NL-SCIA-DC ServerGrid Process Request

Grid Process Request

Page 16: Use Case 1: Sciamachy Data Center

Grid Tutorial 2008, SURFnet – November 2008 16

Results SciaGrid

SRB did not solve our problem; Drawbacks: Adding an existing archive is not easy

Licensing of SRB

Future?

Solved the metadata part in an other way, Grid FTP selected for data access

Certificates: NL-SCIA-DC has (first issued) robot cert! Users can use their own login from NL-SCIA-DC to submit jobs

Pilot Job framework used Gain better successful submission ratios

Minimize Grid component installations at KNMI/SRON

Page 17: Use Case 1: Sciamachy Data Center

Grid Tutorial 2008, SURFnet – November 2008 17

SciaGrid setupNL-SCIA-DC

NADC processing

suite

The EGEE GridTOPOS

NL-SCIA-DCFileserver

Grid

FT

P

HT

TP

S

TOPOSToken-pool

ServerHTTPS

GridWorkerNodes

GridWorkerNodes

GridWorkerNodes

GR

AM

Grid RB/WMS

NL-SCIA-DCGUI

NL-SCIA-DC Robot

Certificateserver

SO

AP

scp

Page 18: Use Case 1: Sciamachy Data Center

Grid Tutorial 2008, SURFnet – November 2008 18

NL-SCIA-DC interactive users

NL-SCIA-DCBulk data users

User domain

PROMOTE/TEMISinteractive users

FTP site DLRSatellite receiver DVD FTP site ESRIN FTP site X

Data source domain

WWW PROMOTE / TEMIS

Web client NL-SCIA-DC

WWW NL-SCIA-DC

WWW domain

Grid FTP serversProcessing Scheduler

GRID domain

SARA/Nikhef resources

Data IngestData Distribution Product Processing

KNMI domain

Instrument data

Instrument metadata

NL-SCIA-DC metadataControlNADC

shared domain SRON domain

Data IngestData Distribution Product Processing

Instrument data

Instrument metadata

NL-SCIA-DC Server

NL-SCIA-DCIntermediate

NL-SCIA-DC ServerGrid Process Request

Grid Process Request

Status NL-SCIA-DCAvailable

Debug…

Page 19: Use Case 1: Sciamachy Data Center

Grid Tutorial 2008, SURFnet – November 2008 19

Summary and outlook

Grid experiment was successful connection to the Grid established

Data is accessible at Grid resources

Jobs can be submitted using the NL-SCIA-DC GUI

Release of User interface asap so users can actually use the new functionality

NL-SCIA-DC operations in SciaVisie project

Grid component expanded in Big Grid (?)