european grid of solar observations egso european grid of solar observations simon martin rutherford...
TRANSCRIPT
Eu
rope
an G
rid
of S
olar
Obs
erva
tion
sE
uro
pean
Gri
d of
Sol
ar O
bser
vatio
ns
EGSO European Grid of Solar
Observations
Simon MartinRutherford Appleton Laboratory
CDS Users Meeting, NAM 2003
Eu
rope
an G
rid
of S
olar
Obs
erva
tion
sE
uro
pean
Gri
d of
Sol
ar O
bser
vatio
ns
Outline
IntroductionProblem, EGSO, Grids
EGSO detailsEnhanced solar cataloguesProcessing data setsCDS/Spectral data
SummaryProgressConclusionsQuestions/suggestions
Eu
rope
an G
rid
of S
olar
Obs
erva
tion
sE
uro
pean
Gri
d of
Sol
ar O
bser
vatio
ns
Generic Problem Of Solar Physics
Observations used to build up a picture of the plasma in multi-dimensional parameter space (incl. x, y, z, t, T & )
Users need access to as many wavelengths as possible
Data centres and observatories located around the world
Difficult to find and access all relevant dataVery heterogeneous data sets/cataloguesIncreasing data volumes (ILWS > 1.0TB/day)Large and small data centres (with varying resources)
Users scattered around the worldDo not need to know where the data is locatedCapabilities/resources of users computing vary greatly
Eu
rope
an G
rid
of S
olar
Obs
erva
tion
sE
uro
pean
Gri
d of
Sol
ar O
bser
vatio
ns
Virtual Solar Observatories – A Solution
Make all archives ‘speak the same language’Consistent UI, search and analysis tools
Some objectives:Users are aware of, and have access to all available dataSearches for required data can be made using metadata alone – the data are only accessed laterThe results of a search can be used to retrieve data from several sources simultaneously, preferably in a processed formNew data can be added with little or no impact on the users
Several related projects in the solar community:The European Grid of Solar Observations (EGSO, EC funded)US Virtual Solar Observatory (US-VSO, funded by NASA)Sun-Earth Connector (CoSEC, funded by NASA under ILWS) Approach and emphasis of the projects differ
EGSO is the largest and is collaborating closely with the others
Eu
rope
an G
rid
of S
olar
Obs
erva
tion
sE
uro
pean
Gri
d of
Sol
ar O
bser
vatio
ns
EGSO
EGSO is a grid ‘testbed’ which will federate solar data centres, forming a single ‘virtual archive’
EGSO will lay the foundations for a ‘virtual solar observatory’
EGSO will provide tools for searching and analysing this solar data:
EGSO will improve access to solar dataUsers do not need knowledge of individual archives
Ten partners in Europe and the US, led by UCL-MSSL
EC funded project 3 in UK, 2 in France, 2 in Italy, 1 in Switzerland, 2 in USSeveral associate partners, mainly in the USEGSO and the US VSO planning to collaborate closely
Eu
rope
an G
rid
of S
olar
Obs
erva
tion
sE
uro
pean
Gri
d of
Sol
ar O
bser
vatio
ns
Grids
A grid is essentially a network of shared computers and devices (grid resources)Grid resources are heterogeneous and distributed
Resources include super computers, laptops, desktops, handheld devices, instruments…
To a user, these resources are available in a simple, transparent and secure wayThe grid deals with locations, security, heterogeneity on behalf of userAnalogy with electricity grids
Eu
rope
an G
rid
of S
olar
Obs
erva
tion
sE
uro
pean
Gri
d of
Sol
ar O
bser
vatio
ns
Grids cntd.
Grids must be scalable, fault tolerant, secure
Types of Grids:Computational e.g. SetiData e.g. EU DataGridService – service not provided by single machine e.g. MRI scanner Collaborative e.g. Access grids
EGSO is largely a data grid, but also a service grid
Eu
rope
an G
rid
of S
olar
Obs
erva
tion
sE
uro
pean
Gri
d of
Sol
ar O
bser
vatio
ns
Obtaining Data From EGSO
1. Identify suitable observationsSearch EGSO catalogues using GUI, search & visualisation toolsRefine search
2. Locate the data Grid locates and accesses data
3. Process the dataExtraction and calibration of dataCustom processing
4. Retrieve the dataData returned to user
Eu
rope
an G
rid
of S
olar
Obs
erva
tion
sE
uro
pean
Gri
d of
Sol
ar O
bser
vatio
ns
Identifying Suitable Observations (1)
In order to provide an enhanced search capability, EGSO will improve the quality and availability of metadataEnhanced “cataloguing” describes the data more fully
Standardized metadata versions of observing catalogues tie together the heterogeneous data sets from different instrumentsNew types of catalogue allow searches on events, features and phenomena rather than just date & time, pointing, etc…
Ancillary data used to provide additional search criteria
Images, time series, derived products, etc.Search Registry describes all metadata available for search improving performance.
Eu
rope
an G
rid
of S
olar
Obs
erva
tion
sE
uro
pean
Gri
d of
Sol
ar O
bser
vatio
ns
Identifying Suitable Observations (2)
EGSO’s enhanced catalogues based on new meta-data standards allow detailed searches to be conducted for many instruments, or by event/feature:
Unified Observing Catalogues (UOC)Metadata form of observing catalogues used to tie together the heterogeneous data, leaving the data unchangedSelf describing (e.g. XML), quantised by time and instrument, with no dependencies on ancillary data or proprietary software and any errors corrected (pre-processing)Standards defined for future data sets (e.g. STEREO, ILWS, Solar-B)
Solar Event Catalogues (SEC)Built from information contained in published listsFlare lists, CME lists, lists in SGD, etc.
Solar Feature Catalogue (SFC)Lists of the occurrence of events, phenomena and features provides an alternate means of selecting dataDerived using image recognition software
Eu
rope
an G
rid
of S
olar
Obs
erva
tion
sE
uro
pean
Gri
d of
Sol
ar O
bser
vatio
ns
Identifying Suitable Observations (3)
Pre-processing images to eliminate errors in UOC
Image (right) shows a poor quality Meudon H alpha image with circle showing position according to FITS header information Other problems include defects in data, weather (ground based), image shape…
Eu
rope
an G
rid
of S
olar
Obs
erva
tion
sE
uro
pean
Gri
d of
Sol
ar O
bser
vatio
ns
Identifying Suitable Observations (4)
GUI provided for defining queries:Simple queries may be date/time, wavelength etc.Synoptic images may be used to select pointing (next slide)Results from initial search will be accompanied by Quick-look images to refine query
More complex queries can be formulated since catalogues stored in RDB (SQL)Can also search by SEC, SFC
Eu
rope
an G
rid
of S
olar
Obs
erva
tion
sE
uro
pean
Gri
d of
Sol
ar O
bser
vatio
ns
Quick-look Processing
Eu
rope
an G
rid
of S
olar
Obs
erva
tion
sE
uro
pean
Gri
d of
Sol
ar O
bser
vatio
ns
EGSO – Query Resolving
SEC
SFC
Obs
ervi
ng C
atal
ogue
etc.
, Pro
vide
rs
Catalogue Warehouse
(cache)UOC
Ancill.
Data Requests
Summary Images, etc.(G)UI
Sea
rch
Qu
ery
Res
olve
r
QueryGenerator
SearchRegistry
Search Info.Requestor
Nature of searches and User Interface will be derived from Use Cases
Exact nature of interface to providers under review (Grid and/or P2P)
Eu
rope
an G
rid
of S
olar
Obs
erva
tion
sE
uro
pean
Gri
d of
Sol
ar O
bser
vatio
ns
Processing Data
Current archives have amassed large quantities of data, and future missions will generate huge amounts of data
moving this across networks is clearly undesirableEGSO aims to increase access to solar data
if the user needs to process data, the demands placed on users hardware becomes too great, and uptake may be poor
As far as possible, process the data at sourceInvolves extraction and calibration of a subset of the raw data
Software for processing defined by instrument team (IDL, C…)
Standard processing e.g. image cleaning, Quick-lookProcessing reduces volumes of data moved aroundSimplifies requirements on user’s own system
Special processing, including the ability to upload custom code, will also be possible. Users can perform investigations on large data sets on dedicated processing facilities.
e.g. Using image recognition techniques to search for bright points on CDS images Data mining
Eu
rope
an G
rid
of S
olar
Obs
erva
tion
sE
uro
pean
Gri
d of
Sol
ar O
bser
vatio
ns
CDS/Spectral Data
Can search CDS archive in terms of time and position
Need to have a target in mindSpectral data is hard to specify in meta-data terms.Adding meta-data to spectral data would allow more complex queries to be resolved
e.g. searching for data sets with particularly strong spectral lines
What meta-data could be used?Velocities (derived), emission line strengths, spectral type classification (derived)Meta-data could be applied to other spectral data including ground based
ProcessingCommon processing requirements, others (e.g. Chianti?)
Eu
rope
an G
rid
of S
olar
Obs
erva
tion
sE
uro
pean
Gri
d of
Sol
ar O
bser
vatio
ns
EGSO Progress
Just successfully completed first year review in Brussels (EC funded).
System architecture complete, working on:
Image recognition techniques to form new cataloguesMetadata for cataloguesImplementing aspects of architecture
Demonstration due Summer 2003.
Eu
rope
an G
rid
of S
olar
Obs
erva
tion
sE
uro
pean
Gri
d of
Sol
ar O
bser
vatio
ns
Summary
EGSO will improve access to solar dataScattered heterogeneous datasets can be found and accessed by anyone without detailed knowledge of multiple archives
EGSO can help working with large data sets by:
Reducing the number of files downloaded by use of improved catalogues and search facilitiesReducing the amount of data to be transferred by processing/extraction/calibration ‘at source’ Offering processing facilities & power (grid services)
Spectral data (e.g. CDS) not so straight forward.
Eu
rope
an G
rid
of S
olar
Obs
erva
tion
sE
uro
pean
Gri
d of
Sol
ar O
bser
vatio
ns
Questions/Suggestions
www.egso.org