16-17 oct 2003ivoa data access layer, strasbourg 20031 ivoa data access layer (dal) working group...

30
16-17 Oct 2003 IVOA Data Access Layer, Strasbourg 2003 1 IVOA Data Access Layer (DAL) Working Group Doug Tody National Radio Astronomy Observatory nternational VIRTUAL OBSERVATORY ALLIANCE

Upload: oswald-blair

Post on 16-Jan-2016

219 views

Category:

Documents


2 download

TRANSCRIPT

Page 1: 16-17 Oct 2003IVOA Data Access Layer, Strasbourg 20031 IVOA Data Access Layer (DAL) Working Group Doug Tody National Radio Astronomy Observatory International

16-17 Oct 2003IVOA Data Access Layer, Strasbourg 2003 1

IVOA Data Access Layer (DAL)Working Group

Doug TodyNational Radio Astronomy Observatory

International VIRTUAL OBSERVATORY ALLIANCE

Page 2: 16-17 Oct 2003IVOA Data Access Layer, Strasbourg 20031 IVOA Data Access Layer (DAL) Working Group Doug Tody National Radio Astronomy Observatory International

16-17 Oct 2003IVOA Data Access Layer, Strasbourg 2003 2

IVOA Data Access Layer (DAL)

• DAL Working Group Priorities– Update simple image access (SIA) to V1.1– Introduce simple spectral access (SSA) V1.0– Introduce web services versions of DAL

services– Drive VO technology development as required

for DAL (e.g., dataset identifiers, data models, VOTable)

Page 3: 16-17 Oct 2003IVOA Data Access Layer, Strasbourg 20031 IVOA Data Access Layer (DAL) Working Group Doug Tody National Radio Astronomy Observatory International

16-17 Oct 2003IVOA Data Access Layer, Strasbourg 2003 3

Simple Spectral Access (SSA)

• Goals– provide uniform access to both 1D spectra and SEDs– simplify interface for both data providers and client

applications– powerful "multiwavelength" spectral analysis

capability• Spectral survey

– use-cases to drive interface design– identify early data providers and application

developers• Current issues

– spectral data model– interface design issues– spectral dataset representation

Page 4: 16-17 Oct 2003IVOA Data Access Layer, Strasbourg 20031 IVOA Data Access Layer (DAL) Working Group Doug Tody National Radio Astronomy Observatory International

16-17 Oct 2003IVOA Data Access Layer, Strasbourg 2003 4

Cambridge SIA V1.1 Priorities

• Essential– Registry integration– Pixflags support for lossy compressed data (e.g., HCOMPRESS)

• Image Characterization– Image provenance and identification (collection ID, dataset ID,

virtual data provenance, replica support)– Spectral bandpass (already present; may need tweaking for

consistency)– Time of observation– Spatial resolution– Limiting flux (harder; may not make V1.1)

• Other– VO technology integration (normalize UCDs, data models, etc.)– Use of image attributes to refine query (e.g., band)– Default for case where there are multiple versions of same dataset– Spatial bandpass - 3– Image type (future- v2)– Logical hierarchies to describe complex metadata (as in IDHA – v2)

Page 5: 16-17 Oct 2003IVOA Data Access Layer, Strasbourg 20031 IVOA Data Access Layer (DAL) Working Group Doug Tody National Radio Astronomy Observatory International

16-17 Oct 2003IVOA Data Access Layer, Strasbourg 2003 5

DAL Interface Issues

• Next version of SIA requires progress in the following areas:– dataset identifiers– component data models, dataset

characterization– data model, dataset representation

• These are actually required for all DAL services, not just SIA

Page 6: 16-17 Oct 2003IVOA Data Access Layer, Strasbourg 20031 IVOA Data Access Layer (DAL) Working Group Doug Tody National Radio Astronomy Observatory International

16-17 Oct 2003IVOA Data Access Layer, Strasbourg 2003 6

Global Dataset Identifiers

• Required to identify data returned by DAL services– Images: data collection ID, dataset ID– Catalogs: catalog ID, record ID

• Will enable– replica management and selection– virtual data management and

characterization

• Discussion being led by Registries group

Page 7: 16-17 Oct 2003IVOA Data Access Layer, Strasbourg 20031 IVOA Data Access Layer (DAL) Working Group Doug Tody National Radio Astronomy Observatory International

16-17 Oct 2003IVOA Data Access Layer, Strasbourg 2003 7

Data Model Representation

• All DAL data access is data model based• Must be able to represent data models

unambiguously in VOTable• VOTable UTYPE proposal to provide

"pointer into data model“

• Discussion will be in VOTable group

Page 8: 16-17 Oct 2003IVOA Data Access Layer, Strasbourg 20031 IVOA Data Access Layer (DAL) Working Group Doug Tody National Radio Astronomy Observatory International

16-17 Oct 2003IVOA Data Access Layer, Strasbourg 2003 8

Page 9: 16-17 Oct 2003IVOA Data Access Layer, Strasbourg 20031 IVOA Data Access Layer (DAL) Working Group Doug Tody National Radio Astronomy Observatory International

16-17 Oct 2003IVOA Data Access Layer, Strasbourg 2003 9

Agenda for DAL Working GroupStrasbourg, October 2003

• DAL Recap– Service class hierarchy– Concept of different views of same data

• SIA V1.1 / DAL Interface Issues– image identifiers, virtual data– component data models (UTYPE,

UCD normalization)– getImage acref templating (Francois)

• SSA Straw man– SSA overview / interface (Doug)– SED introduction (Markus) – 1D spectral data model (Jonathan)– discussion of SSA issues

• Review process for development of SSA specification • Update DAL Priorities and Schedule

After a brief review of the services architecture, most of the discussion in this WG meeting focused on enhancement to SIA and the general DAL infrastructure, and the scope and design of the simple spectra access (SSA) interface.

Page 10: 16-17 Oct 2003IVOA Data Access Layer, Strasbourg 20031 IVOA Data Access Layer (DAL) Working Group Doug Tody National Radio Astronomy Observatory International

16-17 Oct 2003IVOA Data Access Layer, Strasbourg 2003 10

DAL Scope: Types of Data (Cambridge)

Dataset

Time Series

Catalog Source Catalog

Event List

Visibility Data

Image NDImage

1D Spectrum

SED

Primary DAL Services

Concept of DAL service architecture from Cambridge. Reviewed and reaffirmed without objection.

Page 11: 16-17 Oct 2003IVOA Data Access Layer, Strasbourg 20031 IVOA Data Access Layer (DAL) Working Group Doug Tody National Radio Astronomy Observatory International

16-17 Oct 2003IVOA Data Access Layer, Strasbourg 2003 11

SIA V1.1 / DAL Infrastructure

• Some key issues– Registry integration, service metadata– Image identifiers (data collection ID, dataset ID)– Data characterization (coverage, bandpass, resolution,

etc.)– Data provenance, virtual data characterization– Data model representation, UCD normalization– Templating the URL access reference

• For the most part these issues actually affect all DAL/VO data access and are not specific to SIA.

Some hot topics affecting SIA and all DAL services were discussed.

Page 12: 16-17 Oct 2003IVOA Data Access Layer, Strasbourg 20031 IVOA Data Access Layer (DAL) Working Group Doug Tody National Radio Astronomy Observatory International

16-17 Oct 2003IVOA Data Access Layer, Strasbourg 2003 12

• Globally unique dataset (resource) identifier– ivoa://<authority-ID>/<resource-ID>#<dataset-ID>

• naming authority (namespace): authority-ID• data collection: resource-ID• dataset or record: dataset-ID

– Images: data collection, dataset– Catalogs: table, record ID

• Key points– data tagged by a unique global identifier– global identifiers may exist independently of any specific registry– identifiers of published data are persistent– authority IDs are globally unique, globally allocated– each authority controls name allocation within their namespace– caveat: this only works in a simple way for physical datasets

Dataset Identifiers

Required for many aspects of data access: publication, data provenance, replication, virtual data.

Page 13: 16-17 Oct 2003IVOA Data Access Layer, Strasbourg 20031 IVOA Data Access Layer (DAL) Working Group Doug Tody National Radio Astronomy Observatory International

16-17 Oct 2003IVOA Data Access Layer, Strasbourg 2003 13

Replica management

• Data Replication– data replication required for efficient access, data

backup– replica management and selection of datasets is

enabled by dataset IDs

• How it works:– replica manager service can harvest individual

registries and build a replica catalog– query replica manager service to discover replicas– query individual service to confirm existence, get

metadata, get data

Page 14: 16-17 Oct 2003IVOA Data Access Layer, Strasbourg 20031 IVOA Data Access Layer (DAL) Working Group Doug Tody National Radio Astronomy Observatory International

16-17 Oct 2003IVOA Data Access Layer, Strasbourg 2003 14

Virtual data

• Data access layer– most data is virtual data derived from external data sources– data access services subset, transform, or otherwise generate

data• Dataset IDs

– will allow data provenance to be specified– dataset A derived from datasets B, C by operation P

• This is an essential step to allow us to describevirtual data, but how we do so? [TBD]

• Current "acref" URL is a kind of virtual data reference– e.g., "http://archive.nrao.edu/sia/nvss?POS=12.32,-

11.2&SIZE=0.1&...”– acref implicitly specifies data provenance– may also be unstable, contain irrelevant access-specific details

• Use of a getData method instead of an explicit acref URL might allow virtual data generation to be standardized for a given access protocol

Dataset identifiers will provide the basis for describing virtual data, but how we do so is still TBD. Most likely doing so will involve defining the generation operation and inputs.

Page 15: 16-17 Oct 2003IVOA Data Access Layer, Strasbourg 20031 IVOA Data Access Layer (DAL) Working Group Doug Tody National Radio Astronomy Observatory International

16-17 Oct 2003IVOA Data Access Layer, Strasbourg 2003 15

Component data models

• Required in DAL to– characterize complex objects– represent data for transport and analysis

• Modeling complex objects– Standard and custom component data

models are aggregated to model more complex objects, e.g., datasets

Providing a means to determine the ‘quality’ of data will be essential to enable automated data analysis via the VO. Dataset characterization via component data models will provide this capability.

Page 16: 16-17 Oct 2003IVOA Data Access Layer, Strasbourg 20031 IVOA Data Access Layer (DAL) Working Group Doug Tody National Radio Astronomy Observatory International

16-17 Oct 2003IVOA Data Access Layer, Strasbourg 2003 16

Sample Component Data Models

• Observation metadata– observatory, instrument, project, observer, etc.

• Standard 'coverage' metadata– sky, time, bandpass, etc.

• Dataset characterization– time of observation (lo, high, refvalue)– spectral bandpass (lo, high, refvalue, ID)– spatial bandpass (lo, high; resolution?)– sensitivity or limiting flux (flux 'bandpass'?)– observable

• World coordinate systems• Storage models

The data models WG is actively working to define

these component data models.

Page 17: 16-17 Oct 2003IVOA Data Access Layer, Strasbourg 20031 IVOA Data Access Layer (DAL) Working Group Doug Tody National Radio Astronomy Observatory International

16-17 Oct 2003IVOA Data Access Layer, Strasbourg 2003 17

UTYPE, UCD normalization

• Proposed VOTable FIELD, PARAM, GROUP Attributes:

name="application-name" -- A name freely defined by an applicationid="ID-name" -- An XML identifier unique within a documentref="ID-ref" -- Reference to an ID elsewhere in documentucd="ucd-name" -- The Unified Content Descriptor ("fuzzy")utype="ns:datamodel-name" -- The uniform attribute type related to a

data-model; "ns" represents an optional

namespace attribute.

• A possible alternative would be to use a namespace within UCD, but this would overload UCD and interfere with its current usage.

UTYPE or something like it is required to represent data models rigorously in VOTable for data analysis in the VO.

Page 18: 16-17 Oct 2003IVOA Data Access Layer, Strasbourg 20031 IVOA Data Access Layer (DAL) Working Group Doug Tody National Radio Astronomy Observatory International

16-17 Oct 2003IVOA Data Access Layer, Strasbourg 2003 18

Sample Data Model: Spectral Bandpass

UTYPE UCD Name ID

ID INST_FILTER_CODE user-defined none

Unit UNITS user-defined none

RefValue INST_FILTER_REF user-defined none

HiLimit INST_FILTER_MAX user-defined none

LoLimit INST_FILTER_MIN user-defined none

Response DATA_LINK user-defined none

Page 19: 16-17 Oct 2003IVOA Data Access Layer, Strasbourg 20031 IVOA Data Access Layer (DAL) Working Group Doug Tody National Radio Astronomy Observatory International

16-17 Oct 2003IVOA Data Access Layer, Strasbourg 2003 19

Access reference templating

• Motivation– SIA query response table can get very large if there are a matrix of

options for each possible output image.

• Some Possible Solutions– ACREF template– getData method

• Thoughts– templating the acref is a form of getData method– should we just add a getData method instead?– but what is settable may be dataset dependent– metadata can flag attributes which can be set in template– acref would be template string– hence can collapse what could be P1*P2*PN redundant entries

Page 20: 16-17 Oct 2003IVOA Data Access Layer, Strasbourg 20031 IVOA Data Access Layer (DAL) Working Group Doug Tody National Radio Astronomy Observatory International

16-17 Oct 2003IVOA Data Access Layer, Strasbourg 2003 20

Access reference templating

• Motivation– SIA query response table can get large if there are a matrix of

options for each possible output image.– Should be easier to recognize simple variations on the same

image.– A simple one step ‘getImage’ method could be useful.

• Proposals– Parameter substitution on acref template (F. Bonarrel)

• e.g., image format, compression, image generation parameters– Formal getData method

No clear consensus at this point. Further discussion is needed.

Some form of templating could be good so long as it does not complicate the interface for the client. Some felt that the current approach is ok. XPATH or similar technology should be investigated for implementation.

Page 21: 16-17 Oct 2003IVOA Data Access Layer, Strasbourg 20031 IVOA Data Access Layer (DAL) Working Group Doug Tody National Radio Astronomy Observatory International

16-17 Oct 2003IVOA Data Access Layer, Strasbourg 2003 21

Simple Spectral Access (SSA)

• SSA Overview (Doug)– Goals, Interface, Data Formats

• SED Introduction (Markus) • Spectral data model (Jonathan)• SSA Issues (all)

General Agreements on SSA– Provide uniform interface for both 1D spectra and SEDs– Develop uniform data model for both 1D spectra and SEDs– Service interface will be similar to SIA, CS (query/response, getData)– Data output formats will include at least text, VOTable, FITS, graphics

SSA will provide an opportunity to learn how to 1) map VO data models into multiple external representations, and 2) package actual datasets in XML/VOTable, including representing data models in XML.

Page 22: 16-17 Oct 2003IVOA Data Access Layer, Strasbourg 20031 IVOA Data Access Layer (DAL) Working Group Doug Tody National Radio Astronomy Observatory International

16-17 Oct 2003IVOA Data Access Layer, Strasbourg 2003 22

SSA Interface Issues

• Registry integration• Query• Query response• Dataset retrieval• Data model• Data representation

Page 23: 16-17 Oct 2003IVOA Data Access Layer, Strasbourg 20031 IVOA Data Access Layer (DAL) Working Group Doug Tody National Radio Astronomy Observatory International

16-17 Oct 2003IVOA Data Access Layer, Strasbourg 2003 23

SSA Interface Issues

• Registry integration– Service metadata query– SSA service metadata

• SSA service verifier– Verify service is correct– Read service metadata, enter into a registry

Agreed without objection. Service verification and registration of service metadata should be provided for all DAL services.

Page 24: 16-17 Oct 2003IVOA Data Access Layer, Strasbourg 20031 IVOA Data Access Layer (DAL) Working Group Doug Tody National Radio Astronomy Observatory International

16-17 Oct 2003IVOA Data Access Layer, Strasbourg 2003 24

SSA Interface Issues

• Query– Query attributes

• pos, size, spectral resolution, bandpass, time• velocity, redshift, spectral class, object name, etc.• Spatial resolution,• Other?

– Query interface• Simple keyword queries (now)• Query language (ADQL) queries (later)

General agreement that the query is an important aspect of SSA. Spectra are generally more highly processed than, e.g., images, and may have attributes such as velocity, redshift, etc., which one would like to query on.

Implementation of a general query mechanism for SSA may require something like ADQL.

Page 25: 16-17 Oct 2003IVOA Data Access Layer, Strasbourg 20031 IVOA Data Access Layer (DAL) Working Group Doug Tody National Radio Astronomy Observatory International

16-17 Oct 2003IVOA Data Access Layer, Strasbourg 2003 25

SSA Interface Issues

• Query Response– form VOTable as in SIA– this is a flat summary table for simplicity– alternative would be sequence of structured

objects

Not discussed due to lack of time.

Unless a reason is found to deviate the expectation is that the query response will be a flat VOTable as with SIA.

Page 26: 16-17 Oct 2003IVOA Data Access Layer, Strasbourg 20031 IVOA Data Access Layer (DAL) Working Group Doug Tody National Radio Astronomy Observatory International

16-17 Oct 2003IVOA Data Access Layer, Strasbourg 2003 26

SSA Interface Issues

• Dataset Retrieval– one getData method per spectrum/SED– data format options

text, xml, votable, fits, graphics, html, ...

Agreed that spectra output formats should include at least text, VOTable, FITS, and graphics. How data is represented in each format is a different issue.

Page 27: 16-17 Oct 2003IVOA Data Access Layer, Strasbourg 20031 IVOA Data Access Layer (DAL) Working Group Doug Tody National Radio Astronomy Observatory International

16-17 Oct 2003IVOA Data Access Layer, Strasbourg 2003 27

SSA Interface Issues

• Data model– Uniform model for 1D spectra and SED– As simple as it can be while solving this problem– Range of observables– …

The general spectral data model as presented by Jonathan was well received by the WG and will serve as the basis for further development of SSA via a subgroup with members from both DAL and DM.

Page 28: 16-17 Oct 2003IVOA Data Access Layer, Strasbourg 20031 IVOA Data Access Layer (DAL) Working Group Doug Tody National Radio Astronomy Observatory International

16-17 Oct 2003IVOA Data Access Layer, Strasbourg 2003 28

SSA Interface Issues

• Dataset representation– text (# keyword = value, records)– votable (how do we represent datset in

XML?)– fits (table or image?)

Most of the discussion here was of what FITS format to use. It was agreed that a FITS table was the most general, but would be harder for existing applications to use and would duplicate what VOTable will already provide. Use of a simple linearized 1D spectrum represented as a FITS image will be investigated.

Page 29: 16-17 Oct 2003IVOA Data Access Layer, Strasbourg 20031 IVOA Data Access Layer (DAL) Working Group Doug Tody National Radio Astronomy Observatory International

16-17 Oct 2003IVOA Data Access Layer, Strasbourg 2003 29

Process

• Process for development of SSA spec– Spectral survey– Discuss SSA design issues (this meeting)– Initial draft specification– Discuss, revise draft specification– Initial implementations

Page 30: 16-17 Oct 2003IVOA Data Access Layer, Strasbourg 20031 IVOA Data Access Layer (DAL) Working Group Doug Tody National Radio Astronomy Observatory International

16-17 Oct 2003IVOA Data Access Layer, Strasbourg 2003 30

Priorities and Schedule

• SSA V1.0– Initial specification– Initial implementations

• DAL Technology– Component data models– Data model representation

• SIA V1.1• Web service implementations