nvo summer school, aspen 9-sep-20051 data access layer doug tody (nrao) us n ational v irtual o...

44
NVO Summer School, Aspen 9-Sep-2005 1 Data Access Layer Doug Tody (NRAO) US NATIONAL VIRTUAL OBSERVATORY

Upload: jesus-ogrady

Post on 27-Mar-2015

212 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: NVO Summer School, Aspen 9-Sep-20051 Data Access Layer Doug Tody (NRAO) US N ATIONAL V IRTUAL O BSERVATORY

NVO Summer School, Aspen 9-Sep-2005 1

Data Access LayerDoug Tody (NRAO)

US NATIONAL VIRTUAL OBSERVATORY

Page 2: NVO Summer School, Aspen 9-Sep-20051 Data Access Layer Doug Tody (NRAO) US N ATIONAL V IRTUAL O BSERVATORY

NVO Summer School, Aspen 9-Sep-2005 2

Data Access Layer

• What does it do?– Provides access to data

• data discovery• mediation to a standard model• data retrieval• on-demand data generation• server-side computation (subsetting, filtering)

• What is it for?– Supports client data analysis

• distributed, multiwavelength

• How does it work?– Object (dataset) oriented

• catalog, image, spectrum, time series, SED, etc.– Services

• cone search (also SkyNode), SIA, SSA

Page 3: NVO Summer School, Aspen 9-Sep-20051 Data Access Layer Doug Tody (NRAO) US N ATIONAL V IRTUAL O BSERVATORY

NVO Summer School, Aspen 9-Sep-2005 3

Cone Search

Page 4: NVO Summer School, Aspen 9-Sep-20051 Data Access Layer Doug Tody (NRAO) US N ATIONAL V IRTUAL O BSERVATORY

NVO Summer School, Aspen 9-Sep-2005 4

Cone Search

• Provides basic catalog access– Query by position and aperture (cone in space)– Query consists of base-URL (service endpoint) plus parameters

• e.g., http://base-url %RA=12.0&DEC=0.0&SR=1.0– Catalog returned as a VOTable

• Advantages– Simple but powerful, provides standard interface– Easy to implement and use

• Limitations– Catalog metadata is not defined– No data model support

• Future– Supplanted by basic SkyNode (Greene, Saturday)– Supports metadata discovery, SQL-like syntactical queries– We will continue to support the basic cone search query

however!

Page 5: NVO Summer School, Aspen 9-Sep-20051 Data Access Layer Doug Tody (NRAO) US N ATIONAL V IRTUAL O BSERVATORY

NVO Summer School, Aspen 9-Sep-2005 5

Simple Image Access

Page 6: NVO Summer School, Aspen 9-Sep-20051 Data Access Layer Doug Tody (NRAO) US N ATIONAL V IRTUAL O BSERVATORY

NVO Summer School, Aspen 9-Sep-2005 6

Simple Image Access (SIA)

• Basic Usage, Highest Level– Client queries Registry to find interesting

services– Each service is queried (in turn or

simultaneously) for data– Client collates and analyzes results– Selected datasets are retrieved

Page 7: NVO Summer School, Aspen 9-Sep-20051 Data Access Layer Doug Tody (NRAO) US N ATIONAL V IRTUAL O BSERVATORY

NVO Summer School, Aspen 9-Sep-2005 7

Simple Image Access (SIA)

• Basic Usage, Single Service– Query

• find data of interest from a single service• http://base-url

%POS=12.0,0.0&SIZE=0.2&FORMAT=image/fits

– Query response• VOTable, one row per candidate dataset• "access reference" (a URL) points to data

– Data selection• Performed by the client using query response metadata

– Dataset retrieval• Retrieve actual datasets, if any

Page 8: NVO Summer School, Aspen 9-Sep-20051 Data Access Layer Doug Tody (NRAO) US N ATIONAL V IRTUAL O BSERVATORY

NVO Summer School, Aspen 9-Sep-2005 8

Service Capabilities

• Types of Services– Atlas Precomputed survey image (entire image)– Pointed Image from pointed observation (entire

image)– Cutout Cutout existing image (pixels unchanged)– Mosaic Reprojected image (pixels resampled)

• Virtual Data– Data model mediation– Subsetting, filtering, etc. on the fly– Possible to view same data in different ways

• Interface– RESTful interface currently (HTTP GET)– Document oriented (VOTable, FITS, JPEG, etc.)

Page 9: NVO Summer School, Aspen 9-Sep-20051 Data Access Layer Doug Tody (NRAO) US N ATIONAL V IRTUAL O BSERVATORY

NVO Summer School, Aspen 9-Sep-2005 9

Data Model

• SIA data model is the familiar "astronomical image"– Generally this means a 2D sky projection– Data array is logically a regular grid of pixels– Encoded as a FITS image, GIF/JPEG, etc.

• Standardized dataset metadata– Provenance– Image geometry– Scale– Format– Position, WCS– Time of observation– Spectral bandpass– Access information

Page 10: NVO Summer School, Aspen 9-Sep-20051 Data Access Layer Doug Tody (NRAO) US N ATIONAL V IRTUAL O BSERVATORY

NVO Summer School, Aspen 9-Sep-2005 10

Input Parameters

• Required parameters– POS center of ROI (ra, dec decimal degrees ICRS)– SIZE width; or width, height– FORMAT ALL, GRAPHIC, image/fits, image/jpeg,

text/html,…

• Optional parameters– INTERSECT values: covers, enclosed, center, overlaps– VERB table verbosity

• Service-defined parameters– used to further refine queries, but not yet standardized

• e.g., BAND, SURVEY, etc.

• Image generation parameters– NAXIS, CFRAME, EQUINOX, CRPIX, CRVAL, CDELT, ROTANG, PROJ

• used for cutout/mosaic services to specify image to be generated

Page 11: NVO Summer School, Aspen 9-Sep-20051 Data Access Layer Doug Tody (NRAO) US N ATIONAL V IRTUAL O BSERVATORY

NVO Summer School, Aspen 9-Sep-2005 11

Query Response

• Output is a VOTable– Must contain a RESOURCE element with tag="results",

containing the results of the query.

• The ‘results’ resource contains a single table– Each row of the table describes a single data object which can

be retrieved.

• The fields of the table describe the attributes of the dataset– These are the attributes of the SIA data model– In SIA 1.0, the UCD is used to identify the data model attribute

• e.g., POS_EQ_RA_MAIN, VOX:Image_Scale, etc.

Page 12: NVO Summer School, Aspen 9-Sep-20051 Data Access Layer Doug Tody (NRAO) US N ATIONAL V IRTUAL O BSERVATORY

NVO Summer School, Aspen 9-Sep-2005 12

Query Response

• Image metadata– Describes the image object (required)

• Coordinate system metadata– Image WCS

• Spectral bandpass metadata– Prototype data model describing spectral bandpass of image

• Processing metadata– Tells whether the service modified the image data

• Access metadata– Tells client how to access the dataset (required)

• Resource-specific metadata– Additional optional service-defined metadata describing image

Page 13: NVO Summer School, Aspen 9-Sep-20051 Data Access Layer Doug Tody (NRAO) US N ATIONAL V IRTUAL O BSERVATORY

NVO Summer School, Aspen 9-Sep-2005 13

Image Metadata

VOX:Image_Title Brief description of image

POS_EQ_RA_MAIN Ra (ICRS)POS_EQ_DEC_MAIN Dec (ICRS)INST_ID Instrument nameVOX:Image_MJDateObs MJD of observationVOX:Image_Naxes Number of image axesVOX:Image_Naxis Length of each axisVOX:Image_Scale Image scale, deg/pixVOX:Image_Format Image file format

Page 14: NVO Summer School, Aspen 9-Sep-20051 Data Access Layer Doug Tody (NRAO) US N ATIONAL V IRTUAL O BSERVATORY

NVO Summer School, Aspen 9-Sep-2005 14

Page 15: NVO Summer School, Aspen 9-Sep-20051 Data Access Layer Doug Tody (NRAO) US N ATIONAL V IRTUAL O BSERVATORY

NVO Summer School, Aspen 9-Sep-2005 15

Image Retrieval

• Completely optional– Typically only a fraction of the available images are retrieved

• Query response– If an access reference is provided, the data can be retrieved– SIAP can also be used to describe data which is not online– The same data may be available in multiple formats

• Image retrieval– Very simple; access reference is a URL– Standard tools can be used to fetch the data

• (browser, wget, curl, i/o library, etc.)– Data is often computed on-the-fly– All retrieval is synchronous (currently)– No provision for restricting access (currently)

Page 16: NVO Summer School, Aspen 9-Sep-20051 Data Access Layer Doug Tody (NRAO) US N ATIONAL V IRTUAL O BSERVATORY

NVO Summer School, Aspen 9-Sep-2005 16

Service Registration

Page 17: NVO Summer School, Aspen 9-Sep-20051 Data Access Layer Doug Tody (NRAO) US N ATIONAL V IRTUAL O BSERVATORY

NVO Summer School, Aspen 9-Sep-2005 17

Future Development

• SIA V1.1– Based on work done on SSA– Expanded query interface

• no longer limited to positional queries– Much richer query response

• generic dataset identification, characterization, etc.• metadata extension mechanism

– Selected features• VOTable 1.1 with UCD 1+, GROUP, UTYPE• query response can be ordered by "score"• logical groupings of related query records• compression support

– Versioning• required to make protocol upgrades manageable

Page 18: NVO Summer School, Aspen 9-Sep-20051 Data Access Layer Doug Tody (NRAO) US N ATIONAL V IRTUAL O BSERVATORY

NVO Summer School, Aspen 9-Sep-2005 18

Page 19: NVO Summer School, Aspen 9-Sep-20051 Data Access Layer Doug Tody (NRAO) US N ATIONAL V IRTUAL O BSERVATORY

NVO Summer School, Aspen 9-Sep-2005 19

Page 20: NVO Summer School, Aspen 9-Sep-20051 Data Access Layer Doug Tody (NRAO) US N ATIONAL V IRTUAL O BSERVATORY

NVO Summer School, Aspen 9-Sep-2005 20

Future Development

• Service verification– for testing at development time– when registered; level of compliance metric

• Grid capabilities – Data staging

• asynchronous image generation (long running jobs)• batch generation of images (multiple images)

– Data management• support for single sign-on authentication, authorization• network data caching, third party delivery (VOStore etc.)

– Web service interface• resource metadata• service availability (etc.)

• ADQL integration– Capability to use query language for queries

Page 21: NVO Summer School, Aspen 9-Sep-20051 Data Access Layer Doug Tody (NRAO) US N ATIONAL V IRTUAL O BSERVATORY

NVO Summer School, Aspen 9-Sep-2005 21

Simple Spectral Access

Page 22: NVO Summer School, Aspen 9-Sep-20051 Data Access Layer Doug Tody (NRAO) US N ATIONAL V IRTUAL O BSERVATORY

NVO Summer School, Aspen 9-Sep-2005 22

Simple Spectral Access (SSA)

• What is it?– Provides access to 1D spectra, time series, SEDs– Tabular spectrophotometric data (photometry points)– Represents second generation, data model-based DAL

interfaces

• Status– Draft V0.9 query interface reviewed in Kyoto (May 05)– Revisions in progress; draft PR targeted for Madrid (Oct 05)– Much work on data models however still being revised– Some initial prototypes already exist (services, client apps)

• IVOA/Madrid discussions will be held immediately after the ADASS and are open to all

Page 23: NVO Summer School, Aspen 9-Sep-20051 Data Access Layer Doug Tody (NRAO) US N ATIONAL V IRTUAL O BSERVATORY

NVO Summer School, Aspen 9-Sep-2005 23

Basic Usage

• SSA specification may be complex, but basic usage is simple

• Simple query– POS, SIZE, FORMAT - like cone search, SIA– Possibly refined by spectral or time bandpass, etc.– Most metadata in query response is optional

• Data retrieval– Simple retrieval is again URL-based– Get back a dataset "document" (VOTable, FITS, JPEG, etc.)– In simplest case could be wavelength, flux as text (for Spectrum)– Pass-through of external data is permitted

• Data Analysis– Standard data model isolates application from quirks of– external project data

Page 24: NVO Summer School, Aspen 9-Sep-20051 Data Access Layer Doug Tody (NRAO) US N ATIONAL V IRTUAL O BSERVATORY

NVO Summer School, Aspen 9-Sep-2005 24

Concepts - Dataset-oriented

• Data object type– Spectrum, TimeSeries, SED

• Dataset creation type– Atlas Whole datasets, uniform survey data– Pointed Whole datasets, variable instrumental data– Cutout Subset, data samples are not modified– Resampled Subset, data samples computed by service

• Dataset derivation– Observed An observation– Composite Combination of several observations– Simulated Simulated observation made from real data– Synthetic Data from a theoretical model

Page 25: NVO Summer School, Aspen 9-Sep-20051 Data Access Layer Doug Tody (NRAO) US N ATIONAL V IRTUAL O BSERVATORY

NVO Summer School, Aspen 9-Sep-2005 25

Data Models

• Data models used in SSA– Spectral data Spectrum, TimeSeries, SED– Dataset Generic dataset descriptor– Target Astronomical target observed– Curation Origin of data– Characterization Physical characteristics of data– Provenance Instrument which generated the data

• User defined data models– Metadata extension mechanisms

• additional data model attributes (table fields)• additional resources in VOTable, linked back to main table

– Provide a mechanism to "subclass" dataset to tailor it for a given data collection

Page 26: NVO Summer School, Aspen 9-Sep-20051 Data Access Layer Doug Tody (NRAO) US N ATIONAL V IRTUAL O BSERVATORY

NVO Summer School, Aspen 9-Sep-2005 26

Spectral Data (SED)

spectrum segment

Photometry point

Page 27: NVO Summer School, Aspen 9-Sep-20051 Data Access Layer Doug Tody (NRAO) US N ATIONAL V IRTUAL O BSERVATORY

NVO Summer School, Aspen 9-Sep-2005 27

Spectral/SED Data Model

Page 28: NVO Summer School, Aspen 9-Sep-20051 Data Access Layer Doug Tody (NRAO) US N ATIONAL V IRTUAL O BSERVATORY

NVO Summer School, Aspen 9-Sep-2005 28

Page 29: NVO Summer School, Aspen 9-Sep-20051 Data Access Layer Doug Tody (NRAO) US N ATIONAL V IRTUAL O BSERVATORY

NVO Summer School, Aspen 9-Sep-2005 29

Query Interface

• Mandatory query parameters– POS RA, DEC (ICRS)– SIZE diameter (decimal degrees)– TIME data1,date2 (epoch in

decimal years UTC) – BAND wave1,wave2 (meters in vacuum;

source or observer)– FORMAT VOTable, fits, xml, text, graphics,

html, external

Page 30: NVO Summer School, Aspen 9-Sep-20051 Data Access Layer Doug Tody (NRAO) US N ATIONAL V IRTUAL O BSERVATORY

NVO Summer School, Aspen 9-Sep-2005 30

Query Interface

• Recommended query parameters– APERTURE approx spatial resolution

(decimal degrees)– SPECRES spectral resolution (meters)– TOP number of top-ranked

records to return– OBJTYPE mandatory if service returns

multiple object types– COLLECTION data collection identifier

Page 31: NVO Summer School, Aspen 9-Sep-20051 Data Access Layer Doug Tody (NRAO) US N ATIONAL V IRTUAL O BSERVATORY

NVO Summer School, Aspen 9-Sep-2005 31

Query Interface

• Optional parameters– CREATORID creator-assigned dataset identifier (at most 1)– PUBID publisher-assigned dataset identifier (at most N)– COMPRESS enable compression (for both data _and_

queries?)

– SNR signal-to-noise ratio– REDSHIFT redshift range (dlambda/lambda)– TARGETCLASS star, galaxy, pulsar, PN, QSO, AGN, etc.

Page 32: NVO Summer School, Aspen 9-Sep-20051 Data Access Layer Doug Tody (NRAO) US N ATIONAL V IRTUAL O BSERVATORY

NVO Summer School, Aspen 9-Sep-2005 32

Query Response

• Classes of query metadata– Query metadata Describes the query itself– Dataset metadata Describes data object; object-specific– Target metadata Astronomical target– Curation metadata External identification of dataset– Characterization Coverage, Accuracy, Frame, etc.– Instrument metadata Service-defined; hard to

standardize– Access metadata Describes how to access the

dataset

Page 33: NVO Summer School, Aspen 9-Sep-20051 Data Access Layer Doug Tody (NRAO) US N ATIONAL V IRTUAL O BSERVATORY

NVO Summer School, Aspen 9-Sep-2005 33

Query Response

• Query Metadata– Query.Score How well object matches

query– Query.LName Logical name (identifier)– Query.LNameKey Logical name key (id-ref)

• Example: LName="MyObj123" LNameKey="server,format"

Page 34: NVO Summer School, Aspen 9-Sep-20051 Data Access Layer Doug Tody (NRAO) US N ATIONAL V IRTUAL O BSERVATORY

NVO Summer School, Aspen 9-Sep-2005 34

Query Response

• Dataset Metadata– Dataset.Type Spectrum, TimeSeries, SED, etc.– Dataset.DataModel DM name, e.g., "SSA-V0.90"– Dataset.Title Brief descriptive title of dataset– Dataset.SSA.NSamples Total samples in dataset

Dataset.SSA.Aperture Characteristic aperture diameter– Dataset.SSA.TimeAxis TimeCoord axis (external data)– .SSA.SpectralAxis SpectralCoord axis (external

data)– Dataset.SSA.FluxAxis Flux axis (external data)– Dataset.CreationType atlas, pointed, cutout,

resampled– Dataset.Derivation observed, composite,

simulated, synthetic

Page 35: NVO Summer School, Aspen 9-Sep-20051 Data Access Layer Doug Tody (NRAO) US N ATIONAL V IRTUAL O BSERVATORY

NVO Summer School, Aspen 9-Sep-2005 35

Query Response

• Target Metadata– Target.Name Name of astronomical object– Target.Class Target class (star, galaxy, QSO, etc.)– Target.SpectralClass Spectral class (e.g., 'O', 'B', etc.)– Target.Redshift Nominal redshift for object– Derived.VarAmpl Variability amplitude (fraction 0-

1)– Derived.SNR Observed signal to noise ratio

Page 36: NVO Summer School, Aspen 9-Sep-20051 Data Access Layer Doug Tody (NRAO) US N ATIONAL V IRTUAL O BSERVATORY

NVO Summer School, Aspen 9-Sep-2005 36

Query Response

• Curation Metadata– Curation.Collection Data collection name (identifier)– Curation.Creator Creator identify (identifier)– Curation.CreatorID Creator-assigned dataset

identifier– Curation.PublisherID Publisher-assigned dataset

identifier– Curation.Date Dataset creation date (ISO date string)– Curation.Version Dataset version (within same

ID)

Page 37: NVO Summer School, Aspen 9-Sep-20051 Data Access Layer Doug Tody (NRAO) US N ATIONAL V IRTUAL O BSERVATORY

NVO Summer School, Aspen 9-Sep-2005 37

Query Response

• Characterization1 - Coverage– .Location.Spatial Position (e.g., RA, DEC)– .Location.Time Observation time characteristic value– .Location.Spectral Spectral bandpass characteristic

value– .Location.Spectral.BandID Bandpass ID (band or filter name)– .Bounds.Spatial Aperture footprint (polygon on sky)– .Bounds.Time Low/High time values– .Bounds.Spectral Low/High spectral values– .Bounds.Flux Limiting flux, saturation limit (Jansky)– .Fill.Spatial Spatial sampling filling factor (0-1)– .Fill.Time Time sampling filling factor (0-1)– .Fill.Spectral Spectral sampling filling factor (0-1)

Page 38: NVO Summer School, Aspen 9-Sep-20051 Data Access Layer Doug Tody (NRAO) US N ATIONAL V IRTUAL O BSERVATORY

NVO Summer School, Aspen 9-Sep-2005 38

Query Response

• Characterization2 - Accuracy– Accuracy.*.Calibrated uncalibrated, relative,

absolute– Accuracy.*.Resolution Resolution of measured

signal

– Accuracy.*.StatErr Statistical error (measured)

– Accuracy.*.SysErr Systematic error (estimated)

('*' = Spatial, Time, Spectral, Flux)

Page 39: NVO Summer School, Aspen 9-Sep-20051 Data Access Layer Doug Tody (NRAO) US N ATIONAL V IRTUAL O BSERVATORY

NVO Summer School, Aspen 9-Sep-2005 39

Query Response

• Characterization3 - Reference Frames– Frame.Spatial.Type Coordinate frame (default ICRS)– Frame.Spatial.Equinox Coordinate system equinox

(J2000)– Frame.Time.System Timescale (TT)– Frame.Time.SIDim SI factor and dimension– Frame.Spectral.SIDim SI factor and dimension– Frame.Flux.SIDim SI factor and dimension– Frame.Flux.UCD UCD of flux value (flux type)

(These apply only to the query response)(SIDim metadata still under construction)

Page 40: NVO Summer School, Aspen 9-Sep-20051 Data Access Layer Doug Tody (NRAO) US N ATIONAL V IRTUAL O BSERVATORY

NVO Summer School, Aspen 9-Sep-2005 40

Query Response

• Instrument Metadata– Instrument.Name Instrument name (identifier)– Instrument.Exposure Total exposure time (seconds)– Instrument.<other> Service-defined

• Notes– Optional; provided for instrumental data collections– In general, Collection, Bounds.Time, etc. are preferred– In general Instrument metadata is service-defined– Use Observation model as a starting point

Page 41: NVO Summer School, Aspen 9-Sep-20051 Data Access Layer Doug Tody (NRAO) US N ATIONAL V IRTUAL O BSERVATORY

NVO Summer School, Aspen 9-Sep-2005 41

Query Response

• Access Metadata– Access.Reference Data access URL– Access.Format MIME type of returned

dataset– Access.Size Approximate dataset size (bytes)– Access.Server Server endpoint URL

• Staging support goes here in the future– e.g., will dataset access require asynchronous staging– estimated cost to construct dataset

Page 42: NVO Summer School, Aspen 9-Sep-20051 Data Access Layer Doug Tody (NRAO) US N ATIONAL V IRTUAL O BSERVATORY

NVO Summer School, Aspen 9-Sep-2005 42

Service Metadata

• Usage– Describe service type and capabilities– Characterize service (data resources served, coverage, etc.)– Describe interface (optional query parameters)

• Interface– Requires new service metadata query method– Returns resource metadata descriptor (XML)

• Format– Registry resource descriptor (XML)

Page 43: NVO Summer School, Aspen 9-Sep-20051 Data Access Layer Doug Tody (NRAO) US N ATIONAL V IRTUAL O BSERVATORY

NVO Summer School, Aspen 9-Sep-2005 43

Data Retrieval

• Based on GET as with SIA– Variety of formats available– Compression supported

• Data representation– Data model defines logical content of data– The same data object may be represented

in various formats– Hence we need to specify both the data

model, and the file format

Page 44: NVO Summer School, Aspen 9-Sep-20051 Data Access Layer Doug Tody (NRAO) US N ATIONAL V IRTUAL O BSERVATORY

NVO Summer School, Aspen 9-Sep-2005 44

Data Retrieval

• Data models– SSA data model for fully-compliant data– Provider-defined data model for external data

• Data formats– VOTable (a container), native XML (direct serialization)– FITS binary table (another container; uses FITS spectral

WCS)– Text, e.g., CSV– Graphics (JPEG etc.)– text/html (rendered into browser page)