tap meeting, jhu nov 19-20 20071 ivoa data access layer table access protocol doug tody (nrao/nvo )...

25
TAP Meeting, JHU Nov 19-20 2007 1 IVOA Data Access Layer Table Access Protocol Doug Tody (NRAO/NVO) INTERNATIONAL VIRTUAL OBSERVATORY ALLIANCE US National Virtual Observatory

Upload: brianna-short

Post on 27-Mar-2015

213 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: TAP Meeting, JHU Nov 19-20 20071 IVOA Data Access Layer Table Access Protocol Doug Tody (NRAO/NVO ) I NTERNATIONAL V IRTUAL O BSERVATORY A LLIANCE US National

TAP Meeting, JHU Nov 19-20 2007 1

IVOA Data Access LayerTable Access Protocol

Doug Tody (NRAO/NVO)

INTERNATIONAL VIRTUAL OBSERVATORY ALLIANCEUS National Virtual Observatory

Page 2: TAP Meeting, JHU Nov 19-20 20071 IVOA Data Access Layer Table Access Protocol Doug Tody (NRAO/NVO ) I NTERNATIONAL V IRTUAL O BSERVATORY A LLIANCE US National

TAP Meeting, JHU Nov 19-20 2007 2

TAP Key Issues

• Brief TAP History• Some TAP Use-Cases• Key Requirements/Issues

Page 3: TAP Meeting, JHU Nov 19-20 20071 IVOA Data Access Layer Table Access Protocol Doug Tody (NRAO/NVO ) I NTERNATIONAL V IRTUAL O BSERVATORY A LLIANCE US National

TAP Meeting, JHU Nov 19-20 2007 3

TAP Perspective

• History– TAP discussions began in one year ago in Moscow– VOQL-TEG discussions established overall architecture

• Main Documents– "Recommendations for a Table Access Protocol" (Sept06)

• Plante, Budavari, Greene, Good, McGlynn, M.NS, Szalay, Williams

– "Homogeneous Access to Tabular Data" (March-May 2007)• Stebe, Andrews, Rixon, Ortiz

– "Table Access Protocol Design Analysis" (Sept07)• Tody, Dowler, et.al. (mainly based upon NVO, DAL discussions)

Page 4: TAP Meeting, JHU Nov 19-20 20071 IVOA Data Access Layer Table Access Protocol Doug Tody (NRAO/NVO ) I NTERNATIONAL V IRTUAL O BSERVATORY A LLIANCE US National

TAP Meeting, JHU Nov 19-20 2007 4

TAP Perspective

• Evolution of TAP Concept– Two IVOA Notes describe largely similar interfaces– ADQL and "simple" queries– Metadata query capabilities– GET/POST interface, optional data staging, async, etc.– VOTable, CSV/TSV, XML, etc. as output

• Spring-Fall Evolution– More consistent with other DAL interfaces– Unified access to both table data and metadata

• Simplied interface as a result (two main operations vs 8 or so)• Revised metadata query mechanism more easily extended

– Added support for multi-region queries, VOSpace, schemas, etc.

• Look back to Moscow (Plante et.al.)– Very similar in concept to what we have now, except for REGION

Page 5: TAP Meeting, JHU Nov 19-20 20071 IVOA Data Access Layer Table Access Protocol Doug Tody (NRAO/NVO ) I NTERNATIONAL V IRTUAL O BSERVATORY A LLIANCE US National

TAP Meeting, JHU Nov 19-20 2007 5

Primary TAP Use-Cases

• ADQL-Based Table Query– Large queries (requires async, vospace, authentication)– Multi-table operations (join etc.)– Multi-region queries including user table upload– Advanced ADQL/SQL capabilities (region, cross-match,

functions)

• Notes– Required to support cross-match portal, advanced apps– Full functionality ultimately required (everything above)– Simple synchronous GET version still useful though

Page 6: TAP Meeting, JHU Nov 19-20 20071 IVOA Data Access Layer Table Access Protocol Doug Tody (NRAO/NVO ) I NTERNATIONAL V IRTUAL O BSERVATORY A LLIANCE US National

TAP Meeting, JHU Nov 19-20 2007 6

Primary TAP Use-Cases

• Simple Table/Catalog Query– Filter-type operation upon a single table– Most basic astronomical catalog access is of this type– ADQL, async useful but not required for most simple queries– Not necessarily limited to small queries; can stream output

• Notes– Probably sufficient for most small data providers– Cone search is no longer an option, too limited

Page 7: TAP Meeting, JHU Nov 19-20 20071 IVOA Data Access Layer Table Access Protocol Doug Tody (NRAO/NVO ) I NTERNATIONAL V IRTUAL O BSERVATORY A LLIANCE US National

TAP Meeting, JHU Nov 19-20 2007 7

Primary TAP Use-Cases

• Table Metadata Query– Metadata describing stored data is also data

• Can be virtual/computed OTF, subsetted, transformed, etc.• Queries against both data/metadata required at the service level• Service metadata - capabilities etc - is a separate issue

– Typical use case:• Client application queries TAP service for available data• Tables, table columns, relationships, etc.

• Notes– Basic metadata model should be simple– Extensibility required for advanced query support

Page 8: TAP Meeting, JHU Nov 19-20 20071 IVOA Data Access Layer Table Access Protocol Doug Tody (NRAO/NVO ) I NTERNATIONAL V IRTUAL O BSERVATORY A LLIANCE US National

TAP Meeting, JHU Nov 19-20 2007 8

Primary TAP Use-Cases

• Data Access Query– This is "ADQL integration into DAL"– ADQL query against a DAL data model (complex data etc.)

• Notes– Less critical than others, but important in the longer term– Main requirement is consistency with other DAL interfaces

Page 9: TAP Meeting, JHU Nov 19-20 20071 IVOA Data Access Layer Table Access Protocol Doug Tody (NRAO/NVO ) I NTERNATIONAL V IRTUAL O BSERVATORY A LLIANCE US National

TAP Meeting, JHU Nov 19-20 2007 9

Key Requirements/Issues

• ADQL and Grid capabilities

– Motivation• Required for portals, large computations, advanced applications• Needs everything: ADQL, multi-region, async, vospace, SSO, etc.

– Issues or Options• Not controversial: everyone agrees we need this• Not required however for basic usage (most usage)• Complex; will take time to prototype, specify, standardize• Unrealistic to expect community implementation w/o frameworks

Page 10: TAP Meeting, JHU Nov 19-20 20071 IVOA Data Access Layer Table Access Protocol Doug Tody (NRAO/NVO ) I NTERNATIONAL V IRTUAL O BSERVATORY A LLIANCE US National

TAP Meeting, JHU Nov 19-20 2007 10

Key Requirements/Issues

• Simple Query capability

– Motivation• Provide a simple, basic table access capability (90/10 rule)• Probably needed anyway for simple table metadata queries• Adequate for most simple filter-type queries of single table• Supplants cone search; much more powerful but still simple• Provide robust implementation while we develop advanced stuff

– Notes• Provides minimal-TAP option for small data providers• Avoid splitting community; first step toward full TAP service• Sufficient for publishing a few typical catalogs

Page 11: TAP Meeting, JHU Nov 19-20 20071 IVOA Data Access Layer Table Access Protocol Doug Tody (NRAO/NVO ) I NTERNATIONAL V IRTUAL O BSERVATORY A LLIANCE US National

TAP Meeting, JHU Nov 19-20 2007 11

Key Requirements/Issues

• TAP Information Schema

– Motivation• Provide uniform access to both table data and metadata• Same query/access interface re-used for both• Supports virtual data, dynamic queries, format options, etc• Easily extended without changing interface• Don't do one thing now, another later• Strategy: adopt registry tableset model for initial schema

– Alternatives• Dataless VOTable headers• Return static registry-compliant XML

Page 12: TAP Meeting, JHU Nov 19-20 20071 IVOA Data Access Layer Table Access Protocol Doug Tody (NRAO/NVO ) I NTERNATIONAL V IRTUAL O BSERVATORY A LLIANCE US National

TAP Meeting, JHU Nov 19-20 2007 12

Key Requirements/Issues

• Proposed Core TAP Schema (Common with registry)

– TAP_SCHEMA.tables• name [[catalog.]schema.]table• type base table, view, output, etc.• description table description

– TAP_SCHEMA.columns• name column name• tableName table name• description column description• unit unit in VO standard format• ucd UCD if any• utype UTYPE if any• dataType dataType as in VOTable/registry• arrayShape array "shape"/size as in VOTable/registry• std standard column (else custom addition)

Page 13: TAP Meeting, JHU Nov 19-20 20071 IVOA Data Access Layer Table Access Protocol Doug Tody (NRAO/NVO ) I NTERNATIONAL V IRTUAL O BSERVATORY A LLIANCE US National

TAP Meeting, JHU Nov 19-20 2007 13

Key Requirements/Issues

• Possible Examples of TAP Schema Usage– List Tables

request=SimpleQuery&FROM=TAP_SCHEMA.tables

– List Table Columnsrequest=SimpleQuery&FROM=TAP_SCHEMA.columns&WHERE=tableName,

foo

– Registry Compliant XML TableSetrequest=SimpleQuery&FROM=TAP_SCHEMA.tableset&FORMAT=xml

Page 14: TAP Meeting, JHU Nov 19-20 20071 IVOA Data Access Layer Table Access Protocol Doug Tody (NRAO/NVO ) I NTERNATIONAL V IRTUAL O BSERVATORY A LLIANCE US National

TAP Meeting, JHU Nov 19-20 2007 14

Key Requirements/Issues

• Consistency of DAL Interfaces– Motivations

• Family of services: generic dataset, catalog, image, spectrum, etc.– Consistency promotes ease of use, code reuse

• Multi-operation GET/POST, error returns, versioning,– parameter syntax and semantics, output formats, UTYPE, etc.

• Related issue is consistency with UWS, VOSI, VOSpace, SSO, etc.

– Suggested Guideline• TAP should conform to DAL interface concept, form and semantics

except where there is good justification to do otherwise.

Page 15: TAP Meeting, JHU Nov 19-20 20071 IVOA Data Access Layer Table Access Protocol Doug Tody (NRAO/NVO ) I NTERNATIONAL V IRTUAL O BSERVATORY A LLIANCE US National

TAP Meeting, JHU Nov 19-20 2007 15

Page 16: TAP Meeting, JHU Nov 19-20 20071 IVOA Data Access Layer Table Access Protocol Doug Tody (NRAO/NVO ) I NTERNATIONAL V IRTUAL O BSERVATORY A LLIANCE US National

TAP Meeting, JHU Nov 19-20 2007 16

TAP Design Study

• History– Based upon work done by ESAC/VOQL-TEG and DAL WG in spring

2007– Also NVO tiger team, SkyNode experience, data center experience

• TAP Design Goals– Provide capability for ADQL queries to support advanced analysis– Define minimal implementation

• for small data provider, common queries• replace legacy cone search with more general facility

– Both data access and metadata access supported natively by service – Provide for scalability, in particular multi-position queries– Support Grid capabilities, i.e, async, staging, authentication– TAP should be consistent with other DAL interfaces where possible– Provide registry integration for automated service discovery

Page 17: TAP Meeting, JHU Nov 19-20 20071 IVOA Data Access Layer Table Access Protocol Doug Tody (NRAO/NVO ) I NTERNATIONAL V IRTUAL O BSERVATORY A LLIANCE US National

TAP Meeting, JHU Nov 19-20 2007 17

TAP Interface Summary

• Form of interface– HTTP GET/POST based (other protocols possible, e.g. SOAP,

CEA)– Multiple output formats (VOTable, CSV/TSV, XML, VOSpace,

etc.)

• Operations– AdqlQuery ADQL-based queries, full functionality– SimpleQuery Simple data queries, metadata queries– GetCapabilities Return metadata describing the service– GetAvailability Monitor runtime service function and

health

Page 18: TAP Meeting, JHU Nov 19-20 20071 IVOA Data Access Layer Table Access Protocol Doug Tody (NRAO/NVO ) I NTERNATIONAL V IRTUAL O BSERVATORY A LLIANCE US National

TAP Meeting, JHU Nov 19-20 2007 18

AdqlQuery Operation

• Scope and Form of Interface– General capability for ADQL-based queries– Both GET and POST versions are required

• GET is synchronous, indempotent, simple, RESTful• POST required for async, staging, large queries

– Semantics, e.g., parameters, identical for both versions– ADQL query is URL-encoded so use in GET is not a problem

• Parameters– QUERY The query string (ADQL; URL-encoded)– FORMAT Output data format (VOTable, CSV, XML, etc.)– <staging> Only used in POST version; for VOSpace – <async> Only used in POST version; for driving UWS – MAXREC Maximum records in the output table– RUNID Pass-through; used for logging

(others TBD)

Page 19: TAP Meeting, JHU Nov 19-20 20071 IVOA Data Access Layer Table Access Protocol Doug Tody (NRAO/NVO ) I NTERNATIONAL V IRTUAL O BSERVATORY A LLIANCE US National

TAP Meeting, JHU Nov 19-20 2007 19

AdqlQuery Operation

• Field Names, UTYPE and UCD– Suggest this be done at level of field rather than by operation– Literal field names directly access database table– A UTYPE reference resolves into a literal table field name

• e.g., “ssa:Target.Name” resolves to table field “TargetName”

– UTYPE (in this context) is a special case of UTYPE ("ucd:")

• Field name resolution– Both literal and UTYPE/UCD field names resolve to table field– All queries evaluated equivalently after field name resolution– Data models, at the level of TAP, involve only mappings– UFI can automate this, or it can be done client side

Page 20: TAP Meeting, JHU Nov 19-20 20071 IVOA Data Access Layer Table Access Protocol Doug Tody (NRAO/NVO ) I NTERNATIONAL V IRTUAL O BSERVATORY A LLIANCE US National

TAP Meeting, JHU Nov 19-20 2007 20

AdqlQuery Operation

• Multi-Position Queries– AKA multi-cone search; but doesn't have to be limited to position– Common use-case involves user source list with thousands of positions– Required for scalability to reduce operation overhead

• How It Works– Uses ADQL, REGION, POST form of operation– VOTable used to upload source table (ID, POS, SIZE, etc.)

• other fields are passed through to output• output is tagged by source ID• can be generalized to any input parameter, not just position

– POST (e.g., multipart/form-data) used to upload params, VOTable– Parameters are common to both GET and POST forms

• Data Scoping– Query, Local (DBMS), and VOSpace (Net) tables are equivalent– POST is a Query space table

Page 21: TAP Meeting, JHU Nov 19-20 20071 IVOA Data Access Layer Table Access Protocol Doug Tody (NRAO/NVO ) I NTERNATIONAL V IRTUAL O BSERVATORY A LLIANCE US National

TAP Meeting, JHU Nov 19-20 2007 21

SimpleQuery Operation

• Scope and Form of Interface– Provides capability for simple non-ADQL queries– Used for both data queries and metadata queries (like ADQL/SQL)– Only a synchronous GET version is required– Only a single table is queried at a time

• Motivation– Simple to implement, easy to use– >90% of actual catalog queries are simple filters of a single table– We need something like this anyway for simple metadata queries

• but why limit it to only metadata?– Small data providers publish a few simple catalogs– Simpler to implement, likely to be more robust implementation

Page 22: TAP Meeting, JHU Nov 19-20 20071 IVOA Data Access Layer Table Access Protocol Doug Tody (NRAO/NVO ) I NTERNATIONAL V IRTUAL O BSERVATORY A LLIANCE US National

TAP Meeting, JHU Nov 19-20 2007 22

SimpleQuery Operation

• Parameters– SELECT Table fields to be returned (default all)– FROM The table (or view) to be accessed– WHERE A filter to be applied to the table (default

none)– POS,SIZE Find data only in this spatial region– FORMAT Output data format

– MAXREC Maximum records out– RUNID Pass-through for logging

(etc)

• Provides– Simplified SQL-lite query (90/10 rule)– Both data and metadata queries– Simple cone search capability

Page 23: TAP Meeting, JHU Nov 19-20 20071 IVOA Data Access Layer Table Access Protocol Doug Tody (NRAO/NVO ) I NTERNATIONAL V IRTUAL O BSERVATORY A LLIANCE US National

TAP Meeting, JHU Nov 19-20 2007 23

SimpleQuery Operation

• Metadata Queries– Information Schema concept

• great concept; definition/implementation imperfect• but it is a standard, widely (but not completely) implemented

– Concept• represent database/table metadata as data tables (views)• allows use of standard data table interface to query metadata• easily extensible without changing service interface• views can be used for things such as registry view

– Examples• FROM=SCHEMA.tables • FROM=SCHEMA.columns&WHERE=tableName,foo• FROM=SCHEMA.columns&WHERE=tableName,foo&FORMAT=xml

Page 24: TAP Meeting, JHU Nov 19-20 20071 IVOA Data Access Layer Table Access Protocol Doug Tody (NRAO/NVO ) I NTERNATIONAL V IRTUAL O BSERVATORY A LLIANCE US National

TAP Meeting, JHU Nov 19-20 2007 24

Simple Cone Search

• Approach– Integrate into SimpleQuery to allow additional constraints

• would probably be too ambitious in a separate SCS standard– Re-use common DAL position syntax (POS, SIZE)

• extensible in terms of region type and spatial frame– UTYPE/UCD field syntax allows data models to be used– Table to be queried is specified with FROM– ADQL,REGION provides an advanced alternative with common

semantics

• Examples– REQUEST=SimpleQuery&FROM=foo&POS=180.0,12.5&SIZE=0.2– REQUEST=SimpleQuery&FROM=foo&POS=180.0,12.5&SIZE=0.2&WHERE=flu

x,5/

Page 25: TAP Meeting, JHU Nov 19-20 20071 IVOA Data Access Layer Table Access Protocol Doug Tody (NRAO/NVO ) I NTERNATIONAL V IRTUAL O BSERVATORY A LLIANCE US National

TAP Meeting, JHU Nov 19-20 2007 25

Minimal TAP Service

• Requirements– Implements SimpleQuery operation

• possibly getCapabilities and getAvailability as well?

– Provides basic data query capability– Provides basic metadata query capability (tables, columns)– No ADQL support required (but may use SQL back end)– No UTYPE support required