vamdc tutorial for prospective data-providers guy rixon sup@vamdc meeting, ipr, november 2013

47
data-providers Guy Rixon SUP@VAMDC meeting, IPR, November 2013

Upload: steven-nicholson

Post on 30-Dec-2015

220 views

Category:

Documents


4 download

TRANSCRIPT

Page 1: VAMDC tutorial for prospective data-providers Guy Rixon SUP@VAMDC meeting, IPR, November 2013

VAMDC tutorialfor prospectivedata-providers

Guy RixonSUP@VAMDC meeting, IPR, November 2013

Page 2: VAMDC tutorial for prospective data-providers Guy Rixon SUP@VAMDC meeting, IPR, November 2013

2

• VAMDC orientation

• Introduction to node building (technical)

• Self-paced investigation with on-line tutorial material

Agenda

Page 3: VAMDC tutorial for prospective data-providers Guy Rixon SUP@VAMDC meeting, IPR, November 2013

3

Orientation

Page 4: VAMDC tutorial for prospective data-providers Guy Rixon SUP@VAMDC meeting, IPR, November 2013

4

Query

Extract

VAMDC is in the cloud

Page 5: VAMDC tutorial for prospective data-providers Guy Rixon SUP@VAMDC meeting, IPR, November 2013

5

A flock of databases

Science applicationScience applicationScience applicationScience application

VAMDCdata

nodes

Page 6: VAMDC tutorial for prospective data-providers Guy Rixon SUP@VAMDC meeting, IPR, November 2013

6

For list of databases see:

http://portal.vamdc.eu/vamdc_portal/nodes.seam

Page 7: VAMDC tutorial for prospective data-providers Guy Rixon SUP@VAMDC meeting, IPR, November 2013

7

Two-stage selection

SelectSelectSelectSelect

XSAMS

Filter and Filter and extractextract

Filter and Filter and extractextract

Science code

OR

Page 8: VAMDC tutorial for prospective data-providers Guy Rixon SUP@VAMDC meeting, IPR, November 2013

8

XSAMS

• XML Schema for Atoms, Molecules and Solids

• IAEA originally; developed by VAMDC

• Rich ⇒ good for transforming to other formats

• See http://www.vamdc.eu/documents/standards/dataModel/vamdcxsams/index.html

Page 9: VAMDC tutorial for prospective data-providers Guy Rixon SUP@VAMDC meeting, IPR, November 2013

9

E.g. XSAMS for phys. chem.

Page 10: VAMDC tutorial for prospective data-providers Guy Rixon SUP@VAMDC meeting, IPR, November 2013

10

Many UIs

web sites VAMDC web-portal

scripts applicationsYour software

here

Page 11: VAMDC tutorial for prospective data-providers Guy Rixon SUP@VAMDC meeting, IPR, November 2013

11

Some UIs and applications

• VAMDC web portal - the starting point

• SpectCol - combine spectroscopy and collisions

• Specview - STScI’s spectrum viewer with VAMDC support

• Query Builder - app to generate queries for scripting

• VAMDC as IVOA PDL service - astronomy integration

• Taverna - workflow engine with VAMDC plug-in

• Selection of Python scripts from VAMDC

• Various XSAMS-processing web-services

Page 12: VAMDC tutorial for prospective data-providers Guy Rixon SUP@VAMDC meeting, IPR, November 2013

12

Finding things: registry

ApplicationApplicationApplicationApplication

Registry web-Registry web-serviceservice

Registry web-Registry web-serviceservice Data nodeData nodeData nodeData node1: Register

2: Discover 3: Query

Avoids hard-coding addresses: data nodes may move

Page 13: VAMDC tutorial for prospective data-providers Guy Rixon SUP@VAMDC meeting, IPR, November 2013

13

VAMDC web portal: query

Page 14: VAMDC tutorial for prospective data-providers Guy Rixon SUP@VAMDC meeting, IPR, November 2013

14

VAMDC web-portal: results

Page 15: VAMDC tutorial for prospective data-providers Guy Rixon SUP@VAMDC meeting, IPR, November 2013

15

VAMDC web-portal: display

Page 16: VAMDC tutorial for prospective data-providers Guy Rixon SUP@VAMDC meeting, IPR, November 2013

16

Portal, nodes & processors

PortalPortalPortalPortal

Data nodeData nodeData nodeData nodeData nodeData nodeData nodeData nodeData nodeData nodeData nodeData nodeProcessorProcessorProcessorProcessor

XSAMSXSAMSRegistryRegistryRegistryRegistry

http://portal.vamdc.eu/

http://registry.vamdc.eu/Independent services,

can be used with any UI

Page 17: VAMDC tutorial for prospective data-providers Guy Rixon SUP@VAMDC meeting, IPR, November 2013

17

SpectCol application

Implements the original use case for matching spectroscopic and collisional data

See http://www.vamdc.eu/software

Page 18: VAMDC tutorial for prospective data-providers Guy Rixon SUP@VAMDC meeting, IPR, November 2013

18

Specview application

This query UI available as a Java library

Line IDs forastronomy:

VAMDC data added to existing

application

See http://www.vamdc.eu/software

Page 19: VAMDC tutorial for prospective data-providers Guy Rixon SUP@VAMDC meeting, IPR, November 2013

19

Introduction tonode building

Page 20: VAMDC tutorial for prospective data-providers Guy Rixon SUP@VAMDC meeting, IPR, November 2013

20

Options for data providers

• Publish your data into VAMDC by:

• adding your data to an existing node, or

• build a new node around your data and run it, and

• host the node at your site or

• have the node hosted at another VAMDC site

Page 21: VAMDC tutorial for prospective data-providers Guy Rixon SUP@VAMDC meeting, IPR, November 2013

21

Database → data node

Science application (e.g. portal)Science application (e.g. portal)Science application (e.g. portal)Science application (e.g. portal)

VAMDCdata

nodes

Page 22: VAMDC tutorial for prospective data-providers Guy Rixon SUP@VAMDC meeting, IPR, November 2013

22

====

Web serverWeb server

Node softwareNode software

DatabaseDatabaseWeb service,

runs in web server

Parts of a data node

VAMDCstandard

Databasespecific

Page 23: VAMDC tutorial for prospective data-providers Guy Rixon SUP@VAMDC meeting, IPR, November 2013

23

Qualifying as a data node

• A web service is a VAMDC node if it:

• implements the VAMDC-TAP protocol

• is publicly visible on port 80

• is registered in the VAMDC registry

• actually emits data

• (No constraint on how you achieve that)

Page 24: VAMDC tutorial for prospective data-providers Guy Rixon SUP@VAMDC meeting, IPR, November 2013

24

VAMDC-TAP

• VAMDC Table Access Protocol

• Based on IVOA Table Access Protocol

• Specifies a facade for queries to DB via web service

• Also ancillary interfaces for registration, availability checks

Science application (e.g. Science application (e.g. portal)portal)

Science application (e.g. Science application (e.g. portal)portal)

Synchronous query

Capabilities

Availability

Page 25: VAMDC tutorial for prospective data-providers Guy Rixon SUP@VAMDC meeting, IPR, November 2013

25

VAMDC-TAP: queryhttp://some.server/tap/sync?

LANG=VSS2&FORMAT=XSAMS&QUERY=SELECT+*+...

• Synchronous-query URL within TAP service

• HTTP HEAD → statistics, no data raised

• HTTP GET, POST → data raised, returned in HTTP response

• Query language, result format variable

• VAMDC requires VSS2 and XSAMS

• could add others

Page 26: VAMDC tutorial for prospective data-providers Guy Rixon SUP@VAMDC meeting, IPR, November 2013

26

VSS2 query-language

• VAMDC SQL Sub-set #2

• ANSI SQL with much of the detail excluded

• E.g. SELECT * WHERE collider.AtomSymbol=’He’ AND target.MoleculeInchiKey=...

• Operates on a virtual, single table with columns defined by VAMDC dictionary

• See http://www.vamdc.eu/documents/standards/queryLanguage/index.html

Page 27: VAMDC tutorial for prospective data-providers Guy Rixon SUP@VAMDC meeting, IPR, November 2013

27

VAMDC dictionary

• Lists, defines:

• RESTRICTABLES: columns to constraints in query

• RETURNABLES: columns that can be in the results

• REQUESTABLES: columns/structures desired in results

• See http://dictionary.vamdc.eu/

Page 28: VAMDC tutorial for prospective data-providers Guy Rixon SUP@VAMDC meeting, IPR, November 2013

28

VAMDC-TAP: capabilities

• Describes service interfaces in a form that the registry understands

• Responds to HTTP GET

• XML document

• Capability for VAMDC-TAP lists:

• version of standards

• version of software

• search terms supported in query

• sample queries

• E.g. http://ag02.ast.cam.ac.uk/chianti/tap/capabilities

http://some.server/tap/capabilities

Page 29: VAMDC tutorial for prospective data-providers Guy Rixon SUP@VAMDC meeting, IPR, November 2013

29

VAMDC-TAP: availability

• Check that web-service is up

• XML document (XSL for browser display)

• E.g. http://ag02.ast.cam.ac.uk/chianti/tap/availability

http://some.server/tap/availability

Page 30: VAMDC tutorial for prospective data-providers Guy Rixon SUP@VAMDC meeting, IPR, November 2013

30

MySQL

Django

Customcode

for DB

Pythonhttpd/WSGI

Commonnode

software

From VAMDC;code in GitHub;docs on vamdc.eu site

Readily available;usually as optionalpackage in Linuxdistro

You write this bit

See http://www.vamdc.eu/software

VAMDC standard “stack” for node

Page 31: VAMDC tutorial for prospective data-providers Guy Rixon SUP@VAMDC meeting, IPR, November 2013

31

What does Django do?• Represents DB tables as “Model” objects

• Handles joins

• E.g. http://ag02.ast.cam.ac.uk/tutorials/_downloads/models.py

• Represents queries as “Q” objects

• E.g. http://ag02.ast.cam.ac.uk/tutorials/_downloads/queryfunc.py

• Represents query results as “query-set objects”

• Cursors on DB query

• ⇒ lazy evaluation

• E.g. http://ag02.ast.cam.ac.uk/tutorials/_downloads/queryfunc.py

Page 32: VAMDC tutorial for prospective data-providers Guy Rixon SUP@VAMDC meeting, IPR, November 2013

32

Extract VSS2Extract VSS2from URLfrom URL

Extract VSS2Extract VSS2from URLfrom URL

VSS2 →VSS2 →Django QDjango QVSS2 →VSS2 →

Django QDjango Q

Apply Q →Apply Q →main result setmain result set

Apply Q →Apply Q →main result setmain result set

Derive Derive subsidiary subsidiary result-setsresult-sets

Derive Derive subsidiary subsidiary result-setsresult-sets

Truncate main Truncate main result-setresult-set

Truncate main Truncate main result-setresult-set

Processing a query

Extract statsExtract statsExtract statsExtract stats

Generate and Generate and stream XMLstream XML

Generate and Generate and stream XMLstream XML

Nest Nest result-setsresult-sets

Nest Nest result-setsresult-sets

Set HTTP Set HTTP status & headersstatus & headers

Set HTTP Set HTTP status & headersstatus & headers

Common part

Common partusing dictionaryfrom custom part

Custom part

Custom part

Custom part

Custom part

Custom part

Common part

Common part,using dictionaryfrom custom

part

Page 33: VAMDC tutorial for prospective data-providers Guy Rixon SUP@VAMDC meeting, IPR, November 2013

33

Therefore, you write:

• models.py: define table structure to Django

• dictionaries.py: define mappings to VAMDC

• RESTRICTABLES: VSS2 → Django Q

• RETURNABLES: Django query-set → XSAMS

• queryfunc.py: implement query flow as per previous slide

Page 34: VAMDC tutorial for prospective data-providers Guy Rixon SUP@VAMDC meeting, IPR, November 2013

34

Pause to digest that information...

...possibly reviewing examples:• http://ag02.ast.cam.ac.uk/tutorials/_downloads/qu

eryfunc.py

• http://ag02.ast.cam.ac.uk/tutorials/_downloads/models.py

• http://ag02.ast.cam.ac.uk/tutorials/_downloads/dictionaries.py

Page 35: VAMDC tutorial for prospective data-providers Guy Rixon SUP@VAMDC meeting, IPR, November 2013

35

Design sequence

• Choose DB tables; define as Django models

• Design query strategy (queryfunc.py) for models

• Choose search terms (RESTRICTABLES dictionary)

• “Wire up” models to XSAMS (RETURNABLES dictionary)

• Test; iterate, refine

Page 36: VAMDC tutorial for prospective data-providers Guy Rixon SUP@VAMDC meeting, IPR, November 2013

36

Database ingestion

• Node software doesn’t care how you load data

• MySQL can read either SQL scripts or ASCII files

• ASCII inputs have to match chosen DB-schema

• Node software includes code to re-arrange ASCII files:

• see imptools package: https://github.com/VAMDC/NodeSoftware/tree/master/imptools

• docs at http://www.vamdc.eu/documents/nodesoftware/importing.html

Page 37: VAMDC tutorial for prospective data-providers Guy Rixon SUP@VAMDC meeting, IPR, November 2013

37

Testing sequence

Define DB

Code Python

modules

Test with internal server + TAP

validator

Deploy on server

Tweakable?

Working?

Re-evaluation

N

Y

N

Y

Test with portal

Working?

Register

NDone

Y

Page 38: VAMDC tutorial for prospective data-providers Guy Rixon SUP@VAMDC meeting, IPR, November 2013

38

TAP validatorSee http://www.vamdc.eu/software

Enter Capabilities URL for your node in settings page...

Download and run locally

Page 39: VAMDC tutorial for prospective data-providers Guy Rixon SUP@VAMDC meeting, IPR, November 2013

39

TAP validator (2)

XSAMSresults

Query

Validity report here

Page 40: VAMDC tutorial for prospective data-providers Guy Rixon SUP@VAMDC meeting, IPR, November 2013

40

Registration, step 1• Go to http://registry.vamdc.eu/ and select production*

registry

• Select “create entry” from side-bar

• Fill out name of service; select “catalog service” type

*or use dev registry for practice: http://casx019-zone1.ast.cam.ac.uk/registry/

Page 41: VAMDC tutorial for prospective data-providers Guy Rixon SUP@VAMDC meeting, IPR, November 2013

41

Registration, step 2• Fill out “core information” on next form

Page 42: VAMDC tutorial for prospective data-providers Guy Rixon SUP@VAMDC meeting, IPR, November 2013

42

Registration, step 3

• Select “edit” for this registry entry (use “browse registry” to search for entry if necessary)

• Select “Edit metadata ... by VOSI”

• Paste in the capabilities URL for your node and submit

Page 43: VAMDC tutorial for prospective data-providers Guy Rixon SUP@VAMDC meeting, IPR, November 2013

43

More information

• Node-software manual: http://www.vamdc.eu/documents/nodesoftware/

• VAMDC standards: http://www.vamdc.eu/standards

• Node-software video tutorials: http://ag02.ast.cam.ac.uk/tutorials/self-study/data-provider-self-study/index.html

Page 44: VAMDC tutorial for prospective data-providers Guy Rixon SUP@VAMDC meeting, IPR, November 2013

44

Self paced tutorialsusing VAMDC’son-line material

Page 45: VAMDC tutorial for prospective data-providers Guy Rixon SUP@VAMDC meeting, IPR, November 2013

45

On-line tutorial suitehttp://www.vamdc.eu/usersupport/tutorials

Page 46: VAMDC tutorial for prospective data-providers Guy Rixon SUP@VAMDC meeting, IPR, November 2013

46

Self-paced tutorials

Page 47: VAMDC tutorial for prospective data-providers Guy Rixon SUP@VAMDC meeting, IPR, November 2013

47

Examples of node building