david schade canadian astronomy data centre data services curation practice and user metrics
TRANSCRIPT
David SchadeCanadian Astronomy Data Centre
Data Services Curation Practice and User Metrics
Services AssessmentSchade & Arviset 2006-2007
In 2006 Executive Committee agreed that we produce a survey to assess the state of implementation of services– Registries– SIAP– ConeSearch
How widely are these implemented?
Are these services compliant?
The report was delivered at Interop 2007 and discussed by Exec Comm
Services AssessmentSchade & Arviset 2006-2007
Simple Image Access: 128 Implementations
Services AssessmentSchade & Arviset 2006-2007
ConeSearch: 441 Implementations
– 81% Fully compliant– 19% Not compliant
SSAP– Early implementations– Usually not fully compliant
Services AssessmentSchade & Arviset 2006-2007
ConeSearch: 441 Implementations
– 81% Fully compliant– 19% Not compliant
SSAP– Early implementations– Usually not fully compliant
SLAP– 4 v0.5 prototype implementations– All fully compliant
Services AssessmentSchade & Arviset 2006-2007
IVOA Registries: harvesting and keyword search
Most found to be mostly compliant pre-v1.0 and in the process of upgrading in 2006-2007.
Services AssessmentSchade & Arviset 2006-2007
2009 Simple Image Access: 207 Implementations
– 100 fully compliant– 73 mostly compliant– 34 not compliant– (found problems with our own SIA services at
CADC)
• Registries were very difficult to use to find all SIA• Services that have been de-registered still appear
Services AssessmentSchade & Arviset 2006-2007
Recommendations of 2007 report – The goals of implementation should be Full
Compliance……. Not less– Non-compliant services should be flagged in the
registries– IVOA needs to release stable versions of protocols
in clear and unambiguous language– Automated curation tools should be created for VO
services– Issues of backward compatibility or a supported
portability path should be considered by WGs– IVOA needs an ongoing effort to assess progress
toward wide implementation of fully compliant services
Metrics of Success
The IVOA will be demonstrated to be a success when thousands of astronomers use services based on IVOA-developed standards as part of their daily work
The IVOA should develop a strategy to measure its progress toward success by measuring the growth of service usage by astronomers
Usage statistics provide feedback mechanism into the development process
Proposal for VO Implementation Working Group
Metrics of Success
CADC Statistics:
“VO” Access (SIA) < 1% of total usage by file/volume
“Programmatic access is > 80% by volume
Metrics of Success
CADC Statistics:
We need more high-quality collections in SIA
18% of users tried SIA
150,000 SIA queries in 2008
Metrics of Success
CFHT data (MegaPrime Camera):
96.4 Terabytes delivered in 2008
Among “normal” users of CFHT Legacy Survey data:– Raw data was 1.7% of the data downloads
– Processed data was 98.3%
>>>> Advanced Data Products are wanted
Among PI users of CFHT– Raw data was 37.7% of the data downloads
– Processed data was 62.3%
>>>> Different behaviour
CADC Data Flows (last week)
Metrics of Success
• To establish an inclusive international group that will:
1) Monitor, evaluate, and report on implementations of VO standards and protocols
2) Compile statistics on usage of VO-based services and report back to Exec.
3) Produce an annual status report on VO implementation of IVOA standards (and other related developments)
4) Advise Exec on how TCG and developer effort may be better focused on the most important problems.
5) Produce an annual status report on VO implementation of IVOA standards (and other related developments)
6) Produce proposals to Executive Committee on specific actions to improve the success of the IVOA
• Deliver 2 Terabytes per week to users
• 2500 distinct users• 87 countries• Serve all Canadian
astronomy research universities
• Largest astronomy data centre in the world (by data volume)
CADC
HST
CFHT
JCMT CGPS
FUSE MOST
Gemini BLAST
MACHO
Strategic Goals
• Strengthen engagement between CADC and the university community
• Make the grid a useful facility for observational astronomy
16
Risks
• Closer engagement with university community brings risks to CADC and to science teams
• The Grid (Compute Canada) will find it difficult to adapt to Observational Astronomy
• CANFAR will find it hard to adapt to the Grid
• CANFAR-CANARIE interaction is challenging.