what makes a data archive tick: marrying content and user support

34
What Makes a Data Archive Tick: Marrying Content and User Support Steven Worley National Center for Atmospheric Research Computational and Information Systems Laboratory May 17-21, 2010 Summer Institute for Data Curation for Earth and Environmental Science Graduate School of Library and Information Science University of Illinois, Urbana-Champaign

Upload: long

Post on 26-Feb-2016

35 views

Category:

Documents


5 download

DESCRIPTION

What Makes a Data Archive Tick: Marrying Content and User Support . Steven Worley National Center for Atmospheric Research Computational and Information Systems Laboratory May 17-21, 2010 Summer Institute for Data Curation for Earth and Environmental Science - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: What Makes a Data Archive Tick:  Marrying Content and User Support

What Makes a Data Archive Tick: Marrying Content and User Support

Steven WorleyNational Center for Atmospheric Research

Computational and Information Systems LaboratoryMay 17-21, 2010

Summer Institute for Data Curation for Earth and Environmental ScienceGraduate School of Library and Information Science

University of Illinois, Urbana-Champaign

Page 2: What Makes a Data Archive Tick:  Marrying Content and User Support

① How to make and keep the archive content relevant to the users?

② How to engage the users?

Page 3: What Makes a Data Archive Tick:  Marrying Content and User Support

How to make and keep the archive content relevant to the users?

Know your usersDefine your focus community

Cannot serve everyoneDesign service not to limit othersAt decision points (e.g. changes in service) ask:

“Is this a significant benefit for my users?”The case @ NCAR

Atmospheric, oceanic, and some related geo-science researchGraduate students and higher educationNCAR scientists, researchers @ universities with graduate

degree programs in meteorology and oceanographyOver 50% of 6000+ unique users, annually, are outside focus

group

Page 4: What Makes a Data Archive Tick:  Marrying Content and User Support

Understand their science, currently, and trendsAttend seminars, symposia, meetings where they

present their workCorollary: Have science educated staff

The case @ NCAR – Research Data Archive

How to make and keep the archive content relevant to the users?

All have MS degrees, or greater• meteorology (6)• oceanography (2)• computing science

(1)• exception – admin.

(1)

Page 5: What Makes a Data Archive Tick:  Marrying Content and User Support

Understand their science, currently, and trendsRoutinely review journals, bulletins, and relevant news

letters Search for science strongly dependent on your data focusContact authors, offer data sharing service

@ NCAR

How to make and keep the archive content relevant to the users?

Page 6: What Makes a Data Archive Tick:  Marrying Content and User Support

Understand their science, currently, and trendsDevelop close contacts with a few key users

Seek ‘honest’ opinions about your serviceMake your service known – presentations, publications@ NCAR

How to make and keep the archive content relevant to the users?

Page 7: What Makes a Data Archive Tick:  Marrying Content and User Support

Know how your users workHow do they prefer to handle data?

Digital files – write and run program codes to evaluate contentDigital files – specific formats that are application friendly

E.g. netCDF, GIS, WMO ASCII text convenient for worksheets Images of analyses (charts, line graphs, 2D/3D contoured plots)

@NCARDigital files are key Some images for discovery, but not critical

Design the systems to deliver what users want

How to make and keep the archive content relevant to the users?

Page 8: What Makes a Data Archive Tick:  Marrying Content and User Support

Choosing the contentAt decision points (e.g. adding a new dataset) ask:

“Can we handle this efficiently?”Does it supplement or extend the central data foci?Does it address a new need or trend?Are the formats aligned with user preferences?

If not, can we make a cost effective conversion?Do you have staff (data scientists / stewards) that can

understand the scientific content?@ NCAR

Atmospheric, oceanic, related geo-sciences observations or analyses derived from observations to support climate and weather research.

How to make and keep the archive content relevant to the users?

Page 9: What Makes a Data Archive Tick:  Marrying Content and User Support

Choosing the contentEvaluate user metrics

What datasets are most popular?Who is using what – can you distinguish your focus group? Are there any trends?Caution: this is only part of the story

@ NCAROur user registration allows us to track thisExamples

How to make and keep the archive content relevant to the users?

Page 10: What Makes a Data Archive Tick:  Marrying Content and User Support

Unique Users by service path

Users in four service categories MSS to CISL HPC environment Web to world-wide community Orders – one off consulting assisted data

preparation TIGGE

6 thousand users annually FY09: MSS=266, Web=5649, Orders=196,

TIGGE=44

Page 11: What Makes a Data Archive Tick:  Marrying Content and User Support

Amount of data by service path

Users in four service categories MSS to CISL HPC environment Web to world-wide community Orders – one off consulting assisted data

preparation TIGGE

162 TB in FY09 FY09: MSS=31, Web=120, Orders=9,

TIGGE=2

Page 12: What Makes a Data Archive Tick:  Marrying Content and User Support

NCAR-CSM Symposium on Climate and Energy

12

User ranked popular datasets

7 May 2010

Unique users FY09 datasets Titles

2878ds082.0, ds083.2, ds083.0 NCEP FNL Operational Model Global Tropospheric Analyses

924 ds090.0 NCEP/NCAR Global Reanalysis Products

510ds758.0, ds759.3, ds759.2 NGDC Global 2' and 5' Elevations, USGS 30 ARC-second

477

ds461.0, ds351.0ds337.0, ds464.0,ds353.4 NCEP ADP/PREPBUFR Global Surface and Upper Air Observations

358 ds608.0 NCEP North American Regional Reanalysis (NARR)264 ds609.2 GCIP NCEP ETA model output

262 ds540.1, ds540.0 International Comprehensive Ocean-Atmosphere Data Set (ICOADS)190 ds744.4 QSCAT/NCEP Blended Ocean Winds

173 ds277.0 NCEP V2.0 OI Global SST, V3.0 Extended Reconstructed Analyses153 ds335.0, ds336.0 Unidata (IDD) Observations and Model Data5921 All Datasets All DSS datasets

Top 10 datasets/groups FY09

~ 6000 Unique Users Annually

Page 13: What Makes a Data Archive Tick:  Marrying Content and User Support

Remain flexible – expect constant changeBe ready to take opportunities when they come along

Re-adjust prioritiesResist ‘tight’ mission controlTake advice from advisory groups, but don’t depend on

them exclusively Use holistic approach

@ NCAR, unplanned for exampleArctic System Reanalysis – NSF sponsored research critical to

assess the changes happening in the ArcticNeed controlled access to first prototype data – We do this!

How to make and keep the archive content relevant to the users?

Page 14: What Makes a Data Archive Tick:  Marrying Content and User Support

Sustaining for the long-termRichness and data value grow over time

Data assets tend to compliment each other – add value to many different research questions

Scientific publications lead to broader and increased interestDefinitive data citation is a work in progress

Staffing needs to be base/core fundedGrant directed funding can lead to a fractured, ad hoc,

incomplete archiveCan be a major frustration for users

@ NCAR – the Research Data ArchiveBegan 40+ years ago Today sustained by 9 persons

How to make and keep the archive content relevant to the users?

Page 15: What Makes a Data Archive Tick:  Marrying Content and User Support

CollaborationsParticipate/volunteer for committees and panels that

tackle data issues (all sorts)Learn from others, share knowledge

Share efforts and data with other organizationsNo one group can do it all (don’t have resources and all

expertise required)@ NCAR (conf. like SIDC for EES)

Volunteerism: NAS, AMS, NOAA, WMO, NASANational and International data agreements with:

European Centre for Medium Range Forecasting Japanese Meteorological AdministrationU.S. National Weather Service, National Center for

Environmental Prediction

How to make and keep the archive content relevant to the users?

Page 16: What Makes a Data Archive Tick:  Marrying Content and User Support

How to Engage the Users?Data Discovery – how can people find you?

All 600+ RDA Datasets have metadata in GCMD• Automatically, exported via OAI – PMHSimilarly: RDA > CDP@NCAR > BADC in UK

Page 17: What Makes a Data Archive Tick:  Marrying Content and User Support

How to Engage the Users?Design your portal to evolve – it will/should

2002• Search• Navigation• List of menus• Unique layout of

links • Picture of

people

Page 18: What Makes a Data Archive Tick:  Marrying Content and User Support

How to Engage the Users?

2008• Search

• Two ways

• Navigation• Links• News• Text• People

Page 19: What Makes a Data Archive Tick:  Marrying Content and User Support

How to Engage the Users?

Page 20: What Makes a Data Archive Tick:  Marrying Content and User Support

NCAR-CSM Symposium on Climate and Energy

207 May 2010

Primary design feature for web portal• Data Discovery – Find Data!

How to Engage the Users?

2010• All about

search• Gone from top

• people• text• news

Page 21: What Makes a Data Archive Tick:  Marrying Content and User Support

How to Engage the Users?

Navigation once they arriveWorking principles

Uniform across web portal Keep organizational elements out of prime visual territory

@ NCARHave user registration – only required to get data

All discovery metadata open – unlimited searching

Page 22: What Makes a Data Archive Tick:  Marrying Content and User Support

How to Engage the Users?

The complete data knowledge package, and data cycle

What is a complete data knowledge package?Rich metadata plus the data files!

One example http://dss.ucar.edu/datasets/ds277.0/

Page 23: What Makes a Data Archive Tick:  Marrying Content and User Support

How to Engage the Users?

The pieces that make rich metadataDataset navigation (Access, Documentation, Software)TitleSummary

Page 24: What Makes a Data Archive Tick:  Marrying Content and User Support

How to Engage the Users?

The pieces that make rich metadataPeriod of data recordUpdate cycleScientific parameters (Variables)Earth reference levels

Page 25: What Makes a Data Archive Tick:  Marrying Content and User Support

How to Engage the Users?

The pieces that make rich metadataTimes – temporal increment Data types – points or gridsGeo-spatial coverageSource organizations

Page 26: What Makes a Data Archive Tick:  Marrying Content and User Support

How to Engage the Users?

The pieces that make rich metadataRelated Internet sitesPublicationsAcknowledgement statement

Page 27: What Makes a Data Archive Tick:  Marrying Content and User Support

How to Engage the Users?

The pieces that make rich metadataVolume – size of the datasetData formatsRelated datasets in the NCAR collectionConsulting contact (email and phone)A 2nd pointer to Data Access

Page 28: What Makes a Data Archive Tick:  Marrying Content and User Support

How to Engage the Users?

The complete data knowledge package, and data cycleData Cycle Facts Datasets are re-published – new versions. Datasets are corrected and extended in time or space. Scientific analysis and publication will occur randomly along the

data cycle.

Data referencing is more challenging than traditional publication referencing because of the data cycle.

How can you accurately trace/recover what has been used for publication?

Page 29: What Makes a Data Archive Tick:  Marrying Content and User Support

How to Engage the Users?

The complete data knowledge package, and data cycle

@ NCAR Don’t have systematic (organization-wide) way to

handle the data cycle We do not discard/delete old versions of data

Ad hoc approach Currently, building a version tracking software

Versioning will be included in DOI implementation

Page 30: What Makes a Data Archive Tick:  Marrying Content and User Support

How to Engage the Users?

ConsultationCritical two-way communication1. Benefits for the user

Guidance to best available datasetsConsolidate research ideas into required data sourcesSoftware assistanceCustomized data preparation if necessary

2. Benefits to the archive stewardshipDetect ways to improve our search processLearn about data requirement trendsOccasionally, acquire new data resources from scientific effortsLearn about data problems we might have

Page 31: What Makes a Data Archive Tick:  Marrying Content and User Support

How to Engage the Users?

Provide research tool support and documentation Provide users a starting point for data evaluation

Simple access programs – the languages used by the focus community

Pointers to applications (IDL, MatLab, NCL, NCO, etc.) Specific example are VERY helpful!

Must maintain software/applications and documentation for the long-term.Guarantee users will understand the meaning and have access.

Page 32: What Makes a Data Archive Tick:  Marrying Content and User Support

How to Engage the Users?

Provide research tool support and documentation @ NCAR

Remain aware of proprietary software taps, E.g. for documents

will .xls be viable 50 years from now - .xlsx is now standard? Is .pdf any better?

Prefer data file formats that define everything to the byte/bit level

Computer code could always be written to access these.All kinds of reports, project descriptions, and documents that

explain the intent of the data are vital for the long-term.Use dedicated document directories for each datasets

Page 33: What Makes a Data Archive Tick:  Marrying Content and User Support

How to Engage the Users?

Follow-up aidNotification service for significant dataset changes

If an error is corrected – should notify all users of the data Subscription service

Inform users when new data is available Prepare special products based on user determined template

– e.g. past requests@ NCAR

We have automated notification serviceProvided users register accurately

We do not have subscription service - yet

Page 34: What Makes a Data Archive Tick:  Marrying Content and User Support

① How to make and keep the archive content relevant to the users?

② How to engage the users?

http://dss.ucar.edu/