december 9, 2015 niso webinar: two-part webinar: emerging resource types - part 1 large data sets

34
Ronin Institute Ronin Institute Ronin Institute Making Big Data Useful: The NSIDC Experience Ruth Duerr This work is licensed under a Creative Commons Attribution v4.0 License.

Upload: devonne-parks-cem

Post on 11-Feb-2017

803 views

Category:

Education


1 download

TRANSCRIPT

Page 1: December 9, 2015 NISO Webinar: Two-Part Webinar: Emerging Resource Types - Part 1 Large Data Sets

Ronin InstituteRonin Institute

Ronin Institute

Making Big Data Useful: The NSIDC Experience

Ruth Duerr

This work is licensed under a Creative Commons Attribution v4.0 License.

Page 2: December 9, 2015 NISO Webinar: Two-Part Webinar: Emerging Resource Types - Part 1 Large Data Sets

Ronin InstituteRonin Institute

Ronin Institute

Outline

*

•Introduction to NSIDC•The story of sea ice•Lessons learned

Page 3: December 9, 2015 NISO Webinar: Two-Part Webinar: Emerging Resource Types - Part 1 Large Data Sets

Ronin InstituteRonin Institute

Ronin Institute

Introduction to NSIDC

*

Page 4: December 9, 2015 NISO Webinar: Two-Part Webinar: Emerging Resource Types - Part 1 Large Data Sets

NSIDC: An overview4

Cooperative Institute for Research in Environmental Sciences

Main sponsors:

NSIDC affiliations and sponsorship

National ScienceFoundation

NASA

National Oceanographic and Atmospheric Administration

World Data System

Page 5: December 9, 2015 NISO Webinar: Two-Part Webinar: Emerging Resource Types - Part 1 Large Data Sets

The National Snow and Ice Data Center…

Provides tools for

data access

Researches the cryosphere and data science

Educates the public about the

cryosphereSupports data users

Manages and distributes scientific data

Supports local and

traditional knowledge

Page 6: December 9, 2015 NISO Webinar: Two-Part Webinar: Emerging Resource Types - Part 1 Large Data Sets

NSIDC: An overview6

▪Satellite▪In situ (station data

and the like)▪Model output▪Most digital, some analog

Products

Users

More than 600 data and information products, most freely available online

Page 7: December 9, 2015 NISO Webinar: Two-Part Webinar: Emerging Resource Types - Part 1 Large Data Sets

NSIDC: An overview7

NSIDC 2011 Metrics

2011 Ingest13 TB3 million files

2011 Distribution183 TB23 million files

Total Archive102 TB20 million files

Metrics are for calendar years.

Page 8: December 9, 2015 NISO Webinar: Two-Part Webinar: Emerging Resource Types - Part 1 Large Data Sets

ELOKA

NSIDC: An overview8

Why• Community-based knowledge of the Arctic informs

science, policy, and development.• Arctic communities want their knowledge shared

broadly and ethically, and passed on to generations.• There is a need for local knowledge holders to decide

how to manage their “data” and how to effectively share it.

What• ELOKA provides data management services and user

support to facilitate the collection, use, exchange, and preservation, of local observations and knowledge of the Arctic.

• ELOKA is mostly funded by NSF. • Data includes interviews, maps, and community

measurements of sea ice, observations of narwhal behavior, environmental change, and ecology.

• See eloka-arctic.org

Exchange for Local Observations and Knowledge in the Arctic (ELOKA)

Page 9: December 9, 2015 NISO Webinar: Two-Part Webinar: Emerging Resource Types - Part 1 Large Data Sets

Ronin InstituteRonin Institute

Ronin Institute

The Story of Sea Ice

*

Page 10: December 9, 2015 NISO Webinar: Two-Part Webinar: Emerging Resource Types - Part 1 Large Data Sets

Ronin InstituteRonin Institute

Ronin Institute*

The Remote Sensing Record• Satellite-based Passive

Microwave sensors have been measuring sea ice since 1972

• Consistent collection of data started in 1978 with the SMMR series of instruments

• Why passive microwave?○ Distinguishing sea ice

from ocean is straightforward

○ Passive microwave works through clouds and in the dark

• Initial user base was cryospheric scientists

Arctic sea ice concentration in April 2004, calculated from data measured by the Special Sensor Microwave/Imager (SSM/I) on the Defense Meteorological Satellite Program (DMSP) satellite. The image is centered over the North Pole, with continents shown in green. - Image courtesy of Florence Fetterer and Ken Knowles, National Snow and Ice Data Center, University of Colorado, Boulder, CO.

Page 11: December 9, 2015 NISO Webinar: Two-Part Webinar: Emerging Resource Types - Part 1 Large Data Sets

Ronin InstituteRonin Institute

Ronin Institute

Trending and audience – Audience confusion

•A search of the NSIDC catalog for “sea ice” data sets returns 132 products!

•The need to support these users led to the development of NSIDC’s Sea Ice FAQs which attempts to discuss the pros and cons of each data set, see for example:

Page 12: December 9, 2015 NISO Webinar: Two-Part Webinar: Emerging Resource Types - Part 1 Large Data Sets

Ronin InstituteRonin Institute

Ronin Institute

Trending and audience• Audience stayed pretty stable for roughly 20 years• That changed once the science community started

reporting statistically significant trends in sea ice extent (in the early 2000’s)

• At that point climatologists became interested in the data

Page 13: December 9, 2015 NISO Webinar: Two-Part Webinar: Emerging Resource Types - Part 1 Large Data Sets

Ronin InstituteRonin Institute

Ronin Institute

Development of the Sea Ice Index

• But did NSIDC really have the data the climatologists needed?

• Not really as they need:○ Monthly averages○ Climatologies○ Anomaly maps and trends

• Thus the Sea Ice Index (2002) data set was born with support from NOAA NESDIS

Page 14: December 9, 2015 NISO Webinar: Two-Part Webinar: Emerging Resource Types - Part 1 Large Data Sets

Development of the Sea Ice Index

14Evolution of the sea ice “designated community”, presented by Ruth DuerrMarch 15, 2012, Foundations of Data Curation, GSLIS/UIUC7

Page 15: December 9, 2015 NISO Webinar: Two-Part Webinar: Emerging Resource Types - Part 1 Large Data Sets

Ronin InstituteRonin Institute

Ronin Institute

• Also starting in 2002 NSIDC started crafting periodic press releases concerning Arctic sea ice conditions at the end of the summer melt season

• Media attention rose over the years, initially only from science reporters; eventually the sea ice minimum press release resulted in a barrage of queries from uninformed news reporters

• It became painfully obvious that NSIDC needed a site where basic information would be available with answers to questions like “where is the arctic”

• Thus was born the Arctic Sea Ice News & Analysis Site (ASINA)

• Started as a “skunkworks” project with a bit of funding from a NASA outreach addendum to an existing grant

Development of the Arctic Sea Ice News & Analysis Site

Page 16: December 9, 2015 NISO Webinar: Two-Part Webinar: Emerging Resource Types - Part 1 Large Data Sets

Ronin InstituteRonin Institute

Ronin Institute

Page 17: December 9, 2015 NISO Webinar: Two-Part Webinar: Emerging Resource Types - Part 1 Large Data Sets

Ronin InstituteRonin Institute

Ronin Institute

ASINA – What is it?•A news blog•Currently updated monthly or more frequently if conditions warrant

•Expert analysis and commentary about Arctic sea ice conditions consistently presented and written at a predictable level for returning visitors

•High-resolution satellite imagery of current conditions

•Additional high-resolution graphics and content as conditions warrant

•References and links to scientific work outside NSIDC as appropriate

Page 18: December 9, 2015 NISO Webinar: Two-Part Webinar: Emerging Resource Types - Part 1 Large Data Sets

Ronin InstituteRonin Institute

Ronin Institute

ASINA – What is it?•A news blog•Currently updated monthly or more frequently if conditions warrant

•Expert analysis and commentary about Arctic sea ice conditions consistently presented and written at a predictable level for returning visitors

•High-resolution satellite imagery of current conditions

•Additional high-resolution graphics and content as conditions warrant

•References and links to scientific work outside NSIDC as appropriate

Page 19: December 9, 2015 NISO Webinar: Two-Part Webinar: Emerging Resource Types - Part 1 Large Data Sets

Ronin InstituteRonin Institute

Ronin Institute

ASINA – What happened next?• 2007 sea ice minimum was extreme• Media attention was also extreme • That led to the general public becoming one of

NSIDC’s most vocal and consistent audiences• NSIDC’s most popular page• This also led to a rise in requests by the science

community for access to NSIDC data• Because of ASINA the audience for our data

expanded into a community without a lot of background science knowledge but who are “loyal, tenacious, and perceptive”

• They ask a lot of questions and often many of them ask the same or similar questions in a short period of time

Page 20: December 9, 2015 NISO Webinar: Two-Part Webinar: Emerging Resource Types - Part 1 Large Data Sets

Ronin InstituteRonin Institute

Ronin Institute

Thus IceLights was born!

Page 21: December 9, 2015 NISO Webinar: Two-Part Webinar: Emerging Resource Types - Part 1 Large Data Sets

Ronin InstituteRonin Institute

Ronin Institute

What is Icelights?•A way to respond to user questions in a public forum

•Roughly monthly blog-like format with ○The ability to “Ask Icelights” a question○Search for previous topics by tag, free text, etc.○Posts that have been reviewed by the science

community•“Crash course” content to provide

○Arctic sea ice 101○Data 101○Reading list

Page 22: December 9, 2015 NISO Webinar: Two-Part Webinar: Emerging Resource Types - Part 1 Large Data Sets

Ronin InstituteRonin Institute

Ronin Institute

Distribution for October, 2015

*

Page 23: December 9, 2015 NISO Webinar: Two-Part Webinar: Emerging Resource Types - Part 1 Large Data Sets

Ronin InstituteRonin Institute

Ronin Institute

Lessons Learned

*

Page 24: December 9, 2015 NISO Webinar: Two-Part Webinar: Emerging Resource Types - Part 1 Large Data Sets

Data Curation: The Evolution of Data Products for Value and Reach

AGU 2015

Karen S. Baker1, Ruth E. Duerr2,1, and Mark Parsons3

1Graduate School of Library and Information ScienceUniversity of Illinois Champaign-Urbana

2Ronin Institute for Independent Scholarshiphttp://ronin.org

3Research Data AllianceRensselaer Polytechnic Institute

Page 25: December 9, 2015 NISO Webinar: Two-Part Webinar: Emerging Resource Types - Part 1 Large Data Sets

Figure 2. A simplified view of the continuing development of scientific data products. Each cycle is initiated by one or more events that create a new audience that leads to generation of a new data product in response to the needs of a recently identified designated user community.

Data Products: Multi-cycle Trajectory

Page 26: December 9, 2015 NISO Webinar: Two-Part Webinar: Emerging Resource Types - Part 1 Large Data Sets

Evolution of Sea Ice Data Products

Redrawn from original work by Donna Scott, 2010who manages the NSIDC Passive Microwave Product Team.

Page 27: December 9, 2015 NISO Webinar: Two-Part Webinar: Emerging Resource Types - Part 1 Large Data Sets

Evolution of Sea Ice Data Products

Redrawn from original work by Donna Scott, 2010who manages the NSIDC Passive Microwave Product Team.

Page 28: December 9, 2015 NISO Webinar: Two-Part Webinar: Emerging Resource Types - Part 1 Large Data Sets

Data Product Teams

Roles - Skill Sets• Data Managers• Programmers• Technical Writers• Scientists• Science Communications• Systems/Database Managers• User Services Specialists

“This active human element of data management is not always recognized by funding agencies, nor is it explicit in the OAIS Reference Model …” - Parsons and Duerr, 2005

Page 29: December 9, 2015 NISO Webinar: Two-Part Webinar: Emerging Resource Types - Part 1 Large Data Sets

Continuing Development of Data Products

Page 30: December 9, 2015 NISO Webinar: Two-Part Webinar: Emerging Resource Types - Part 1 Large Data Sets

Sea Ice Data Products: Dependencies & Levels

Page 31: December 9, 2015 NISO Webinar: Two-Part Webinar: Emerging Resource Types - Part 1 Large Data Sets

Levels of Data

Page 32: December 9, 2015 NISO Webinar: Two-Part Webinar: Emerging Resource Types - Part 1 Large Data Sets

To the GIS community, the world is:• A collection of features (e.g., roads, lakes, plots of land) with

geographic footprints on the Earth's surface• The features are discrete objects described by a set of characteristics

such as a shape/geometry (often 2-D)

To fluid-earth scientists, the world is:• A set of observations/measurements described by parameters

(e.g.,velocity, temperature) that vary as continuous functions in (4-D) space-time

• Parameter behaviors are governed by a set of equations

To the public, the world is:• The place within which their neighborhood is nested • A place where decision-making is increasing in complexity due to the

interdependencies of natural systems, human systems, and human-natural systems

* after Mark Parsons, Ben Domenico, and Stefano Nativi

Diverse Audiences -> Diverse worldviews

Page 33: December 9, 2015 NISO Webinar: Two-Part Webinar: Emerging Resource Types - Part 1 Large Data Sets

Greenland Ice Sheet Melt Data Products

Page 34: December 9, 2015 NISO Webinar: Two-Part Webinar: Emerging Resource Types - Part 1 Large Data Sets

Ronin InstituteRonin Institute

Ronin Institute42

Questions?