dataverse community meeting€¦ · dataverse > 70,000 datasets > 2.5 m downloads >...

26
DATAVERSE COMMUNITY MEETING 10 Years Sharing Data with Dataverse #dataverse2017

Upload: others

Post on 25-Jul-2020

5 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: DATAVERSE COMMUNITY MEETING€¦ · Dataverse > 70,000 datasets > 2.5 M downloads > 340,000 files < 2006 When we started, there were very few journals with data policies,

DATAVERSECOMMUNITY

MEETING10 Years Sharing Data with Dataverse

#dataverse2017

Page 2: DATAVERSE COMMUNITY MEETING€¦ · Dataverse > 70,000 datasets > 2.5 M downloads > 340,000 files < 2006 When we started, there were very few journals with data policies,

<2006

Once there was the VDC

2

Page 3: DATAVERSE COMMUNITY MEETING€¦ · Dataverse > 70,000 datasets > 2.5 M downloads > 340,000 files < 2006 When we started, there were very few journals with data policies,

2006

And then came the Dataverse Network

2015

142

Page 4: DATAVERSE COMMUNITY MEETING€¦ · Dataverse > 70,000 datasets > 2.5 M downloads > 340,000 files < 2006 When we started, there were very few journals with data policies,

2006

Now we have the Dataverse

2015 2017

2 14 23

Page 5: DATAVERSE COMMUNITY MEETING€¦ · Dataverse > 70,000 datasets > 2.5 M downloads > 340,000 files < 2006 When we started, there were very few journals with data policies,

RESEARCHERS ARE SHARING AND USING DATA

200datasets/month

4,000files/month

60,000downloads/month

HarvardDataverse

> 70,000 datasets

> 2.5 M downloads

> 340,000 files

Page 6: DATAVERSE COMMUNITY MEETING€¦ · Dataverse > 70,000 datasets > 2.5 M downloads > 340,000 files < 2006 When we started, there were very few journals with data policies,

< 2006

When we started, there werevery few journals with data

policies,no data requirements from

funders

Page 7: DATAVERSE COMMUNITY MEETING€¦ · Dataverse > 70,000 datasets > 2.5 M downloads > 340,000 files < 2006 When we started, there were very few journals with data policies,

2006 2015 2017

weak = recommendstrong = require

Weak data sharing and strong data sharing vs. disciplines

Castro, Crosas, Garnett, Sheridan, Altman, 2017, Journal of Scholarly Publishing, Forthcoming

Now,Journals

acrossdisciplines

startsupporting

data policies

Gen

etic

sJo

urna

ls

Bio

med

ical

Jour

nals

Com

puta

tion

alSc

ienc

es

Econ

omic

s

Ope

n A

cces

sJo

urna

ls

Ecol

ogy

33%

4%

Page 8: DATAVERSE COMMUNITY MEETING€¦ · Dataverse > 70,000 datasets > 2.5 M downloads > 340,000 files < 2006 When we started, there were very few journals with data policies,

2006 2015 2017

AndFunders

require datasharing

PRIVATE RESEARCH FUNDERS

Bill and Melinda Gates Foundation Information Sharing ApproachSloan Foundation Data Sharing PolicyWellcome Trust Data Sharing PolicyArnold FoundationMoore FoundationRobert Wood Johnson FoundationHHMI Policy on the Sharing of Publication-Related Materials, Data and Software

PUBLIC RESEARCH FUNDERS

Department of AgricultureDepartment of CommerceDepartment of DefenseDepartment of EducationDepartment of EnergyDepartment of Health and Human Services

Agency for Healthcare Research and Quality (AHRQ)Assistant Secretary for Preparedness and Response (ASPR)Center for Disease Control and Prevention (CDC)Food and Drug Administration (FDA)National Institutes of Health (NIH)

Department of Homeland SecurityDepartment of Housing and Urban DevelopmentDepartment of InteriorDepartment of LaborDepartment of TransportationDepartment of Veterans AffairsEnvironmental Protection Agency (EPA)

Page 9: DATAVERSE COMMUNITY MEETING€¦ · Dataverse > 70,000 datasets > 2.5 M downloads > 340,000 files < 2006 When we started, there were very few journals with data policies,

WE ARE EXPERIENCING ACULTURAL CHANGE

Page 10: DATAVERSE COMMUNITY MEETING€¦ · Dataverse > 70,000 datasets > 2.5 M downloads > 340,000 files < 2006 When we started, there were very few journals with data policies,

WE ARE EXPERIENCING ACULTURAL CHANGE

WE ARE THE CULTURALCHANGE!

Page 11: DATAVERSE COMMUNITY MEETING€¦ · Dataverse > 70,000 datasets > 2.5 M downloads > 340,000 files < 2006 When we started, there were very few journals with data policies,

King, 1995, Replication,Replication

Altman and King, 2007, A Proposed for theScholarly Citation of Quantitative Data

Altman et al, 2001, A Digital Library for the Disseminationand Replication of Quantitative Social Science

King, 2007, An Introduction to the DataverseNetwork as an Infrastructure for Data Sharing

Crosas, Honaker, King, Sweeney, 2015,Automating Open Science for Big Data

Crosas, 2012, The Dataverse Network: an open sourceapplication for sharing, discovering, and preserving research

data

Altman and Crosas, 2013, The Evolution to DataCitation: from principles to implementation

Crosas, 2013, A Data Sharing Story

2014, Joint Declaration of DataCitation Principles

Pepe et al, 2014, How Do Astronomers Share Data?

Goodman et al, 2014, Ten Simple Rules forthe Care and Feeding of Scientific Data

Castro et al, 2015, Achieving Human andMachine Accessibility of Cited Data

Sweeney, Crosas, Bar-Sinai, 2015, Sharing SensitiveData with Confidence: The DataTags System

Meyer et al. 2016, Data Publication with the Structural Biology Data Grid Supports Live Analysis

Wilkinson et al, 2016, The FAIRGuiding Principles for Scientific

Data Management andStewardship

Bierer, Crosas, Pierce, 2017, DataAuthorship as an Incentive to

Data Sharing

The Dataverse project and team leading many aspects of data sharing

2017

Page 12: DATAVERSE COMMUNITY MEETING€¦ · Dataverse > 70,000 datasets > 2.5 M downloads > 340,000 files < 2006 When we started, there were very few journals with data policies,

METRICS FROM LAST YEAR,JUNE 2016 TO JUNE 2017

AN ACTIVETEAM AND

COMMUNITY

Page 13: DATAVERSE COMMUNITY MEETING€¦ · Dataverse > 70,000 datasets > 2.5 M downloads > 340,000 files < 2006 When we started, there were very few journals with data policies,

22 COMMUNITYCALLS

190 ATTENDEES25 ORGANIZATIONS/UNIVERSITIES10 COUNTRIES

Community

Page 14: DATAVERSE COMMUNITY MEETING€¦ · Dataverse > 70,000 datasets > 2.5 M downloads > 340,000 files < 2006 When we started, there were very few journals with data policies,

975 GOOGLEGROUPMESSAGES

Community

Page 15: DATAVERSE COMMUNITY MEETING€¦ · Dataverse > 70,000 datasets > 2.5 M downloads > 340,000 files < 2006 When we started, there were very few journals with data policies,

7,114 IRCMESSAGES

Community

245 UNIQUE USERS

Page 16: DATAVERSE COMMUNITY MEETING€¦ · Dataverse > 70,000 datasets > 2.5 M downloads > 340,000 files < 2006 When we started, there were very few journals with data policies,

12 SPRINTS (STARTED IN JANUARY 2017)

IQSS Dataverse Team

Page 17: DATAVERSE COMMUNITY MEETING€¦ · Dataverse > 70,000 datasets > 2.5 M downloads > 340,000 files < 2006 When we started, there were very few journals with data policies,

220 STANDUPMEETINGS

IQSS Dataverse Team

Page 18: DATAVERSE COMMUNITY MEETING€¦ · Dataverse > 70,000 datasets > 2.5 M downloads > 340,000 files < 2006 When we started, there were very few journals with data policies,

52,000 SLACKMESSAGES

IQSS Dataverse Team

Page 19: DATAVERSE COMMUNITY MEETING€¦ · Dataverse > 70,000 datasets > 2.5 M downloads > 340,000 files < 2006 When we started, there were very few journals with data policies,

43 GITHUBCONTRIBUTORS

Code

Page 20: DATAVERSE COMMUNITY MEETING€¦ · Dataverse > 70,000 datasets > 2.5 M downloads > 340,000 files < 2006 When we started, there were very few journals with data policies,

334 PULLREQUESTS

Code

Page 21: DATAVERSE COMMUNITY MEETING€¦ · Dataverse > 70,000 datasets > 2.5 M downloads > 340,000 files < 2006 When we started, there were very few journals with data policies,

8,335 GITHUBCOMMITS

Code

Page 22: DATAVERSE COMMUNITY MEETING€¦ · Dataverse > 70,000 datasets > 2.5 M downloads > 340,000 files < 2006 When we started, there were very few journals with data policies,

1,153 SUPPORTTICKETS

Support

Page 23: DATAVERSE COMMUNITY MEETING€¦ · Dataverse > 70,000 datasets > 2.5 M downloads > 340,000 files < 2006 When we started, there were very few journals with data policies,

DATAVERSE CUP 2017

Page 24: DATAVERSE COMMUNITY MEETING€¦ · Dataverse > 70,000 datasets > 2.5 M downloads > 340,000 files < 2006 When we started, there were very few journals with data policies,

A VISION:DATAVERSE AS A KEY PART OF

THE FULL RESEARCH DATALIFECYCLE

Page 25: DATAVERSE COMMUNITY MEETING€¦ · Dataverse > 70,000 datasets > 2.5 M downloads > 340,000 files < 2006 When we started, there were very few journals with data policies,

TOWARDS A DATA-CENTRIC RESEARCH LIFECYCLE

Data Collection

Lab

E-NotebooksInstruments

Surveys...

Assign DUA&

metadata

Cloud Computing andStorage

Run data &code

Explore &Visualize data

Track Provenance

Journals &Funders

DataCitation

Work withSensitive Data

FROM DATA COLLECTION, TO COMPUTING AND SHARING

Page 26: DATAVERSE COMMUNITY MEETING€¦ · Dataverse > 70,000 datasets > 2.5 M downloads > 340,000 files < 2006 When we started, there were very few journals with data policies,

RESEARCHCOLLABORATIONS

Data Privacy

Big Data

Data Policies

Replication

...

COMMUNITY

STANDARDS ANDBEST PRACTICES

INSTITUTIONSREQUIREMENTS

JOURNALSREQUIREMENTS

FUNDERSREQUIREMENTS

TECHNOLOGYADVANCES