building support for a discipline-based data repository ryan scherle 1, sarah carrier 2, jane...

1
Building Support for a Discipline-Based Data Repository Ryan Scherle 1 , Sarah Carrier 2 , Jane Greenberg 2 , Hilmar Lapp 1 , Abbey Thompson 2 , Todd Vision 1,3 , Hollie White 2 Use-case studies We are conducting detailed interviews with a small number of researchers in the field. These results will provide great depth, and allow us to better interpret results from the survey. Some preliminary results: Data is sometimes lost due to poor preservation habits. • There is surprising agreement about the proper levels of credit for dataset creators (citation, acknowledgement, or co-authorship). • Authors agree that data should be shared after results are published, but there is great variety in the amount of data shared and the circumstances under which it is shared. Existing data deposition policies need to be enforced better. GenBank submissions are typically not verified. Workshops Two workshops were held to introduce the repository to the community and gather initial requirements. Results of these workshops include: Immediate data collection is top priority. Detailed metadata curation and analysis can be introduced later. Metadata collection must be automated as much as possible. • To ensure credibility, there must be a focus on data associated with published articles. • The repository must not dictate any specific data format. Journals must be responsible for encouraging deposition and verifying the quality of deposits. All data should be openly available. It is critical to develop good practices for data citation. Survey A survey is currently underway to asses trends and attitudes within the community. Preliminary results are shown below. Summary Dryad is a repository of scientific data from evolutionary biology and related fields. To be certain the repository meets the community’s needs, we are gathering information on community practices and attitudes through a variety of methods. To further build community support, we are integrating the repository into the normal research workflow via partnerships with journals, exchange agreements with other repositories, and the solicitation of high-use datasets. Partner Journals Full partners Agreed to archiving policy Expressed interest Joint Archiving Policy <<Journal title>> requires, as a condition for publication, that data used in the paper should be archived in an appropriate public archive, such as GenBank, TreeBase, or Dryad. The data should be given with sufficient details that, together with the contents of the paper, allows each result in the published paper to be re- created. GenBank TreeBas e Dryad ccaattggct gttcttcgat tctggcgagt 1 National Evolutionary Synthesis Center (NESCent), Durham, NC; 2 School of Information and Library Science, University of North Carolina, Chapel Hill, NC; 3 Department of Biology, University of North Carolina, Chapel Hill, NC Dryad data repository http://datadryad.org [email protected] NESCent is a collaborative effort of Duke University, The University of North Carolina at Chapel Hill and North Carolina State University. NESCent is supported by the National Science Foundation (Grant # EF-0423641)

Upload: savannah-parker

Post on 10-Dec-2015

216 views

Category:

Documents


1 download

TRANSCRIPT

Page 1: Building Support for a Discipline-Based Data Repository Ryan Scherle 1, Sarah Carrier 2, Jane Greenberg 2, Hilmar Lapp 1, Abbey Thompson 2, Todd Vision

Building Support for aDiscipline-Based Data RepositoryRyan Scherle1, Sarah Carrier2, Jane Greenberg2, Hilmar Lapp1, Abbey Thompson2, Todd Vision1,3,

Hollie White2

Use-case studiesWe are conducting detailed interviews with a small number of researchers in the field. These results will provide great depth, and allow us to better interpret results from the survey. Some preliminary results:• Data is sometimes lost due to poor preservation habits.• There is surprising agreement about the proper levels of credit for dataset creators (citation, acknowledgement, or co-authorship).• Authors agree that data should be shared after results are published, but there is great variety in the amount of data shared and the circumstances under which it is shared.• Existing data deposition policies need to be enforced better. GenBank submissions are typically not verified.

WorkshopsTwo workshops were held to introduce the repository to the community and gather initial requirements. Results of these workshops include:• Immediate data collection is top priority. Detailed metadata curation and analysis can be introduced later.• Metadata collection must be automated as much as possible.• To ensure credibility, there must be a focus on data associated with published articles.• The repository must not dictate any specific data format.• Journals must be responsible for encouraging deposition and verifying the quality of deposits. • All data should be openly available.• It is critical to develop good practices for data citation.

SurveyA survey is currently underway to asses trends and attitudes within the community. Preliminary results are shown below.

SummaryDryad is a repository of scientific data from evolutionary biology and related fields. To be certain the repository meets the community’s needs, we are gathering information on community practices and attitudes through a variety of methods. To further build community support, we are integrating the repository into the normal research workflow via partnerships with journals, exchange agreements with other repositories, and the solicitation of high-use datasets.

Partner JournalsFull partners

Agreed to archiving policy

Expressed interest

Joint Archiving Policy<<Journal title>> requires, as a condition for publication, that data used in the paper should be archived in an appropriate public archive, such as GenBank, TreeBase, or Dryad. The data should be given with sufficient details that, together with the contents of the paper, allows each result in the published paper to be re-created.

GenBank

TreeBase

Dryad

ccaattggct gttcttcgat tctggcgagt

1 National Evolutionary Synthesis Center (NESCent), Durham, NC; 2 School of Information and Library Science, University of North Carolina, Chapel Hill, NC;3 Department of Biology, University of North Carolina, Chapel Hill, NC

Dryad data repositoryhttp://[email protected]

NESCent is a collaborative effort of Duke University, The

University of North Carolina at Chapel Hill and North Carolina

State University.

NESCent is supported by theNational Science Foundation

(Grant # EF-0423641)