summary of data citation synthesis activity & review

20
Prepared for Data Citation Synthesis Group Open Workshop s Sept 2013 Summary of data citation synthesis activity & Next steps for review <bit.ly/dsynthrev> Dr. Micah Altman <[email protected]> MIT Libraries Joan Starr [email protected] California Digital Library

Upload: micah-altman

Post on 06-May-2015

336 views

Category:

Technology


6 download

TRANSCRIPT

Page 1: Summary of data citation synthesis activity & Review

Prepared for

Data Citation Synthesis GroupOpen Workshop

s

Sept 2013

Summary of data citation synthesis activity &

Next steps for review <bit.ly/dsynthrev>

Dr. Micah Altman<[email protected]>

MIT Libraries

Joan Starr

[email protected] Digital Library

Page 2: Summary of data citation synthesis activity & Review

Summary of synthesis activity & Next steps for review

What has been done?

Page 3: Summary of data citation synthesis activity & Review

Summary of synthesis activity & Next steps for review

Refining Approaches to Data Citation

2000-2004

NESSTAR, Virtual Data Center

Cite research data in publications; Use

persistent identifiers; Facilitate direct access to data through URI’s

[Ryssevik & Musgrave 2001][Altman, et al. 2001]

2005-2009

Dataverse Network System, TIB Data DOI Registration

Include versioning, fixity, and granularity for verification; use

permanent institutions; facilitate

attribution

[Buhneman 2006][Altman & King 2007]

[OECD 2009]

2010-DataCite;

Thomson-Reuters Data Citation

Index; FigShare; Data Dryad

Include data citations in standard locations; index data citations in catalogs; facilitate

machine understanding

[NAS 2012][DCC 2012]

[Force 11 2013][CODATA 2013]

Example Systems Core Recommendations

Key References

Page 4: Summary of data citation synthesis activity & Review

Summary of synthesis activity & Next steps for review

Integrating Current Recommendation

Disciplinary Practices; Repository Practices

Summative Recommendations

Synthesis Principles

Page 5: Summary of data citation synthesis activity & Review

Summary of synthesis activity & Next steps for review

Synthesis Group Activity• Hosted by Force 11

– Charter here: http://www.force11.org/node/4432• Formed early summer• Meeting weekly• Reviewed current key recommendations

& engaged lead authors: – Force 11/Amsterdam Manifesto [FORCE11 2012]– Co-Data/”Out of Cite” Recommendations [CODATA 2013]– DCC Guide [DCC 2012]– DataCite/Metadata Core [Datacite 2012]– Research Data Alliance

• Identified core principles that are consistent across recommendation groups• Formulated a draft synthesis of principles• Agreed to use key documents above for definitions of terms, detailed

explanation of issues• Out of scope: specific detailed standards, protocols, infrastructure, tools

Page 6: Summary of data citation synthesis activity & Review

Summary of synthesis activity & Next steps for review

Yesterday

• Open Workshop• Line-by-line review of draft• Open editing of document

– In shared document– Using revision control

• Convergence on principles– 8 principles revised and approved by consensus– 1 recommendation struck– 1 recommendation tabled for discussion today

• Summary – Substantial core of agreement need for citation; use of persistent identifiers; support

for human and machine access; facilitation of verification, attribution.– Maintain conceptual boundaries among data citation; publication & evaluation– Recognize that terminology cannot always be aligned with colloquial or disciplinary

usage

Page 7: Summary of data citation synthesis activity & Review

Summary of synthesis activity & Next steps for review

The principles

1. Importance2. Credit and attribution3. Evidence4. Unique Identification5. Access6. Persistence7. Versioning and granularity8. Interoperability and flexibility

Page 8: Summary of data citation synthesis activity & Review

Summary of synthesis activity & Next steps for review

Open Question: Data Repository Recommendations

6. PersistenceMetadata describing the data, and unique identifiers should persist, even beyond the lifespan of the data they describe.

Data citations should be resolvable to data stored in repositories with a commitment and demonstrated capability to maintain long term access. Data stored in such repositories may not always be publicly accessible. Although such repositories should be committed to long term maintenance and preservation of data, the nature of digital data is such that they may not persist indefinitely.

Page 9: Summary of data citation synthesis activity & Review

Summary of synthesis activity & Next steps for review

Review Process

• Synthesis group will supplement today’s consensus principles with background:– Illustrative examples for each recommendation– References with each principle to detailed discussion of

embedded issues in prior reports. – Glossary.

• Public release of draft for open online commentary• Integration of commentary and release of final draft

Page 10: Summary of data citation synthesis activity & Review

Summary of synthesis activity & Next steps for review

Questions for Review & Decisions• Nomination of additional members to synthesis group for preparation of

summary material (glossary, references, example, preamble)? – Decision: anyone in attendance who can substantively (if not officially) represent a

group – Decision: Identify additional key organizations for commentary,

• Public release of draft – when, to whom?– Decision: Available for open public commentary mid November– Decision: Will specifically request comments from key organizations, including:

• Organizations listed earlier ( Force11, DCC, CoData, ESIP, RDA, DataVerse, Data-PASS, DataCite)• Additional suggested organizations: NLM, ARL• Additional organization identified by synthesis group

• Open commentary via mailing list & force11 website. Period for commentary?– Decision: 6-8 weeks for public commentary

• Integration of commentary by synthesis group and release of updated draft. Number of drafts necessary? When to declare “done”?– Decision: Single round of revisions by synthesis group. Will then seek endorsements.

Page 11: Summary of data citation synthesis activity & Review

Summary of synthesis activity & Next steps for review

Additional References• [Ryssevik & Musgrave 2001]

J Ryssevik , S. Musgrave. 2001. The Social Science Dream MachineSocial Science Computer Review [Altman, et al. 2001] M. Altman, et al. 2001. A Digital Library for the Dissemination and Replication of Quantitative Social Science Research: The Virtual Data Center, Social Science Computer Review

• [Buhneman 2006]P. Buhneman 2006. How to Cite Curated Databases and Make them CitableSSDBM ’06

• [Altman & King 2007]M. Altman & G. King, 2007. A Proposed Standard for the Scholarly Citation of Quantitative Data, D-Lib

• [OECD 2009]T. Green. 2009, We need publishing standards for datasets and data tables. OECD.

• [NAS 2012]P. Uhlir (ed.),2011. For Attribution -- Developing Data Attribution and Citation Practices and Standards. National Academies of Sciences.

Page 12: Summary of data citation synthesis activity & Review

Synthesis Group Contacts

About the synthesis group: http://www.force11.org/node/4432

Questions for the synthesis group: [email protected]

Consensus document, with revision history:https://docs.google.com/document/d/1KosNqBPgE8ziWDuJgBIrk20KxcOXoZdAt_TdJV3xoz8/edit?usp=drive_web

Summary of synthesis activity & Next steps for review

Page 13: Summary of data citation synthesis activity & Review

Summary of synthesis activity & Next steps for review

Key Recommendations• [[Force11 2013]

M. Crosas, T. Carptenter, C. Borgman, D. Shotton 2013, The Amsterdam Manifesto on Data Citation Principles, Force11

• [CODATA 2013]CODATA-ICSTI Task Group on Data Citation, 2013; Out of Cite, Out of Mind: The Current State of Practice, Policy, and Technology for the Citation of Data. Data Science Journal

• [DCC 2012]Ball, A., Duke, M. (2012). ‘Data Citation and Linking’. DCC Briefing Papers. Edinburgh: Digital Curation Centre.

Page 14: Summary of data citation synthesis activity & Review

Summary of synthesis activity & Next steps for review

Additional References• [Ryssevik & Musgrave 2001]

J Ryssevik , S. Musgrave. 2001. The Social Science Dream MachineSocial Science Computer Review [Altman, et al. 2001] M. Altman, et al. 2001. A Digital Library for the Dissemination and Replication of Quantitative Social Science Research: The Virtual Data Center, Social Science Computer Review

• [Buhneman 2006]P. Buhneman 2006. How to Cite Curated Databases and Make them CitableSSDBM ’06

• [Altman & King 2007]M. Altman & G. King, 2007. A Proposed Standard for the Scholarly Citation of Quantitative Data, D-Lib

• [OECD 2009]T. Green. 2009, We need publishing standards for datasets and data tables. OECD.

• [NAS 2012]P. Uhlir (ed.),2011. For Attribution -- Developing Data Attribution and Citation Practices and Standards. National Academies of Sciences.

• [Datacite 2012]Datacite metadata schema, v 3.0 http://schema.datacite.org/

Page 15: Summary of data citation synthesis activity & Review

Summary of synthesis activity & Next steps for review

Appendix: Principles

Page 16: Summary of data citation synthesis activity & Review

Summary of synthesis activity & Next steps for review

The principles

1. ImportanceData should be considered legitimate, citable products of research. Data citations should be accorded the same importance in the scholarly record as citations of other research objects, such as publications.

Page 17: Summary of data citation synthesis activity & Review

Summary of synthesis activity & Next steps for review

The principles

2. Credit and attributionData citations should facilitate giving scholarly credit and normative and legal attribution to all contributors to the data, recognizing that a single style or mechanism of attribution may not be applicable to all data.

3. EvidenceWhere a specific claim rests upon data, the corresponding data citation should be provided.

Page 18: Summary of data citation synthesis activity & Review

Summary of synthesis activity & Next steps for review

The principles

4. Unique identificationA data citation should include a persistent method for identification that is machine actionable, globally unique, and widely used by a community.

5. AccessData citations should facilitate access to the data themselves and to such associated metadata, documentation, and other materials, as are necessary for both humans and machines to make informed use of the referenced data.

Page 19: Summary of data citation synthesis activity & Review

Summary of synthesis activity & Next steps for review

The principles

6. PersistenceMetadata describing the data, and unique identifiers should persist, even beyond the lifespan of the data they describe.

[more to be decided upon]

Page 20: Summary of data citation synthesis activity & Review

Summary of synthesis activity & Next steps for review

The principles

7. Versioning and granularityData citations should facilitate identification and access to different versions and/or subsets of data. Citations should include sufficient detail to verifiably link the citing work to the portion and version of data cited.

8. Interoperability and flexibilityData citation methods should be sufficiently flexible to accommodate the variant practices among communities but should not differ so much that they compromise interoperability of data citation practices across communities.