data exchange, data citation: an overview of some community work

26
Data Exchange, Data Citation: An overview of some community work Todd Carpenter Executive Director, NISO May 20, 2012 Council of Science Editors

Upload: national-information-standards-organization-niso

Post on 15-Nov-2014

808 views

Category:

Technology


1 download

DESCRIPTION

 

TRANSCRIPT

Page 1: Data Exchange, Data Citation: An overview of some community work

Data Exchange, Data Citation:

An overview of some community work

Todd CarpenterExecutive Director, NISO

May 20, 2012 Council of Science Editors

Page 2: Data Exchange, Data Citation: An overview of some community work

May 20, 2012 CSE: Data Standards and Data Citation Issues

2

National Information Standards Organization

Non-profit industry trade association accredited by ANSI

110+ Organizational Members

Divided roughly in thirds among publishers, libraries and support service providers

Mission of developing and maintaining technical standards related to information, documentation, discovery and distribution of published materials and media

Volunteer driven organization: 400+ contributors who are spread out across the world

Page 3: Data Exchange, Data Citation: An overview of some community work

CSE: Data Standards and Data Citation Issues

3May 20, 2012

Standards are familiar, even if you don’t notice

Image: DanTaylor Image: Joel Washing

Page 4: Data Exchange, Data Citation: An overview of some community work

Data Publishing

May 20, 2012 CSE: Data Standards and Data Citation Issues

4

Page 5: Data Exchange, Data Citation: An overview of some community work

CSE: Data Standards and Data Citation Issues

5

Large Scale Data-Driven Science

May 20, 2012

Increasingly, scientific breakthroughs will be powered by advanced computing capabilities that help researchers manipulate and explore massive datasets.

Page 6: Data Exchange, Data Citation: An overview of some community work

CSE: Data Standards and Data Citation Issues

6

Science data isn’t what it used to be

May 20, 2012

Image: Walters Art Museum Image: Domenico, Caron, Davis, et al.

Page 7: Data Exchange, Data Citation: An overview of some community work

CSE: Data Standards and Data Citation Issues

7

Challenges of Data Publishing/Sharing

Data– is massive in scale.– can be constantly changing.– is difficult to describe/catalog.– is often difficult to integrate with other

data.– can be complex to analyze (tools

required).– is hard to peer review.– is challenging to curate.– is challenging to preserve.May 20, 2012

Page 8: Data Exchange, Data Citation: An overview of some community work

May 20, 2012 CSE: Data Standards and Data Citation Issues

8

Data – Complexities

StorageProvenanceLegal IssuesTechnical InteroperabilityMetadataPrivacyIdentificationCitation

Page 9: Data Exchange, Data Citation: An overview of some community work

CSE: Data Standards and Data Citation Issues

9

Problem with data, e.g.: mutabilityThe fact data sets are constantly changing

poses the following problems (not exhaustive, BTW)– How does someone rerun your experiment,

when the data set isn’t now what it was when you ran your experiment

– For citation – You have to cite the subset of the data that existed on XXX date when you ran your analysis

– How can you preserve a snapshot of a dataset?

– How do you describe the dataset at that point in time?

– Problems of managing the metadata of each data deposit

– Problems of synchronization across repositories

May 20, 2012

Page 10: Data Exchange, Data Citation: An overview of some community work

Data Complexity – Provenance

• How can you be certain that the data set you are looking at hasn’t been altered or changed?

• Can you trust the data source?

• Has the data been updated/added to/or appended since the analysis was initially run

• Dealing with issues of abstractionMay 20, 2012 10CSE: Data Standards and Data Citation

Issues

Page 11: Data Exchange, Data Citation: An overview of some community work

Data Citation

May 20, 2012 CSE: Data Standards and Data Citation Issues

11

Page 12: Data Exchange, Data Citation: An overview of some community work

CSE: Data Standards and Data Citation Issues

12

We all know what a citation looks like? Right?

May 20, 2012

Page 13: Data Exchange, Data Citation: An overview of some community work

CSE: Data Standards and Data Citation Issues

13

But what about citing a data set?

May 20, 2012

Source: Citations for SEER Databases

Source: Global Land Cover Facility

Source: International Polar Year

Source: ICPSR

Source: The Economist

Page 14: Data Exchange, Data Citation: An overview of some community work

CSE: Data Standards and Data Citation Issues

14

CODATA Group on Data Citation• Approved by the CODATA 27th General Assembly

in Cape Town in October, 2010

TASKS– Survey existing literature and existing data citation

initiatives.– Obtain input from stakeholders in library, academic,

publishing, and research communities.– Hold at least one meeting and a workshop to help

establish a solid foundation of the state of the art and practices in this area.

– Work with the ISO and major regional and national standards organizations to develop formal data citation standards and good practices.

May 20, 2012

Page 15: Data Exchange, Data Citation: An overview of some community work

CSE: Data Standards and Data Citation Issues

15

Issues Requiring Attention

May 20, 2012

Page 16: Data Exchange, Data Citation: An overview of some community work

CSE: Data Standards and Data Citation Issues

16

Accomplishments so far

• Compiled a significant bibliography of 200+ references

• Conducted a survey for repository managers, publishers, and funding organizations. – More than 60 narrative responses

• Upcoming meeting in Copenhagen in June

• Final meeting in Taipei in October with final report to follow.

May 20, 2012

Page 17: Data Exchange, Data Citation: An overview of some community work

CSE: Data Standards and Data Citation Issues

17

BRDI Meeting in Berkeley

• Two-day meeting in Berkeley, CA on data citation issues in August 2011

• Covered topics critical to data citation– Administration -- Legal– Publishing -- Policy– Identification -- P&T– Funding support -- Provenance

National Academies report to be published in June

May 20, 2012

Page 18: Data Exchange, Data Citation: An overview of some community work

What does the future hold?

May 20, 2012 CSE: Data Standards and Data Citation Issues

18

Page 19: Data Exchange, Data Citation: An overview of some community work

Standards for Data: Work areas?Many potential areas for work in sharing of scientific data

including:• Author/Contributor disambiguation & other issues

• Data Equivalence – How does one know that this thing and that are equivalent (i.e., contain same data)?

• Systemic metadata What is the form of this information?  What are its structural components?

• Archival issuesStorage, physical level, but also migration issues

• Bibliographic information for discovery, delivery and reuse

• Rights issues – Ownership, recognition, sharing, privacy

May 20, 2012 19CSE: Data Standards and Data Citation Issues

Page 20: Data Exchange, Data Citation: An overview of some community work

CSE: Data Standards and Data Citation Issues

20

ORCID – Author disambiguationFounded by CrossRef,

Thomson-Reuters, Nature in 2009

Now 328 participant organizations, 50 of which have provided sponsorship funding

Prototype technologyFull launch in Fall

2012 May 20, 2012

Page 21: Data Exchange, Data Citation: An overview of some community work

CSE: Data Standards and Data Citation Issues

21

International Standard Name ID• ISO 27729:2012

Information & Documentation -- International Standard Name Identifier

• Another identifier for public identity of parties

• Application for institutions (NISO I2)

• Launched in Spring ‘12

May 20, 2012

Page 22: Data Exchange, Data Citation: An overview of some community work

CSE: Data Standards and Data Citation Issues

22

Data Equivalence

• Creation of a high-level conceptual model of data description

• A “FRBR” for data• Defines the

distinctions between states & transformations of data

• Basis for identification & description

May 20, 2012

Page 23: Data Exchange, Data Citation: An overview of some community work

Too many players, doing too many things?

May 20, 2012 23CSE: Data Standards and Data Citation Issues

Source: XKCD

Page 24: Data Exchange, Data Citation: An overview of some community work

Other Initiatives & Links• CODATA/ICSTI – Task Group on

Data Citation Standards and Practices• DataCite – datacite.org• NISO/NFAIS Group on Supplemental Journal Materials• National Academies Board on Research Data and Information

– For Attribution: Developing Data Attribution and Citation Practices and Standards in Berkely, CA. August 2011 Report due in January

• Cite Datasets and Link to Publications by Digital Curation Centre• ORCID - about.orcid.org• International Standard Name ID – www.isni.org • ScienceCommons - Scholar’s Copyright Project• International Council for Scientific and Technical Information

(ICSTI) – Multimedia Search and Retrieval, and Interactive Journal Articles Projects

• Dublin Core Metadata Initiative - Science and Metadata Community

• DataNet projects funded by NSF, launched in 2009• NISO/OAI ResourceSync project funded by Alfred P. Sloan

Foundation

May 20, 2012 CSE: Data Standards and Data Citation Issues

24

Page 25: Data Exchange, Data Citation: An overview of some community work

May 20, 2012 CSE: Data Standards and Data Citation Issues

25

Information Standards Quarterly (ISQ)

Summer 2010 Issue of ISQ

Special issue on Enhanced Journal

ArticlesArticle-level Enhancements in

the Humanities and Social Sciences

Hosting Supplemental Materials

Archiving Supplemental Materials

DOIs - Linking and BeyondSupplemental Materials

SurveyReport on NISO/NFAIS project

NOW AVAILABLE OPEN ACCESS

www.niso.org/publications/isq

Page 26: Data Exchange, Data Citation: An overview of some community work

May 20, 2012 CSE: Data Standards and Data Citation Issues

26

Questions?

Todd CarpenterExecutive Director

[email protected]

National Information Standards Organization (NISO)One North Charles Street, Suite 1905Baltimore, MD 21201 USA+1 (301) 654-2512www.niso.org