shareable by design: making better use of your research
DESCRIPTION
Presentation on data sharing that outlines five layers that must be addressed to enable data to be located, obtained, access, understood and use, and cited.TRANSCRIPT
Shareable by DesignMake better use of your research
Gareth [email protected]
This work is licensed under aCreative Commons Attribution 2.0 UK:
England & Wales License
Gateways to Research23 September 2013
Data Sharing in the News
Funder RequirementExpectation that projects will take steps to manage & share data
• DM Plan to accompany research proposal
• Data created & managed in accordance with relevant standards
• Minimum retention period• Strategy for sharing data, indicating
when, where and how
Funder Requirements for Data Management and Sharinghttp://researchonline.lshtm.ac.uk/208596/
Research Impact
Open Data citation benefit (Craig et. al. 2007):• Increased visibility through promotion of paper and data as distinct entities,
as well as additional cross-linking• Research considered to be more credible• Papers with available datasets can be used in ways that papers without
data
Piwowar & Vision (2013):• Papers with associated data receive higher citation (9%) in comparison to
those that do not, in study of 10,555 gene expression papers
Piwowar H, Vision TJ. (2013) Data reuse and the open data citation advantage. PeerJ PrePrints 1:e1v1 http://dx.doi.org/10.7287/peerj.preprints.1v1
Craig, et al, (2007) Do open access articles have 2 greater citation impact?A critical review of the literature. Journal of Informetrics 1 (3) 239- 3 248
http://dx.doi.org/10.1016/j.joi.2007.04.001
Enabling Better Research1. Recognise all sources
Greater clarity on contribution by data creator to research activity
2. Supports research validationSave time when locating and reproducing research findings
3. Supports new researchData can be analysed in new ways & for different purposes:• Piwowar & Vision Gene
expression study identified increase in no. of datasets analysed in secondary studies
Data Sharing Pyramid
Usable
Accessible
Available
Locatable
Citable
Ability to locate• Publish information in digital form that
enables a user to discover:– Existence of dataset– Content & origin of dataset
• Referred to as discovery/descriptive metadata
• Descriptive information may include:– Creator, Institution, creation date,
description, temporal coverage,geographic coverage, rights
• Should be possible to find through search engines and relevant portals
Ability to obtain
• Entire dataset or subset, e.g. Anonymised/non-confidential data
• Different access methods:– Public: Available to anyone– Registered: Time-limited access to
registered users– Approved: User must state
intended use– Contract: User must complete a
formal agreement to access & use
Make research data available for access to interested parties
To Share or not to Share
1. Is the Sharing justified?• What benefits will it provide?• What are the risks associated with sharing data?
2. Do you have the ability to share?• Ability to anonymise• Intellectual Property Rights (IPR)• Participant Consent• Other obligations, e.g. confidentiality
3. Are there any conditions associated with sharing?• What measures need to be in place to protect data? (e.g. record access
requests, specific use only)Information Commissioner Office. Data Sharing Code of Practice
http://www.ico.org.uk/for_organisations/data_protection/topic_guides/data_sharing
Information Commissioner Office: Anonymisation Code of Practicehttp://www.ico.org.uk/for_organisations/data_protection/topic_guides/anonymisation
Data Rights
• Informed Consent– Obtain authorisation from project participants
• Clarify rights holders at early stage of project– Ownership rights– Roles & responsibilities, e.g. For long-term management– Permitted access and use
• Document in an appropriate form– Standard data licence (e.g. Creative Commons, Open Data Commons License)– Tailored licence form, e.g. Data Transfer Agreement
• Particularly important when performing collaborative research involving multiple rights holders
Rights issues must be addressed to enable access and use
Ability to access
Accessible using available software• Is it held in a format that can be
imported into a wide-range of software tools?
Stored in form that enables analysisUse published standards to encode data, rather than invent your own approach• Simplify cross-analysis of studies• Save times on documentation
Address issues that will limit access and use to data
“turning [a] PDF into XML is like turning a hamburger into a cow”Peter Murray-Rust on the challenges of extracting data from published research papers
UK Data Service: Formatting Datahttp://ukdataservice.ac.uk/manage-data/format.aspx
Ability to UseProvide document sufficient to interpret content
How is this value calculated?
What are the boundaries of this measurement?
What does this column refer to?
Where and how was this data
captured?
What processing has been performed?
Provide documentation (e.g. Codebook, research protocol) to accompany data1. Check reqs in your field (e.g. Clinical trials)
2. Look at other collections for inspiration (e.g. Via UK Data Service)3. Consider Qs that may be raised by a user unfamiliar with research
What are the permitted uses of
this data?
Ability to Cite
1. References, without broken links• Web -based data change location over time, resulting in broken URLs (‘link rot’) • Many persistent identifier schemes exist. Digital Object Identifier (DOI) is preferred
standard
2. Research data and citation standards• No formal standard for citing data using Harvard, Vancouver, but many different
guides– Author names. Title of resource [medium type]. Host institution name: Physical
location; Year of publication [Date accessed]. Available from: Identifier
2. Citation granularity• Referencing intra-object components (e.g. Tables, paragraph) remains a concern
Provide structure to identify and acknowledgesources used in research paper
http://www.esds.ac.uk/orderingData/citing.asphttp://www.dcc.ac.uk/resources/briefing-papers/introduction-curation/data-citation-and-linking
Things to remember
• To increase research visibility: Publish metadata on research data through an appropriate research portal
• To enable access: Evaluate benefits and risks and address barriers to data sharing
• To manage access: consider the conditions that need to be met
• To encourage uptake and citation: Ensure that your data is easy to access and use
• To recognise your sources and credit data creators: Cite the datasets that you use in research papers
A Few Useful References• Funder Requirements for Data Management and Sharing http://researchonline.lshtm.ac.uk/208596/
• Information Commissioner Office. Data Sharing Code of Practice http://www.ico.org.uk/for_organisations/data_protection/topic_guides/data_sharing
• Information Commissioner Office: Anonymisation Code of Practice• http://www.ico.org.uk/for_organisations/data_protection/topic_guides/anonymisation
• MANTRA – Data Management training for PhD studentshttp://datalib.edina.ac.uk/mantra/
• UK Data Archive – Managing and Sharing Datahttp://www.data-archive.ac.uk/media/2894/managingsharing.pdf
• LSHTM Information Management support materialhttp://intra.lshtm.ac.uk/infoman/
• Guidelines on good research practice: Implementing research governance: http://www.lshtm.ac.uk/research/ethicscommittees/good_research_practice.pdf
• Information Management and Security Policy: http://intra.lshtm.ac.uk/infoman/security/index.html
Thank You for your attention!
Gareth Knight.RDM Support Service Project Manager
Email: [email protected]
Questions