digital curation centre - sharing of experimental clinical research … · 2020. 5. 12. ·...

29

Upload: others

Post on 05-Sep-2020

0 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Digital Curation Centre - Sharing of experimental clinical research … · 2020. 5. 12. · Guidance for publishing descriptions of non-public clinical datasets Iain Hrynaszkiewicz,
Page 2: Digital Curation Centre - Sharing of experimental clinical research … · 2020. 5. 12. · Guidance for publishing descriptions of non-public clinical datasets Iain Hrynaszkiewicz,

Horizon 2020Coordination and

Support ActionGARRI-3-2014 Scientific

Information in the Digital Age: Text and Data Mining (TDM)

Project number: 665940

To help the Text and Dataminingcommunity

Lets discuss(best) Practices

Page 3: Digital Curation Centre - Sharing of experimental clinical research … · 2020. 5. 12. · Guidance for publishing descriptions of non-public clinical datasets Iain Hrynaszkiewicz,

Guidance for publishing descriptions of non-public clinical datasets

Iain Hrynaszkiewicz, Varsha Khodiyar, Andrew L. Hufton, Mathias Astell and Susanna-Assunta Sansone

The problem:• Sharing of experimental clinical research data usually happens between individuals or research groups rather

than via public repositories

• It is difficult to connect journal articles with their underlying clinical datasets even when they are “available on request”

Scientific Data workflow for clinical datasets

Our suggested solutions:• New scholarly journal and article types to enable increasing

accessibility to non-public research data and provide case studies

• Journals to develop stronger links with specialist data repositories

• Use and promote voluntary data sharing services to increase accessibility to clinical datasets for secondary uses while protecting patient privacy and the legitimacy of secondary analyses

• Increase collaboration between journals, data repositories, researchers, funders, and voluntary data sharing services

• Use the journal Scientific Data as an example of changes to article format and peer-review process that can be made to journal articles to more robustly link them to data that are only available on request

• Assess and promote features of data repositories to better accommodate non-public clinical datasets, including Data Use Agreements (DUAs)

Page 4: Digital Curation Centre - Sharing of experimental clinical research … · 2020. 5. 12. · Guidance for publishing descriptions of non-public clinical datasets Iain Hrynaszkiewicz,

6 services in 1 afternoon: JOINING UP RESEARCH SUPPORT ACROSS UCL

Our “6-in-1” course : ‘Introduction to Research Support & Integrity’�Coordinated by the RDM & Research Integrity teams, with the support of the Doctoral School�3 hour-long; 6 convenors �Up to 100 PhD students (all disciplines & PhD years)

Drivers & enablers�Researchers’ need for help: during project rather than at the end �Inter-services collaboration�6 services that support the research lifecycle at various stages

Benefits9For students - for speakers - for coordinators

www.ucl.ac.uk/research-data-management

Page 5: Digital Curation Centre - Sharing of experimental clinical research … · 2020. 5. 12. · Guidance for publishing descriptions of non-public clinical datasets Iain Hrynaszkiewicz,

Amber Leahey & Grant Hurley Scholars Portal, Ontario Council of University Libraries

(Canada)

Page 6: Digital Curation Centre - Sharing of experimental clinical research … · 2020. 5. 12. · Guidance for publishing descriptions of non-public clinical datasets Iain Hrynaszkiewicz,

Data Management Outreach Efforts @ University of Florida (UF) - USA

1st Data Management Planning (DMP) Workshop•9/22/16

2nd & 3rd

DMP Workshops•10/24/16

Associate Deans of Research Luncheon•12/14/16

University-wide data survey (preview)•1/3/17

Associate Dean and STRIDEDirector Mtg.•3/9/17

Plato L. Smith IIData Management [email protected]

These activities were made possible through collaborations between the Data Management and Curation Working Group, UF Research Computing, UF Informatics Institute, and UF Division of Sponsored Programs.

Page 7: Digital Curation Centre - Sharing of experimental clinical research … · 2020. 5. 12. · Guidance for publishing descriptions of non-public clinical datasets Iain Hrynaszkiewicz,
Page 8: Digital Curation Centre - Sharing of experimental clinical research … · 2020. 5. 12. · Guidance for publishing descriptions of non-public clinical datasets Iain Hrynaszkiewicz,

What happens when 15 different people curate data?Do they do the same curation activities?

Inspect Files(15)

Inspect Metadata

(15)Quality

Assurance(14)

Activity?n = ?

Activities?n = ?

Activityn = ?

Activity?n = ?

Page 9: Digital Curation Centre - Sharing of experimental clinical research … · 2020. 5. 12. · Guidance for publishing descriptions of non-public clinical datasets Iain Hrynaszkiewicz,

www.dpoc.ac.uk#DP0C

Parallel Auditing of the University of Oxford and

Cambridge’sInstitutional Repositories

Page 10: Digital Curation Centre - Sharing of experimental clinical research … · 2020. 5. 12. · Guidance for publishing descriptions of non-public clinical datasets Iain Hrynaszkiewicz,

How, why?

Research Software in RDM?

Page 11: Digital Curation Centre - Sharing of experimental clinical research … · 2020. 5. 12. · Guidance for publishing descriptions of non-public clinical datasets Iain Hrynaszkiewicz,

Scientific Data Management within the Brazilian Information Science Community

ProblemScientific data management is a necessary practiceto add value to the researcher’s data. This reality isbecoming evident to the Brazilian researchInformation Science community; in this regard wequestion which are their practices concerning the

management of scientific data.

HypothesisWe assume as a hypothesis that only a minority ofthe researchers in the Brazilian InformationScience community effectively perform scientific

data management.

Objectives

Current Research Status

References

The survey is being deployed to the BrazilianInformation Science Community through a FreeOpen Source Software survey tool.

BELL, G. Foreword (2009). In: Hey, Anthony J. G., Stewart Tansley, and Kristin Michele Tolle. The Fourth Paradigm: Data-intensiv e Scientific Discov ery. Redmond, Wash: Microsoft Research, 2009. Available in: <http://digital.library.unt.edu/ark:/67531/metadc31516/>. BORGMAN, C.L. Research Data: Who will share what, with whom, when,  and  why?”  in  Proceedings of the China-North American Library Conference, Beijing, Sep. 2010. Available in: <http://works.bepress.com/borgman/238/>. BORGMAN, C. L. Scholarship in the digital age: information, infrastructure, and the Internet. Cambridge: The MIT Press, 2010a. (E-book).BORGMAN, C.L. Big Data, Little Data, No Data: Scholarship in the Networked World. Cambridge, MA: MIT Press, 2015. (E-book).DATAONE. DataOne Education Module. Data Sharing. 2012. PowerPoint Presentation. Available in: <http://www.dataone.org/sites/all/documents/L02_DataSharing.pptx>.TENOPIR, C.; ALLARD, S.; DOUGLASS, K.; AYDINOGLU, A. U. et al. Data Sharing by Scientists: Practices and Perceptions. PLoSONE, Volume 6, Issue 6, June 2011. Available in: <http://www.plosone.org>. ZINS, C. Conceptual approaches for defining data, information and knowledge. Journal of the American Society for Information Science and Technology, v.58, n.4, p.479-493, 2007.

Guilherme Ataíde Dias Universidade Federal da Paraíba - MPGOA

Adriana Alves Rodrigues Universidade Federal da Paraíba - PPGCI

Renata Lemosdos AnjosUniversidade Federal da Paraíba - DCI

• Characterize the scientific data formats usedby the researchers in the Information Sciencefield;

• Identify the average (weekly) time spent by theresearchers in the management of scientificresearch data;

• Investigate the types of scientific datacollected by the researchers;

• Identify the data sharing practices employedby the researchers;

• Identify the actions taken by the researchersrelated to the storage and preservation ofresearch data;

• List the scientific data repositories used by the

researchers.

Methodological Characteristics

• Exploratory research

• Survey research

• Quantitative analisys

Page 12: Digital Curation Centre - Sharing of experimental clinical research … · 2020. 5. 12. · Guidance for publishing descriptions of non-public clinical datasets Iain Hrynaszkiewicz,

Reusability, Digital Reunification and Analysis of US Overseas Pension Records

Students – Mary KENDIG | Jen PROCTOR| Paridhi MATHUR | Scott HARKLESS | Anne DEMPSEY | Rosemary HALLDr. Kenneth HEGER, Richard MARCIANO, Michael KURTZ -- Staff

Genealogy

Human Migration

Health Informat

ics

Economics

Follow our blog at http://dcicblog.umd.edu/overseas-pension/

History•US Civil War•US Spanish American

War•Trans Atlantic Family

Connections

Records•Letters •Reports•Health Files•Statistical Tables

Page 13: Digital Curation Centre - Sharing of experimental clinical research … · 2020. 5. 12. · Guidance for publishing descriptions of non-public clinical datasets Iain Hrynaszkiewicz,

CURE: A consortium of academic institutions that support data quality review, a framework that includes research data curation and code review.

FOUNDING MEMBERS:

http://cure.web.unc.edu

Page 14: Digital Curation Centre - Sharing of experimental clinical research … · 2020. 5. 12. · Guidance for publishing descriptions of non-public clinical datasets Iain Hrynaszkiewicz,

TILDA - A solution for publishing, e-archivingand long term preservation of research and environmental data at SLU (Swedish University of Agricultural Sciences)

1.Fruit of co-operation between different units

2.Joint business process for archiving and publishing

3. Long term preservation aspects in the beginningof the process

4.Quality assured data and metadata

5. Integration of CKAN & Archivematica

6. SLU, first university in Sweden launching solutionfor research data

Page 15: Digital Curation Centre - Sharing of experimental clinical research … · 2020. 5. 12. · Guidance for publishing descriptions of non-public clinical datasets Iain Hrynaszkiewicz,

Methods and metrics for the assessment of research data management maturity, adoption of software tools, and data sharing

outcomes in neuroimaging Ana Van Gulick, Carnegie Mellon University, Pittsburgh, PA, USA

John Borghi, California Digital Library, Oakland, CA, USA

What tools are neuroimaging researchers *actually* using? How are they managing and sharing data?

Are they using open science tools?

Pyles et al. 2013, PLOS One

Pyles et al. 2013, PLOS OnePyles et al. 2013, PLOS One

Page 16: Digital Curation Centre - Sharing of experimental clinical research … · 2020. 5. 12. · Guidance for publishing descriptions of non-public clinical datasets Iain Hrynaszkiewicz,

DATA MANAGEMENT PLAN AS A UNIVERSITY

REQUIREMENT

Funding agency approves grant application

Administrator creates project record in in-house system

PI receives alert to file DMP

PI creates and submits DMP in in-house system

Research fund is released to PI

Ms GOH Su Nee, Ms Lavanya ASOKAN

Page 17: Digital Curation Centre - Sharing of experimental clinical research … · 2020. 5. 12. · Guidance for publishing descriptions of non-public clinical datasets Iain Hrynaszkiewicz,

Establishing data management services for multi-disciplinary, long-term collaborative research centres

Constanze Curdt and Dirk Hoffmeister

CRC / Transregio 32:Patterns in Soil-Vegetation-Atmosphere Systemswww.tr32db.de

12th International Digital Curation Conference |Royal College of Surgeons of Edinburgh, Scotland | 20 - 23 February 2017

funded by:

CRC 1211:Earth – Evolution at the Dry Limitwww.crc1211db.uni-koeln.de

Page 18: Digital Curation Centre - Sharing of experimental clinical research … · 2020. 5. 12. · Guidance for publishing descriptions of non-public clinical datasets Iain Hrynaszkiewicz,

Agile Data

Curation

Values and

Principles

Case Studies

Design Patterns

Karl Benedict,W. Christopher Lenhardt,Joshua Young

Community Engagement for Developing the Principles and

Practices of Agile Data Curation

Page 19: Digital Curation Centre - Sharing of experimental clinical research … · 2020. 5. 12. · Guidance for publishing descriptions of non-public clinical datasets Iain Hrynaszkiewicz,

Printing:

Customizing the Content:

Heuristic model for climate information validation made available via Linked Open Data

João José Barbosa Ferreira | Guilherme Ataíde Dias | Universidade Federal de Minas Gerais - Brazil

Problem / QuestionWhat is the commitment degree of the participating organizations of the Linked Open Data project in relation to the maintenance and updating of the available data?

Hypothesis

• The increasing volume of data being attached to the project Linked Open Data, raises attention to the validity of the available information, since obsolete information can significantly compromise data quality

Project Overview

• This study aims to investigate the metadata related to Linked Open Data using heuristics to analyze the frequency to data updating, studying the degree of reliability and indicating possible points of attention to be corrected.

• An example is the exchange of information among organizations that monitor and assess the Earth's climate status in real time and provide data to assist decision-makers at all levels of public and private sectors w ith data and information on trends such as climate variability for civil defense disaster prediction or to plan actions in agricultural issues.

• It is planned the development of a softw are prototype based on a model to analyze the research data updates published by climate agencies w hich have joined the project Linked Open Data to identify possible non-conformities in the update of this data, generating as output statistical information for the composition of quantitative indicators regarding the the data produced reliability.

Analysis cycle

Operational Steps

Step

1 Choose a database member of the project Open Linked Data

Step

2 Criteria of identification established to update the data

Step

3 Modelvalidation

Step

4 Results presentation

Data / Observations

• Data use effectiveness for the prevention of climatic incidents.

• Identification of possible data users consume.• Amount of periodic database access.• Effective use of controlled vocabularies and ontologies

in the metadata design.

Methodological Approach

• From the scientif ic method standpoint, to achieve the goals of this research is intended to use the statistical method

Initial conclusions

• Creating a model for analyzing the quality of available data on a Linked Open Data project is presented as an important resource in the audit process for certif ication of the data.

• The use of this model allow s information producers to increase the data quality and users greater assurance in their decision making process.

References

• Almeida, M. B., & Bax, M. P. (2003). Uma visão geral sobre ontologias: pesquisa sobre definições, tipos, aplicações, métodos de avaliação e de construção. Ciência da Informação, Brasília, 32(3), 7-20.

• Bauer, F., & Kaltenböck, M. (2011). Linked open data: The essentials. Edition mono/monochrom, Vienna.

• da Silva, D. L., Souza, R. R., & Almeida, M. B. (2008). Ontologias e vocabulários controlados: comparação de metodologias para construção. Ci. Inf, 37(3), 60-75.

• Edw ards, P. N. (2010). A vast machine: Computer models, climate data, and the politics of global w arming. Mit Press.

• Fischer, G., Shah, M. M., & Van Velthuizen, H. T. (2002). Climate change and agricultural vulnerability.

• Jain, P., Hitzler, P., Sheth, A. P., Verma, K., & Yeh, P. Z. (2010, November). Ontology alignment for linked open data. In International Semantic Web Conference (pp. 402-417). Springer Berlin Heidelberg.

• Kovats, S., Ebi, K. L., Annunziata, G., Bagaria, J., Banatvala, N., Baschieri, A., ... & Dow ning, T. Health Impacts of Catastrophic Climate Change: Expert Workshop.

• Moresi, E. (2003). Metodologia da pesquisa. Brasília: Universidade Católica de Brasília, 108.

Page 20: Digital Curation Centre - Sharing of experimental clinical research … · 2020. 5. 12. · Guidance for publishing descriptions of non-public clinical datasets Iain Hrynaszkiewicz,

ORDA(figshare)

archiveUS(Ex Libris Rosetta)

Page 21: Digital Curation Centre - Sharing of experimental clinical research … · 2020. 5. 12. · Guidance for publishing descriptions of non-public clinical datasets Iain Hrynaszkiewicz,

How to make your data valuable via data sharing?

Roche DG, Lanfear R, Binning SA, Haff TM, Schwanz LE, et al. (2014) Troubleshooting Public Data Archiving: Suggestions to Increase Participation. PLoS Biol 12(1): e1001779. doi:10.1371/journal.pbio.1001779, CC BY 4.0, https://commons.wikimedia.org/w/index.php?curid=30978545

IncentivesData reuse

Research reproducibility

Data organization and management

Sharing platform infrastructure

Licenses

POSTER: Data Sharing in a Complex Computational Study: Easier Said than Done!

Page 22: Digital Curation Centre - Sharing of experimental clinical research … · 2020. 5. 12. · Guidance for publishing descriptions of non-public clinical datasets Iain Hrynaszkiewicz,

Cost and value

22/02/2017 22

It’s  undeniable  that  research  data  is  valuable…

Outputs   from  the  ‘Research  at  Risk’  Business  case  and  costing  project

So why is it so hard to make the business  case…?

Page 23: Digital Curation Centre - Sharing of experimental clinical research … · 2020. 5. 12. · Guidance for publishing descriptions of non-public clinical datasets Iain Hrynaszkiewicz,

Keep the wheels turning - Advocating Data Stewardship at TU Delft

Alastair C. Dunning @alastairdunningJasmin K. Böhmer @JasminBoehmer

Page 24: Digital Curation Centre - Sharing of experimental clinical research … · 2020. 5. 12. · Guidance for publishing descriptions of non-public clinical datasets Iain Hrynaszkiewicz,

Chung-Yi (Sophie) Hou1 ([email protected]), Michael Twidale2, Steven Worley1, Matthew S. Mayernik1

1 – National Center for Atmospheric Research, University Corporation for Atmospheric Research2 – School of Information Sciences, University of Illinois at Urbana-Champaign

Case Studies of Selected Usability Evaluation Techniques and Their Applications to Improve Data Repositories

Poster #6

Without Usability Evaluations With Usability Evaluations

Page 25: Digital Curation Centre - Sharing of experimental clinical research … · 2020. 5. 12. · Guidance for publishing descriptions of non-public clinical datasets Iain Hrynaszkiewicz,

Demonstration of a Humanities Data LibraryTuesday at 14:00 in GB Ong (JHU)

Page 26: Digital Curation Centre - Sharing of experimental clinical research … · 2020. 5. 12. · Guidance for publishing descriptions of non-public clinical datasets Iain Hrynaszkiewicz,

Charles  Booth’s   Londonhttps://booth.lse.ac.uk/

Page 27: Digital Curation Centre - Sharing of experimental clinical research … · 2020. 5. 12. · Guidance for publishing descriptions of non-public clinical datasets Iain Hrynaszkiewicz,

Hybrid Provenance Overview

Page 28: Digital Curation Centre - Sharing of experimental clinical research … · 2020. 5. 12. · Guidance for publishing descriptions of non-public clinical datasets Iain Hrynaszkiewicz,

ASCII Data or

Column Binary Data

Data Files:SPSS, SAS

Stata

ASCII Data or

Column Binary Data

SFC_FWSFC_CB

Setup Files:SPSS, SAS

Stata

DDI 2.5 Codebook

Data Files:SPSS, SAS

Stata

ASCII Data or

Column Binary Data

CodebookSetup Files:SPSS, SAS

Stata

Data Files:SPSS, SAS

Stata

You want:

You only have: + -

Don’t  worry,  CISER has:

Demo: CISER Setup Files CreatorFlorio Arguillas William Block

Setup Files:SPSS, SAS

Stata

Demo Room: Tausend Time: 14:00-14:25, Tuesday, 21 February 2017

Page 29: Digital Curation Centre - Sharing of experimental clinical research … · 2020. 5. 12. · Guidance for publishing descriptions of non-public clinical datasets Iain Hrynaszkiewicz,