wiser2009 luis martinez

36
Manage your data: why and how? Luis Martinez Uribe Luis. Martinez-Uribe@oerc .ox.ac. uk OULS WISER Trinity 2009 22 May 2009

Post on 22-Oct-2014

831 views

Category:

Technology


0 download

DESCRIPTION

 

TRANSCRIPT

Page 1: Wiser2009 Luis Martinez

Manage your data:why and how?

Luis Martinez [email protected]

OULS WISER Trinity 200922 May 2009

Page 2: Wiser2009 Luis Martinez

• Background

• What are research data?

• What is research data management and curation?

• Oxford activities

• How to manage data and services available to researchers

Summary

Page 3: Wiser2009 Luis Martinez

Background

New tools and infrastructures availableto researchers

A key characteristic is the generation ofdigital research data

Page 4: Wiser2009 Luis Martinez

How much data? A data deluge!

“More digital data will beproduce in the next 5 yearsthan in whole human history”(Australian DEST )

2007 is the “crossover year”where the amount of digitalinformation is greater than theamount of available storage

Source: “The Expanding Digital Universe: A forecast of Worldwide Information Growth through 2010” IDC Whitepaper, March 2007

Page 5: Wiser2009 Luis Martinez

What are research data? http://www.flickr.com/photos/iscjorgegarcia/2359144636/*

Page 6: Wiser2009 Luis Martinez

“Research data is the evidence base on which academic researchers buildtheir analytic or other work.

It includes the widest possible range of data volumes from relatively smalldata sets up to vast data volumes generated by research in fields such asparticle physics. It also includes great variety and heterogeneity of data andits accompanying metadata and documentation to make it usable andunderstood, or the digital representations and records for physical researchdata.” (UKRDS final report)

What are research data?

Page 7: Wiser2009 Luis Martinez

Exampleshttp://www.flickr.com/photos/piet_musterd/2231850447/

Page 8: Wiser2009 Luis Martinez

From Dr David Shotton presentation

Page 9: Wiser2009 Luis Martinez

http://www.flickr.com/photos/djmccrady/1883226927/

http://www.flickr.com/photos/oliastro/2987657532/

Page 10: Wiser2009 Luis Martinez

http://ecrystals.chem.soton.ac.uk/604/http://www.rcsb.org/pdb/explore.do?structureId=1BA4

Page 11: Wiser2009 Luis Martinez

http://www.flickr.com/photos/wrowlands/2270729405/From Building a VRE for the Humanities poster presented at All Hands Meeting 2007

http://www.beazley.ox.ac.uk/XDB/ASP/recordDetails.asp?recordCount=37&start=0

Page 12: Wiser2009 Luis Martinez

http://www.flickr.com/photos/piper/22584430/ http://www.flickr.com/photos/althouse/273160052/

Page 13: Wiser2009 Luis Martinez

http://www.flickr.com/photos/thivierr/540241947/

EUROBAROMETER 69 PUBLIC OPINION IN THE EUROPEAN UNION FIRST RESULTS

Page 14: Wiser2009 Luis Martinez

Data Management and Curation

http://www.flickr.com/photos/hanan_cohen/455238557/

Page 15: Wiser2009 Luis Martinez

Research data management and curation

• Takes from knowledge/information management

• “…is understanding the current data needs and future ones” (USDepartment of Defence)

• A means to an end

• Not just technical infrastructure but also procedures and policies

• Preservation http://www.youtube.com/watch?v=pbBa6Oam7-w

• Digital Curation “maintaining and adding value to a trusted body of digital information for

current and future use; it encompasses the active management of datathroughout the information lifecycle” DCC Charter and Statement Principles

Page 16: Wiser2009 Luis Martinez
Page 17: Wiser2009 Luis Martinez

Why?

• Ensuring data quality and authenticity of research results

• Not re-inventing the wheel - data collection can be expensive!

• Better access to information (which in many cases is publicly funded) willproduce high quality research

• Future access (preservation) http://www.youtube.com/watch?v=pbBa6Oam7-w

• Added value from data mining or combining datasets

• and …

Page 18: Wiser2009 Luis Martinez

Comply with requirements of funding agencies “the outputs from current and future research must be

preserved and remain accessible for future generations”

“expects research data generated as part of BBSRC supportto be made available…data should be retain for a period of 10years after completion of the project”

“requires all grant holders to offer for deposit copies of data tothe UK Data Archive”

“require that the applicants provide a data management andsharing plan as part of their application”

SHERPA JULIET SERVICE http://tinyurl.com/datapolicies

Page 19: Wiser2009 Luis Martinez

Data management and curationactivities in Oxford

Page 20: Wiser2009 Luis Martinez

Scoping Digital Repository Servicesfor Research Data Management

Page 21: Wiser2009 Luis Martinez

Interviews with researchers

Page 22: Wiser2009 Luis Martinez

I COULDN’T MAKE SENSE OF THEDATA I COLLECTED FOR MY PhD 5

YEARS AGO

I WANT TO PUBLISH THE DATA AS ANADDITIONAL RESOURCE FOR READERS

OF MY PUBLISHEDBOOK/ARTICLE

HELP! I AM REQUIRED TOPRODUCE A DATA

MANAGEMENT PLAN

WHEN RESEARCHERS LEAVE THEDEPARTMENT WE LOOSE ALL THE DATA

THEY CREATED

WE HAD TO MIGRATE DATA TO NEWFORMATS AS NOT TO LOSE THEM. IT

TOOK US MONTHS!!

CLINICAL TRIALS DATA COLLECTED 30 YEARS AGOCAN BE USED TO IDENTIFY THE DAUGHTERS OFTHOSE WOMAN WHO WERE ADMINISTERED A

DRUG THAT CAUSES CANCER IN THEIR DAUGHTERS

TO SHARE OUR DATA WE HADTO PHYSICALLY TRANSPORT THE

SERVER

WE COLLECTED DATA AS PART OF ANINTERNATIONAL

COLLABORATION BUT WE DON’TKNOW WHO OWNS THE DATA?

Researcher’s data - the challenges

Page 23: Wiser2009 Luis Martinez

Top requirements for services

Page 24: Wiser2009 Luis Martinez

Consultation with service units

• Aiming to– Validate the researchers’

requirements for services

– Determine the data managementservices available to researchers inOxford

– Identify gaps in service provision

Page 25: Wiser2009 Luis Martinez

Findings

• Widespread expertise in data management and curation amongst service units inOxford

• Support provided in ad-hoc basis but services not made explicit

• Overall, the majority of the services in the data management and curation frameworkare not offered fully or at all.

• There is a need for a university wide policy on data management and curation

Page 26: Wiser2009 Luis Martinez

Research data management and curation services

Page 27: Wiser2009 Luis Martinez

Embedding Institutional DataCuration Services in Research (EIDCSR)

Page 28: Wiser2009 Luis Martinez

Where to start? A data management plan

*With details about:

• the need for access to existing data sources

• the data to be produced by the research project

• the planned quality assurance and back-up procedures for data

• the plans for management and archiving of collected data

• any expected difficulties in making data available for secondary research(through data archiving) and measures to overcome such difficulties

• who holds copyright and Intellectual Property Rights of the data

• data management responsibility roles within the research team[Support from Departments’ IT or research facilitators or Research Services]

* RELU Data Management Plans

Page 29: Wiser2009 Luis Martinez

File handling

• Use open file formats if possible (ODF, PNG, TIFF, JPEG)[Training providers (OUCS, OULS, departmental…)]

• Use a clear directory structure

• Name files consistently (http://mst.nerc.ac.uk/file_naming_conventions.html)

• Use version control tools[OUCS Subversion Repositories]

Page 30: Wiser2009 Luis Martinez

Collect metadata : “data about data”

• Different types

– descriptive metadata : describing the intellectual content of the object

Simple DC: Title/ Creator/Subject/Description/Publisher/Contributor/Date/

Type/Format/ Identifier/Source/Language/Relation/Coverage/Rights

– administrative metadata: information used to manage the object orcontrol access to it.

– structural metadata: information that ties each object to others.[OUCS Research Technology Service may be able to help]

or the Digital Curation Centre (DCC)

Page 31: Wiser2009 Luis Martinez

Storage

• Check with your departmental IT

• Need a back-up strategy– How often/stored for how long/ who will be responsible?

[Hierarchical File Server for back-up your files and long term storage]

[OUCS Research Technology Service may be able to help]

• Ethics and confidentiality[Research Ethics Committee http://www.admin.ox.ac.uk/curec/]

– http://www.data-archive.ac.uk/sharing/confidential.asp

Page 32: Wiser2009 Luis Martinez

Data sharing and long-term preservation

• Sharing through:– Papers, local repositories, national repositories or web tools– Informally at conferences, blogs or email

• Be aware of IP and copyright issues– [http://www.data-archive.ac.uk/sharing/copyright.asp][ISI Innovation][Legal Services can help]

• Long-term preservation and sharing at national data centres– UK Data Archive– NERC data centres– Archeological Data Service– European Bioinformatics Centre (EBI)

– Many more like this at: http://tinyurl.com/globaldatarepo

Page 33: Wiser2009 Luis Martinez

Services available in Oxford

• ORA for research articles and other grey literature

• Hierarchical File Server for back-up your files

• OUCS Research Technology Service

• Departmental support through IT or research facilitators

• Departmental storage

• Legal Services

• Research Services

• Central University Research Ethics Committee

• Different training providers

Page 34: Wiser2009 Luis Martinez

Basic Data Management Principles

1. Plan before producing data

2. When possible choose right standards for open formats

3. Document your data

4. Store your data securely and always backup

5. Use trusted repositories to deposit your data for sharing and long-termpreservation

Page 35: Wiser2009 Luis Martinez

Other useful resources

• UK Data Archive Manage and Share guidelines

– http://tinyurl.com/datamanage

• Research Data Management Services: Findings of the Consultationwith Service Providers

– http://tinyurl.com/Oxdataservices

• MIT Data Management and Publishing guide

– http://tinyurl.com/qjz6ay

• Australian National University data management planning– http://ilp.anu.edu.au/dm/

Page 36: Wiser2009 Luis Martinez

Thanks