rdm staff development event presentation 140310

56
Research data management Chris Awre Library and Learning Innovation Staff Development event, 10 th March 2014

Upload: chris-awre

Post on 06-May-2015

241 views

Category:

Technology


0 download

DESCRIPTION

A presentation used for a staff development event on research data management at the University of Hull, 10th March 2014

TRANSCRIPT

Page 1: RDM staff development event presentation 140310

Research data management

Chris Awre

Library and Learning Innovation

Staff Development event, 10th March 2014

Page 2: RDM staff development event presentation 140310

Agenda

• 9:15 – Introduction / structure for the session

• 9:20 – Research data – what is it and why manage it?

• 9:50 – Data management lifecycle and planning

• 10:15 – Data management during the research

• 10:45 – Break

• 11:00 – Data management after the research

• 11:30 – Trends and external developments

• 11:50 – Summary and wrap-up

Research Data Management @ Hull | 10 March 2014 | 2

Page 3: RDM staff development event presentation 140310

Aims for the day

• To show how data can be managed throughout the research lifecycle

• To highlight good practice in research data management practice

• To explore data sharing and data preservation

• To highlight how data management can be embedded

Research Data Management @ Hull | 10 March 2014 | 3

Page 4: RDM staff development event presentation 140310

Research data

Page 5: RDM staff development event presentation 140310

Starting points

• Management of research data happens

– Existing activity is acknowledged

• Current research data management initiatives are based on three trends

– The amount of data is growing– Data management is required by multiple

disciplines– Increasing perception of the value of data

Research Data Management @ Hull | 10 March 2014 | 5

Page 6: RDM staff development event presentation 140310

Research Data Management @ Hull

• What?

• Why?

• Where?

• When?

• How?

Research Data Management @ Hull | 10 March 2014 | 6

Page 7: RDM staff development event presentation 140310

RDM @ Hull – What?

• Research data takes many forms, e.g., to name a few,

– Computer-generated data from experiments– Survey data– Compilations of historical facts

• The scope of research data encompasses the materials and/or information that are created or gathered to underpin research analysis

• Completed work / work in progress

– Data management can, and probably should, encompass both

Research Data Management @ Hull | 10 March 2014 | 7

Page 8: RDM staff development event presentation 140310

What is data? Exercise

• Please list what data you work with or need to manage

– Use the flipchart sheets provided

• Include anything you feel fits the definition:

– Research data encompasses the materials and/or information that are created or gathered to underpin research analysis

• Next, list issues and concerns you have in managing this data

Research Data Management @ Hull | 10 March 2014 | 8

Page 9: RDM staff development event presentation 140310

RDM @ Hull – Why?

• Data as research output

– Data itself can be a valid (REF) research output and needs to be well managed for presentation and assessment

• Transparency of research

– Good data management allows the process of research to be transparent, adding validity and integrity

• Data security and accuracy

– Data management is not just for outputs, but can support research practice itself

• Data sharing

– Foster collaboration and increase the value of the data through making it available for others to use

Research Data Management @ Hull | 10 March 2014 | 9

Page 10: RDM staff development event presentation 140310

RDM @ Hull – Where?

• Local research data management does not necessarily mean local provision of storage

– Local management, making use of services as required wherever they may be

• Options for storage

– Local– Cloud– Disciplinary, national or international data centre– Publishers (traditional journals, but also data journals)

• All have associated services

– What criteria determine which option(s)?

Research Data Management @ Hull | 10 March 2014 | 10

Page 11: RDM staff development event presentation 140310

RDM @ Hull – When?

• When to manage data?

– At all stages of the research lifecycle

• Key to this is making it easy to embed

– Minimise effort, whilst demonstrating benefits of effort undertaken

• Timetable?

– Your decision– Starting with data management at the start of a

research process alleviates issues in the future

Research Data Management @ Hull | 10 March 2014 | 11

Page 12: RDM staff development event presentation 140310

RDM @ Hull – How?

• Research has a lifecycle

– Data management can also follow this cycle– Helps to identify when data management

activities are required– Helps to plan data management within a project

• Data management planning

– Living document that acts as a guide to data management throughout a project

– Helps define specific actions to undertake, and how they get acted upon

Research Data Management @ Hull | 10 March 2014 | 12

Page 13: RDM staff development event presentation 140310

Questions…?

Page 14: RDM staff development event presentation 140310

Data management guides

Page 15: RDM staff development event presentation 140310

Data management lifecycle - UKDA

Research Data Management @ Hull | 10 March 2014 | 15

Page 16: RDM staff development event presentation 140310

Data management planning

• Data management plan template

– Can be used as a full guide– Can be used as a checklist– It is a tool to support data management and a

prompt for the issues you need to consider

• Online data management planning

– DMPOnline tool• http://dmponline.dcc.ac.uk

• Generic questions / Funder requirements– The importance of data management planning

• https://www.youtube.com/watch?v=PXr14Urf268

Research Data Management @ Hull | 10 March 2014 | 16

Page 17: RDM staff development event presentation 140310

Questions…?

Page 18: RDM staff development event presentation 140310

During the research

Page 19: RDM staff development event presentation 140310

Creating data

• Design research

• Plan data management

• Plan consent (if required)

• Plan storage

• Locate existing data

• Collect data

– Experiment, observe, measure, etc.

• Capture and/or create metadata

Research Data Management @ Hull | 10 March 2014 | 19

Page 20: RDM staff development event presentation 140310

Organising your data

• Make sure the files are named and structured in a way that is meaningful, to you and colleagues

– Effort at the time can save time when you need to retrieve data

– How do you manage version control?– Be consistent in practice

• Important for collaborative research

• Would you know what you had if returning to it in a year’s time?

Research Data Management @ Hull | 10 March 2014 | 20

Page 21: RDM staff development event presentation 140310

Benefits of consistent file naming and organisation• Data files are not accidentally overwritten or deleted• Data files are distinguishable from each other within

their containing folder• Data file naming prevents confusion when multiple

people are working on shared files• Data files are easier to locate and browse• Data files can be retrieved both by creator and by other

users• Data files can be sorted in logical sequence• Different versions of data files can be identified• If data files are moved to other storage platform their

names will retain useful context

Research Data Management @ Hull | 10 March 2014 | 21

Page 22: RDM staff development event presentation 140310

File formats

• Be aware of the file formats your data exists in

– Does this format require a specific type of software?

– Can others access the data in this format?– Can alternative formats be used?

• Would transfer between formats lead to any loss of data?

• Using widely available or open formats maximises the chances of your data being stable and usable

• Watch out for backwards compatibility if software is upgraded

Research Data Management @ Hull | 10 March 2014 | 22

Page 23: RDM staff development event presentation 140310

Ethics and consent

• Where people or animals are involved, ethical consent will be required

• Data management requirements can be embedded within ethical/research approval

– Single workflow– Ensures data management is at the heart of

research planning

• Consent needs to be carefully considered

– What is being requested?– Think ahead, beyond the research, for

implicationsResearch Data Management @ Hull | 10 March 2014 | 23

Page 24: RDM staff development event presentation 140310

Processing data

• Manage and store data

• Check, validate, clean data

• Anonymise data (if applicable)

• Describe data

– Distinction with metadata– Covers more detail, e.g., methodology of creation

and structure– Aim is to enable a third party to make sense of it

Research Data Management @ Hull | 10 March 2014 | 24

Page 25: RDM staff development event presentation 140310

Metadata/data description

• Information that says what the data file is

• Metadata is a brief summary

– Title, brief description, date, creator, etc.

• Data description is a fuller explanation of the data

– Origins of the data, methodology used, structure, why it was captured, etc.

• Metadata aids future retrieval, to give assurance the correct file has been found

• Data description communicates the purpose of the data to others

Research Data Management @ Hull | 10 March 2014 | 25

Page 26: RDM staff development event presentation 140310

Storage and security

• Data storage

– How much storage will you need? Can you access this?

• Data security

– Data confidentiality– Data corruption and loss– How do you protect against these

• Data backup

– Good practice to have three copies of the data in different locations

– Backup policy• What needs backing up? Who has access to backups? When

can backups be deleted?

Research Data Management @ Hull | 10 March 2014 | 26

Page 27: RDM staff development event presentation 140310

Anonymisation

• If data includes personal data, it may need to be anonymised to protect individuals

– Office of the Information Commissioner guidance• http

://ico.org.uk/for_organisations/data_protection/topic_guides/anonymisation

• Data Protection Act

– Useful to know about the principles of this Act– Research exemptions

• Data can be kept for the long-term to aid ongoing research, so long as there is no impact on the individuals concerned

Research Data Management @ Hull | 10 March 2014 | 27

Page 28: RDM staff development event presentation 140310

Analysing data

• Interpret data

• Derive data

• Produce research outputs

• Author publications based on data

Research Data Management @ Hull | 10 March 2014 | 28

Page 29: RDM staff development event presentation 140310

Data as output

• Supplement to journal article

– Many journal publishers now insist on this

• If the data is going to be accessed, is the metadata/data description in place?

• Data journals

– Journals where the paper is just about the production of the dataset• Examples include Geoscience Data Journal and

Biodiversity Data Journal, amongst others

• Developing field of data science

Research Data Management @ Hull | 10 March 2014 | 29

Page 30: RDM staff development event presentation 140310

Questions…?

Page 31: RDM staff development event presentation 140310

Break

Page 32: RDM staff development event presentation 140310

After the research

Page 33: RDM staff development event presentation 140310

Preserving data

• Migrate data to best format

• Migrate data to suitable medium

• Back-up and store data

• Create metadata and documentation

• Archive data

Research Data Management @ Hull | 10 March 2014 | 33

Page 34: RDM staff development event presentation 140310

Preservation starts at the beginning

• By carrying out the actions recommended during the research, preservation has already started

– Preservation is not a post-research activity, but a continuous one

• Any actions at this stage affect both preservation and sharing/access

– Good management during the research provides a good platform for actions now

Research Data Management @ Hull | 10 March 2014 | 34

Page 35: RDM staff development event presentation 140310

Preservation considerations

• Is the file format that was used during the research the most appropriate for long-term preservation?

• What is the best medium for preservation?

• What metadata and documentation will aid long-term preservation of the data?

– Is this different to that already created?

• What lifespan are you planning for?

– When can data be deleted?

Research Data Management @ Hull | 10 March 2014 | 35

Page 36: RDM staff development event presentation 140310

Long-term storage and management

• Local

– Discuss early with ICTD– Hydra digital repository (http://hydra.hull.ac.uk)

• Cloud

– Who manages this? Who would act as proxy/backup?

• National/international

– What disciplinary data centres could you use?• E.g., NERC data centres, re3data.org

• Offline

Research Data Management @ Hull | 10 March 2014 | 36

Page 37: RDM staff development event presentation 140310

Giving access to data

• Distribute data

• Share data

• Control access

• Establish copyright

• Promote data

Research Data Management @ Hull | 10 March 2014 | 37

Page 38: RDM staff development event presentation 140310

Sharing data

• Reasons to share data (other than the funder telling you to)

– Scientific integrity– Impact– Collaboration– Innovation– Preservation (for your own use)– Teaching– Public record

• "The coolest thing to do with your data will be thought of by someone else.” (Rufus Pollock, Open Knowledge Foundation)

Research Data Management @ Hull | 10 March 2014 | 38

Page 39: RDM staff development event presentation 140310

RCUK data principles

• Data are a public good

• Data management plans should adhere to standards and best practice. Data should be preserved where appropriate

• Data should have metadata to facilitate discovery and understanding

• Researchers should assess barriers preventing sharing

• Data should be exploitable by the creators prior to sharing (within reason)

• Sources of data should always be acknowledged

• Public funds can be requested to assist with the management of data

Research Data Management @ Hull | 10 March 2014 | 39

Page 40: RDM staff development event presentation 140310

Hydra digital repository

• A University service to aid management of digital content, as required by staff

– Theses, exam papers, committee minutes, data, etc.– http://hydra.hull.ac.uk

• Holds, presents and preserves files

• Access can be open or controlled

• Data can be structured to demonstrate links between files

• Data catalogue

– Data stored elsewhere can also be recorded here as a University output• EPSRC requirement

Research Data Management @ Hull | 10 March 2014 | 40

Page 41: RDM staff development event presentation 140310

Copyright and licensing

• Who owns the copyright in the data?

• As employees, the University owns copyright by default

– But check the funding agreement for any claims

• Beware third-party copyright within data

• Datasets can have database copyright

– Copyright in the whole, not just the parts

• Data licensing

– Open Data Commons, http://opendatacommons.org– Creative Commons, http://creativecommons.org– Open Government Licence

Research Data Management @ Hull | 10 March 2014 | 41

Page 42: RDM staff development event presentation 140310

Citation

• Data can be cited

• No standard format

– Different journals/repositories will have different recommendations

– Be clear about the information you provide• Title, creator, date, link, etc.

• Example

– M. Haines, ed. 'Newfoundland, 1698-1833' in M.G Barnard and J.H Nicholls (comp.), 2010, HMAP Data Pages (www.hull.ac.uk/hmap)

Research Data Management @ Hull | 10 March 2014 | 42

Page 43: RDM staff development event presentation 140310

Re-using data

• Follow-up research

• New research

• Undertake research reviews

• Scrutinise findings

• Teach and learn

Research Data Management @ Hull | 10 March 2014 | 43

Page 44: RDM staff development event presentation 140310

Building on work already done

• Easy access to data for further work

• Ability to examine data from others

• Have your own data validated

• Research-informed teaching

– Ability to use data within teaching

Research Data Management @ Hull | 10 March 2014 | 44

Page 45: RDM staff development event presentation 140310

Questions…?

Page 46: RDM staff development event presentation 140310

Quick exercise

• What services / help would you value having access to internally?

– Please take 5 minutes to list priorities– Data will be used to help shape future assistance

Research Data Management @ Hull | 10 March 2014 | 46

Page 47: RDM staff development event presentation 140310

Trends and external developments

Page 48: RDM staff development event presentation 140310

Why the emphasis on data management?

• Data as research output

– Data itself can be a valid (REF) research output and needs to be well managed for presentation and assessment

• Transparency of research

– Good data management allows the process of research to be transparent, adding validity and integrity

• Data security and accuracy

– Data management is not just for outputs, but can support research practice itself

• Data sharing

– Foster collaboration and increase the value of the data through making it available for others to use

Research Data Management @ Hull | 10 March 2014 | 48

Page 49: RDM staff development event presentation 140310

Government driver

• “The Government, in line with our overarching commitment to transparency and open data, is committed that publicly-funded research should be accessible free of charge. Free and open access to taxpayer-funded research offers significant social and economic benefits by spreading knowledge, raising the prestige of UK research and encouraging technology transfer”

Innovation and Research Strategy for GrowthDepartment of Business, Innovation & Skills, 2011

Research Data Management @ Hull | 10 March 2014 | 49

Page 50: RDM staff development event presentation 140310

The Royal Society

• Science as an Open Enterprise

– Scientists should communicate data, using open access where possible, and a repository where justifiable

– Universities should support an open data culture, and be rewarded for this

– Funders should support data management financially– Publishers should insist on data availability– Government should foster policies for open data– What is shared should be proportionate to the research

and in the public interest, and using shared protocols• http://royalsociety.org/policy/projects/science-public-enterprise/report/

Research Data Management @ Hull | 10 March 2014 | 50

Page 51: RDM staff development event presentation 140310

OECD / Horizon 2020

• OECD principles and guidelines for access to research data from public funding

– http://www.oecd.org/science/sci-tech/38500813.pdf

• Horizon 2020

– All data outputs must be open access– http://ec.europa.eu/research/participants/data/ref/

h2020/grants_manual/hi/oa_pilot/h2020-hi-oa-data-mgt_en.pdf • Needs UKRO account for access

– Funding applications must address data management

Research Data Management @ Hull | 10 March 2014 | 51

Page 52: RDM staff development event presentation 140310

EPSRC data management roadmap

AwarenessPolicies

and processes

Data curation

Resourcing

Nine expectations

Preserving Non-digital data

MetadataPublication link to data

Manage access

Research Data Management @ Hull | 10 March 2014 | 52

Page 53: RDM staff development event presentation 140310

RDM @ Hull: What?

This, and other, events

EPSRC roadmap

University Research DataManagement Working Group

Assessing current practice

Research data management

websiteEngagement in

research bid process

Data management

planningHydra repository Research Storage Service

Research Data Management @ Hull | 10 March 2014 | 53

Page 54: RDM staff development event presentation 140310

DIY RDM? We are not alone

• Research data management is happening across HE

– Experience and expertise is growing

• The initiatives underway are:

– Making use of available tools and adapting/exploiting them for Hull

– Taking us down a path that many others are following• We can all learn from each other

• Need to map local needs against available external support, and fill in the gaps locally

– Develop tailored University of Hull RDM supportResearch Data Management @ Hull | 10 March 2014 | 54

Page 55: RDM staff development event presentation 140310

Link to open access

• Open access publication refers largely to document outputs

– Articles, conference contributions, books, reports, etc.

• Open access to data is being treated separately

– To avoid confusion– Because datasets need specific attention

• Now seeing them being brought together more

– Horizon 2020– The ‘open’ agenda will see this continuing

Research Data Management @ Hull | 10 March 2014 | 55

Page 56: RDM staff development event presentation 140310

Thank you

Chris Awre, [email protected]