long-term storage – will it fill up with the good stuff, or the big, bad, and ugly? can checklists...

24
Long-term storage – will it fill up with the good stuff, or the big, bad, and ugly? Can checklists make a difference? Angus Whyte, DCC ‘Research Data Storage and Preservation Strategies’ University of Edinburgh 27 October 2014 [email protected]

Upload: dcc-info

Post on 25-May-2015

361 views

Category:

Data & Analytics


0 download

DESCRIPTION

Presentation by Angus Whyte at DCC-Arkivum event 'Data Storage & Preservation Strategies for Research Data Management' at University of Edinburgh 27 October 2014

TRANSCRIPT

Page 1: Long-term storage – will it fill up with the good stuff, or the big, bad, and ugly? Can checklists make a difference?

Long-term storage – will it fill up with the good stuff, or the big, bad, and ugly?

Can checklists make a difference?

Angus Whyte, DCC‘Research Data Storage and Preservation Strategies’

University of Edinburgh 27 October 2014

[email protected]

Page 2: Long-term storage – will it fill up with the good stuff, or the big, bad, and ugly? Can checklists make a difference?

Long-term storage – will it fill up with the good stuff, or just the big, bad, and ugly? Will checklists encourage researchers to decide?

Page 3: Long-term storage – will it fill up with the good stuff, or the big, bad, and ugly? Can checklists make a difference?
Page 4: Long-term storage – will it fill up with the good stuff, or the big, bad, and ugly? Can checklists make a difference?

RDM Service Components

www.dcc.ac.uk/resources/how-guides/how-develop-rdm-services

Page 5: Long-term storage – will it fill up with the good stuff, or the big, bad, and ugly? Can checklists make a difference?

RDM Service ‘Components’

www.dcc.ac.uk/resources/how-guides/how-develop-rdm-services

Page 6: Long-term storage – will it fill up with the good stuff, or the big, bad, and ugly? Can checklists make a difference?

But more support needed!

*March 2014 DCC 2014 RDM Survey of 61 institutions Data available at: zenodo.org/collection/user-dcc-rdm-2014

1. Defining what to retain

2. Specifying tools/ infrastructure

3. Supporting metadata creation for research data discovery

Top 3 support needs for institutions *

Page 7: Long-term storage – will it fill up with the good stuff, or the big, bad, and ugly? Can checklists make a difference?

Data Asset Surveys

About your data and its lifecycle…?

1.File type2.Volumes3.Density4.Update frequence5.Usage frequency6.Availability req’d7.Sensitivity

Data Asset Framework Implementation guide www.data-audit.eu/docs/DAF_Implementation_Guide.pdf

Active storageArchival storage

Some institutions have estimated storage requirements from these

Page 8: Long-term storage – will it fill up with the good stuff, or the big, bad, and ugly? Can checklists make a difference?

Data Asset Surveys

About your data and its lifecycle…?

1.File type2.Volumes3.Density4.Update frequence5.Usage frequency6.Availability req’d7.Sensitivity

Data Asset Framework Implementation guide www.data-audit.eu/docs/DAF_Implementation_Guide.pdf

Active storageArchival storage

Some institutions have estimated storage requirements from these

But if you provide it will researchers use it, at what cost?

Page 9: Long-term storage – will it fill up with the good stuff, or the big, bad, and ugly? Can checklists make a difference?

Practical checklists key points in research cycle

Active storageArchival storage

Data Mgmt Plan1. Collection2. Documentation3. Ethics & legal4. Storage & backup5. Selection& preserve6. Data sharing7. Responsibilities

Data Selection5 Steps to decide what to keep1. Could - benefit2. Must - risks3. Should - value4. Cost factors5. Weigh-up 1-4

CatalogueMetadata1. Name2. Description3. Identifier4. Subject5. URL6. Date7. Creator8. Rights9. Spatial10.Publisher

Repository selection 1. Policy & legal2. Discoverable 3. Preservation4. Reports5. Trust

Start

Write-up

Page 11: Long-term storage – will it fill up with the good stuff, or the big, bad, and ugly? Can checklists make a difference?

11

Straightforward steps to guide researchers

①Could this data be re-used②Must it be kept to manage compliance risk③Should it be kept for its potential value and…④Considering costs⑤Will ✔or won’t ✗ it be kept, shared on what terms

Institution orexternal

repository

Data Selection5 Steps to decide what to keep1. Could - benefit2. Must - risks3. Should - value4. Cost factors5. Weigh-up 1-4

Repository selection 1. Policy & legal2. Discoverable 3. Preservation4. Reports5. Trust

Data selection checklist

Page 12: Long-term storage – will it fill up with the good stuff, or the big, bad, and ugly? Can checklists make a difference?

12

Research record includes data as evidence for e.g. …• Audit purposes• Health & Safety (Lab book)• Contractual requirement

Compliance also about data that won’t be kept, or may only be shared with approved researchers… Research Ethics, Duty of Confidentiality, Data Protection Act, Human Rights Act, Statistics & Registration Services Act. UK Data Archive: http://www.data-archive.ac.uk/create-manage/consent-ethics/legal

Step 1 (?) What ‘must’ be kept?

Jisc Infonet Guidance on Managing Research Recordstools.jiscinfonet.ac.uk/downloads/bcs-rrs/managing-research-records.pdf

Page 13: Long-term storage – will it fill up with the good stuff, or the big, bad, and ugly? Can checklists make a difference?

13

Research record includes data as evidence for e.g. …• Audit purposes• Health & Safety (Lab book)• Contractual requirement

Compliance also about data that won’t be kept, or may only be shared with approved researchers… Research Ethics, Duty of Confidentiality, Data Protection Act, Human Rights Act, Statistics & Registration Services Act. UK Data Archive: http://www.data-archive.ac.uk/create-manage/consent-ethics/legal

Step 1 (?) What ‘must’ be kept?

Available choices depend on what purposes the data serves

Jisc Infonet Guidance on Managing Research Recordstools.jiscinfonet.ac.uk/downloads/bcs-rrs/managing-research-records.pdf

Page 14: Long-term storage – will it fill up with the good stuff, or the big, bad, and ugly? Can checklists make a difference?

14

“Data with acknowledged long-term value ” RCUK Common Principles on Data Policy

“Data, information and other electronic resources of long-term interest”

ESRC UK Data Archive Collections Development Policy

“Where data underpins published research there is much greater expectation that it will be kept” Ben Ryan, EPSRC

“An inherent principle of publication is that others should be able to replicate and build upon the authors' published claims. Nature

But what about funder & journal data policies?

Step 1 (?) What ‘must’ be kept?

Page 15: Long-term storage – will it fill up with the good stuff, or the big, bad, and ugly? Can checklists make a difference?

15

“Data with acknowledged long-term value ” RCUK Common Principles on Data Policy

“Data, information and other electronic resources of long-term interest”

ESRC UK Data Archive Collections Development Policy

“Where data underpins published research there is much greater expectation that it will be kept” Ben Ryan, EPSRC

“An inherent principle of publication is that others should be able to replicate and build upon the authors' published claims. Nature

But what about funder & journal data policies?

Step 1 (?) What ‘must’ be kept?

Still researchers’ judgement- what purposes the data may serve

Page 16: Long-term storage – will it fill up with the good stuff, or the big, bad, and ugly? Can checklists make a difference?

16

“Data with acknowledged long-term value ” RCUK Common Principles on Data Policy

“Data, information and other electronic resources of long-term interest”

ESRC UK Data Archive Collections Development Policy

“Where data underpins published research there is much greater expectation that it will be kept” Ben Ryan, EPSRC

“An inherent principle of publication is that others should be able to replicate and build upon the authors' published claims. Nature

But what about funder & journal data policies?

So make thinking about that the first step

Step 1 (?) What ‘must’ be kept?

Still researchers’ judgement- what purposes the data may serve

Page 17: Long-term storage – will it fill up with the good stuff, or the big, bad, and ugly? Can checklists make a difference?

17

Step 2 1 What could it be reused for?

1. Verification2. Further analysis3. Reputation building4. Resource development5. Further publications inc. data articles6. Learning and teaching materials7. Private reference

Any angles the researcher has not already considered?

Page 18: Long-term storage – will it fill up with the good stuff, or the big, bad, and ugly? Can checklists make a difference?

18

Step 2 1 What could it be reused for?

1. Verification2. Further analysis3. Reputation building4. Resource development5. Further publications inc. data articles6. Learning and teaching materials7. Private reference

Any angles the researcher has not already considered?

Then, relative to these, which data must be kept

Page 19: Long-term storage – will it fill up with the good stuff, or the big, bad, and ugly? Can checklists make a difference?

19

Step 3 What data should have value

1. Good quality data and descriptioncomplete, accurate, reliable, valid, representative etc

2. High demand known users, integration potential, reputation, recommendation, appeal

3. High effort to replicatedifficult, costly, or impossible to reproduce

4. Low barriers to reuselegal/ ethical, copyright non-restrictive terms and conditions

5. Rarity valueunique copy or other copies at risk

Then what else e.g. software does it depend on?

Any two of these fit?

Page 20: Long-term storage – will it fill up with the good stuff, or the big, bad, and ugly? Can checklists make a difference?

20

Step 4 Cost factorsWhy?• Costs incurred during project may add to value• Post-project costs must be covered

1. Creation, collection & cleaning

2. Short-term storage & backup

3. Short-term access & security

4. Team communication & development

5. Preservation & long-term access

So what action needed to ensure on budget?

Page 21: Long-term storage – will it fill up with the good stuff, or the big, bad, and ugly? Can checklists make a difference?

21

Step 5 Bring it all together

Balance risks, costs and valueDocument the choices made1. Name, contributors, description, sensitivity - metadata2. Reuse purposes and value – the ‘reuse case’3. Risk of non-compliance and costs shortfall4. Justification to keep or dispose5. Actions to prepare for preservation or disposal

Page 22: Long-term storage – will it fill up with the good stuff, or the big, bad, and ugly? Can checklists make a difference?

But will this work

Active storageArchival storage

Data Mgmt Plan1. Collection2. Documentation3. Ethics & legal4. Storage & backup5. Selection& preserve6. Data sharing7. Responsibilities

Data Selection5 Steps to decide what to keep1. Could - benefit2. Must - risks3. Should - value4. Cost factors5. Weigh-up 1-4

CatalogueMetadata1. Name2. Description3. Identifier4. Subject5. URL6. Date7. Creator8. Rights9. Spatial10.Publisher

Repository selection 1. Policy & legal2. Discoverable 3. Preservation4. Reports5. Trust

From research perspective will active selection mean bureacracy?

Page 23: Long-term storage – will it fill up with the good stuff, or the big, bad, and ugly? Can checklists make a difference?

But will it work

Active storageArchival storage

Data Mgmt Plan- enough to identify which project this data relates to

“The bad”Can’t share as nobody knows its sensitivity

The “too big for anywhere else”

The ugly“dont know its value or where else to put it”

Easier to avoid selecting the good and let someone else deal with de-allocation?

Page 24: Long-term storage – will it fill up with the good stuff, or the big, bad, and ugly? Can checklists make a difference?

Thank you