physical preservation with eprints: 1 storage, by adam field, david tarrant, hannes kulovits and...

20
Digital Preservation: Logical and bit-stream preservation using Plato and Eprints Physical preservation with Eprints: 1 Storage Hannes Kulovits Andreas Rauber David Tarrant Adam Field Department of Software Technology and Interactive Systems School of Electronics and Computer Science Vienna University of Technology [email protected] [email protected] University of Southampton, UK [email protected] [email protected]

Upload: jisc-keepit-project

Post on 17-Nov-2014

877 views

Category:

Technology


2 download

DESCRIPTION

This presentation, part of an extensive practical tutorial on logical and bit-stream preservation using Plato (a preservation planning tool) and EPrints (software for creating digital repositories), presents a new storage controller for EPrints providing selectable storage options locally and in the cloud. The presentation was given as part of module 4 of a 5-module course on digital preservation tools for repository managers, presented by the JISC KeepIt project. For more on this and other presentations in this course look for the tag ’KeepIt course’ in the project blog http://blogs.ecs.soton.ac.uk/keepit/

TRANSCRIPT

Page 1: Physical preservation with EPrints: 1 Storage, by Adam Field, David Tarrant, Hannes Kulovits and Andreas Rauber

Digital Preservation: Logical and bit-stream preservation using

Plato and EprintsPhysical preservation with Eprints: 1 Storage

Hannes KulovitsAndreas Rauber

David Tarrant

Adam FieldDepartment of Software Technology and

Interactive SystemsSchool of Electronics and

Computer Science

Vienna University of [email protected]@ifs.tuwien.ac.at

University of Southampton, [email protected]@ecs.soton.ac.uk

Page 2: Physical preservation with EPrints: 1 Storage, by Adam Field, David Tarrant, Hannes Kulovits and Andreas Rauber

What is EPrints For?

EPrints offers a safe, open and useful place to store,

share and manage material in the pursuit of research and

educational agendas.

administrative reporting, collaboration, data sharing,

digital profile enhancement , e-learning, e-publishing, e-research, marketing, open access, preservation, publicity, research assessment, research management, scholarly collections

Page 3: Physical preservation with EPrints: 1 Storage, by Adam Field, David Tarrant, Hannes Kulovits and Andreas Rauber

An EPrints repository is

a valuable part of the researcher’s information environment- directly integrating with the research desktop- offering sustainable storage and open access

a competent and mature component of the institution’s information environment- providing management and curation support for core business

research data- leveraging information about research outputs to inform

management strategy

Page 4: Physical preservation with EPrints: 1 Storage, by Adam Field, David Tarrant, Hannes Kulovits and Andreas Rauber

A repository needs to interoperate with management information systems

Create reports based on research project activities as well asresearch outputs

EPrints will supportCERIF standard forCurrent ResearchInformation Systems

Research Information Systems

Page 5: Physical preservation with EPrints: 1 Storage, by Adam Field, David Tarrant, Hannes Kulovits and Andreas Rauber

Summary

1. Storage Ecosystem- Environmental study

2. Storage Controller- Interacting with your environment

3. Managing Stored Assets- Ensuring the future of your data

Page 6: Physical preservation with EPrints: 1 Storage, by Adam Field, David Tarrant, Hannes Kulovits and Andreas Rauber

STORAGE ECOSYSTEMWhere can we store data?

Page 7: Physical preservation with EPrints: 1 Storage, by Adam Field, David Tarrant, Hannes Kulovits and Andreas Rauber

Local Disk Storage

No local bandwidth costs Hard to expand Locally Managed High overheads cost Requires space and cooling Tied closely to the software

Page 8: Physical preservation with EPrints: 1 Storage, by Adam Field, David Tarrant, Hannes Kulovits and Andreas Rauber

Local Archival Storage

Specialist Expensive to purchase Locally Managed Space and running costs Expandable

Page 9: Physical preservation with EPrints: 1 Storage, by Adam Field, David Tarrant, Hannes Kulovits and Andreas Rauber

Cloud Storage

Scalable Externally controlled Known Costings Unclear retention policy Re-Useable (using simple APIs) Global Scale

Page 10: Physical preservation with EPrints: 1 Storage, by Adam Field, David Tarrant, Hannes Kulovits and Andreas Rauber

But Clouds Blow Away

In the last 24 months: Yahoo Briefcase XDrive AOL Pictures HP Upline Sony Image Station

Source: Tom Spring - PCWorld

Page 11: Physical preservation with EPrints: 1 Storage, by Adam Field, David Tarrant, Hannes Kulovits and Andreas Rauber

Why use Hybrid Storage

Use the best features of each storage type Performance

- Scaling-up bandwidth

Optimisation- Large-file handling- Multimedia streaming

Localised Delivery- Local delivery from the cloud

Page 12: Physical preservation with EPrints: 1 Storage, by Adam Field, David Tarrant, Hannes Kulovits and Andreas Rauber

STORAGE CONTROLLERWhich storage should we use?

Page 13: Physical preservation with EPrints: 1 Storage, by Adam Field, David Tarrant, Hannes Kulovits and Andreas Rauber

EPrints Storage Controller

• The storage controller decides where to put a file.

• Uses rule based policy defined by simple configuration file (XML)

• Examples:• Large binary files of scientific data (raw machine result data) can be

stored in a large disk (slower access) system and sent to a tape company for long term storage.

• Processed results can be stored locally and in the cloud ready for rapid delivery to end points.

Page 14: Physical preservation with EPrints: 1 Storage, by Adam Field, David Tarrant, Hannes Kulovits and Andreas Rauber

Hybrid Storage Policies

Page 15: Physical preservation with EPrints: 1 Storage, by Adam Field, David Tarrant, Hannes Kulovits and Andreas Rauber

Desktop & Cloud IntegrationPart 1: Hybrid Storage Policies

<choose> <when test="datasetid = 'document'"> <choose> <when test="$parent{relation_type} = 'isVolatileVersionOf'"> <plugin name="Local"/> </when> <otherwise>

<plugin name="SunCSS"/> <plugin name="AmazonS3"/> </otherwise> </choose> </when> <otherwise> <plugin name="Local"/> </otherwise> </choose>

Page 16: Physical preservation with EPrints: 1 Storage, by Adam Field, David Tarrant, Hannes Kulovits and Andreas Rauber

MANAGING STORED ASSETSHow do I move data around?

Page 17: Physical preservation with EPrints: 1 Storage, by Adam Field, David Tarrant, Hannes Kulovits and Andreas Rauber

EPrints Storage Manager

Page 18: Physical preservation with EPrints: 1 Storage, by Adam Field, David Tarrant, Hannes Kulovits and Andreas Rauber

Amazon S3 Localisation (1)

Page 19: Physical preservation with EPrints: 1 Storage, by Adam Field, David Tarrant, Hannes Kulovits and Andreas Rauber

Amazon S3 Localisation (2)

Page 20: Physical preservation with EPrints: 1 Storage, by Adam Field, David Tarrant, Hannes Kulovits and Andreas Rauber

Recap

1. Storage Ecosystem- There are a great number of products and services available

designed to protect your resources. Each is aimed at a market with different needs based on the type of content.

2. Storage Controller- Allows you to utilise a diverse range of storage services

simultaneously. Take advantage of the current ecosystem.

3. Managing Stored Assets- If the ecosystem changes, moving of resources to a new

service is a seamless operation.