scape information day at bl - flint, a format and file validation tool

7
Flint – a format and file validation tool Alecs Geuder SCAPE Information Day British Library, UK, 14 th July 2014

Upload: scape-project

Post on 05-Dec-2014

65 views

Category:

Technology


1 download

DESCRIPTION

Alecs Geuder from the British Library presented a new SCAPE developed tool called ‘Flint’ at the ‘SCAPE Information Day at the British Library’, on 14 July 2014. Flint is a format and file validation tool which can be used to valide your files and/or formats against a policy. At the British Library Flint is used to deal with non print legal deposit. The information day introduced the EU-funded project SCAPE (Scalable Preservation Environments) and its tools and services to the participants.

TRANSCRIPT

Page 1: SCAPE Information day at BL - Flint, a Format and File Validation Tool

Flint – a format and file validation tool

Alecs Geuder

SCAPE Information Day

British Library, UK, 14th July 2014

Page 2: SCAPE Information day at BL - Flint, a Format and File Validation Tool

Introducing Flint: Presentation Structure

• Introduction

• What does Flint do?

• Flint-the-API

• Policy-focused Validation

• Flint-the-toolbox

• Format-specific Implementations

• How we are using it

• Mini-demo

Page 3: SCAPE Information day at BL - Flint, a Format and File Validation Tool

Introduction

• Flint facilitates [file/format validation against a policy]

• the code centres on individual file format modules (pdf, epub, ..)

• Comes with a command line interface, GUIs and a hadoop mapreduce program

Page 4: SCAPE Information day at BL - Flint, a Format and File Validation Tool

FLint – core features

Schematron Policy

• categoryA – three tests • categoryB – two tests

Input file of specific format

PolicyAware (Uses schematron-utils)

categoryC – two tests

Format specific Implementation

• canCheck • validationResult • ..

<checkresult file=“input file“ result=“passed”> <categoryA result=“passed”/> <categoryB result=“failed”/> <testB.1 result=“failed”/> <testB.2 result=“failed”/> <categoryC result=“passed”/>

</checkresult>

configuration

code

Set of internal & third party tools

Page 5: SCAPE Information day at BL - Flint, a Format and File Validation Tool

The FLint ecosystem

config

code

CLI

GUIs

hadoop

PDF

EPUB

Geospatial data

Entry points

Format/Feature specific

Implementations

CORE

DRM-detection PDF/EPUB

Input file

<checkResult>

Page 6: SCAPE Information day at BL - Flint, a Format and File Validation Tool

How we are using it

• To deal with non print legal deposit

What’s next

• Add additional format/feature modules (geospatial, etc..)

Page 7: SCAPE Information day at BL - Flint, a Format and File Validation Tool

Mini-demo