metadata normalisation in europeana the hague, 13 & 14 january 2009 julie verleyen scientific...

64
Metadata Normalisation in Europeana The Hague, 13 & 14 January 2009 Julie Verleyen Scientific Coordinator, Europeana Office EuropeanaLocal Knowledge Sharing Workshop

Upload: winifred-oliver

Post on 30-Dec-2015

212 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Metadata Normalisation in Europeana The Hague, 13 & 14 January 2009 Julie Verleyen Scientific Coordinator, Europeana Office EuropeanaLocal Knowledge Sharing

Metadata Normalisation in Europeana

The Hague, 13 & 14 January 2009

Julie Verleyen

Scientific Coordinator, Europeana Office

EuropeanaLocal Knowledge Sharing Workshop

Page 2: Metadata Normalisation in Europeana The Hague, 13 & 14 January 2009 Julie Verleyen Scientific Coordinator, Europeana Office EuropeanaLocal Knowledge Sharing

A. Workflow

B. Metadata normalisation with ESE

C. Approach in practice: Demo of tools used

D. Knowledge SHARING Workshop:

Discussion of the practice for EuropeanaLocal

Session

Page 3: Metadata Normalisation in Europeana The Hague, 13 & 14 January 2009 Julie Verleyen Scientific Coordinator, Europeana Office EuropeanaLocal Knowledge Sharing

A. Workflow

B. Metadata normalisation with ESE

C. Approach in practice: Demo of tools used

D. Knowledge SHARING Workshop:

Discussion of the practice for EuropeanaLocal

Session

Page 4: Metadata Normalisation in Europeana The Hague, 13 & 14 January 2009 Julie Verleyen Scientific Coordinator, Europeana Office EuropeanaLocal Knowledge Sharing

CONTENT SURVEY

#0

Page 5: Metadata Normalisation in Europeana The Hague, 13 & 14 January 2009 Julie Verleyen Scientific Coordinator, Europeana Office EuropeanaLocal Knowledge Sharing

Stage #0: Content survey

Input:

Output:

Specifications of

content contribution

Excelspecs

questionnaire

Page 6: Metadata Normalisation in Europeana The Hague, 13 & 14 January 2009 Julie Verleyen Scientific Coordinator, Europeana Office EuropeanaLocal Knowledge Sharing
Page 7: Metadata Normalisation in Europeana The Hague, 13 & 14 January 2009 Julie Verleyen Scientific Coordinator, Europeana Office EuropeanaLocal Knowledge Sharing
Page 8: Metadata Normalisation in Europeana The Hague, 13 & 14 January 2009 Julie Verleyen Scientific Coordinator, Europeana Office EuropeanaLocal Knowledge Sharing
Page 9: Metadata Normalisation in Europeana The Hague, 13 & 14 January 2009 Julie Verleyen Scientific Coordinator, Europeana Office EuropeanaLocal Knowledge Sharing

CONTENT SURVEY

#0

Page 10: Metadata Normalisation in Europeana The Hague, 13 & 14 January 2009 Julie Verleyen Scientific Coordinator, Europeana Office EuropeanaLocal Knowledge Sharing

Stage #1: Harvesting and package creation

Input:

Output: Harvested data in XML

Collection-specific analysis tool

Sample of source data: 1000 records

Mapping specifications template

Excelspecs

XMLrawdata

HTMLanalysis

toolXML

samplerawdata

TXTmappingtemplate

Page 11: Metadata Normalisation in Europeana The Hague, 13 & 14 January 2009 Julie Verleyen Scientific Coordinator, Europeana Office EuropeanaLocal Knowledge Sharing

CONTENT SURVEY

#0

Page 12: Metadata Normalisation in Europeana The Hague, 13 & 14 January 2009 Julie Verleyen Scientific Coordinator, Europeana Office EuropeanaLocal Knowledge Sharing

#2 Analysis and mapping specifications

Input:

Output:

Excelspecs

TXTmapping

specs

HTMLanalysis

tool

XMLsample

rawdata

TXTmappingtemplate

Page 13: Metadata Normalisation in Europeana The Hague, 13 & 14 January 2009 Julie Verleyen Scientific Coordinator, Europeana Office EuropeanaLocal Knowledge Sharing
Page 14: Metadata Normalisation in Europeana The Hague, 13 & 14 January 2009 Julie Verleyen Scientific Coordinator, Europeana Office EuropeanaLocal Knowledge Sharing

CONTENT SURVEY

#0

Page 15: Metadata Normalisation in Europeana The Hague, 13 & 14 January 2009 Julie Verleyen Scientific Coordinator, Europeana Office EuropeanaLocal Knowledge Sharing

Stage #3: Mapping and normalisation

Input:

Output:

XMLrawdata

TXTmapping

specs

XMLnormalised

mappeddata

XMLprofile

Quality check

Page 16: Metadata Normalisation in Europeana The Hague, 13 & 14 January 2009 Julie Verleyen Scientific Coordinator, Europeana Office EuropeanaLocal Knowledge Sharing

NORMALISER

Page 17: Metadata Normalisation in Europeana The Hague, 13 & 14 January 2009 Julie Verleyen Scientific Coordinator, Europeana Office EuropeanaLocal Knowledge Sharing

STAGE 3

Page 18: Metadata Normalisation in Europeana The Hague, 13 & 14 January 2009 Julie Verleyen Scientific Coordinator, Europeana Office EuropeanaLocal Knowledge Sharing

CONTENT SURVEY

#0

Page 19: Metadata Normalisation in Europeana The Hague, 13 & 14 January 2009 Julie Verleyen Scientific Coordinator, Europeana Office EuropeanaLocal Knowledge Sharing

Stage #4: Database storage and indexing

Input:

Output:

XMLnormalised

mappeddata

DB INDEX

Page 20: Metadata Normalisation in Europeana The Hague, 13 & 14 January 2009 Julie Verleyen Scientific Coordinator, Europeana Office EuropeanaLocal Knowledge Sharing

A. Workflow

B. Metadata normalisation with ESE

C. Approach in practice: Demo of tools used

D. Knowledge SHARING Workshop:

Discussion of the practice for EuropeanaLocal

Session

Page 21: Metadata Normalisation in Europeana The Hague, 13 & 14 January 2009 Julie Verleyen Scientific Coordinator, Europeana Office EuropeanaLocal Knowledge Sharing

Europeana Semantic Element (ESE)

• Europeana “Schema” for the Prototype

• Based on Dublin Core Metadata Elements Set

(DCMES)(ISO )

49 Elements (26 Elements & 23 Refinements)

• Created through discussions in July/August 2008

Page 22: Metadata Normalisation in Europeana The Hague, 13 & 14 January 2009 Julie Verleyen Scientific Coordinator, Europeana Office EuropeanaLocal Knowledge Sharing

ESE specialities

• europeana:country • europeana:provider (dc:source)• europeana:language (dc:language)• europeana:type (dc:type, dc:format)• europeana:year (dc:date)• europeana:isShownBy (dc:relation)• europeana:isShownAt (dc:relation)• europeana:object • europeana:uri (dc:identifier)

Page 23: Metadata Normalisation in Europeana The Hague, 13 & 14 January 2009 Julie Verleyen Scientific Coordinator, Europeana Office EuropeanaLocal Knowledge Sharing

All normalised:

Syntax

Value

Let’s examine their characteristics

ESE specialities

Page 24: Metadata Normalisation in Europeana The Hague, 13 & 14 January 2009 Julie Verleyen Scientific Coordinator, Europeana Office EuropeanaLocal Knowledge Sharing

Definition: Country of content provider.

If several countries: Europe Format:

String, ex: switzerland, germany,… Reference:

TEL controlled list. Supports TEL interface translation mechanism Mechanism:

Manual In portal:

Facet browsing of search results

Normalised ESE terms: Country

Page 25: Metadata Normalisation in Europeana The Hague, 13 & 14 January 2009 Julie Verleyen Scientific Coordinator, Europeana Office EuropeanaLocal Knowledge Sharing
Page 26: Metadata Normalisation in Europeana The Hague, 13 & 14 January 2009 Julie Verleyen Scientific Coordinator, Europeana Office EuropeanaLocal Knowledge Sharing

Definition: Organisation sending the data to Europeana

Format: String, ex: Musées lausannois, Nasjonalbiblioteket,…

Reference: Europeana controlled list of content providers: <original_name>

Mechanism: Manual but potentially can be automated

In portal: Facet browsing of search results

Normalised ESE terms: Provider

Page 27: Metadata Normalisation in Europeana The Hague, 13 & 14 January 2009 Julie Verleyen Scientific Coordinator, Europeana Office EuropeanaLocal Knowledge Sharing
Page 28: Metadata Normalisation in Europeana The Hague, 13 & 14 January 2009 Julie Verleyen Scientific Coordinator, Europeana Office EuropeanaLocal Knowledge Sharing
Page 29: Metadata Normalisation in Europeana The Hague, 13 & 14 January 2009 Julie Verleyen Scientific Coordinator, Europeana Office EuropeanaLocal Knowledge Sharing

Definition: Language of provider’s country (ESE:languages of the metadata)

Format: 2-letters, ex: it, no,fr, en, es,…

Reference: ISO639-1 language codes Exception: If several languages: “mul”

Mechanism: Manual but potentially can be automated

In portal: Facet browsing of search results

Normalised ESE terms: Language

Page 30: Metadata Normalisation in Europeana The Hague, 13 & 14 January 2009 Julie Verleyen Scientific Coordinator, Europeana Office EuropeanaLocal Knowledge Sharing

Definition: Type of the original object

Format: String

Reference: 4 Europeana types: IMAGE, TEXT, SOUND, VIDEO

Mechanism: Manual: Mapping specified by content provider

In portal: Categorisation display Facet browsing of search results

Normalised ESE terms: Type

Page 31: Metadata Normalisation in Europeana The Hague, 13 & 14 January 2009 Julie Verleyen Scientific Coordinator, Europeana Office EuropeanaLocal Knowledge Sharing
Page 32: Metadata Normalisation in Europeana The Hague, 13 & 14 January 2009 Julie Verleyen Scientific Coordinator, Europeana Office EuropeanaLocal Knowledge Sharing

Definition: Date of creation of the original object (analog or born digital)

Format: 4 digits [YYYY], ex: 1950

Reference: Europeana year

Mechanism: Automatic extraction with “YearExtractor” converter

In portal: Facet browsing of search results Browsing by time (timeline)

Normalised ESE terms: Year

Page 33: Metadata Normalisation in Europeana The Hague, 13 & 14 January 2009 Julie Verleyen Scientific Coordinator, Europeana Office EuropeanaLocal Knowledge Sharing
Page 34: Metadata Normalisation in Europeana The Hague, 13 & 14 January 2009 Julie Verleyen Scientific Coordinator, Europeana Office EuropeanaLocal Knowledge Sharing
Page 35: Metadata Normalisation in Europeana The Hague, 13 & 14 January 2009 Julie Verleyen Scientific Coordinator, Europeana Office EuropeanaLocal Knowledge Sharing

Definition: URL to the digital object

Format: URL (http://...)

Mechanism: Automatic or manual

In portal: Linking

Normalised ESE terms: isShownBy

Page 36: Metadata Normalisation in Europeana The Hague, 13 & 14 January 2009 Julie Verleyen Scientific Coordinator, Europeana Office EuropeanaLocal Knowledge Sharing
Page 37: Metadata Normalisation in Europeana The Hague, 13 & 14 January 2009 Julie Verleyen Scientific Coordinator, Europeana Office EuropeanaLocal Knowledge Sharing

Definition: URL to the digital object with context

Format: URL (http://...)

Mechanism: Automatic or manual

In portal: Linking

Normalised ESE terms: isShownAt

Page 38: Metadata Normalisation in Europeana The Hague, 13 & 14 January 2009 Julie Verleyen Scientific Coordinator, Europeana Office EuropeanaLocal Knowledge Sharing
Page 39: Metadata Normalisation in Europeana The Hague, 13 & 14 January 2009 Julie Verleyen Scientific Coordinator, Europeana Office EuropeanaLocal Knowledge Sharing

Definition: URL to the digital object as thumbnail

Format: URL (http://...)

Mechanism: Automatic or manual

In portal: Display

Normalised ESE terms: Object

Page 40: Metadata Normalisation in Europeana The Hague, 13 & 14 January 2009 Julie Verleyen Scientific Coordinator, Europeana Office EuropeanaLocal Knowledge Sharing
Page 41: Metadata Normalisation in Europeana The Hague, 13 & 14 January 2009 Julie Verleyen Scientific Coordinator, Europeana Office EuropeanaLocal Knowledge Sharing

Definition: Record identifier for Europeana system

Format: URI

Mechanism: Automatic: special algorithm guaranteeing uniqueness (and

integrity) of recordshttp://www.europeana.eu/resolve/record/91101/0BAF44EDF8B98F1322DEEAD4AB989778E6394418

In portal: MyEuropeana Full digital object view in Europeana

Normalised ESE terms: URI

Page 42: Metadata Normalisation in Europeana The Hague, 13 & 14 January 2009 Julie Verleyen Scientific Coordinator, Europeana Office EuropeanaLocal Knowledge Sharing

A. Workflow

B. Metadata normalisation with ESE

C. Approach in practice: Demo of tools used

D. Knowledge SHARING Workshop:

Discussion of the practice for EuropeanaLocal

Session

Page 43: Metadata Normalisation in Europeana The Hague, 13 & 14 January 2009 Julie Verleyen Scientific Coordinator, Europeana Office EuropeanaLocal Knowledge Sharing

Metadata normalisation in practice

Demo of stage #3’s workflow:

1. Go through data of example collection #1

2. Practical exercise: let’s normalise example collection #2 for Europeana!!

3. 2 examplesof known issues

MAPPING & NORMALISATION

#3

Page 44: Metadata Normalisation in Europeana The Hague, 13 & 14 January 2009 Julie Verleyen Scientific Coordinator, Europeana Office EuropeanaLocal Knowledge Sharing

SUBVERSION (SVN)

Page 45: Metadata Normalisation in Europeana The Hague, 13 & 14 January 2009 Julie Verleyen Scientific Coordinator, Europeana Office EuropeanaLocal Knowledge Sharing

COLLECTION FOLDER

SOURCE XML

MAPPING SPECS TXT

OUTPUT XML

MAPPING/NORM. SPECS XML

Page 46: Metadata Normalisation in Europeana The Hague, 13 & 14 January 2009 Julie Verleyen Scientific Coordinator, Europeana Office EuropeanaLocal Knowledge Sharing

Example 1: “Midas” collection

83 moving image records from the Association des Cinémathèques Européennes Harvested data Fields mapping/Type values mapping specs Analysis file (source data) Mapping file Profile file Analysis file + sample (normalised data)

Page 47: Metadata Normalisation in Europeana The Hague, 13 & 14 January 2009 Julie Verleyen Scientific Coordinator, Europeana Office EuropeanaLocal Knowledge Sharing

Example 2: “Outsider Art Museum” collection 4142 records from the Musées Lausannois

Page 48: Metadata Normalisation in Europeana The Hague, 13 & 14 January 2009 Julie Verleyen Scientific Coordinator, Europeana Office EuropeanaLocal Knowledge Sharing

Known issues with mapping/profile files

1. Wrong syntax in mapping file causes errors

in profile.xml:

If use “=>” in comment in mapping.txt this

creates a mapping entry in profile.xml!

Ex: ………

Page 49: Metadata Normalisation in Europeana The Hague, 13 & 14 January 2009 Julie Verleyen Scientific Coordinator, Europeana Office EuropeanaLocal Knowledge Sharing
Page 50: Metadata Normalisation in Europeana The Hague, 13 & 14 January 2009 Julie Verleyen Scientific Coordinator, Europeana Office EuropeanaLocal Knowledge Sharing
Page 51: Metadata Normalisation in Europeana The Hague, 13 & 14 January 2009 Julie Verleyen Scientific Coordinator, Europeana Office EuropeanaLocal Knowledge Sharing
Page 52: Metadata Normalisation in Europeana The Hague, 13 & 14 January 2009 Julie Verleyen Scientific Coordinator, Europeana Office EuropeanaLocal Knowledge Sharing

BEFORE

Page 53: Metadata Normalisation in Europeana The Hague, 13 & 14 January 2009 Julie Verleyen Scientific Coordinator, Europeana Office EuropeanaLocal Knowledge Sharing

AFTER

Page 54: Metadata Normalisation in Europeana The Hague, 13 & 14 January 2009 Julie Verleyen Scientific Coordinator, Europeana Office EuropeanaLocal Knowledge Sharing

Known issues with mapping/profile files

2. Wrong syntax in mapping file causes errors

in profile.xml:

There should be 2 blanks between “=>” and

“N/A” and not one otherwise the mapping

specification is not well formatted in XML in

profile.xml:

Ex: ………………….

Page 55: Metadata Normalisation in Europeana The Hague, 13 & 14 January 2009 Julie Verleyen Scientific Coordinator, Europeana Office EuropeanaLocal Knowledge Sharing

MAPPING.TXT

PROFILE.XML

MAPPING.TXT

PROFILE.XML

profile.xml with error: 2 white spaces!

Page 56: Metadata Normalisation in Europeana The Hague, 13 & 14 January 2009 Julie Verleyen Scientific Coordinator, Europeana Office EuropeanaLocal Knowledge Sharing

Documentation in Europeana context

Europeana Semantic Elements (ESE) v3.1

“Europeana – Data Offline Preparation”

Commented version of “profile.xml”

“Quality Control Checklist”

Page 57: Metadata Normalisation in Europeana The Hague, 13 & 14 January 2009 Julie Verleyen Scientific Coordinator, Europeana Office EuropeanaLocal Knowledge Sharing
Page 58: Metadata Normalisation in Europeana The Hague, 13 & 14 January 2009 Julie Verleyen Scientific Coordinator, Europeana Office EuropeanaLocal Knowledge Sharing
Page 59: Metadata Normalisation in Europeana The Hague, 13 & 14 January 2009 Julie Verleyen Scientific Coordinator, Europeana Office EuropeanaLocal Knowledge Sharing

A. Workflow

B. Metadata normalisation with ESE

C. Approach in practice: Demo of tools used

D. Knowledge SHARING Workshop:

Discussion of the practice for EuropeanaLocal

Session

Page 60: Metadata Normalisation in Europeana The Hague, 13 & 14 January 2009 Julie Verleyen Scientific Coordinator, Europeana Office EuropeanaLocal Knowledge Sharing

Questions about Europeana metadata

ingestion/normalisation process?

Integration and/or compatibility of this process with

EuropeanaLocal content strategy:

Where normalisation will take place?

By who?

Discussion

Page 62: Metadata Normalisation in Europeana The Hague, 13 & 14 January 2009 Julie Verleyen Scientific Coordinator, Europeana Office EuropeanaLocal Knowledge Sharing
Page 63: Metadata Normalisation in Europeana The Hague, 13 & 14 January 2009 Julie Verleyen Scientific Coordinator, Europeana Office EuropeanaLocal Knowledge Sharing
Page 64: Metadata Normalisation in Europeana The Hague, 13 & 14 January 2009 Julie Verleyen Scientific Coordinator, Europeana Office EuropeanaLocal Knowledge Sharing

Duplicated records

Records without URLs to digital object

Records without Europeana type (SOUND, TYPE,

IMAGE, VIDEO)

Records to copyright-protected digital objects

Discarding factors during normalisation