linked data from a digital object management system
DESCRIPTION
Lightning talk about generating Linked Data from a digital object management system at the National Library of Latvia. Conference: http://swib.org/swib12/programme.phpTRANSCRIPT
Linked Datafrom
Digital Object Management System
@ the National Library of Latvia
Uldis Bojārs - SWIB12 – 28-Nov-2012
Uldis Bojārs
• [email protected]• @CaptSolo
• National Library of Latvia• Semantic Web expert
• PhD in Computer Science, DERI Galway(National University of Ireland, Galway)
Foto: Ligita Ieviņa / Latvijas Nacionālā bibliotēka
DOM2
Digital Object Management System (DOM2)
work – in – progress• feedback, suggestions = very welcome :)
• Core functionality – digital object management and preservation– production system (not a pilot)
• Development: custom, outsourced
• Linked Data functions (added on)
Context
• Core functionality– Must be reliable and with good performance
• Linked Data functions (added on)– Aim: bootstrap linked data at NLL– Linked data interface (URIs, HTTP conneg, RDF data)– SPARQL endpoint
• Developers– Lack of developers who have experience building production-
level systems based on RDF stores
https://twitter.com/nichtich/status/273460676222152704
Architecture
• Core system (MSSQL, C#, .Net)– Ingest, object management, …– (DB allows to add links to other objects, web pages)– (new Digital Object metadata fields can be added)
• RDF / Linked Data adaptor module– URIs, HTTP content negotiation– HTML, RDF, XML– (for new Digital Object fields can specify how to export in RDF)
• Separate RDF / SPARQL server– SPARQL endpoint – (no impact on core system)
Synchronisation
• Named graphs
• Push-sync– core system knows when something is updated
and sends changes to the RDF store– updates at object level (named graphs)• SPARQL CLEAR, INSERT
Data• Digital object packages (XML)
– from various sources– mapped to RDF: (mix of various vocabs)
• Authority records– from ALEPH: ~170 k records– may use DOM2 to expose authority data as RDF– in RDF: SKOS
• via https://github.com/kefo/marcauth-2-madsrdf
• Classifiers– digital object types, access rights, languages, …– in RDF: SKOS
<http://example.org/data/obj/11> dc:creator
<http://example.org/data/auth/104168> ; dc:rights <http://example.org/data/clas/copyright#Public> ; dc:title "Garās magones" ; dc:type
<http://example.org/data/subtype#Postcard>, <http://example.org/data/type#Image> ;
dct:accessRights <http://example.org/data/clas/accessright#AllowPublic> ; dct:captured "2012-07-04"^^xsd:dateTime ; dct:modified "2012-07-04"^^xsd:dateTime ; a ore:Aggregation . <http://example.org/data/auth/104168> rdfs:label "Губайдуллин, Г. С., (Газиз Салихович)" ; rdfs:seeAlso "http://example.org/data/auth/104168.rdf" .
<file id="91"mimeType="image/jpeg" name="junijs15-16_040.jpg" size="2112976" … >
<fileMetadata> <field name="Type">JPEG image</field> <field name="Name">91.jpg</field> <field name="Size">2.01 MB</field> <field name="Title">OLYMPUS DIGITAL CAMERA</field> <field name="Subject">OLYMPUS DIGITAL CAMERA</field> <field name="Content created">30.10.2012 11:37:52</field> <field name="Date last saved">30.10.2012 11:37:52</field> <field name="Program name">Version 1.1</field> <field name="Width">2736 pixels</field> <field name="Height">3648 pixels</field> <field name="Horizontal resolution">96 dpi</field> <field name="Vertical resolution">96 dpi</field>
…
What about modeling file metadata (for various content types)?Source XML data not very useful.
Issues / Questions
• Technical issues– how to reliably work with RDF stores
• Modeling– Digital object metadata
• using a mix of vocabs. can BIBFRAME help?
– File metadata (for various file types)• https://answers.semanticweb.com/questions/19810/file-metadata-ontolo
gy
– Classifiers• Existing vocabs that can be reused? (for digital object types, …)
• Best practices– Have you done something similar? – What choices did you make?
Looking for:
Suggestions and feedback:– modeling, technical decisions, …– … anything else that comes to mind …
Collaboration ideas, projects:– to do useful things with this information• (re digital objects, authority data, …)
– further research and development
[email protected] / @CaptSolo