extensible catalog: tools for the creation and use of rda, frbrized and l inked data
DESCRIPTION
eXtensible Catalog: Tools for the creation and use of RDA, FRBRized and l inked data. eXtensible Catalog. David Lindahl eXtensible Catalog Organization University of Rochester, River Campus Libraries Rochester, NY. LITA National Forum September 30, 2011. Funders and Sponsors. - PowerPoint PPT PresentationTRANSCRIPT
eXtensible Catalog:Tools for the creation and use of RDA,
FRBRized and linked data
David Lindahl
eXtensible Catalog OrganizationUniversity of Rochester, River Campus Libraries
Rochester, NYLITA National ForumSeptember 30, 2011
eXtensible Catalog
2
Funders and Sponsors
Major Funding• Andrew W. Mellon Foundation
Sponsors• Consortium of Academic and Research Libraries in
Illinois (CARLI)• Kyushu University• University of North Carolina at Charlotte• University of Rochester
User Research
Problem:
• User research is of limited value if a library doesn’t have control over its discovery environment
• Our solution:– Develop our own software (eXtensible Catalog)– Offer a modular architecture (4 “toolkits”)– Build in tons of configurability– Use established standards and protocols– Give it away (open source)
• What articles, books and other resources had researchers used most recently?– How did they know the items existed?– How did they obtain them?– How did they use them?
• How do they keep current in their fields?
XC User Research Approach
7
User Research Findings
• Users want to choose between versions of a resource, see relationships between resources– Underlying XC metadata is based on FRBR model:
works, expressions, manifestations, etc.– Use some RDA data elements in FRBR structure– Metadata services to aggregate/group FRBR entities
in the User Interface
8
User Research Findings
• Users have preferred material and format types, depending upon their projects– Show online materials only– Exclude microforms
• Users want to know why items appear on a search result list– Show keywords in context
9
Acting on User Research Findings
10
XC: “Taking Control” of metadata
More Control over Metadata
More Options for Customizing the User Interface
11
XC Schema
• Dublin Core terms (all)• RDA – subset of elements and
role designators• XC elements (newly-defined) –
when necessary to contain MARC vocabularies, linking fields, etc.
DCMI
RDA
XC
Discovery InterfaceTranslating User Research Findings into XC Functionality
13
14
15
16
17
18
19
20
21
22
23
FRBR Structure - Pyramid
Work
Expression Expression
Manifestation Manifestation Manifestation
Holdings Holdings HoldingsHoldings
24
FRBR Structure - Hourglass
Manifestation
Expression
WorkWork
HoldingsHoldings Holdings
Work
Expression Expression
25
26
27
28
29
30
31
32
33
34
35
36
37
38
Software Overview
Discovery, Metadata Management, and Connectivity
40
XC Software
OAIToolkitILS ConnectivitySynchronizedata with XC
NCIPToolkitILS Connectivity- Circ. status- Account info
MSTToolkitMetadata Services- Cleanup- Format Convert
DrupalToolkit
User Interface- Search- Browse
Each toolkit is eXtensible with add-on packages
User Interface Features More Metadata Services ILS Export ScriptsXSLT Scripts
ILS connectors
41
XC Software
OAIToolkitILS ConnectivitySynchronizedata with XC
NCIPToolkitILS Connectivity- Circ. status- Account info
MSTToolkitMetadata Services- Cleanup- Format Convert
DrupalToolkit
User Interface- Search- Browse
Voyager ILS
Metadata
Live Circ. DataUser Interface
Voyager“Driver”
Voyager“Driver”
42
Drupal Toolkit
OAIToolkitILS ConnectivitySynchronizedata with XC
NCIPToolkitILS Connectivity- Circ. status- Account info
MSTToolkitMetadata Services- Cleanup- Format Convert
DrupalToolkit
User Interface- Search- Browse
43
Drupal Toolkit Features
DrupalToolkit
User Interface- Search- Browse
• Search/Browse• Customization and theming• Platform for applications– Library website– Modules add functionality
44
Drupal Toolkit In Use
DrupalToolkit
User Interface- Search- Browse
Cute.Catalog @ Kyushu University
45
Drupal Toolkit In Use
DrupalToolkit
User Interface- Search- Browse
Cute.Catalog @ Kyushu University
46
Drupal Toolkit In Use
DrupalToolkit
User Interface- Search- Browse
“Creating Communities” @ Denver Public Library
47
Drupal Toolkit In Use
DrupalToolkit
User Interface- Search- Browse
“Creating Communities” @ Denver Public Library
48
Metadata Services Toolkit (MST)
OAIToolkitILS ConnectivitySynchronizedata with XC
NCIPToolkitILS Connectivity- Circ. status- Account info
MSTToolkitMetadata Services- Cleanup- Format Convert
DrupalToolkit
User Interface- Search- Browse
49
MST Features
MSTToolkitMetadata Services- Cleanup- Format Convert
• Collect metadata from repositories• Process metadata with services:– Normalize– Convert –Merge – Add identifiers
• Platform for building new services
50
MST In Use
Demonstration Server @ Rochester
MSTToolkitMetadata Services- Cleanup- Format Convert
51
MST In Use
Demonstration Server @ Rochester
MSTToolkitMetadata Services- Cleanup- Format Convert
52
MST In Use
Perseus Digital Library @ Tufts University (dev.)
MSTToolkitMetadata Services- Cleanup- Format Convert
53
MST In Use
Perseus Digital Library @ Tufts University (dev.)
MSTToolkitMetadata Services- Cleanup- Format Convert
54
MST In Use
Union Catalogue @ Ministerio de Cultura, Madrid, Spain
MSTToolkitMetadata Services- Cleanup- Format Convert
Metadata Services Toolkit
XC Metadata Services Toolkit
DC to XCTransformation
MARC to XCTransformation
MARCNormalization
DCNormalization XC Aggregation XC
Authority
Clean-up
Format conversion Merge AddIdentifiers
OAI-PMHOAI-PMH
MST decides which services andin which order to process incoming records
ILS ILS IR Digital Repository
Discovery Service
56
Creating XC Schema data from MARC
MARCXMLBibliographic
XCWork
XCExpression
XC Manifestation
• Parse MARCXML records into linked FRBR-based records
• Holdings can be separate or embedded• Manage uplinks
XC HoldingsMARCXMLHoldings
OO4 “Uplink”Manifestation Held
Expression Manifested
Work Expressed
Other XCrecords
MMW
MME
MMM
5. Index4. Aggregate3. Transform
Following one MARC record through XC
Steps:1. Convert from raw MARC to MARCXML (minor cleanup)2. Normalize MARCXML (major cleanup)3. Transform from MARCXML to XC (FRBRize)4. Aggregate at each FRBR level (match and merge)5. Index records / create WEMs (one for each unique Manifestation)
57
MARC MARCXML(dirty)
MARCXML(clean)
W
E
M
XC
2. Normalize1. Convert
WEMWEM
Index
Data is ready for searchand faceted browse
XC
merge
W
E
M
match
?
?
?
5. Index4. Aggregate3. Transform2. Normalize1. Convert
58
Metadata Services Toolkit (MST)
OAIToolkitILS ConnectivitySynchronizedata with XC
NCIPToolkitILS Connectivity- Circ. status- Account info
MSTToolkitMetadata Services- Cleanup- Format Convert
DrupalToolkit
User Interface- Search- Browse
59
Connectivity Tools
OAIToolkitILS ConnectivitySynchronizedata with XC
NCIPToolkitILS Connectivity- Circ. status- Account info
• OAI Toolkit– Synchronizes metadata with XC– Cleans up MARC data– Uses export scripts
• NCIP 2 Toolkit– Looks up circulation status– Places requests (renew, hold)– Retrieves user account information– Enables resource sharing
• Evergreen ILS OCLC Worldcat Navigator• SirsiDynix Symphony PALCI’s EZBorrow
– Test bed available now!
NCIP 2 Toolkit: Testbed
NCIPToolkitILS Connectivity- Circ. status- Account info
NCIP 2 Toolkit: Testbed
NCIPToolkitILS Connectivity- Circ. status- Account info
RDA and FRBRHelping libraries make the transition
63
64
65
U.S. RDA Test Coordinating Committee
Overall Recommendation:
“…the Coordinating Committee recommends that RDA should be implemented by LC, NAL, and NLM no sooner than January 2013.”
66
Bottom line…by January 2013…
Libraries will be able to use RDA in MARC and RDA in a non-MARC environment at the same time.
XC provides one option for doing this
67
Recommended Tasks and Action Item:
“Solicit demonstrations of prototype input and discovery systems that use the RDA element set (including relationships)...” Timeframe for completion: within 18 months.
U.S. RDA Test Coordinating Committee
68
Breaking down the Recommendation
prototype input discovery RDA element set
including relationships
XC is near production-ready MARC data (bulk) XC has a discovery interface Uses subsets of RDA elements
and roles to date Primary relationships between
work, expression and item so far
What XC Provides
“Solicit demonstrations of prototype input and discovery systems that use the RDA element set (including relationships)...”
XC: Facilitating the Transition
69
XC enables risk-free experimentation with RDA data while the library community develops a successor to MARC
XC can serve as a “bridge” between using RDA in MARC-based systems and in emerging applications
Linked Data in XC
71
Library of Congress statement, May 13, 2011
Transforming our Bibliographic Framework
“Experiment with Semantic Web and linked data technologies to see what benefits to the bibliographic framework they offer our community and how our current models need to be adjusted to take fuller advantage of these benefits.”
72
Semantic Web and Linked Data
• The Semantic Web refers to a set of technologies that allow computers to understand the meaning of information on the web
• Linked data is a mechanism for exposing, sharing and connecting data on the web
73
Semantic Web and Linked Data
• If everything has a unique identifier, then information from one website can be related to information from another via a computer program
• Everything includes people, places, things, vocabularies, metadata elements, web documents, …
Getting Started
To create Linked Data, we need:–Software to transform legacy data–Analysis: mapping of legacy metadata
to Linked Data properties
74
75
Converting MARC to Linked Data
• What XC software can do:– Convert MARC codes to vocabulary values– Remove extraneous data– Normalize inconsistencies– Map most MARC fields/subfields and parse to
appropriate FRBR Group 1 entity records
76
Best Practices for Linked Data
- Unique identifiers for XC metadata records- Data elements from registered schemas- Registered vocabularies
By attempting to follow best practices in XC for Linked Data, we hope to facilitate eventual output of XC metadata in RDF.
77
RDF Triple
This resource Poets, American
has subject
ObjectPredicateSubject
URIs for each?
78
RDF Triple – Record identifiers
ObjectPredicateSubject
oai:mst.rochester.edu: MST/MARCToXCTransformation/10081
This resource has subject Poets, American
79
Identifiers for XC Schema records
<?xml version="1.0" encoding="UTF-8"?><xc:frbr xmlns:xc="http://www.extensiblecatalog.info/Elements" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:rdvocab="http://rdvocab.info/Elements" xmlns:dcterms="http://purl.org/dc/terms/" xmlns:rdarole="http://rdvocab.info/roles"><xc:entity type="work" id="oai:mst.rochester.edu:MST/MARCToXCTransformation/10081"><dcterms:subject xsi:type="dcterms:LCC">PS3505.U334</dcterms:subject><dcterms:subject xsi:type="dcterms:DDC">811/.52</dcterms:subject><dcterms:subject xsi:type="dcterms:DDC">B</dcterms:subject><rdarole:author>Sawyer-Lauc<U+0327>anno, Christopher, 1951-</rdarole:author><rdvocab:titleOfTheWork>E.E. Cummings :</rdvocab:titleOfTheWork><xc:subject xsi:type="dcterms:LCSH">Cummings, E. E. (Edward Estlin), 1894-1962.</xc:subject><xc:subject xsi:type="dcterms:LCSH">Poets, American-20th century-Biography.</xc:subject></xc:entity></xc:frbr>
A persistent, globally unique identifier for each XC Schema record
80
RDF Triple - Registered Data Elements
http://www.extensiblecatalog.info/Elements/subject
ObjectPredicateSubject
oai:mst.rochester.edu: MST/MARCToXCTransformation/10081
This resource has subject Poets, American
81
XC Schema Elements
DCMI
RDA
XC
Dublin Core terms (DCMI) - all
RDA – subset of elements and role designators
XC elements (newly-defined) – when necessary to enable XC system functionality
8181
82
XC Schema “work” record: data elements
<?xml version="1.0" encoding="UTF-8"?><xc:frbr xmlns:xc="http://www.extensiblecatalog.info/Elements" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:rdvocab="http://rdvocab.info/Elements" xmlns:dcterms="http://purl.org/dc/terms/" xmlns:rdarole="http://rdvocab.info/roles"><xc:entity type="work" id="oai:mst.rochester.edu:MST/MARCToXCTransformation/10081"><dcterms:subject xsi:type="dcterms:LCC">PS3505.U334</dcterms:subject><dcterms:subject xsi:type="dcterms:DDC">811/.52</dcterms:subject><dcterms:subject xsi:type="dcterms:DDC">B</dcterms:subject><rdarole:author>Sawyer-Lauc<U+0327>anno, Christopher, 1951-</rdarole:author><rdvocab:titleOfTheWork>E.E. Cummings :</rdvocab:titleOfTheWork><xc:subject xsi:type="dcterms:LCSH">Cummings, E. E. (Edward Estlin), 1894-1962.</xc:subject><xc:subject xsi:type="dcterms:LCSH">Poets, American-20th century-Biography.</xc:subject></xc:entity></xc:frbr>
Data elements from registered namespaces for DC terms, RDA roles and vocab, and XC
83
RDF Triple - Registered Vocabularies
http://id.loc.gov/authorities/sh85103735#concepthttp://www.
extensiblecatalog.info/Elements/subject
ObjectPredicateSubject
oai:mst.rochester.edu: MST/MARCToXCTransformation/10081
This resource has subject Poets, American
84
<?xml version="1.0" encoding="UTF-8"?><xc:frbr xmlns:xc="http://www.extensiblecatalog.info/Elements" … xmlns:subjid=“id.loc.gov/authorities”><xc:entity type="work" id="oai:mst.rochester.edu:MST/MARCToXCTransformation/10081">…<xc:subject xsi:type="dcterms:LCSH">Poets, American-20th century-Biography.</xc:subject><xc:subject xsi:type="dcterms:LCSH” subjid=“sh85103735#concept”>Poets, American</xc:subject><xc:temporal>20th century</xc:temporal><xc:type>Biography</xc:type></xc:entity>…
XC Work record with embedded URI for LCSH “Poets, American”
85
RDF Triple
http://id.loc.gov/authorities/sh85103735#concepthttp://www.
extensiblecatalog.info/Elements/subject
ObjectPredicateSubject
oai:mst.rochester.edu: MST/MARCToXCTransformation/10081
This resource has subject Poets, American
86
XC Software is “Linked Data Ready”
• Converts metadata to FRBR entities with RDA elements and roles
• Adds identifiers for “things”• Provides a platform for service development• Synchronizes with existing tools– Cataloging staff client – Institutional repository
Download XC software at
eXtensibleCatalog.org