extensible catalog: tools for the creation and use of rda, frbrized and l inked data

Post on 05-Jan-2016

27 Views

Category:

Documents

0 Downloads

Preview:

Click to see full reader

DESCRIPTION

eXtensible Catalog: Tools for the creation and use of RDA, FRBRized and l inked data. eXtensible Catalog. David Lindahl eXtensible Catalog Organization University of Rochester, River Campus Libraries Rochester, NY. LITA National Forum September 30, 2011. Funders and Sponsors. - PowerPoint PPT Presentation

TRANSCRIPT

eXtensible Catalog:Tools for the creation and use of RDA,

FRBRized and linked data

David Lindahl

eXtensible Catalog OrganizationUniversity of Rochester, River Campus Libraries

Rochester, NYLITA National ForumSeptember 30, 2011

eXtensible Catalog

2

Funders and Sponsors

Major Funding• Andrew W. Mellon Foundation

Sponsors• Consortium of Academic and Research Libraries in

Illinois (CARLI)• Kyushu University• University of North Carolina at Charlotte• University of Rochester

User Research

Problem:

• User research is of limited value if a library doesn’t have control over its discovery environment

• Our solution:– Develop our own software (eXtensible Catalog)– Offer a modular architecture (4 “toolkits”)– Build in tons of configurability– Use established standards and protocols– Give it away (open source)

• What articles, books and other resources had researchers used most recently?– How did they know the items existed?– How did they obtain them?– How did they use them?

• How do they keep current in their fields?

XC User Research Approach

7

User Research Findings

• Users want to choose between versions of a resource, see relationships between resources– Underlying XC metadata is based on FRBR model:

works, expressions, manifestations, etc.– Use some RDA data elements in FRBR structure– Metadata services to aggregate/group FRBR entities

in the User Interface

8

User Research Findings

• Users have preferred material and format types, depending upon their projects– Show online materials only– Exclude microforms

• Users want to know why items appear on a search result list– Show keywords in context

9

Acting on User Research Findings

10

XC: “Taking Control” of metadata

More Control over Metadata

More Options for Customizing the User Interface

11

XC Schema

• Dublin Core terms (all)• RDA – subset of elements and

role designators• XC elements (newly-defined) –

when necessary to contain MARC vocabularies, linking fields, etc.

DCMI

RDA

XC

Discovery InterfaceTranslating User Research Findings into XC Functionality

13

14

15

16

17

18

19

20

21

22

23

FRBR Structure - Pyramid

Work

Expression Expression

Manifestation Manifestation Manifestation

Holdings Holdings HoldingsHoldings

24

FRBR Structure - Hourglass

Manifestation

Expression

WorkWork

HoldingsHoldings Holdings

Work

Expression Expression

25

26

27

28

29

30

31

32

33

34

35

36

37

38

Software Overview

Discovery, Metadata Management, and Connectivity

40

XC Software

OAIToolkitILS ConnectivitySynchronizedata with XC

NCIPToolkitILS Connectivity- Circ. status- Account info

MSTToolkitMetadata Services- Cleanup- Format Convert

DrupalToolkit

User Interface- Search- Browse

Each toolkit is eXtensible with add-on packages

User Interface Features More Metadata Services ILS Export ScriptsXSLT Scripts

ILS connectors

41

XC Software

OAIToolkitILS ConnectivitySynchronizedata with XC

NCIPToolkitILS Connectivity- Circ. status- Account info

MSTToolkitMetadata Services- Cleanup- Format Convert

DrupalToolkit

User Interface- Search- Browse

Voyager ILS

Metadata

Live Circ. DataUser Interface

Voyager“Driver”

Voyager“Driver”

42

Drupal Toolkit

OAIToolkitILS ConnectivitySynchronizedata with XC

NCIPToolkitILS Connectivity- Circ. status- Account info

MSTToolkitMetadata Services- Cleanup- Format Convert

DrupalToolkit

User Interface- Search- Browse

43

Drupal Toolkit Features

DrupalToolkit

User Interface- Search- Browse

• Search/Browse• Customization and theming• Platform for applications– Library website– Modules add functionality

44

Drupal Toolkit In Use

DrupalToolkit

User Interface- Search- Browse

Cute.Catalog @ Kyushu University

45

Drupal Toolkit In Use

DrupalToolkit

User Interface- Search- Browse

Cute.Catalog @ Kyushu University

46

Drupal Toolkit In Use

DrupalToolkit

User Interface- Search- Browse

“Creating Communities” @ Denver Public Library

47

Drupal Toolkit In Use

DrupalToolkit

User Interface- Search- Browse

“Creating Communities” @ Denver Public Library

48

Metadata Services Toolkit (MST)

OAIToolkitILS ConnectivitySynchronizedata with XC

NCIPToolkitILS Connectivity- Circ. status- Account info

MSTToolkitMetadata Services- Cleanup- Format Convert

DrupalToolkit

User Interface- Search- Browse

49

MST Features

MSTToolkitMetadata Services- Cleanup- Format Convert

• Collect metadata from repositories• Process metadata with services:– Normalize– Convert –Merge – Add identifiers

• Platform for building new services

50

MST In Use

Demonstration Server @ Rochester

MSTToolkitMetadata Services- Cleanup- Format Convert

51

MST In Use

Demonstration Server @ Rochester

MSTToolkitMetadata Services- Cleanup- Format Convert

52

MST In Use

Perseus Digital Library @ Tufts University (dev.)

MSTToolkitMetadata Services- Cleanup- Format Convert

53

MST In Use

Perseus Digital Library @ Tufts University (dev.)

MSTToolkitMetadata Services- Cleanup- Format Convert

54

MST In Use

Union Catalogue @ Ministerio de Cultura, Madrid, Spain

MSTToolkitMetadata Services- Cleanup- Format Convert

Metadata Services Toolkit

XC Metadata Services Toolkit

DC to XCTransformation

MARC to XCTransformation

MARCNormalization

DCNormalization XC Aggregation XC

Authority

Clean-up

Format conversion Merge AddIdentifiers

OAI-PMHOAI-PMH

MST decides which services andin which order to process incoming records

ILS ILS IR Digital Repository

Discovery Service

56

Creating XC Schema data from MARC

MARCXMLBibliographic

XCWork

XCExpression

XC Manifestation

• Parse MARCXML records into linked FRBR-based records

• Holdings can be separate or embedded• Manage uplinks

XC HoldingsMARCXMLHoldings

OO4 “Uplink”Manifestation Held

Expression Manifested

Work Expressed

Other XCrecords

MMW

MME

MMM

5. Index4. Aggregate3. Transform

Following one MARC record through XC

Steps:1. Convert from raw MARC to MARCXML (minor cleanup)2. Normalize MARCXML (major cleanup)3. Transform from MARCXML to XC (FRBRize)4. Aggregate at each FRBR level (match and merge)5. Index records / create WEMs (one for each unique Manifestation)

57

MARC MARCXML(dirty)

MARCXML(clean)

W

E

M

XC

2. Normalize1. Convert

WEMWEM

Index

Data is ready for searchand faceted browse

XC

merge

W

E

M

match

?

?

?

5. Index4. Aggregate3. Transform2. Normalize1. Convert

58

Metadata Services Toolkit (MST)

OAIToolkitILS ConnectivitySynchronizedata with XC

NCIPToolkitILS Connectivity- Circ. status- Account info

MSTToolkitMetadata Services- Cleanup- Format Convert

DrupalToolkit

User Interface- Search- Browse

59

Connectivity Tools

OAIToolkitILS ConnectivitySynchronizedata with XC

NCIPToolkitILS Connectivity- Circ. status- Account info

• OAI Toolkit– Synchronizes metadata with XC– Cleans up MARC data– Uses export scripts

• NCIP 2 Toolkit– Looks up circulation status– Places requests (renew, hold)– Retrieves user account information– Enables resource sharing

• Evergreen ILS OCLC Worldcat Navigator• SirsiDynix Symphony PALCI’s EZBorrow

– Test bed available now!

NCIP 2 Toolkit: Testbed

NCIPToolkitILS Connectivity- Circ. status- Account info

NCIP 2 Toolkit: Testbed

NCIPToolkitILS Connectivity- Circ. status- Account info

RDA and FRBRHelping libraries make the transition

63

64

65

U.S. RDA Test Coordinating Committee

Overall Recommendation:

“…the Coordinating Committee recommends that RDA should be implemented by LC, NAL, and NLM no sooner than January 2013.”

66

Bottom line…by January 2013…

Libraries will be able to use RDA in MARC and RDA in a non-MARC environment at the same time.

XC provides one option for doing this

67

Recommended Tasks and Action Item:

“Solicit demonstrations of prototype input and discovery systems that use the RDA element set (including relationships)...” Timeframe for completion: within 18 months.

U.S. RDA Test Coordinating Committee

68

Breaking down the Recommendation

prototype input discovery RDA element set

including relationships

XC is near production-ready MARC data (bulk) XC has a discovery interface Uses subsets of RDA elements

and roles to date Primary relationships between

work, expression and item so far

What XC Provides

“Solicit demonstrations of prototype input and discovery systems that use the RDA element set (including relationships)...”

XC: Facilitating the Transition

69

XC enables risk-free experimentation with RDA data while the library community develops a successor to MARC

XC can serve as a “bridge” between using RDA in MARC-based systems and in emerging applications

Linked Data in XC

71

Library of Congress statement, May 13, 2011

Transforming our Bibliographic Framework

“Experiment with Semantic Web and linked data technologies to see what benefits to the bibliographic framework they offer our community and how our current models need to be adjusted to take fuller advantage of these benefits.”

72

Semantic Web and Linked Data

• The Semantic Web refers to a set of technologies that allow computers to understand the meaning of information on the web

• Linked data is a mechanism for exposing, sharing and connecting data on the web

73

Semantic Web and Linked Data

• If everything has a unique identifier, then information from one website can be related to information from another via a computer program

• Everything includes people, places, things, vocabularies, metadata elements, web documents, …

Getting Started

To create Linked Data, we need:–Software to transform legacy data–Analysis: mapping of legacy metadata

to Linked Data properties

74

75

Converting MARC to Linked Data

• What XC software can do:– Convert MARC codes to vocabulary values– Remove extraneous data– Normalize inconsistencies– Map most MARC fields/subfields and parse to

appropriate FRBR Group 1 entity records

76

Best Practices for Linked Data

- Unique identifiers for XC metadata records- Data elements from registered schemas- Registered vocabularies

By attempting to follow best practices in XC for Linked Data, we hope to facilitate eventual output of XC metadata in RDF.

77

RDF Triple

This resource Poets, American

has subject

ObjectPredicateSubject

URIs for each?

78

RDF Triple – Record identifiers

ObjectPredicateSubject

oai:mst.rochester.edu: MST/MARCToXCTransformation/10081

This resource has subject Poets, American

79

Identifiers for XC Schema records

<?xml version="1.0" encoding="UTF-8"?><xc:frbr xmlns:xc="http://www.extensiblecatalog.info/Elements" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:rdvocab="http://rdvocab.info/Elements" xmlns:dcterms="http://purl.org/dc/terms/" xmlns:rdarole="http://rdvocab.info/roles"><xc:entity type="work" id="oai:mst.rochester.edu:MST/MARCToXCTransformation/10081"><dcterms:subject xsi:type="dcterms:LCC">PS3505.U334</dcterms:subject><dcterms:subject xsi:type="dcterms:DDC">811/.52</dcterms:subject><dcterms:subject xsi:type="dcterms:DDC">B</dcterms:subject><rdarole:author>Sawyer-Lauc<U+0327>anno, Christopher, 1951-</rdarole:author><rdvocab:titleOfTheWork>E.E. Cummings :</rdvocab:titleOfTheWork><xc:subject xsi:type="dcterms:LCSH">Cummings, E. E. (Edward Estlin), 1894-1962.</xc:subject><xc:subject xsi:type="dcterms:LCSH">Poets, American-20th century-Biography.</xc:subject></xc:entity></xc:frbr>

A persistent, globally unique identifier for each XC Schema record

80

RDF Triple - Registered Data Elements

http://www.extensiblecatalog.info/Elements/subject

ObjectPredicateSubject

oai:mst.rochester.edu: MST/MARCToXCTransformation/10081

This resource has subject Poets, American

81

XC Schema Elements

DCMI

RDA

XC

Dublin Core terms (DCMI) - all

RDA – subset of elements and role designators

XC elements (newly-defined) – when necessary to enable XC system functionality

8181

82

XC Schema “work” record: data elements

<?xml version="1.0" encoding="UTF-8"?><xc:frbr xmlns:xc="http://www.extensiblecatalog.info/Elements" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:rdvocab="http://rdvocab.info/Elements" xmlns:dcterms="http://purl.org/dc/terms/" xmlns:rdarole="http://rdvocab.info/roles"><xc:entity type="work" id="oai:mst.rochester.edu:MST/MARCToXCTransformation/10081"><dcterms:subject xsi:type="dcterms:LCC">PS3505.U334</dcterms:subject><dcterms:subject xsi:type="dcterms:DDC">811/.52</dcterms:subject><dcterms:subject xsi:type="dcterms:DDC">B</dcterms:subject><rdarole:author>Sawyer-Lauc<U+0327>anno, Christopher, 1951-</rdarole:author><rdvocab:titleOfTheWork>E.E. Cummings :</rdvocab:titleOfTheWork><xc:subject xsi:type="dcterms:LCSH">Cummings, E. E. (Edward Estlin), 1894-1962.</xc:subject><xc:subject xsi:type="dcterms:LCSH">Poets, American-20th century-Biography.</xc:subject></xc:entity></xc:frbr>

Data elements from registered namespaces for DC terms, RDA roles and vocab, and XC

83

RDF Triple - Registered Vocabularies

http://id.loc.gov/authorities/sh85103735#concepthttp://www.

extensiblecatalog.info/Elements/subject

ObjectPredicateSubject

oai:mst.rochester.edu: MST/MARCToXCTransformation/10081

This resource has subject Poets, American

84

<?xml version="1.0" encoding="UTF-8"?><xc:frbr xmlns:xc="http://www.extensiblecatalog.info/Elements" … xmlns:subjid=“id.loc.gov/authorities”><xc:entity type="work" id="oai:mst.rochester.edu:MST/MARCToXCTransformation/10081">…<xc:subject xsi:type="dcterms:LCSH">Poets, American-20th century-Biography.</xc:subject><xc:subject xsi:type="dcterms:LCSH” subjid=“sh85103735#concept”>Poets, American</xc:subject><xc:temporal>20th century</xc:temporal><xc:type>Biography</xc:type></xc:entity>…

XC Work record with embedded URI for LCSH “Poets, American”

85

RDF Triple

http://id.loc.gov/authorities/sh85103735#concepthttp://www.

extensiblecatalog.info/Elements/subject

ObjectPredicateSubject

oai:mst.rochester.edu: MST/MARCToXCTransformation/10081

This resource has subject Poets, American

86

XC Software is “Linked Data Ready”

• Converts metadata to FRBR entities with RDA elements and roles

• Adds identifiers for “things”• Provides a platform for service development• Synchronizes with existing tools– Cataloging staff client – Institutional repository

Download XC software at

eXtensibleCatalog.org

top related