the future of cataloging and catalogers
TRANSCRIPT
Part 1: Whither Cataloging?
Libraries are no longer the first place people come for information The Internet has changed the way people (including
us) behave when seeking information Our former “granularity consensus” is coming apart
To compete effectively for user attention, we must: Join the larger world of information, where our users
are Learn how the competition attracts users, draws
them in, and takes good advantage of their interest in participating
Find a better balance between protecting privacy and capturing usage behavior
12/5/08 2NELINET Seminar
And Why Must We Do This?
The comfortable certainties we know are coming undone, whether we’re ready or not
We have much experience and insight to offer the larger information world (but not everything we’ve learned is relevant)
We are collectively about the size of the Queen Mary, unable to turn on a dime—this change will take time
Resistance is futile—we are not in charge of this new world, and our options are two: adapt or retire
12/5/08 3NELINET Seminar
The Map of ChangeCharting Our Course
12/5/08NELINET Seminar 4
What We Must Leave Behind
A view of metadata based on catalog cards
Library software that can’t sort search results better than “random” or “alphabetic”
Search interfaces even Librarians hate (and we know the data)
Clunky static HTML pages that don’t attract our user’s interest, or guide them well
One silo for books, others for journal articles, images, digitized books, etc. (explain that to a user!)
12/5/08 5NELINET Seminar
Starting to Move Forward
A Starting Point: The Working Group on the Future of Bibliographic Control (Library of Congress) “On the Record”—final report, January 2008
http://www.loc.gov/bibliographic-future/ A good, comprehensive overview of our new
world and what we need to do Recommendations for LC, OCLC, ALA, library
educators and all of us Extensively discussed at the Library of Congress
and within the profession at large
12/5/08NELINET Seminar 6
“The Web is our platform”
1.2.4.2 All: Explore tools and techniques for sharing bibliographic data at the network level using both centralized and non-centralized techniques (e.g., OAI-PMH).
3.1.2.1 All: Express library standards in machine-readable and machine-actionable formats, in particular those developed for use on the Web.
3.1.2.2 All: Provide access to standards through registries or Web sites so that the standards can be used by any and all Web applications.
12/5/08NELINET Seminar 7
A New Look at Library Systems
4.1.1.1 All: Encourage and support development of systems capable of relating evaluative data, such as reviews and ratings, to bibliographic records.
4.1.1.2 All: Encourage the enhancement of library systems to provide the capability to link to appropriate user-added data available via the Internet (e.g., Amazon.com, LibraryThing, Wikipedia). At the same time, explore opportunities for developing mutually beneficial partnerships with commercial entities that would stand to benefit from these arrangements.
12/5/08 8NELINET Seminar
Enriching Library Data4.1.2.1 All: Develop library systems that can
accept user input and other non-library data without interfering with the integrity of library-created data.
4.1.2.2 All: Investigate methods of categorizing creators of added data in order to enable informed use of user-contributed data without violating the privacy obligations of libraries.
4.1.2.3 All: Develop methods to guide user tagging through techniques that suggest entry vocabulary (e.g., term completion, tag clouds).
12/5/08 9NELINET Seminar
Exploring Our New World
Avoiding the Traps of Wrongovia
12/5/08NELINET Seminar 10
Taking a Look AroundWhat’s this Semantic Web thingy all about, and
why do we care?
Is RDA really going to happen? Is it that different from AACR2? Why can’t we use RDA with MARC?
How will RDA implementation affect cataloging?
How can we best prepare for all this?
12/5/08 11NELINET Seminar
Acronymia, We Are HereRDA: Resource Description and Access
FRBR: Functional Requirements for Bibliographic Records FRBRoo: Object Oriented FRBR (harmonized with
CIDOC CRM)
FRAD: Functional Requirements for Authority Data
FRASAR: Functional Requirements for Subject Authority Records
12/5/08 12NELINET Seminar
Standards Upgrade!Type of Standard
Old Standard New Standard(s)?
Bibliographic Model
None FRBR, FRBRoo
Metadata Content AACR2 RDA
Metadata Structure
MARC21 Bibliographic
RDAVocab
Name Authority MARC21 Authority FRAD
Subject Authority MARC21 Authority FRASAR, SKOS
Encoding MARC21 XML, XML/RDF
12/5/08 13NELINET Seminar
The RDA You’ve Heard About …
4th quarter calendar 2008 – Full draft of RDA available for constituency review (ending in early February 2009) http://www.collectionscanada.ca/jsc/rdafulldraft.html
2nd quarter calendar 2009 – RDA content is finalized
3rd quarter calendar 2009 – RDA is released
3rd and 4th quarters calendar 2009, possibly into 1st quarter calendar 2010 – Testing by national libraries
1st and 2nd quarters calendar 2010 – Analysis and evaluation of testing by national libraries
3rd-4th quarters calendar 2010 – RDA implementation ?
12/5/08 14NELINET Seminar
What You Might Not Have Heard
JSC has gradually backed away from their original stance that RDA could be expressed easily in MARC21 Full integration of FRBR entities into RDA has
made that problematic
RDA has been developed explicitly to take advantage of the Semantic Web (although there are still residues of past practice)
Well supported rumors indicate that LC is considering discontinuing update of MARC21 sometime in 2010
12/5/08 15NELINET Seminar
Under the RDA HoodA FRBR-based approach to structuring
bibliographic data
More explicitly machine-friendly linkages (preferably with URIs)
More emphasis on relationships and roles
Less reliance on cataloger-created notes and text strings (particularly for identification)
Less reliance on transcription
12/5/08 16NELINET Seminar
JSC ScenariosScenario 1: separate records for all FRBR
entities with linked identifiers
Scenario 2: composite bibliographic records (with authority records representing each entity)
Scenario 3: one flat record, with all Group 1 entities on a single record This is the only scenario that MARC can handle Not really a viable option, and as far as I know, no
one is explicitly planning for it
12/5/08 17NELINET Seminar
The Rest of the Story RDA elements, roles and vocabularies have been
provisionally registered The vocabularies and the text will be tied together in
the RDA online tool
Some efforts have begun to consider how MARC21 data can be parsed into FRBR entities and RDA eXtensible Catalog Project moving strongly in this
direction Unfortunately, we don’t know what OCLC is planning
Discussions about long term maintenance of both RDA and the vocabularies have yet to occur
The push is already on for a multi-language RDA Vocabulary
12/5/08 18NELINET Seminar
RDA & FRBR: Registered!RDA Elements:
http://metadataregistry.org/schema/show/id/1.html
RDA Roles: http://metadataregistry.org/schema/show/id/4.html
RDA Vocabulary: Base Material http://metadataregistry.org/vocabulary/show/id/35.
html
FRBR Relationships (Sandbox version) http://sandbox.metadataregistry.org/vocabulary/sho
w/id/90.html
12/5/08 19NELINET Seminar
Who’s Doing This?DCMI/RDA Task Group
See: http://dublincore.org/dcmirdataskgroup/ Set up during the London meeting between JSC
and DCMI Gordon Dunsire and Diane Hillmann, co-chairs Karen Coyle & Alistair Miles, consultants
IFLA Classification and Indexing Section Gordon Dunsire, Centre for Digital Library
Research, University of Strathclyde, will be registering FRBR entities and relationships
Possible inclusion of ISBDs, FRAD, etc., in future
12/5/08 20NELINET Seminar
How Soon Will All This Happen?
The bad news: This isn’t like 1981, when there was a “start date” and we knew exactly when to change gears
More bad news: This transition is likely to be a pretty messy one, and last longer than we would like
One unknown is OCLC’s role—at present they seem to be focused on consolidating control over library data and promoting WorldCat
The good news: library vendors are starting to wake up and smell the coffee!
12/5/08 21NELINET Seminar
What Are the Challenges?
Coordination with JSC (or it’s successor, given the need to move beyond “Anglo-American”) on long-term maintenance planning Need for lightweight process, where change is
not a multi-year marathon
Continuing development towards a more Semantic web-friendly RDA (less reliance on transcription, for instance)
Tool development (at all levels, including ILS vendors)
12/5/08 22NELINET Seminar
Yet More ChallengesApplication profiles that express more than one
notion of “Work” and more than one point of view JSC still seeing the process through the lens of a text
cataloger Their “core elements” only make sense for traditional
books, serials, and other text-based objects
Moving the MARC legacy data into RDA OCLC’s silence is worrisome, makes planning difficult
Multi-lingual and specialized extensions Non-Anglo-American communities eager to participate
12/5/08 23NELINET Seminar
Multi-lingual RDAThe DCMI Registry approach:
Translations of labels, definitions and comments URIs stay the same, as do relationships Responsibility for updating translations rests with
translation “owner”
Disadvantages Translations tend to become outdated over time
without sophisticated notification services to flag new areas needing attention
Communication with translation “owners” is managed loosely by a committee—support needs still unknown
12/5/08 24NELINET Seminar
Part 2: Whither Catalogers
What Happens When The Revolution Comes?
12/5/08NELINET Seminar 25
Focus on CatalogersWhat do we anticipate will be different about
our changed working environment?
How will workflow change?
How will the data look?
What will the library vendor systems do with it?
How will we integrate user data? What kinds of user data?
What do we need to know to operate in this new environment?
12/5/08NELINET Seminar 26
Approaching ChangeCatalogers will need to separate what they know
about information based on their current systems Much of the knowledge is portable, but needs
updating The new environment is not as well organized (yet),
so much learning will need to be self-directed
Catalogers’ role may become closer to that of Metadata Librarian Managing data at a more abstract level (not yet a
stable structure to fit data into) Understanding of the goals of changes anticipated
and new requirements will be essential
12/5/08NELINET Seminar 27
Walking through a concrete example …
From the Cataloger Scenarios
12/5/08NELINET Seminar 28
A Cataloger Scenario
29
Jane Cataloger is assigned to work on a gift collection. Her first selection is a Latvian translation of Kurt Vonnegut's "Bluebeard: a novel." She searches the library database for the original work, and finds:
*Author: Kurt Vonnegut *Title of the Work: Bluebeard: a novel *Form of Work: Novel *Original Language: English
12/5/08NELINET Seminar
30
with links to the following expression information:
*Language of Expression: English *Content Type: Text
and one manifestation:
*Edition: 1st trade edition *Place of Production: New York *Publisher’s Name: Delacorte Press *Date of Production: 1987 *Number of Units: 300 pages *Resource Identifier: [ISBN]0385295901
12/5/08NELINET Seminar
31
Jane begins her description by linking to the existing Work entity. She then creates an expression description:
*Language of Expression: Latvian *Translator: Arvida Grigulis
She creates an authority record for the translator since none yet existed. She continues by creating a fuller description for the new manifestation, linking to the authority record for the Latvian publisher (what luck, it already existed!).
*Title: [in Latvian] *Place of Production: Riga *Publisher’s Name: Liesma *Date of Production: 1997
12/5/08NELINET Seminar
A Cataloger Scenario: Updating
32
Jane Cataloger is assigned to work on a gift collection. Her first selection is a Latvian translation of Kurt Vonnegut's "Bluebeard: a novel." She searches the library database for the original work, and finds:
*Author: Kurt Vonnegut *Title of the Work: Bluebeard: a novel *Form of Work: Novel *Original Language: English
12/5/08NELINET Seminar
A Cataloger Scenario: Updated
33
Jane Cataloger is assigned to work on a gift collection. Her first selection is a Latvian translation of Kurt Vonnegut's "Bluebeard: a novel." She searches the library database for the original work, and finds:
*Author: http://lcnaf.info/79062641 *Title of the Work: Bluebeard: a novel *Form of Work: http://RDVocab.info/genre/1008 *Original Language: http://marclang.info/eng
12/5/08NELINET Seminar
12/5/08NELINET Seminar 34
with links to the following expression information:
*Language of Expression: English *Content Type: Text
and one manifestation:
*Edition: 1st trade edition *Place of Production: New York *Publisher’s Name: Delacorte Press *Date of Production: 1987 *Number of Units: 300 pages *Resource Identifier: [ISBN]0385295901
12/5/08NELINET Seminar 35
with links to the following expression information:
*Language of Expression: http://marclang.info/eng *Content Type: http://purl.org/dc/dcmitype/Text
and one manifestation:
*Edition: 1st trade edition *Place of Production: http://www.getty.edu/tgn/7007567 *Publisher’s Name: http://onixpub.info/2039987 *Date of Production: 1987 *Number of Units: 300 pages *Resource Identifier: urn:ISBN:0385295901
12/5/08NELINET Seminar 36
Jane begins her description by linking to the existing Work entity. She then creates an expression description:
*Language of Expression: Latvian *Translator: Arvida Grigulis
She creates an authority record for the translator since none yet existed. She continues by creating a fuller description for the new manifestation, linking to the authority record for the Latvian publisher (what luck, it already existed!).
*Title: [in Latvian] *Place of Production: Riga *Publisher’s Name: Liesma *Date of Production: 1997
12/5/08NELINET Seminar 37
Jane begins her description by linking to the existing Work entity. She then creates an expression description:
*Language of Expression: http://marclang.info/lat *Translator: http://lcnaf.info/88007685
She creates an authority record for the translator since none yet existed. She continues by creating a fuller description for the new manifestation, linking to the authority record for the Latvian publisher (what luck, it already existed!).
*Title: [in Latvian] *Place of Production: http://www.getty.edu/tgn/7006484 *Publisher’s Name: http://onixpub.info/6770094 *Date of Production: 1997
12/5/08NELINET Seminar 38
A Dublin Core View of the WorldA Dublin Core View of the World
DCMI Abstract Model: DCMI Abstract Model: http://dublincore.org/documents/abstract-model/http://dublincore.org/documents/abstract-model/
12/5/08NELINET Seminar 39
A Dublin Core View of the WorldA Dublin Core View of the World
DCMI Abstract Model: DCMI Abstract Model: http://dublincore.org/documents/abstract-model/http://dublincore.org/documents/abstract-model/
12/5/08NELINET Seminar 40
Anatomy of a StatementAnatomy of a Statement
Place of Production: New York
PropertyPropertyPropertyProperty ValueValueValueValue
ValueValueStringStringValueValueStringString
12/5/08NELINET Seminar 41
Anatomy of a StatementAnatomy of a Statement
Place of Production: http://www.getty.edu/tgn/7007567
PropertyPropertyPropertyProperty ValueValueValueValue
RelatedRelatedDescriptionDescriptionRelatedRelatedDescriptionDescription
42
A Related DescriptionA Related Description
12/5/08NELINET Seminar
12/5/08NELINET Seminar 43
Description Sets a Key Concept!Description Sets a Key Concept!
Description Set=“A set of one or more descriptions, each of which describes a single
resource.”*
44
*DCAM Definition*DCAM Definition
12/5/08NELINET Seminar
12/5/08NELINET Seminar 45
A Description Set “Package”A Description Set “Package”
Work
Manifestation
Expression
New Tools, New Knowledge
Getting There From Here
12/5/08NELINET Seminar 46
12/5/08NELINET Seminar 47
What’s This Semantic Web?
RDF: Resource Description Framework Statements about Web resources in the form of subject-
predicate-object expressions, called triples E.g. “This presentation” –“has creator” –“Diane
Hillmann”
RDF Schema Vocabulary description language of RDF
SKOS: Simple Knowledge Organisation System Expresses the basic structure and content of concept
schemes such as thesauri and other types of controlled vocabularies
An RDF application
OWL (Web Ontology Language) Explicitly represents the meaning of terms in
vocabularies and the relationships between them12/5/08 48NELINET Seminar
Semantic Web Building Blocks
Each component of an RDF statement (triple) is a “resource”
RDF is about making machine-processable statements, requiring A machine-processable language for representing
RDF statementsExtensible Markup Language (XML)
A system of machine-processable identifiers for resources (subjects, predicates, objects)Uniform Resource Identifier (URI) For full machine-processing potential, an RDF
statement is a set of three URIs
12/5/08 49NELINET Seminar
Things Requiring Identification
Object “This presentation” e.g. its electronic location (URL)
Predicate “has creator” e.g. http://purl.org/dc/terms/creator
Object “Diane Hillmann” e.g. URI of entry in Library of Congress Name
Authority File (real soon now?) NAF: nr2001015786
Declaring vocabularies/values in SKOS and OWL provides URIs—essential for the Semantic Web
12/5/08 50NELINET Seminar
What Happened to XML?Nothing: XML (eXtensible Markup Language) is
most likely how library systems will evolve after MARC It makes sense to use XML to exchange data between
libraries, and some external services But RDF is gaining ground, and libraries will need to
be able to accommodate it, and understand it
An XML record is essentially an aggregation of property = value statements about the same resource RDF triples can also be aggregated using XML, but
this isn’t necessarily the best way to realize the potential of RDF
12/5/08NELINET Seminar 51
User ParticipationBringing Users (and Usage) Into the Circle
12/5/08NELINET Seminar 52
User Data “R” UsSources of ‘active’ user data
Tagging, etc. Review and rating systems Courseware systems
Sources of ‘passive’ user data Logs of user activity Circulation or download data
“Making data work harder …” –Lorcan Dempsey Collaborative filtering Data mining
12/5/08NELINET Seminar 53
Active User DataUser tagging and description
Ex.: The LC Flickr Project Ex.: LibraryThing
Review and rating systems Ex.: Penn Tags Ex.: Amazon
Courseware Systems Making connections so that courseware can reuse
catalog information; catalogs can know what has been used in courses, when, and who assigned it
12/5/08NELINET Seminar 54
LC-Flickr ProjectLibrary of Congress and Flickr--“In a very elegant
way, Flickr solves the authority conundrum of exposing collections content to social process. No need to worry if some comments or tags are misleading, arbitrary or incorrect - it’s not happening on your site, but in a space where people know and expect a wide variety of contributions. On the other hand, LC selectively reaps the benefit of these contributions.”
(http://hangingtogether.org/?p=401)
12/5/08NELINET Seminar 55
12/5/08NELINET Seminar 56
12/5/08NELINET Seminar 57
12/5/08NELINET Seminar 58
12/5/08NELINET Seminar 59
12/5/08NELINET Seminar 60
12/5/08NELINET Seminar 61
12/5/08NELINET Seminar 62
12/5/08NELINET Seminar 63
12/5/08NELINET Seminar 64
Passive User DataLogs of user activity
Usually locally maintained and analyzed Services like Google Analytics can provide
important aggregate information
Circulation or download data Tricky in library settings, where user privacy an
important value Anonymized data can be stored and used for
relevance ranking
12/5/08NELINET Seminar 65
12/5/08NELINET Seminar 66
Hard Working DataCollaborative filtering
Wikipedia: “ … the process of filtering for information or patterns using techniques involving collaboration among multiple agents, viewpoints, data sources, etc.”
Ex.: Amazon (people who bought “X” also bought “Y”)
Data mining Wikipedia: “ … statistical and logical analysis of
large sets of transaction data, looking for patterns that can aid decision making.”
Ex.: LibraryThing Zeitgeist12/5/08NELINET Seminar 67
User Data IssuesPrivacy
Being able to use information about a contributing user without violating personal privacy
Complicated by differences in generational ideas about what privacy is
Authority (who said?) Librarians have traditionally valued “objectivity,” but
there’s no evidence that users see this as a value
Management Keeping spammers out Filtering language and malicious intent
12/5/08NELINET Seminar 68
Sharing User Contributions
Note how LibraryThing pulls Amazon descriptions Amazon has an API that allows other services to use its
data Positioning Amazon data in other sites drives users back
to Amazon
As libraries move more of their unique data to the Web, they need to be aware of the marketing value of sharing data and allowing other services to combine it in new ways
To do this, libraries will need to be able to package the data in ways hat others can capture it Ex.: XC Project is planning to share Courseware
information
12/5/08NELINET Seminar 69
Preparing OurselvesFiguring Out What We Need To Know
12/5/08NELINET Seminar 70
Learning Strategies Group Learning
Seminars (like this one!) Conference presentations Local study groups
Self-directed learning Tutorials Blogs
Keeping up with the discussion--You need a plan!
12/5/08NELINET Seminar 71
Self-directed LearningWeb tutorials:
http://www.w3schools.com/
Blogs Get a Bloglines account (free) Start with a few, and expand:
Lorcan Dempsey (http://orweblog.oclc.org/) Karen Coyle (http://kcoyle.blogspot.com/) The FRBR Blog (http://www.frbr.org/) Catalogablog (http://catalogablog.blogspot.com/) Cataloging Futures
(http://www.catalogingfutures.com/) Metadata Matters (http://managemetadata.org/blog/)
12/5/08 72NELINET Seminar
Mailing listsEvaluate your current reading habits
Are you spending too much time on lists that focus on MARC and AACR2 problem solving?
Do you hear too much whining about change?
Migrate to some of the lists discussing newer ideas
[email protected]@lists.monarchos.com [email protected]@JISCMAIL.AC.UK
Ask questions! Network!12/5/08 73NELINET Seminar
Thanks & Acknowledgements
Thanks for your attention!
Slides and ideas from Karen Coyle, Gordon Dunsire, and too many others to count!
Contact for Diane: Email: [email protected] Website: http://managemetadata.com/
12/5/08 74NELINET Seminar