reenvisioning e-resource holdings management

61
Envisioning E- Resource Holdings Management Marlene van Ballegooie University of Toronto Libraries NASIG 2015

Upload: nasig

Post on 27-Jul-2015

154 views

Category:

Education


1 download

TRANSCRIPT

Envisioning E-Resource Holdings

ManagementMarlene van Ballegooie

University of Toronto LibrariesNASIG 2015

Outline

• Flashback to the dawn of NASIG – What were we thinking about e-resource holdings management then?

• Current state of ERM • OCLC’s automated holdings

management services• The study results• Benefits/challenges of the service• A look to the future of e-resource

holdings management

Predicting the Future of E-Resource Holdings Management

The Hits…“In ten years, the library that we know today will be augmented by virtual libraries... Resources that seem to be locally available will actually be held at remote locations…A library’s holdings will be defined by access, not by possession.”

Lucy Seifert Wegner, “The Research Library and Emerging Information Technology.” (1992)

“Staff will need to change from pointers and retrievers to organizers and facilitators. They must accept that the library must change from a fortress to a pipeline and realize that the collections must be dealt with “en masse” rather than one at a time.”

Kenneth E. Dowlin, “The Neographic Library: A 30-Year Perspective on Public Libraries.” (1993)

“As in-house technical processing recedes into the afterglow of shared-cataloging nirvana, catalogers and other technical processing staff will move toward being managers – rather than producers – of online records.”

Richard D. Hacken, “Tomorrow’s research library: vigor or rigor mortis?” (1988)

“Providing cataloging descriptions for ‘moving targets’ will soon become a familiar problem.”

Karen L. Horny, “New Turns for a New Century: Library Services in the Information Age.” (1987)

“Cataloging may not take place entirely within libraries. Publishers of electronic manuscripts may have their own staffs provide standardized bibliographic records with a variety of subject access points.”

And the Misses…

“Few of these new kinds of journals will come from existing journal publishers, at least not if the new journals would compete with existing products.”

“Librarians’ favorite media after print will continue to be microform…”

Brett Butler, “Scholarly Journals, Electronic Publishing, and Library Networks: From 1986 to 2000.” (1986)

“Primary research – journals articles, proceedings, reports, and other published literature – that is the province of today’s research library does not have a good channel for distribution of electronic information.”

“It would be a mistake, however, to believe that electronic journals are going to replace present printed journals, anymore than television replaced motion pictures … While a few new electronic journals have appeared, they are being created at the very margins of scholarship.”

Harold Billings, “Romancing the information flow: solving the information crisis.” (1991)

“If one assumed that the number of electronic journals would grow to 100 by 1995 and 1,000 by the year 2000, they will still account for only a small proportion of the estimated 7,000 to 15,000 scholarly journals in existence. This is not something … that is going to inundate us anytime soon.”

Martin J. Dillon cited by Kim McDonald. “Despite benefits, electronic journals will not replace print, experts say.” (1991)

Fast Forward to 2015

Proliferation of E-Content in Libraries

• University of Toronto Libraries– $29 million acquisition

budget– $17.5 million devoted to

electronic resources (60% of total acquisition budget)

– Ongoing electronic subscriptions (serials, databases, etc.)$15 million (86% of e-resource budget)

• Libraries are making substantial investments in electronic resources

• Several players in providing access to e-resources– Libraries– Content providers– Knowledgebase vendors / Link resolver vendors– Subscription agents

• More interdependencies than ever…all based on…

A Changing Environment

E-Resource Data Supply Chain

Library activates purchased content in KB to make content available for discovery

Content provider supplies knowledgebase provider with metadata for all electronic content available for purchase

Library purchases electronic resources. Content provider supplies library with title list of purchased materials. (hopefully!)

ContentProvider

Knowledgebase Provider

Library

Current ERM Shortcomings

Manual Processing• Holdings maintenance is a time consuming and

manual process• Constant ‘tweaking’ of metadata in ERM– Serial coverage dates– Individual title purchases– Non-standard packages

TT

Metadata supplied by content providers is often incomplete or erroneous• Title changes• Title transfers• Ceased titles

ProblematicMetadata

TT

Time Lags• Getting content provider metadata

into knowledgebase• Getting title list from content provider• Getting holdings registered in ERM• The more time goes by, the greater

chance it will get neglected

Electronic resources exist in remote locations, yet we rely on people in libraries to pass around information about their holdings.

Metadata is passed through many hands…Sometimes, the baton gets dropped…

Too Many Intermediaries

To overcome current shortcomings in ERM, we need to change the way the data flows.

How should data travel?

…As the crow flies

Automated Holdings Management

ContentProvider

Library

Content provider supplies knowledgebase provider with metadata for all electronic content available for purchase

Knowledgebase Provider

Content provider supplies knowledgebase provider with metadata for institution-specific holdings.

Knowledgebase provider activates institution-specific holdings in content packages.

Electronic resources are available for discovery without library intervention.

Behind the Curtain• To enable autoload, providers supply

OCLC with the following files:– Collections File: KBART format file for

each collection/package offered by the content provider

– If applicable, KBART format file for PDA e-books

– Collections Description File: Listing of all collections being transferred

– Holdings Data File: Includes the institution holdings by collection/title with customer identifier

– Customer Map: Includes the provider’s customer identifier and the corresponding OCLC cataloging symbol

Holdings Data File

Customer Map

Collections Description File

Collections File

The Players…So Far…

Research Questions• How well do automated loads reflect the

library’s purchased electronic content?• What types of collections are ideal for

automated holdings maintenance?• How quickly do titles get in the system

using the automated service?• How is the loaded content organized in

relation to the library’s licensing agreements?

• Does the service provide adequate reporting to enable libraries to monitor their collections?

The Study

• Study duration: September 2014 – May 2015• Signed up for as many automated feeds as possible,

no matter how big or small• Each time a file was uploaded in WorldCat

knowledge base, a corresponding access report was retrieved from the content provider site

• Data uploaded to a MySQL database and manipulated to make it suitable for comparison

• Custom scripting to determine matched and non-matched titles

ebrary

• Service Profile–Collection in KB: ebrary All Purchased– Frequency: Every two weeks–OCLC number coverage: 95%–Available for PDA: Yes

ebrary Results

9/11/2014

10/1/2014

10/31/2014

11/11/2014

11/26/2014

12/26/2014

1/23/2015

2/4/2015

2/26/2015

3/3/2015

0 2000 4000 6000 8000 10000 12000 14000 16000

12418

12417

12417

12424

12427

12435

12447

14524

14571

14571

12120

12121

12116

12110

12117

12127

12435

12441

14533

14571

298

296

301

314

310

308

12

2083

38

0

Unmatched URLs Matched URLs All ebrary URLs

ebrary Observations

• Irregular frequency (between Sept 2014 and May 2015, only 10 uploads)

• Single title orders are often the most anxiously awaited…monthly load too long to wait

• Majority of missing titles showed up in the next subsequent upload

• KB initially represented a fraction of our ebrary titles…later additional collections were added to the knowledgebase

MyiLibrary

• Service Profile–Collection in KB: MyiLibrary Collection– Frequency: Weekly–OCLC number coverage: 96%–Available for PDA: No

MyiLibrary Results

9/24/2014

10/31/2014

11/5/2014

0 5000 10000 15000 20000 25000 30000 35000

30037

30037

30037

30034

30035

30036

3

2

1

Unmatched URLs Matched URLs All MyiLibrary URLs

MyiLibrary Observations

• Load frequency does not live up to expectations (between Sept 2014 and May 2015 there were 3 uploads)

• List provided by content provider missing a large number of purchased titles (approximately 30,000 titles uploaded; 39,636 titles available on website)

• All MyiLibrary content in one collection. Does not account for separately licensed content.

Postscript to MyiLibrary Story

• After contacting MyiLibrary about the missing titles, a list was produced containing ALL 39,636 titles we subscribe to on the platform.

…for the MyiLibrary collection to be updated in the WorldCat Knowledge Base…

EBL Ebook Library

• Service Profile–Collection in KB: Ebook Library Catalogue– Frequency: Once a week–OCLC number coverage: 99.8%–Available for PDA: Yes

EBL Book Library Results

2/28/2015

3/8/2015

3/24/2015

0 2 4 6 8 10 12 14

8

10

12

8

10

12

Unmatched URLs Matched URLs All EBL URLs

EBL Book Library Observations

• New content provider for University of Toronto Libaries

• Perfect results, though sample was extremely small

• Close to weekly uploads (three loads in a one month span, though nothing since end of March)

Elsevier ScienceDirect• Service Profile– Collections in KB:• Elsevier ScienceDirect Journals• ScienceDirect Book Series• ScienceDirect All Books

– Frequency: Weekly– OCLC number coverage:• Elsevier ScienceDirect Journals – 91.6%• ScienceDirect Book Series – 96.7%• ScienceDirect All Books – 98.9%

– Available for PDA: No

ScienceDirect Access Report

• The ScienceDirect access report includes:– Subscribed titles– Complimentary titles– Free-to-read titles– Non-Subscribed titles

• Much duplication in report, mainly attributed to differing access types.

• All categories, except for the non-subscribed titles, are represented in the data feed to OCLC.

Six Publication Types – Three Collections

• Journal• Book• Book Series

• Book Series Volume• Reference Work• Handbooks Series

BooksBookSeriesJournals

ScienceDirect AnalysisA Game of Hide and Seek

• Over the course of the study, some content was missing or moved from one collection to another.– Many book series volumes missing

from collections– Handbook series moved from book

series collection to serials collection– E-books were often contained in

more than one package

Changing Directions• Due to difficulties in data

matching through time, a new approach was needed

• Treat ScienceDirect as a single collection and compare distinct URLs

• Led to a more accurate picture of the uploaded content

Elsevier ScienceDirect Results

9/17/2014

10/25/2014

11/2/2014

11/16/2014

11/30/2014

12/10/2014

12/14/2014

12/22/2014

1/11/2015

1/19/2015

1/25/2015

2/8/2015

2/15/2015

3/2/2015

3/8/2015

3/16/2015

3/22/2015

3/29/2015

4/5/2015

4/12/2015

4/18/2015

4/29/2015

0 2000 4000 6000 8000 10000 12000 14000 16000

14178

14299

14327

14390

14437

14506

14538

14557

14535

14568

14578

14615

14626

14859

14876

14888

14935

14954

14978

15043

15052

15097

13049

13091

13093

14321

14321

14324

14449

14449

14489

14496

14419

14569

14607

14766

14766

14765

14869

14883

14927

14946

15010

15044

1129

1208

1234

69

116

182

89

108

46

72

159

46

19

93

110

123

66

71

51

97

42

53

Unmatched URLs Matched URLs All Elsevier URLs

Elsevier ScienceDirect Results

9/17/2014

10/25/2014

11/2/2014

11/16/2014

11/30/2014

12/10/2014

12/14/2014

12/22/2014

1/11/2015

1/19/2015

1/25/2015

2/8/2015

2/15/2015

3/2/2015

3/8/2015

3/16/2015

3/22/2015

3/29/2015

4/5/2015

4/12/2015

4/18/2015

4/29/2015

0 2000 4000 6000 8000 10000 12000 14000 16000

1129

1208

1234

69

116

182

89

108

46

72

159

46

19

93

110

123

66

71

51

97

42

53

Unmatched URLs

What we really want to know is how many titles DID NOT get into the knowledgebase.

ScienceDirect Observations

• In early uploads, many book series volumes did not get loaded into the knowledgebase

• Change in definition of ‘ScienceDirect Book Series’ collection largely resolved missing title issue

• In most cases, e-resources that were missing in one load, showed up in the subsequent load

• Frequency is generally consistent, with a few minor hiccups

Of all the titles NOT matched throughout the study…

…there were only 20 titles not represented in the KB…

…That’s only 0.1% of all titles in our Elsevier account…

Autoload vs. ‘Traditional’ ERM Techniques

• Comparison between UTL’s ‘subscribed’ ScienceDirect titles in ERM and Elsevier entitlements

• Misalignment between selected packages and actual purchases– 879 titles we are entitled to were not represented

in subscribed content packages– 247 titles in the subscribed packages were titles

we did not have access to

Autoload Reports

Autoload Reports

An ERM Promise Fulfilled?

• Time saving for librarians• Well suited for “cherry-picked”

collections where manual selection is necessary (i.e. aggregator platforms)

• Increased accuracy• Excellent compatibility with PDA

programs

Some Remaining Challenges• Completely reliant on accuracy of

content provider metadata– Any problems need to be addressed

by the content provider– Manual corrections will be overwritten

each time data is reloaded• Length of time between uploads can

be long (monthly or more)• Difficult to spot when things do go

wrong and content does not get loaded.

AutoloadWishlist

Seamless Updates

• Will there ever be a time when activation on content provider site and knowledgebase is synched daily?

Better Reporting Capabilities

• Increased reporting capabilities– Alerts/notifications when uploads occur– Libraries need to know what content could not

be loaded

• Feedback loop– Ability to analyze data and

report inconsistencies leads to better product development

Help With Single Journal Subscriptions

• Managing single e-journals is like trying to herd cats– Consolidation of registration/activation• Do I really need to activate a title on the vendor site

AND in the ERM system?

– New opportunity for subscription agents?

Concurrent Users

• Ability to determine concurrent user limit• Particularly important for aggregator

packages that have multiple purchasing options– i.e. ebrary MUPO and SUPO collections

Greater Participation

• This is only the tip of the iceberg

• Libraries need to advocate for autoloaded collections …LOUDLY!

How Do We Get There From Here?

Standardization

TechnologicalSophisticationCo-operation

CustomerFeedback

ProgressiveLicensing Terms

Data Integrity

A Common Purpose

and knowledgebase providers

Above all, perhaps, librarians and publishers should sit down at a table of common purpose and join again in what has always been a necessary partnership: to publish and make available the ideas and creative works of authors.

Harold Billings, “Supping with the devil: new library alliances in the information age.” (1993)