taming the wilde

Post on 24-Jun-2015

92 Views

Category:

Education

2 Downloads

Preview:

Click to see full reader

DESCRIPTION

2014 Charleston Conference Thursday, Nov 6, 2:15 PM

TRANSCRIPT

Taming the Wilde

Collaborating with Expertise for Faster, Better, Smarter

Collection Analysis

Jackie Bronicki, Collections and Online Resources CoordinatorCherie Turner, Chemical Sciences Librarian

Shawn Vaillancourt, Education Librarian Frederick Young, Systems Analyst

OutlineOutline of Presentation

Research Question 1: What are the best measurements for evaluating the current scope of the collection?

Research Question 2: What subject areas are not adequately covered in the current collection?

Research Questions

• Influenced by the Cornell University Library Print Collection usage report

• No language analysis

• No patron analysis

• Limited formats

Methodology

• 889,825 total monograph items in final dataset

• 425,865 titles that have not circulated (48%)

• 787,590 titles circulated 5 or fewer times (88%)

• 861,910 titles that have not circulated in the last year (97%)

Results

A B C D E F G H J K L M N P Q R S T U V Z0

50000

100000

150000

200000

250000

Distribution by LC Class

𝑃𝐸𝑈=𝑃𝑒𝑟𝑐𝑒𝑛𝑡𝑈𝑠𝑎𝑔𝑒

𝑃𝑒𝑟𝑐𝑒𝑛𝑡𝑜𝑓 𝐻𝑜𝑙𝑑𝑖𝑛𝑔𝑠

𝑃𝐸𝑈 𝐵=1.43%1.32%

=1.08

1.32%

1.43%

If PEU>1 OverusedIf PEU<1 Underused

𝑅𝐵𝐻=𝑃𝑒𝑟𝑐𝑒𝑛𝑡 𝑜𝑓 𝐼𝐿𝐿𝐵𝑜𝑟𝑟𝑜𝑤𝑖𝑛𝑔

𝑃𝑒𝑟𝑐𝑒𝑛𝑡𝑜𝑓 𝐻𝑜𝑙𝑑𝑖𝑛𝑔𝑠%

𝑅𝐵𝐻𝐵=0.79%1.43%

=0.6

Mean RBH=1.54±5.18If RBH>Mean RBH OverusedIf RBH<Mean RBH Underused

Comparing Circulation to ILL Usage

LC Subclass

Percent of Holdings

Percent Usage PEU

Holdings Usage

Percent of ILL Borrowing RBH ILL Usage

B 1.32% 1.43% 1.08 Overused 0.79% 0.60 UnderusedBC 0.09% 0.08% 0.82 Underused 0.05% 0.51 UnderusedBD 0.24% 0.20% 0.84 Underused 0.24% 1.01 UnderusedBF 1.22% 1.78% 1.46 Overused 2.00% 1.64 OverusedBH 0.07% 0.09% 1.29 Overused 0.05% 0.68 UnderusedBJ 0.22% 0.27% 1.21 Overused 0.18% 0.79 UnderusedBL 0.42% 0.65% 1.56 Overused 0.69% 1.65 OverusedBM 0.10% 0.07% 0.67 Underused 0.09% 0.95 UnderusedBP 0.13% 0.26% 1.95 Overused 0.34% 2.57 OverusedBQ 0.04% 0.10% 2.63 Overused 0.32% 8.05 OverusedBR 0.36% 0.33% 0.91 Underused 0.70% 1.96 OverusedBS 0.22% 0.16% 0.73 Underused 0.36% 1.62 OverusedBT 0.16% 0.13% 0.85 Underused 0.40% 2.53 OverusedBV 0.18% 0.15% 0.86 Underused 0.44% 2.49 OverusedBX 0.52% 0.29% 0.56 Underused 1.69% 3.23 Overused

If PEU>1 OverusedIf PEU<1 Underused

If RBH>Mean RBH OverusedIf RBH<Mean RBH Underused

Mean RBH=1.54±5.18

Comparing Circulation to ILL Usage

LC Subclass Holdings Usage ILL Usage ActionB Overused Underused No ChangesBC Underused Underused Ease OffBD Underused Underused Ease OffBF Overused Overused Growth OpportunityBH Overused Underused No ChangesBJ Overused Underused No ChangesBL Overused Overused Growth OpportunityBM Underused Underused Ease OffBP Overused Overused Growth OpportunityBQ Overused Overused Growth OpportunityBR Underused Overused Change PurchasingBS Underused Overused Change PurchasingBT Underused Overused Change PurchasingBV Underused Overused Change PurchasingBX Underused Overused Change Purchasing

Comparing Circulation to ILL Usage

The More Important Question…..

• Sierra Infrastructure– What data existed where?– Title vs. Item – Call Number

• Defining Input/Output Variables – What we could output (circulation)

• MaRC

• Scope of Project

• Building a proper sample

Initial Challenges – Research Team

Challenges to Possibilities

• Understanding the question

• Does the System Provide an Answer?

• What can we do?

• High Expectations

• Inconsistency of Data– Bad input– Batch overlay– Doesn’t exist

Data Mining Challenges – Research Team

• Scaled Expectations

• Learning curve

• Piecing the Data Together

Data Mining Challenges – Systems Team

Research Question 1: What are the best measurements for evaluating the current scope of the collection?

Research Question 2: What subject areas are not adequately covered in the current collection?

Research Questions

Initial Output Criteria

Bibliographic Record

Call NumberSubject HeadingsPublication/Copyright Date

ISBNRecord NumberTitle

Item Record

Copy NumberTotal Number of CheckoutsStatus

Order Record

Order Date

Final Output Criteria

Bibliographic Record

Item Record

Call NumberTotal CheckoutsLast Year CheckoutsYear to Date Checkouts

Location

Call NumberPublication/Copyright DateRecord NumberTitle

PublisherCatalog DateISBN

• Fields for our analysis– Call Number– Request Date– Filled Date– Format

• Fields for later analysis– Lending Library– Title– Author– Publication Date– Publisher– Language– Library Type– ISBN– OCLC Number

ILL Output Criteria

Except….

• What was MaRC telling us?

• How were fields used?

Got Data?

• ISBN?

• Location: 143,823 records deleted

• Call numbers: 14894 records deleted

Data Cleaning

• Understanding the infrastructure– Order records– Bib records• MaRC

– Item records• Understanding local practice• Experts provide guidance and practical

solutions!

Lessons Learned

Print and Electronic Serials

• Challenges

– Different systems store records– Different kinds of usage information available– Holdings based analysis– Subscription or Subscription + Aggregated– Vendor supplied records

Aguilar, W. (1986). The application of relative use and interlibrary demand in collection development. Collection Management, 8(1), 15-24. Knievel, J. E., Wicht, H., & Connaway, L. S. (2006). Use of circulation statistics and interlibrary loan data in collection management. College & Research Libraries, 67(1), 35-49.. John N. Ochola PhD (2003) Use of circulation statistics andInterlibrary loan data in collection management, Collection Management, 27:1, 1-13,DOI:10.1300/J105v27n01_01 Mills, Terry R. (1982). The University of Illinois Film Center Collection Use Study. http://files.eric.ed.gov/fulltext/ED227821.pdf  "Report of the Collection Development Executive Committee Task Force on Print Collection Usage." (2012).Cornell University Library, http://staffweb.edu/system/files/CollectionUsageTF_ReportFinal11-22-10.pdf

top related