endeca @ ncsu libraries andrew pace & emily lynema ncsu libraries may 24, 2006
TRANSCRIPT
Endeca @ NCSU Endeca @ NCSU LibrariesLibraries
Andrew Pace & Emily LynemaAndrew Pace & Emily Lynema
NCSU LibrariesNCSU Libraries
May 24, 2006May 24, 2006
Technical Overview Endeca Information Access Platform Endeca Information Access Platform
co-exists with SirsiDynix Unicorn ILS co-exists with SirsiDynix Unicorn ILS and Web2 online catalog.and Web2 online catalog.
Endeca indexes MARC records Endeca indexes MARC records exported from Unicorn.exported from Unicorn.
Index is refreshed nightly with Index is refreshed nightly with records added/updated during records added/updated during previous day.previous day.
Endeca IAP Overview
Raw MARC data
NCSU exports and reformats
Flat text files
Data Foundry
Parse text files Indices
MDEX Engine
NCSU Web Application
HTTP
Client browser
HTTP
Endeca Information Access Platform
Endeca IAP Overview
Raw MARC data
NCSU exports and reformats
Flat text files
Data Foundry
Parse text files Indices
MDEX Engine
NCSU Web Application
HTTP
Client browser
HTTP
Offline - Nightly
Endeca IAP Overview
Raw MARC data
NCSU exports and reformats
Flat text files
Data Foundry
Parse text files Indices
MDEX Engine
NCSU Web Application
HTTP
Client browser
HTTP
Always Online
Integrating Endeca Endeca doesn’t understand MARC data / MARC-8 Endeca doesn’t understand MARC data / MARC-8
character encoding – translate to UTF-8 text filescharacter encoding – translate to UTF-8 text files Each night a script updates the data indexed by Each night a script updates the data indexed by
Endeca:Endeca:– Exports updated or new MARC records from Unicorn.Exports updated or new MARC records from Unicorn.– Reformats and merges these records with those already Reformats and merges these records with those already
indexed.indexed.– Starts Endeca re-index – completely rebuilding index for Starts Endeca re-index – completely rebuilding index for
the catalog.the catalog. Process requires about 7 hours.Process requires about 7 hours. Retain Web2 OPAC for some functionalityRetain Web2 OPAC for some functionality
– Authority searching - known items and cross-referencesAuthority searching - known items and cross-references– Detailed record pages – how to make Endeca -> Web2 Detailed record pages – how to make Endeca -> Web2
link?link?
Integrating Endeca - Future MarcAdapter plugin for raw MARC MarcAdapter plugin for raw MARC
data.data.– Create local field mappings and special Create local field mappings and special
handlers in Java.handlers in Java.– Eliminate need for external MARC 21 Eliminate need for external MARC 21
translation and file merging.translation and file merging. Partial UpdatesPartial Updates
– Update circulation data multiple times Update circulation data multiple times throughout the day.throughout the day.
Some Search Statistics (March
2006)
Requests by Search Type
Search 55%
Navigation 15%
Search -> Navigation
30%
Searches by Search Key
74971
32776
135639872
58381141
0
20000
40000
60000
80000
Keyword ISBN Title Author Subject Multi-Field
Search Key
Req
ues
ts
Some Navigation Statistics (March 2006)
Navigation by Dimensions
17939
8653
7451
13607
23291
20867
17720
44197
49931
6790
0 20000 40000 60000
Author
Language
Subject: Era
Subject: Region
Library
Format
Subject: Genre
Subject: Topic
LC Classification
Availability
Dim
en
sio
n
Requests
Navigation Statistics (II) (March 2006)
Dimension Requests Order (on page)
LC Classification 49931 2
Subject: Topic 44197 3
Library 23291 6
Format 20867 5
Author 17939 10
Subject: Genre 17720 4
Subject: Region 13607 7
Language 8653 9
Subject: Era 7451 8
Availability 6790 1
Other interesting tidbits… (March 2006)
Authority searching decreased 45%Authority searching decreased 45% Keyword searching increased 230%. Keyword searching increased 230%.
– Caveat: default catalog search changed Caveat: default catalog search changed from title authority to keyword.from title authority to keyword.
~ 6% of keyword searches offered ~ 6% of keyword searches offered spelling correction or suggestion spelling correction or suggestion – 3.6% - automatic spell correction3.6% - automatic spell correction– 2.6% - “Did you mean…” suggestion2.6% - “Did you mean…” suggestion
Usability Testing 10 undergraduate students10 undergraduate students
– 5 with Endeca catalog5 with Endeca catalog– 5 with old Web2 OPAC5 with old Web2 OPAC
Endeca performed as well as OPAC for Endeca performed as well as OPAC for known-item searching in usability testknown-item searching in usability test– 89% Endeca tasks completed ‘easily’ (8/9)89% Endeca tasks completed ‘easily’ (8/9)– 71% OPAC tasks completed ‘easily’ (15/21)71% OPAC tasks completed ‘easily’ (15/21)
Endeca performed better than OPAC for Endeca performed better than OPAC for topical searching in usability test.topical searching in usability test.
Topical Searching Tasks
Topical Task Success: Web2
Easy36%
Medium7%Hard
23%
Failed34%
Topical Task Success: Endeca
Easy58%
Medium17%
Hard3%
Failed22%
00:00.0 00:43.2 01:26.4 02:09.6 02:52.8 03:36.0
Task 5
Task 6
Task 7
Task 8
Task 9
Task 10
Web2
Endeca
Average Topical Task Duration
Usability Testing Trends Relevance *most* importantRelevance *most* important
– ““Once I scroll through a page, I get pretty discouraged Once I scroll through a page, I get pretty discouraged about the results...” about the results...”
Web2 OPAC participant looking for resources on cat healthWeb2 OPAC participant looking for resources on cat health
‘‘Keyword’ term less intuitive / trusted than Keyword’ term less intuitive / trusted than ‘Subject’ and ‘Title’‘Subject’ and ‘Title’– ““[I used] Keyword in Title because that’s what I want the [I used] Keyword in Title because that’s what I want the
book to be mainly referring to. But I also could’ve went book to be mainly referring to. But I also could’ve went Keyword in Subject. But if I’d have went Keyword Keyword in Subject. But if I’d have went Keyword Anywhere it would have had too big of a field to look Anywhere it would have had too big of a field to look through.” through.”
Web2 OPAC participant looking for resources on gene therapyWeb2 OPAC participant looking for resources on gene therapy
When found, dimensions seem intuitive and When found, dimensions seem intuitive and usefuluseful
‘‘Did you mean’ seems intuitiveDid you mean’ seems intuitive
A study in relevance Are search results in Endeca more Are search results in Endeca more
likely to be relevant to a user’s query likely to be relevant to a user’s query than search results in Web2 OPAC? than search results in Web2 OPAC?
100 topical user searches from 1 100 topical user searches from 1 month in fall 2005month in fall 2005
How many of top 5 results relevant?How many of top 5 results relevant?– 40% relevant in Web2 OPAC40% relevant in Web2 OPAC– 68% relevant in Endeca catalog68% relevant in Endeca catalog
Relevance defined Relevance ranking in Endeca – select Relevance ranking in Endeca – select
from a variety of modules and order from a variety of modules and order them based on importance.them based on importance.
Relevance most important in Keyword Relevance most important in Keyword Anywhere - searches all fields.Anywhere - searches all fields.
At NCSU…At NCSU…1.1. Original query term(s) (no thesaurus, Original query term(s) (no thesaurus,
stemming, spell correction)stemming, spell correction)2.2. Exact phrase matchExact phrase match3.3. Field ranking (Title higher than Author higher Field ranking (Title higher than Author higher
than Table of Contents)than Table of Contents)4.4. Number of fields that contain term(s) …Number of fields that contain term(s) …
Future Plans Ongoing tweaks:Ongoing tweaks:
– Continued usability testingContinued usability testing– Relevance ranking algorithms & spell correction Relevance ranking algorithms & spell correction
thresholdsthresholds– Additional browsing optionsAdditional browsing options
Endeca 2.0 ideasEndeca 2.0 ideas– FRBR-ized displayFRBR-ized display– Discussions with OCLC regarding FAST (Faceted Discussions with OCLC regarding FAST (Faceted
Access to Subject Terms) and FRBRAccess to Subject Terms) and FRBR– Patron-generated refinements (folksonomies?)Patron-generated refinements (folksonomies?)– Enrich records with supplemental Web Services Enrich records with supplemental Web Services
content – more usable TOCs, book reviews, etc.content – more usable TOCs, book reviews, etc.– The death of authority searching (?)The death of authority searching (?)– More integration with QuickSearch, other data More integration with QuickSearch, other data
repositories, and third-party discovery toolsrepositories, and third-party discovery tools
Thanks
http://www.lib.ncsu.edu/endecahttp://www.lib.ncsu.edu/endeca
Andrew Pace, Head, ITAndrew Pace, Head, IT
[email protected][email protected]
Emily Lynema, Systems Librarian for Digital Emily Lynema, Systems Librarian for Digital ProjectsProjects