niso two part webinar: is granularity the next discovery frontier? part 1: supporting direct...
TRANSCRIPT
NISO Two Part Webinar: Is Granularity the Next Discovery Frontier?
Part 1: Supporting Direct Access to Increasingly Granular Chunks of Content
Wednesday, March 11, 2015
Speakers:
Myung-Ja (MJ) Han, Metadata Librarian, University of Illinois at Urbana-Champaign Urbana
Tito Sierra, Director of Product Management, EBSCO Information Services
Daniel Mayer, Vice President of Product & Marketing, TEMIS
http://www.niso.org/news/events/2015/webinars/granularity_pt1/
Metadata with Levels of Description
Myung-Ja (MJ) HanMetadata Librarian
University of Illinois at Urbana-Champaign
23/11/2015NISO Webinar on Supporting Direct Access to Increasingly Granular Chunks of Content
Metadata is …
Prime sources of information to:
• Organize
• Manage
• Provide access to resources
• Preserve library resources
Access and Management
3/11/2015NISO Webinar on Supporting Direct Access to Increasingly Granular Chunks of Content
3
00667cam a2200229Ii 4500001000800000005001700008008004100025035002300066040002300089035001300112035001600125090002200141049000900163100003500172245010800207260003800315300001900353500003200372910001200404994001200416910000900428262410820030807121329.0900530s1841 enk 000 1 eng d a(OCoLC)ocm03535290 aSUCcSUCdOCLdUIU 9ARY-4606 9UC 12037257 aPR4058b.S65 1841 aUIUU1 aIngoldsby, Thomas,d1788-1845.10aSome account of my cousin Nicholas /cby Tomas Igoldsby, esq. ; to which is added, The rubber of life. aLondon :bRichard Bentley,c1841. a3 v. ;c19 cm. aFirst edition. Sadleir 157. arcp3356 a02bUIU aMARS
Localized card cataloging
Library of Congress’card service
MARC
3/11/2015NISO Webinar on Supporting Direct Access to Increasingly Granular Chunks of Content
4
MARC
• MARC Format Record
- has been a cataloging tool in the library community since the 1960’s
- has more than 1,900 fields
- has made libraries move towards the Computer Age (Tennant, 2004)
3/11/2015NISO Webinar on Supporting Direct Access to Increasingly Granular Chunks of Content
5
Changes - Environments
• Libraries have lost their place as primary information providers (Coyle & Hillmann, 2007)
• Users’ search behaviors have changed
• Printed book is no longer the only major vehicle for scholarly communication (Sandler, 2005)
• Increase in diverse formats/web-based information resources
3/11/2015NISO Webinar on Supporting Direct Access to Increasingly Granular Chunks of Content
6
Changes - Metadata
• Metadata - contains more than descriptive information
- is harvested and converted
- is provided by vendors and others
- is enhanced by users
- should support discovery service
3/11/2015NISO Webinar on Supporting Direct Access to Increasingly Granular Chunks of Content
7
“Virtually any content we digitize and make available to our clientele requires metadata for
discovery and access.”
(Tennant, 2002, p. 32)
3/11/2015NISO Webinar on Supporting Direct Access to Increasingly Granular Chunks of Content
8
Metadata Should be created in
different levels of granularity
to support granular levels of access!
3/11/2015
NISO Webinar on Supporting Direct
Access to Increasingly Granular Chunks
of Content
9
Different Levels of Access
• Volume on a shelf
• Chapter of a book
• Article of a journal issue
• Special unit of a book
3/11/2015NISO Webinar on Supporting Direct Access to Increasingly Granular Chunks of Content
10
Different Levels of Metadata
• Book/Journal title
• Chapter
• Article
• Special unit of a book
• And more…
3/11/2015NISO Webinar on Supporting Direct Access to Increasingly Granular Chunks of Content
11
Granularity of Metadata I
• Emerging needs
– Meet users’ needs to find and use resources
– Support the library’s discovery services
– Increase interest and development of digital humanities
Need granular levels of metadata!
3/11/2015NISO Webinar on Supporting Direct Access to Increasingly Granular Chunks of Content
12
3/11/2015NISO Webinar on Supporting Direct Access to Increasingly Granular Chunks of Content
14
Riley, Jennifer. (2010).“Seeing Standards: A Visualization of the Metadata Universe.”
http://www.dlib.indiana.edu/~jenlrile/metadatamap/
Granularity of Elements• MARC
100 1 _ $a Last name, First name. $d 1111-1222, $e role. • TEI
<name type="person">First name and Last name</name> or, <author>Last name, First name.</author>
• Dublin Core<dc:creator>Last name, First name. Date. </dc:creator>
• MODS<name type=”personal”>
<namePart type=”given”>Last name</namePart><namePart type=”family”>First name </namePart><role>
<roleTerm type=”code”authority=”marcrelator”>aut</roleTerm>
<roleTerm type=”text”authority=”marcrelator”>author</roleTerm>
</role></name>
3/11/2015NISO Webinar on Supporting Direct
Access to Increasingly Granular Chunks
of Content
15
<dc:title>1818-1918, a hundred years of Sunday school history in Illinois; a mosaic.</dc:title><dc:creator>Mills, Andrew H.,1851- </dc:creator><dc:type>text</dc:type><dc:publisher>Decatur, Ill., The author</dc:publisher><dc:date>[1918?]</dc:date><dc:language>eng</dc:language><dc:subject>International Sunday school association.</dc:subject><dc:subject>Sunday schools</dc:subject><dc:relation/><dc:identifier>http://www.archive.org/details/18181918hundredy00mill</dc:identifier>
<mods:mods><mods:titleInfo>
<mods:title>1818-1918, a hundred years of Sunday school history in Illinois..</mods:title></mods:titleInfo><mods:name><namePart type="given">Andrew H.</namePart>
<namePart type="family">Mills</namePart><namePart type="date">1851- </namePart>
<mods:role><mods:roleTerm mods:type="text">author</mods:roleTerm>
</mods:role></mods:name>
…..</mods:mods>
3/11/2015 16NISO Webinar on Supporting Direct Access to Increasingly Granular Chunks of Content
Extensibility of Element
“Schemas allow users to extend the
elements set to meet the local use.”
3/11/2015NISO Webinar on Supporting Direct Access to Increasingly Granular Chunks of Content
17
Granularity of Values
How to add subject headings?
•20% rule
•Rule of three
•Rule of four
Does this serve well in resource discovery and retrieval in the digital age?
3/11/2015NISO Webinar on Supporting Direct Access to Increasingly Granular Chunks of Content
18
Granularity of Metadata II
• Creating/extending metadata schemas
– *Assess the granularity of access points
– *Develop a new set of elements (Application profile)
– Identify available semantics
– Create a new metadata schema • (Application Profile)
*Metadata record creation
3/11/2015NISO Webinar on Supporting Direct Access to Increasingly Granular Chunks of Content
19
3/11/2015NISO Webinar on Supporting Direct Access to Increasingly Granular Chunks of Content
21
Book Emblem Pictura
3/11/2015NISO Webinar on Supporting Direct Access to Increasingly Granular Chunks of Content
24
<biblioDesc> <mods>
<mods:titleInfo> <mods:title>XL [i.e. Quadraginta] emblemata miscella nova </mods:title>
</mods:titleInfo> <mods:physicalDescription>
<mods:digitalOrigin>reformatted digital</mods:digitalOrigin> <mods:form authority="marcform">print</mods:form> <mods:extent>[8], xxxx p. : 41 ill. ; 20 cm.</mods:extent>
</mods:physicalDescription> ... </mods>
3/11/2015NISO Webinar on Supporting Direct Access to Increasingly Granular Chunks of Content
25
<copyDesc> <copyID>uiu2895515</copyID>
<owner countryCode="US">University of Illinois</owner> <digDesc comp="complete" scope="all"
xml:id="xliequadragintae00mure" globalID=http://hdl.handle.net/10111/UIUCOCA:xliequadragintae00mure>
<copyID>10111/UIUCOCA:xliequadragintae00mure</copyID> <owner countryCode="US">University of Illinois</owner>
</digDesc> …
</copyDesc>
3/11/2015NISO Webinar on Supporting Direct Access to Increasingly Granular Chunks of Content
26
<emblem xmlns:rdf=http://… xmlns:skos=http://…xml:id="E000944" citeNo="I." globalID="http://... :E000944">
<motto><transcription xml:lang="de"> Alchimisterey: <normalisation>xml:lang="de">Alchemie:</normalisation></transcription>
</motto> <pictura xml:id="E000944_P1">
<iconclass rdf:about="http://www.iconclass.org/rkd/31A247"> <skos:notation>31A247</skos:notation> <skos:prefLabel>looking over the shoulder</skos:prefLabel>
</iconclass> …
</picture> </emblem>
</biblioDesc>
Metadata
• Involves designing data model
• Is a highly collaborative effort
– Scholars/users
– Domain specialists
– Metadata/Cataloging librarian(Cole, Han, and Vannoy, 2012)
3/11/2015NISO Webinar on Supporting Direct Access to Increasingly Granular Chunks of Content
28
3/11/2015NISO Webinar on Supporting Direct Access to Increasingly Granular Chunks of Content
29
New MetadataSchema
Catalogers/Metadata Librarians
Users
Domain Specialists
New discovery services
Images• Building Blocks (Attribution-ShareAlike 2.0 Generic (CC BY-SA
2.0))http://www.flickr.com/photos/bdesham/2432400623/• Lego Parts (Attribution-NonCommercial-NoDerivs 2.0
Generic (CC BY-NC-ND 2.0))http://www.flickr.com/photos/starstreak007/3882987034/in/ph
otostream/• Lego Pencil and Notebook (Attribution-NonCommercial-
NoDerivs 2.0 Generic (CC BY-NC-ND 2.0))http://www.flickr.com/photos/starstreak007/3882191947/in/ph
otostream/
3/11/2015NISO Webinar on Supporting Direct Access to Increasingly Granular Chunks of Content
30
References
• Coyle, K., & Hillmann, D. (2007). Resource Description and Access (RDA). D-Lib Magazine, 13(1/2). doi:10.1045/january2007-coyle
• Cole, T. W., Han, M-J, & Vannoy, J. (2012). Descriptive Metadata, Iconclass, and Digitized Emblem Literature." Proceedings of the12th Annual Joint Conference on Digital Libraries. New York: Association for Computing Machinery. 111-120.
• Sandler, M. (2005). Disruptive Beneficence, Internet Reference Services Quarterly, 10(3-4), 5-22.
• Tennant, R. (2002). The Importance of Being Granular. Library Journal, 127(9), 32.
• Tennant, R. (2004). A Bibliographic Metadata Infrastructure for the Twenty-First Century. Library Hi Tech, 22(2), 175-181.
3/11/2015NISO Webinar on Supporting Direct Access to Increasingly Granular Chunks of Content
31
Is Granularity the Next
Discovery Frontier?
Granular Discovery: User
Challenges and Opportunities
Tito Sierra
Director of Product Management, Search
EBSCO Information Services
EBSCO Discovery Service
Agenda
• The Evolution of Library Discovery
• Discovery in the Age of Google
• Metadata and Granular Discovery
– Metadata-centered approaches
– User-centered approaches
• Concluding Thoughts
EBSCO Discovery Service
The Evolution of Library Discovery
Library search has come a long way
• OPACs
• Online Research Databases
• Federated Search
• Next Generation Catalogs
• Web-scale Discovery Services
EBSCO Discovery Service
The Evolution of Library Discovery
From siloed discovery to unified discovery
• Articles / e-journal content
• Print collection (catalog)
• E-books
• Magazines / trade publications / news sources
• General reference / specialized reference
• Multimedia (images, audio, videos)
EBSCO Discovery Service
The Evolution of Library Discovery
Supporting a variety of discovery needs
• Known-item book and article citation search
• Exploratory search / topic exploration
• Discipline-specific research / literature reviews
• Curriculum support / course assignments
• General reference / specialized reference
• Specialized content discovery
EBSCO Discovery Service
User Expectations for Search
Shaped by popular search engines
• Support basic keyword search
• Deep content coverage
– Diverse content types
– Diverse content sources
• Predictive relevance ranking
• Intelligent search features
– Autocomplete, Did-you-mean
EBSCO Discovery Service
Findings from EDS User Research
Observation Implication
Keyword search most
common
Discovery service cannot assume users will pre-
coordinate their search. Discovery service needs to
anticipate user intent based on limited input.
Faceted search used
sparingly
Discovery service cannot assume users will post-
coordinate their search. Discovery service needs to
provide more user-friendly narrowing options.
Search queries length
often short (1-2 words)
Discovery service needs to anticipate user intent
based on limited input. Search features needed to
help users clarity their search intent.
Broad and imprecise
queries common
Discovery service needs to help users narrow their
search based on limited input. Many users looking for
a topical overview on a subject.
User focus on top
results
Relevance ranking crucial for delivering a quality
search experience. Need to optimize search to display
most relevant results on first page.
EBSCO Discovery Service
The Discovery Service Challenge
Despite major evolutions in library search,
libraries, publishers, and discovery service
providers need to work harder to meet
evolving user expectations for search.
EBSCO Discovery Service
The Discovery Service Advantage
One advantage that library discovery
services have over general purpose
search engines is access to high-quality,
well structured metadata.
EBSCO Discovery Service
Metadata, Metadata, Metadata
• Citation metadata– Title, Authors, Publication Date, Volume, Issue…
• Descriptive metadata– Publication Type, Document Type, Subject Terms,
Abstract, Author-supplied keywords, Lexile level…
• Source metadata– Publisher, Content Provider, Database…
• Identifiers– ISBN, ISSN, LCCN, Call Number…
Advanced Search
Advanced search options: Single fielded search
Title
Author
Subject Terms
Multi-fielded search
Search modes
Boolean/Phrase
Find all search terms
Find any search terms
Search limiters
Full Text
Scholarly Journals
Catalog Only
Language
EBSCO Discovery Service
Metadata-centered Approaches
Advanced Search and Faceted Search are
powerful tools—for those who know how
to effectively use them.
They require user to translate discovery
need into the language of metadata.
Consequently they are underutilized.
EBSCO Discovery Service
User-centered Approaches
Anticipate end user intent using clues from
the user query and user context.
Deliver targeted content at the top of the
results list where users expect.
Publication Title Placard (Beta)
Journal name searched,
exact match found.
Publication match based
on customer’s holdings.
EBSCO Discovery Service
Metadata Necessary But Insufficient
Supporting access to increasingly granular
chunks of content will requires capture and
management of finer-grained metadata.
Metadata needs to be coupled with search
intelligence to have a broad impact.
EBSCO Discovery Service
Areas for Future Investment
• Deepen analysis of usage data to better
understand user context and expressed
granular discovery needs
• Develop intelligent bridging between user
queries and granular metadata
• Build in adaptive learning capabilities to
automatically refine intelligent search
approaches over time
Copyright © 2012 TEMIS - All Rights Reserved - Slide 65
Pioneer of Textual Big Data since 2000
• 20 languages
• In production for 12 years
• 8 Billion+ pages processed
70
26
Copyright © 2012 TEMIS - All Rights Reserved - Slide 66
Key customers in the Information IndustrySTM, Legal, B2B, Trade, Media, Public
Copyright © 2012 TEMIS - All Rights Reserved - Slide 67
Textual Big DataChallenge
• News• Market Studies• Economic Data• Financial Publications• Scientific Literature• Patents• Books• Conference Proceedings• Clinical Research• Patient Health Records• Regulations• Legal Decisions• Customer Emails• Tweets• Blog posts
+50%Per year
Copyright © 2012 TEMIS - All Rights Reserved - Slide 68
• Save timewith quick & easy accessto the most relevant
• Improve your insightinto your interest area
• Take better decisions
Textual Big DataOpportunities
Copyright © 2012 TEMIS - All Rights Reserved - Slide 69
Textual Big DataHow to seize these opportunities ?
by adding structure
• Save timewith quick & easy accessto the most relevant
• Improve your insightinto your interest area
• Take better decisions
Copyright © 2012 TEMIS - All Rights Reserved - Slide 70
How do you structure ?
We report a 52 year-old man presenting an acute hair loss induced by carbamazepine (CBZ) in concentration of 8.6 microg/ml.
By extracting granular & domain-specific information
Relations
We report a 52 year-old man presenting an acute hair loss induced by carbamazepine (CBZ) in concentration of 8.6 microg/ml.
Verb Patient Verb Symptom Verb Dosage informationSubj
Entities
Drug Name
Terms
Pro Verb NumArt N-P Noun Verb Art Adj Nn Nn Verb Pp PropNn Pp Noun Pp Num UnitAbbr
Attributes
Roles
Adverse EventSide Effect Alopecia
Cause Carbamazepine
Dosage 8.6 mg/ml
Patient 52 year old male
Copyright © 2012 TEMIS - All Rights Reserved - Slide 71
A case of hair loss induced by carbamazepine
Kohno Y, Ishii A, Shoji S, Department of Clinical Neurology, Tsukuba University.
We report a 52 year-old man presenting with an acute considerable hair loss induced by carbamazepine (CBZ). The remarkable scalp hair loss started within a week after CBZ administration. There was no evidence of dermatitis or allergic reaction, or other cause for the hair loss. The serum concentration of CBZ was 8.6 microg/ml therapeutic range 8-12 microg/ml). CBZ was discontinued, and the hair loss stopped within several days with new hair growth. Medication-induced hair loss is an occasional adverse effect of many drugs used for neuropsychological diseases. CBZ also induces hair loss and its frequency was reported below 2%. Only a limited number of detailed case reports describing CBZ-induced hair loss were available, and we found these cases could divide into two groups with regard to a delay in starting hair loss after administration of CBZ. In one group, the hair loss started within a week suggesting anagen effluvium and in another it started after two or three months suggesting telogen effluvium. This finding suggests the causative mechanism of CBZ-induced hair loss is not unitary.
Neuropsychological diseasesDiseases
8.6 microg./mlDosage information
8-12 microg./ml
man52 year oldPatient information
Dept of Clinical Neurology, Tsukuba UniversityOrganizations
Kohno Y Ishii A Shoji SPeople
Carbamazepine CBZ
Drugs
Alopecia Dermatitis Allergic reaction
Anagen effluvium Telogen effluvium
Symptoms
How does this help ?Enrich document metadata
Side-effect Relationships
Drug-induced alopecia
A case of hair loss induced by carbamazepine
Kohno Y, Ishii A, Shoji S, Department of Clinical Neurology, Tsukuba University.
We report a 52 year-old man presenting with an acute considerable hair loss induced by carbamazepine (CBZ). The remarkable scalp hair loss started within a week after CBZ administration. There was no evidence of dermatitis or allergic reaction, or other cause for the hair loss. The serum concentration of CBZ was 8.6 microg/ml therapeutic range 8-12 microg/ml). CBZ was discontinued, and the hair loss stopped within several days with new hair growth. Medication-induced hair loss is an occasional adverse effect of many drugs used for neuropsychological diseases. CBZ also induces hair loss and its frequency was reported below 2%. Only a limited number of detailed case reports describing CBZ-induced hair loss were available, and we found these cases could divide into two groups with regard to a delay in starting hair loss after administration of CBZ. In one group, the hair loss started within a week suggesting anagen effluvium and in another it started after two or three months suggesting telogen effluvium. This finding suggests the causative mechanism of CBZ-induced hair loss is not unitary.
Copyright © 2012 TEMIS - All Rights Reserved - Slide 72
Extraction
Luxid® Annotation
Server
How does this help ?Exploiting enriched metadata
Metadata / Triples
Copyright © 2012 TEMIS - All Rights Reserved - Slide 73
SearchEngine
IndexDe
RechercheIndex
IndexingExtraction
Luxid® Annotation
Server
How does this help ?Exploiting enriched metadata
Metadata / Triples
Copyright © 2012 TEMIS - All Rights Reserved - Slide 74
Applications
SearchAnalysis
VisualizationFacets
Recommendations
PortalsTopic Pages
DecisionSupport
SearchEngine
IndexDe
RechercheIndex
IndexingExtraction
Luxid® Annotation
Server
How does this help ?Exploiting enriched metadata
Metadata / Triples
Linked DataKnowledge Bases
Copyright © 2012 TEMIS - All Rights Reserved - Slide 77
Sources/Content Types
Scan Types
Anatomical Regions
Pathologies / Indications
Facetted search/navigation
Copyright © 2012 TEMIS - All Rights Reserved - Slide 79
Nature of the offense
Nature of the sentence
Numerical data regarding the sentence
Numerical Dataregarding the context
Facetted search/navigation
Copyright © 2012 TEMIS - All Rights Reserved - Slide 81
CorrespondingScans & Illustrations
Differential DiagnosisSupport
Relevant Content
Topic Page
Copyright © 2012 TEMIS - All Rights Reserved - Slide 82Copyright © 2009 TEMIS - All Rights Reserved - Slide 82
Topic Page
Copyright © 2012 TEMIS - All Rights Reserved - Slide 86
Semantic AnalyticsCompetitive Intelligence - Product/Supplier map
Analytics
Copyright © 2012 TEMIS - All Rights Reserved - Slide 87
Semantic Analytics
Analytics – Corporate Relationships
NISO Webinar • March 11, 2015
Questions?All questions will be posted with presenter answers on
the NISO website following the webinar:
http://www.niso.org/news/events/2015/webinars/granularity_pt1/
NISO Two-Part Webinar
Is Granularity the Next Discovery Frontier?
Part 1: Supporting Direct Access to Increasingly Granular
Chunks of Content