niso/nfais joint virtual conference: connecting the library to the wider world: successful...
DESCRIPTION
Smart Content applications at Elsevier Michael Lauruhn, Disruptive Technology Director, ElsevierTRANSCRIPT
Michael Lauruhn
December 3, 2014
Smart Content applications at ElsevierNISO/NFAIS Virtual Conference:
Connecting the Library to the Wider World - Successful Applications of Linked Data
| 2
Smart Content & Linked Data at Elsevier
Background
Key Components of Smart Content
Current Examples
Project Planning Considerations
Introduction & Agenda
| 3
Introduction: Smart Content & Linked Data
Elsevier Content
Componentized text
Data
Multimedia
3rd Party Linked data
Web Open data
Vocabulary
| 4
Smart Content infrastructure in practice
Trial: NCT00623103
Serious Adverse events:
Atrial fibrillation
med:drugs Rivastigmine
Elsevier
Delirium treatment: An unmet challenge
Rivastigmine, a cholinesterase inhibitor, has been used to
treat delirium in elderly patients with stroke. 1 A biologically
plausible premise—that impaired cholinergic transmission
might either cause or worsen delirium—led to a
randomised, placebo-controlled, double-blind trial by
Maarten van Eijk and colleagues 2 in The Lancet in which
they added rivastigmine or placebo to usual treatment of
patients in intensive care. The trial was halted at 104
patients by the drug safety and monitoring board (DSMB)
because of increased mortality (12/54 in the rivastigmine
group, 4/50 in the placebo group; p=0·07) and a worse
outcome. The rivastigmine group …
foaf:page
owl:same as
owl:same as
| 5
Smart Content as Infrastructure
Product Development & Enhancement • More accurate search results• Faceted navigation• Improved content discoverability
Content Analytics• New insights and abilities to take inventory
about what we publish • Identification of co-occurring terms• Link to related external content & data
Personalization• Individual content recommendations• Targeted individual marketing
Editorial Productivity• Flexible product types – new collections,
image banks, etc. • Increased speed to market
Key Components
of Smart Content
| 7
Vocabulary Example: EMMeT
EMMeT
UMLS
SNOMED
ICD9ICD10
MeSH
LOINC
Gold
Standard
(Drugs)
Elsevier
Custom
Resources
Multi-language taxonomy:
>1 million concepts
>3 million synonyms
Classes include:
Anatomy
Diseases
Drugs
Symptoms
Procedures
Sourced from several
standardized vocabularies
| 8
Medical Name
Malignant Neoplasm of the Breast
Consumer Friendly Name
Breast Cancer
Synonyms
Malignant Tumor of Breast
Malignant Breast Neoplasm
Breast Ca
Codes
ICD9 – 174.9
MeSH – D001943
SNOMED-CT – 190121004
Semantic Type/Group
Neoplastic Process/Disease
• Breast Disorders
• Cancer of the Thorax
• Mammary Neoplasms
• More….
• Breast Sarcoma
• Familial Breast Cancer
• Malignant lymphoma of the Breast
• Malignant Neoplasm of the breast outer
quadrant
• More…
Symptoms
Diagnostic
Procedures
Treatment
Procedures
Medications
Risk Factors
Prevention
Complications
Breast Lump, Nipple Retraction, …..
Mammography, Breast Biopsy, …..
Chemotherapy, Mastectomy, ….
Tamoxifen, Doxorubicin, …..
Family History, Genetics, Predisposition, ….
Screening, Preemptive Mastectomy, ….
Metastatic Cancer, ….
Se
ma
ntic R
ela
tion
ship
s
4
2
3
1
| 9
Vocabulary Example: EMMeT
EMMeT
UMLS
SNOMED
ICD9ICD10
MeSH
LOINC
Gold
Standard
(Drugs)
Elsevier
Custom
Resources
FrEMMeT
SpEMMeT
| 10
Linked Data Repository
| 11
• Knowledgebase of semantic data
• Large scale integration of related
sources of medical and scientific
content and data
• High performance service layer
APIs for integration into end-user
products and internal platforms
Linked Data Repository
Editorial &
Author
Keywords
Classic subject metadata
Componentized text
Robust Data models
Entity extraction
Linked Data Environment
Full-text
Indexing
Semantic Annotations
Current Examples
| 13
ClinicalKey search
| 14
• Support the FundRef initiative facilitated by CrossRef organization to
provide a standard way of reporting funding sources for published
scholarly research.
• SciVal Funding is an online solution that provides targeted
recommendations on grants, making it easier for researchers to
discover funding opportunities related to their area of research.
SciVal Funders Vocabulary
| 15
Similar Methods for Neuroscience
System to extract and index the Methods sections of articles from
100 Elsevier neuroscience journals
Built a comparison and recommendation system so readers can
find and evaluate articles with “Similar Methods” to the ones
presented in the current article
| 16
Similar Methods for Neuroscience
Search process targets factors:
what brain regions are being studied
what organism is being used
what methodologies are being employed
what disease model is the focus of the study
| 17
Leveraging Wikipedia for Neuroscience
• Pilot project that identifies concepts from a
Neuroscience topics vocabulary
• Provides Wikipedia definitions to add context
around the article’s significant concepts
Additional context for Energy terms
18
• A ‘dictionary app’ using the portions of the Encyclopedia of
Energy (1818 terms)
• Available for articles from Applied Energy and Energy
Conversion and Management; additional pilots planned.
Example: http://www.sciencedirect.com/science/article/pii/S0306261913001888
Terms from dictionary
are highlighted in article,
when the reader clicks
on the term the definition
from the dictionary will
be shown in the feature
(right hand pane)
Project Planning
Considerations
| 20
• Get stakeholders invested
• Think about what users currently do… and what they can do better
• Focus on Use Cases to stay centered and identify priorities with
decision making.
Get to a Use Case early
Particularly helpful when introducing a
new infrastructure to an organization
| 21
• Integration with third party content, data models and vocabularies
requires a vetting process:
Are they accurate?
Are they trustworthy?
Are they current?
Are they sustainable?
Quality & Reliable of Resources
Warning: Some of the more attractive
resources on the web are one off
projects are no longer maintained
| 22
• As knowledge models and vocabularies grow, resources are needed
to keep them current
• Governance policy should account for sources for new concepts,
terminology and relations:
New content types
Search logs
New trends & discoveries
Ongoing Maintenance
These require resources (people’s
time) that need to be factored into the
total cost of ownership
| 23
• Applying semantic web technologies for applications is not an
exclusively IT solution:
Sponsors, stakeholders and subject experts need to contribute and
shape the vocabularies and the application functionality
The fine tuning for some of these applications can be surprisingly
manual
It’s important to not get distracted by the outliers and corner cases
Quality & Testing
Installing and implementing these
technologies OOTB is getting
easier…Quality is where it gets hard
| 24
• Test sets are essential
Real content
Real use cases
Scores that show accuracy and measure improvement
Quality & Testing
Our SME’s: Our Application:
| 25
• Don’t forget to look at opportunities for internal applications
Consider internal workflows
Look for efficiency enhancements
Look for discovery opportunities
• Start small
Get some early proof of concepts that you can share with stakeholders
before tackling bigger challenges
Other lessons & observations