development of national digital library of india · development of national digital library of...
TRANSCRIPT
NDLI: LOOKING BEYOND THE HORIZON
Development of National Digital Library of India
Towards Building a National Asset
Dr. Plaban Kumar Bhowmick
Co-PI, NDL Project, NME-ICT, MHRD
Indian Institute of Technology, Kharagpur
L I N E 2 0 1 7 - L I B R A R I E S I N N E X T E R A , K A L Y A N I U N I V E R S I T Y
1 6 - D E C - 2 0 1 7
N M E I C T : N a t i o n a l M i s s i o n o n E d u c a t i o n T h r o u g h I n f o r m a t i o n a n d
C o m m u n i c a t i o n T e c h n o l o g y
16/12/2017 IIT, Kharagpur
The current Vision
Technology to Realize current Vision
What’s there beyond the horizon…..?
Initiatives to realize the future
BUILD UP
NATIONAL DIGITAL LIBRARY OF INDIA
AS A NATIONAL KNOWLEDGE ASSET –
THE KEY DRIVING FORCE FOR EDUCATION, RESEARCH, INNOVATION, AND
KNOWLEDGE ECONOMY IN INDIA
16/12/2017
NDL Vision
IIT, Kharagpur
3
TO CREATE A 24X7-ENABLED INTEGRATED NDL
AS A UBIQUITOUS DIGITAL KNOWLEDGE SOURCE OF THE NATION – CATERING TO
IMMERSIVE E-LEARNING FOR
ALL LEANERS AT ALL LEVELS IN ALL AREAS
TO INITIATE A MOVEMENT FOR INTEGRATED DIGITAL LEARNING ACROSS INDIA
16/12/2017
NDL Mission
IIT, Kharagpur
4
INCLUSIVE
&
OPEN
16/12/2017
NDL Motto
IIT, Kharagpur
5
National Digital Library: Issues
16/12/2017
User-side Wide geographic expanse & Large
population Huge number of students Large number of institutions Varied linguistic diversity Severe lack of Teachers
Provider-side Wealth of digital content
Books and Articles ETD Question Papers and Solutions Video Lectures - MOOCs Simulations & Animations NMEICT Projects Data …
No single-window search Google search uses keyword – no
metadata search Widely varied DL technology Lack of Interactivity, Vernacular
support Low integration between content
and learning system Weak ecosystem between learners
and teachers
IIT, Kharagpur
6
Presentation Model
16/12/2017
Not a new library – an umbrella
Collects and ingests metadata only
Presents full-text from source view
Provides:
Search
Browse
IIT, Kharagpur
7
16/12/2017
NDLIndia Live: https://ndl.iitkgp.ac.in/
NDL Project: http://www.ndlproject.iitkgp.ac.in/ndl/
NDLIndia on Facebook: https://web.facebook.com/NDLIndia/
NDLIndia on YouTube: https://www.youtube.com/watch?v=LEwAyHGKeLw
https://www.youtube.com/watch?v=qIZB-G9ywF0 https://www.youtube.com/watch?v=UCoJwfPrQFs&t=115s
IIT, Kharagpur
T A R G E T S
C O N T E N T S ,
S T A K E H O L D E R S ,
C O N T R I B U T O R S ,
U S E R S ,
A R C H I T E C T U R E , A N D
T H E B I G P I C T U R E
16/12/2017
Objective and Scope
• Books are for use • Every reader his [or her] book
• Every book its reader • Save the time of the reader
• The library is a growing organism
IIT, Kharagpur
9
Objectives
16/12/2017
Create a 24X7-enabled Infrastructure for NDL with single window search facility – To include h/w systems, networks, s/w tools, applications and interoperability standards
Harvest IDRs (Institutional Digital Repository) across institutions of the nation to provide integrated access
Facilitate select institutes to disseminate existing content and create new digital content
Provide support for immersive E-learning environments at multiple levels spanning across All academic levels – school to college to university to life-long learning
All disciplines – Science, Arts, Humanities, Engineering, Medical, Law, and
All languages (vernacular) used as medium of instruction.
Support interfaces in vernacular & for differently abled users IIT, Kharagpur
10
Digital Contents Digital, Surrogate Digital, Metadata Digital, etc.
Content at NDL
Born-digital object
Digital surrogate of a physical object
Digital metadata of physical object
Metadata at NDL
NDL does not store contents
NDL only ingests metadata for Search & Browse
Content (Full-text) is delivered from Source
16/12/2017
A content is included (metadata ingested) in NDL if it is expected to have educational value
• 7679334-abstract-image-of-tunnel-with-binary-language.jpg
• https://www.123rf.com/profile_carloscastilla
• 450 pixels X 376 pixels • 72 dpi • Royalty Free
IIT, Kharagpur
11
Range of Contents
Institutional Digital Repository of
Contributing Institutes
Faculty Publications,
ETD (Electronic Thesis &
Dissertation): DSc-PhD-Masters-
Undergrad, Research Projects
Books & Periodicals, Open Access
Journals , E-Books &
Subscribed E-Resource
Annual Reports, Project Reports,
Convocation, Working Papers,
Others
Encyclopaedia Dictionaries
Directories Others
Lecture Slides,
Videos, Class Notes,
Courseware
Institutions of School & Higher Education, Boards
Term Papers, Assignments,
Solutions
Lab Experiments,
Manuals, Case Studies
Datasets, Benchmarks,
Models, Maps, Software
Audio & Video
Content
Manuscripts, Painting,
Sculpture, Music, Dance, Drama
Question Banks (JEE / GATE / NET / CAT ), Model
Answers
Re
se
arc
h a
nd
Pro
fess
ion
al In
sti
tuti
on
s,
Ce
ntr
al / S
tate
Un
ive
rsit
y
Institutional and Open Contributions. Multi-modal, Multi-faceted
16/12/2017 IIT, Kharagpur
Content View Architecture
16/12/2017
Content Baseline
Sc
ho
ol
Ve
rti
ca
l
Do
ma
in V
er
tic
al
(Me
dic
al/
Le
ga
l/…
)
Co
mp
eti
tiv
e E
xa
m
Ve
rti
ca
l
Da
ta V
er
tic
al
Ap
pli
ca
tio
n
Ve
rti
ca
l
…
Vertical-Specific Custom Interface and Search
Generic Interface and Search
Ap
p L
au
nc
he
r
MC
Q /
MS
Q /
…
Te
xtb
oo
k,
Le
ss
on
Vie
w
IIT, Kharagpur
13
Stakeholders
Roles and Responsibilities
16/12/2017
Stakeholder Roles and Responsibility
Government 1. Sponsor and facilitator 2. Content Contributor
• Ministries / Departments • R & D Labs
Institutions • Public /
Private • Academic /
R & D / Educational
1. Host Institution – IIT Kharagpur 2. Contributing Institution – Supporting
IDRs 3. Participating Institution – Providing
Users & Feedback
Public • NGOs • Individuals
1. Use and Feedback 2. Metadata by Crowd Sourcing 3. Content by Crowd Sourcing
Industry 1. Technology Providers
Publishers 1. Metadata Provider 2. Content Provider (under various
licensing schemes)
IIT, Kharagpur
14
Contributors
CFTI, State and Central Universities, R & D Labs, Govt. Depts, Free Portals, Publishers, etc.
130 Contributors and counting
16/12/2017 IIT, Kharagpur
15
Users & Access
Individual
Institutional
Registration is Open to all
Registration Types:
Individual
Registers directly
Institutional
By request from Institution
Managed by authenticated Nodal Person
Convenient for bulk upload of users
16/12/2017
College Life-long Learner
IIT, Kharagpur
16
The Big Picture
NDL:
Content Repository
LMS:
Content Delivery for Learning
VUC:
Certification & Credit Transfer
16/12/2017
National Digital Library of India
MOOCs
LMS
Virtual University
and Certification
IIT, Kharagpur
17
16/12/2017 IIT, Kharagpur
The current Vision
Technology to Realize current Vision
What’s there beyond the horizon…..?
Initiatives to realize the future
M E T A D A T A E N G I N E E R I N G
S O F T W A R E A R C H I T E C T U R E
M U L T I - L I N G U A L I N T E R F A C E
E X P E R I E N C E T R A C K I N G
16/12/2017
Technology of NDL
IIT, Kharagpur
19
NDL Data Model
16/12/2017
Challenges in Metadata Engineering for NDL
Wide category of resources
Generic metadata or domain specific?
Openness of repository
Closed metadata standard may fail to describe a new resource
Scale is enormous
Manual annotation is infeasible
Automatic annotation guided by crowd sourcing?
16/12/2017
Metadata Specification Requirement
To describe any digital resource Generic content metadata
Contributor, Description, Language, Format etc
To describe domain specific resources Educational content metadata
Educational level, ToC, Type of learning material etc
Medical domain
Disease, Patient condition, case studies etc.
Thesis metadata
Institution, advisor, degree, researcher
16/12/2017
Metadata Envelop
Shodhganga (thesis)
pedagogicObjective
keyword
NDL Metadata
http://www.ndlproject.iitkgp.ac.in/ndl/header.php?mname=Metadata%20Schema
NDL Metadata Envelop
16/12/2017
Locate Content Acquire Metadata
Harvest Institutional IDRs
Crawl Websites
In Bulk – from Publishers
Donated by Source
Source-supported API
…
Creation Manual Automated
Translation Format Standard / Schema
Curation Manual Assisted
Ingestion
Acquisition Scenarios
16/12/2017
Smart Metadata Curation Workflow
16/12/2017
Software Architecture
16/12/2017 IIT, Kharagpur
26
Experience Tracking and UI
Experience Tracking
To offer customized search results
Multi-lingual Support
Reduce cognitive load for native use
Personalization
Customized UI to suit user grade
16/12/2017
Source: http://www.brandquarterly.com/tracking-customer-experience-essential
IIT, Kharagpur
27
Experience Tracking Technology Record:
1. Partha searched ‘tiger’
2. Partha navigated to Tiger Wiki
3. Partha studied Wiki (3 min)
4. Partha downloaded tiger image (4 images)
5. Partha checked tiger map (2 min)
6. Partha enlarged map at Sunderbans (twice)
7. Partha searched ‘national animal of India’
Infer:
• Partha learnt – tiger is the national animal (of India)
• … possibly
Experience API (xAPI) / Tin Can API
Connects learning content and learning systems to record and track all types of learning experiences
Learning Record Store (LRS)
Stores learning experiences
16/12/2017
Source: https://tincanapi.com/overview/
IIT, Kharagpur
28
UI Technology: Experience Tracking
LRS NDL
Repository LMS
LMS Front End Search/Browsing Visualization &
Analytics
LMS Tracker Search Tracker
16/12/2017 IIT, Kharagpur
29
UI Technology: Web Interface
API End Points:
• RestFul API endpoint
• CMS API, Index API and LMS API
• Parameterized access to the index
• API Sandbox
Web interface is an app using NDL API
Extended Search Features Facet based search result refinement
DDC Topic tree based content browsing
Tag, comment on a content
Target group specific interface
Bookshelf
Rating and sharing
NDL-Specific Features Multi-lingual enablement
Multi-lingual query interface powered by Google transliteration
Cross-lingual search
Personalization
16/12/2017 IIT, Kharagpur
30
16/12/2017 IIT, Kharagpur
The current Vision
Technology to Realize current Vision
What’s there beyond the horizon…..?
Initiatives to realize the future
Development of Large Scale Repository
16/12/2017 IIT, Kharagpur
Current Status: • A single search index cater to all the domains
Issues: • Inefficiency in retrieval (relevance and
retrieval time) • Domain specific information is lost
• A single metadata caters to all • e.g., same metadata schema used to index
education, medical, cultural domain
Development of Large Scale Repository
16/12/2017 IIT, Kharagpur
Solution: • Distributed indices each targeting a specific domain
Education Medical Culture
Intelligent Query Forwarding Search Result Aggregation
User Query Search Results
Development of Large Scale Repository
16/12/2017 IIT, Kharagpur
34
Intelligent Query Forwarding or Selective Search • Identify domain of query • Challenging as query contexts tend to be brief
Search Result Aggregation • Aggregating facets that are domain specific
• Subject Categorization (MeSH or DDC) • On-the-fly mapping without compromising
response time
Semantic Search
16/12/2017 IIT, Kharagpur
35
Books where the author is Satyajit Ray
Books where the subject is Satyajit Ray
Semantic Search
16/12/2017 IIT, Kharagpur
36
video lectures
Learning Resource Type
Source Organization
Subject Domain
on IIT Kharagpur Computer Science
video tutorial A. Basu by
Author
of
NPTEL
Data structure
Source
from
Keyword/Subject
Learning Resource Type
Semantic Search
16/12/2017 IIT, Kharagpur
37
Works of Indian about European Colonization of South East Asia
A Complex Query
CreativeWork
Novel NonFiction
isA isA isA
Author
type
Article
type
Indian nationality
asserts Statement colonized
subject
predicate
Indonesia
Europe
SouthEastAsia
object
isAuthorOf
Result: OCEAN OF CHURN
Knowledge Graph
Reified Statement
Semantic Search
16/12/2017 IIT, Kharagpur
38
Core Knowledge Graph Library
Resource Description
External Knowledge Graph
Unstructured Text
Semantic Analysis of Query
Query
Semantic Search
16/12/2017 IIT, Kharagpur
39
Indonesia isPartOf SouthEastAsia
Semantic Search
16/12/2017 IIT, Kharagpur
40
Linked Open Data (LOD) Cloud
Data is out there. We got to use it http://lod-cloud.net/versions/2017-08-22/lod.png
Semantic Search: Research Challenges
16/12/2017 IIT, Kharagpur
41
Linking Data to LOD Cloud • Linking entities to LOD entities (entity disambiguation)
• dbr:Indonesia sameAs http://dbpedia.org/page/Indonesia
Unstructured Data to Structured Knowledge Graph Entry
• Extract entities and relationships from free text, video, image • Indonesia experienced a long colonial history under Dutch rule
(https://www.indonesia-investments.com/culture/politics/colonial-history/item178?)
• dbr:dutch dbp:colonized dbr:indonesia
Semantic Search: Research Challenges
16/12/2017 IIT, Kharagpur
42
Semantic Analysis of Query • Identifying meaningful phrases in query • Mapping phrases to Knowledge Graph vocabulary
• Colonization conquer
Inference over Knowledge Graph
• Graph traversal-based inference • Ocean_of_Churn type NonFiction, NonFiction isA
CreativeWork • Ocean_of_Churn type CreativeWork
• Rule-based Inference • X type Author AND X nationality India X type
IndianAuthor
Crowd Sourcing for User Engagement
16/12/2017 IIT, Kharagpur
43
https://pro.europeana.eu/post/writing-the-past-transcribing-handwritten-documents-from-world-war-one
Crowd Sourcing for User Engagement
16/12/2017 IIT, Kharagpur
44
https://www.nla.gov.au/content/many-hands-make-light-work-public-collaborative-ocr-text-correction-in-australian-historic
User Engagement: Research Challenges
16/12/2017 IIT, Kharagpur
45
Effective Crowdsourcing Strategy • Designing Hackathons that are interesting • Incentive and motivation • Strategy for moderation
OCR Technology • Indian Language OCR technology has to go a long way
Enhancing User Experience
16/12/2017 IIT, Kharagpur
46
Video Lectures
Assessment Items
Simulation
Game
Integrated Resource Presentation
Enhancing User Experience
16/12/2017 IIT, Kharagpur
47
Tools
Dataset
Experts
Research Groups
Integrated Resource Presentation: Research Challenges
16/12/2017 IIT, Kharagpur
48
Selecting Candidate Resource for Integration • Hybrid similarity model based on Knowledge graph and
unstructured text
Judging Supplimentarity • Given one resource to what extent another resource is a
supplement • Diversity in resource modalities
Personalized and Context-based Notification
16/12/2017 IIT, Kharagpur
49
I am a traveler. I want to visit Kolkata to
enjoy and have fun
Personalized and Context-based Notification
16/12/2017 IIT, Kharagpur
50
I am a research scholar in an exploration tour
for my research on Kolkata
Context-based Notification: Research Challenges
16/12/2017 IIT, Kharagpur
51
Context Representation • Demographic data
• Age range, ethnicity etc. • Dynamic data
• Location, current interest, tour plan etc.
Context Profiling • Demographic data
• Survey, prediction model over user behavior data • Dynamic data
• Physical sensors (location, time) • Social sensors (facebook, twitter feeds)
Notification Generation
• Context profile based retrieval model
16/12/2017 IIT, Kharagpur