1 dl.org (digital library interoperability, best practices and modeling foundations) functionality...
TRANSCRIPT
1
DL.Org (Digital Library Interoperability, Best Practices and Modeling Foundations)
Functionality Working Group Mtg29-30 June 2009, Athens
“Functionality modeling and functionality interoperability, Session 1”
Functionality and Interoperability with 5Sby Edward A. Fox
• [email protected] http://fox.cs.vt.edu
• Dept. of Computer Science, Virginia Tech
• Blacksburg, VA 24061 USA
Acknowledgements
• Mentors (Licklider, Kessler, Salton)• Virginia Tech, CS, Digital Library Research Laboratory• NSF and other sponsors, e.g., grants
– DUE-0840719, CCF-0722259, IIS-0535057, IIS-0325579
• Students, colleagues, co-investigators• Robert France, Marcos André Gonçalves, Doug Gorton,
Yi Ma, Uma Murthy, Rao Shen, Hussein Suleman, Ricardo da Silva Torres, ...
• Barbara Wildemuth, Jeffrey Pomerantz, Sanghee Oh, Seungwon Yang
2
Theses and Dissertations• Douglas Gorton, "Practical Digital Library Generation into DSpace with the 5S
Framework", April 2007, MS thesis, http://scholar.lib.vt.edu/theses/available/etd-04252007-161736/
• Rao Shen, "Applying the 5S Framework To Integrating Digital Libraries", April 2006, PhD dissertation, http://scholar.lib.vt.edu/theses/available/etd-04212006-135018/
• Ananth Raghavan, "Schema Mapper: A Visualization Tool for Incremental Semi-automatic Mapping-based Integration of Heterogeneous Collections into Archaeological Digital Libraries: The ETANA-DL Case Study", May 2005, MS thesis, http://scholar.lib.vt.edu/theses/available/etd-05182005-114155/
• Marcos Andre Goncalves, "Streams, Structures, Spaces, Scenarios, and Societies (5S): A Formal Digital Library Framework and Its Applications", Nov. 2004, PhD dissertation, http://scholar.lib.vt.edu/theses/available/etd-12052004-135923/
• Rohit Dilip Kelapure, "Scenario-Based Generation of Digital Library Services", June 2003, MS thesis, http://scholar.lib.vt.edu/theses/available/etd-06182003-055012/
• Hussein Suleman, "Open Digital Libraries", Nov. 2002, PhD dissertation, http://scholar.lib.vt.edu/theses/available/etd-11222002-155624/
• Qinwei Zhu, "5SGraph: A Modeling Tool for Digital Libraries", Nov. 2002, MS thesis, http://scholar.lib.vt.edu/theses/available/etd-11272002-210531/
• Jun Wang, "VIDI: A Lightweight Protocol Between Visualization Systems and Digital Libraries", May 2002, MS thesis, http://scholar.lib.vt.edu/theses/available/etd-07012002-145841/
3
Other Selected References• Marcos Andre Goncalves, Robert K. France, Edward A. Fox, MARIAN: Flexible Interoperability for
Federated Digital Libraries. ECDL 2001, 173-186, 2001• Hussein Suleman and Edward Fox. The Open Archives Initiative: Realizing Simple and Effective
Digital Library Interoperability. J. Library Automation, 35(1/2):125-145, 2002• Marcos Andre Goncalves, Edward A. Fox. 5SL - A Language for Declarative Specification and
Generation of Digital Libraries. JCDL 2002, 263-272• Marcos Andre Goncalves, Ming Luo, Rao Shen, Mir Farooq Ali, Edward A. Fox. An XML Log
Standard and Tool for Digital Library Logging Analysis. ECDL 2002, 129-143• Marcos Andre Goncalves, Ganesh Panchanathan, Unnikrishnan Ravindranathan, Aaron Krowne,
Edward A. Fox, Filip Jagodzinski, Lillian Cassel. The XML Log Standard for Digital Libraries: Analysis, Evolution, and Deployment. JCDL 2003, 312 – 314
• Hussein Suleman, Edward A Fox, Rohit Kelapure, Aaron Krowne, Ming Luo. Building digital libraries from simple building blocks, Online Information Review 27(5): 301-310, 2003
• M. Goncalves, E. Fox, L. Watson, N. Kipp. Streams, Structures, Spaces, Scenarios, Societies (5S): A Formal Model for Digital Libraries. TOIS, 22(2): 270-312 , 2004
• Rao Shen, Naga Srinivas Vemuri, Weiguo Fan, Ricardo da S. Torres, E. A. Fox. Exploring Digital Libraries: Integrating Browsing, Searching, and Visualization. JCDL 2006, 1-10
• Rao Shen, Naga Srinivas Vemuri, Weiguo Fan, Edward A. Fox. What is a Successful Digital Library? ECDL 2006, 208-219
4
Other Selected References - 2• Jeffrey Pomerantz, Sanghee Oh, Seungwon Yang, Edward A. Fox, Barbara M. Wildemuth. The
Core: Digital Library Education in Library and Information Science Programs. D-Lib Magazine, 12(11), Nov. 2006
• Marcos Andre Goncalves, Barbara L. Moreira, Edward A. Fox, Layne T. Watson. "What is a good digital library?" - A quality model for digital libraries. Information Processing and Management, 43(5): 1416-1437, 2007
• Uma Murthy, Douglas Gorton, Ricardo Torres, Marcos Goncalves, Edward Fox, Lois Delcambre. Extending the 5S Digital Library (DL) Framework: From a Minimal DL towards a DL Reference Model. JCDL 2007 Workshop on Digital Library Foundations
• Barbara L. Moreira, Marcos A. Goncalves, Alberto H. F. Laender, Edward A. Fox, Evaluating Digital Libraries with 5SQual. ECDL 2007: pp. 466-470
• Yi Ma, Edward A. Fox, Marcos A. Goncalves. Personal Digital Library: PIM upon 5S Framework. CIKM 2007 Workshop: PIKM07, Lisbon, Nov. 2007, 117-124
• Marcos Andre Goncalves, Edward A. Fox, Layne T. Watson. Towards a Digital Library Theory: A Formal Digital Library Ontology. Int. J. Digital Libraries 8(2): 91-114, 2008
• Rao Shen, Naga Srinivas Vemuri, Weiguo Fan, Edward A. Fox. Integration of Complex Archaeology Digital Libraries: An ETANA-DL Experience. Information Systems. 33(7-8): 699-723, 2008
• Barbara L. Moreira, Marcos Andre Goncalves, Alberto H.F. Laender, Edward A. Fox. Automatic Evaluation of Digital Libraries with 5SQual. J. Informetrics, 3(2): 102-123, 2009
5
Outline
• Contextual Background– DL Definitions, Scope– DL Curricula Efforts– Interoperability Approaches
• 5S
• 5S Services Work
• International Repository Infrastructure Workshop (Amsterdam, Mar 16-17, 2009)
• Discussion Topics6
7
DL Definitions
• Issues and Spectra
– Collection vs. Institution
– Content vs. System
– Access vs. Preservation
– “Free” vs. Quality
– Managed vs. Comprehensive
– Centralized vs. Distributed
8
Borgman et al.:Workshop Report onSocial Aspects ofDigital Libraries: http://www-lis.gseis.ucla.edu/DL/
InformationLifeCycle
9
Information Life Cycle
AuthoringModifying
OrganizingIndexing
StoringRetrieving
DistributingNetworking
Retention/ Mining
AccessingFiltering
UsingCreating
10
Digital LibrariesShorten the Chain from
Editor
Publisher
A&I
Consolidator
Library
Reviewer
11
DLs Shorten the Chain to
Author
Reader
Digital
LibraryEditor
Reviewer
Teacher
Learner
Librarian
DL Curric. Project
• NSF awards to VT and UN C-CH
• CS and LIS
• http://curric.dlib.vt.edu/
• http://curric.dlib.vt.edu/wiki/index.php/Main_Page
• http://curric.dlib.vt.edu/modDev/modDev.html
12
13
DL Curriculum FrameworkSemester 1:
DL collections:development/creation
Semester 2:DL services and
sustainability
CO
UR
SE
ST
RU
CT
UR
E
DigitizationStorage
Interchange
Digital objectsCompositesPackages
MetadataCataloging
Author submission
NamingRepositories
Archives
Spaces(conceptual,geographic,2/3D, VR)
Architectures(agents, buses,
wrappers/mediators)Interoperability
Services(searching,
linking, browsing, etc.)
Intellectual property rights mgmt.
PrivacyProtection (watermarking)
Archiving and preservation
Integrity
Architectures(agents, buses,
wrappers/mediators)Interoperability
CO
RE
DL
TO
PIC
S
DocumentsE-publishing
Markup
Info. NeedsRelevanceEvaluation
Effectiveness
ThesauriOntologies
ClassificationCategorization
Bibliographic information
BibliometricsCitations
RoutingFiltering
Community filtering
Search & search strategyInfo seeking behavior
User modelingFeedback
Info summarizationVisualization
Multimedia streams/structures
Capture/representationCompression/coding
Content-based analysis
Multimedia indexing
Multimediapresentation,
rendering
RE
LA
TE
DT
OP
ICS
DL Curric. Modules - 1
• Module 1-b: History of digital libraries and library automation
• Module 2-c: File Formats, Transformation, and Migration
• Module 3-b: Digitization
• Module 4-b: Metadata
• Module 5-a: Architecture overviews
14
DL Curric. Modules - 2
• Module 5-b: Application software• Module 5-d: Protocols• Module 6-a: Information needs/relevance• Module 6-b: Online information seeking
behaviors and search strategies• Module 6-d: Interaction design and
usability assessment
15
DL Curric. Modules - 3
• Module 7-b: Reference Services
• Module 7-g: Personalization
• Module 8-b: Web Archiving
• Module 9-c: Digital library evaluation, user studies
16
Interoperability Approaches
• Browsers (Mosaic)
• Federation
• Heterogeneous, Homogeneous
• Protocols (OAI-PMH)
• Repositories
• Content Standards (XML), Mapping
• Integration (ETANA)
• Services (Superimposed Information)17
18
Integration: Challenges
• “Semantic Web” is vision, not reality.
• How can we integrate without a theory?
• How can we interoperate without a common framework?
• How can we have a science of DLs if we lack agreement on definitions (so we can reason and discuss) and measures of quality (so we can compare and improve)?
19
Informal 5S & DL Definitions
DLs are complex systems that
• help satisfy info needs of users (societies)
• provide info services (scenarios)
• organize info in usable ways (structures)
• present info in usable ways (spaces)
• communicate info with users (streams)
20
5S LayersSocieties
Scenarios
Spaces
Structures
Streams
21
5Ss
Ss Examples Objectives
Streams Text; video; audio; image Describes properties of the DL content such as encoding and language for textual material or particular forms of multimedia data
Structures Collection; catalog; hypertext; document; metadata
Specifies organizational aspects of the DL content
Spaces Measure; measurable, topological, vector, probabilistic
Defines logical and presentational views of several DL components
Scenarios Searching, browsing, recommending
Details the behavior of DL services
Societies Service managers, learners, teachers, etc.
Defines managers, responsible for running DL services; actors, that use those services; and relationships among them
5S Overview• 5S and Generating DLs
– 5S Framework– 5S definitions, services taxonomy, ontology– 5SL– 5SGraph– 5SGen (and DL development)– DL development of union DL, DL integration– 5SGen into DSpace
• 5S Metamodels – Minimal DL– Archaeology DL– CBIR DL– Union DL
23
D ig ita l L ib ra r y C o n te n t
A rtic le s ,R e p o rts,
B o o ks
T e xtD o cum e n ts
S p ee ch ,M u s ic
V id eoA u d io
(A e ria l)P h o tos
G e og rap h icIn fo rm ation
M o d e lsS im u la tio ns
S o ftw a re ,P ro g ra m s
G e no m eH u m a n,a n im a l,
p la n t
B ioIn fo rm ation
2 D , 3 D ,V R ,C A T
Im ag es a ndG ra p h ics
C o nte n tT yp e s
Streams
24
Structure(Degrees, Terminology)
Chaotic Organized Structured
Web DLs DBs
25
Digital Objects (DOs)
• Born digital
• Digitized version of “real” object– Is the DO version the same, better, or worse?– Decision for ETDs: structured + rendered
• Surrogate for “real” object– Not covered explicitly in metamodel for a
minimal DL– Crucial in metamodel for archaeology DL
26
Databases
• 5S perspective: structures, streams, scenarios
• Extending database technology
• Structured and unstructured info
• Multimedia databases
• Link databases
• Performance, transaction processing
• Replicated storage, rollback/recovery
27
SpacesUser interfaces and visualization
• 2D interfaces
• 3D interfaces
• GIS
• Other paradigms
Scenarios
• Services (see later)
• Scenario based design, use cases
• Functionality
• Representation and processing for humans and machines
28
29
Societies
• User communities– Authors, editors, teachers, students, readers– Personal(ization), group(ware), community, global– Accessibility, universal access
• Librarians: reference, acquisition, operations• Research community
– Associations, conferences, publications, labs, projects• Economics
– Copyright, intellectual property rights, digital rights management, authorization, authentication, security, privacy, self-archiving (eprints)
– Publishers, catalogers, distributors, sustainability– Open source, commercial, hybrid
30
Higher DL Constructs
• Collections
• Catalogs
• Repositories and Archives
• Services
• Systems
• Case Studies
31
Collections
• Terminology: set, “database”• Distributed: basis, efficiency/effectiveness• Parallelism: federation, harvesting• Scale: object size, compression, replication,
stream splitting• Intelligence/processing granularity: object,
cluster, collection, repository
32
NSDL Collections
• Discovery of content• Classification and cataloguing• Acquisition and/or linking; referencing• Disciplinary-based themes define a natural body of content,
but other possibilities are also encouraged • Access to massive real-time or archived datasets• Software tool suites for analysis, modeling, simulation, or
visualization• Reviewed commentary on learning materials and pedagogy
33
Catalogs
• OPACs
• Distributed vs. centralized
• Coverage, breadth
• Specificity, depth
• Management: versioning, works
34
Repositories and Archives
• Naming, identifiers
• Architectures, interoperability– OAI: harvesting– SRU/SRW: federation
• Preservation, archives– LOCKSS, UVC, emulation/migration
• Scalability, storage
• Institutional repositories, Open Access
35
Services
• NSDL Services• Taxonomy of services• Ontology, composition, reuse• Evaluation• Key services in-depth:
– Crawling, indexing– Clustering, classifying– Recommending, using social networks– Logging
36
NSDL Services• Help services, frequently asked questions, etc.
• Synchronous/asynchronous collaborative learning environments using shared resources
• Mechanisms for building personal annotated digital information spaces
• Reliability testing for applets or other digital learning objects
• Audio, image, and video search capability
• Metadata system translation
• Community feedback mechanisms
37
Browsing Collaborating Customizing Filtering Providing access Recommending Requesting Searching Visualizing
Annotating Classifying Clustering Evaluating Extracting Indexing
Measuring Publicizing
Rating Reviewing (peer)
Surveying Translating
(language)
Conserving Converting
Copying/Replicating Emulating Renewing
Translating (format)
Acquiring Cataloging
Crawling (focused) Describing Digitizing
Federating Harvesting Purchasing Submitting
Preservational Creational
Add Value
Repository-Building
Information Satisfaction
Services
Infrastructure Services
38
Services Ontology: Applications
39
Ontology: Applications
• Expand definition of minimal DL by characterizing– typical DL services – in the context of “employs” and “produces”
relationships
• Use characterization to:– Reason about how DL services can be built
from other DL components– As well as be composed with other services
through extension or reuse
40
SearchingBrowsing
queryanchor
Society
actor
Collection, {digital object}
Recommending Filtering Binding Visualizing Expanding query
user model query/category {digital object}
{digital object} {digital object}
binder
InformationSatisfaction Services
space query’
fundamental
Rating Training
Infrastructure
Services (Add_Value)
composite
Requesting
handle
p pp
e e e{(digital object, actor, rate) }
p
e
e
p p p p p
e e
classifier
e ee e
e
p
e
Indexing
Index
p
e
transformer
e
41
5S and DL formal definitions and compositions (April 2004 TOIS)
5S
structures (d.10)streams (d.9) spaces (d.18) scenarios (d.21) societies (d. 24)
structural metadataspecification(d.25)
descriptive metadataspecification(d.26)
repository(d. 33)
collection (d. 31)
(d.34)indexingservice
structured stream (d.29)
digitalobject (d.30)
metadata catalog (d.32)
browsingservice
(d.37)
searchingservice (d.35)
digital library(minimal) (d. 38)
services (d.22)
sequence (d. 3)
graph (d. 6)function (d. 2)
measurable(d.12), measure(d.13), probability (d.14), vector (d.15), topological (d.16) spaces
event (d.10)state (d. 18)
hypertext(d.36)
sequence (d. 3)
transmission(d.23)
relation (d. 1) language (d.5)
grammar (d. 7)
tuple (d. 4)*
42
Streams
text
audio
image
video digitalobject
Repository
CollectionCatalog
describes
stores
is_version_of/ cites/links_to
Index
Service
Scenario
event
extends
reuses
ServiceManager
Actor
operationexecutes
participates_in
recipient
runs
Scenarios
Societies
inherits_from/includes
association
uses
Topological
ProbabilisticMetric
Measurable
Measure
describes
employsproduces
employsproduces
employs
produces
Structures
Spaces
Vector
contains
metadata specifications
is_a is_a
precedes
happens_before
is_a
redefinesinvokes
contains
contains
43
XML-based DL Log Standard• Log analysis
– is a source of information on:• How patrons really use DL services• How systems behave while supporting user information
seeking activities
• Used to:– Evaluate and enhance services– Guide allocation of resources
• Common practice in the web setting– Supported by web servers, proxy caches
• DL Logging can be more detailed
44
The XML Log Format
Log
SessionId MachineInfo StatementTransaction Timestamp
SessionInfo RegisterInfo StatementEvent Timestamp
Action
Search Browse StoreSysInfoUpdate
SearchBy QueryString CatalogCollection PresentationInfo
StatusInfo
Timeout
45
Systems
• Architectures– Client-server, service-oriented– P2P, Grid
• System descriptions and comparisons– Personal DLs; Institutional to global– DSpace, Eprints, Fedora, Greenstone, Kepler
• ODL• 5S Suite: language, visualization,
generation, logging
46
Architectural Issues• Independent system vs. part of federation• Centralized vs. distributed vs. open services• Monolithic vs. modular vs. componentized• Topologies: bus vs. star vs. hierarchical vs. network• Decompositions vary
– search engine, browser, DBMS, MM support– repository, handle server, client– information resources + mediators, bus or agent
collection + client with workspace/environment
47
NSDL Information ArchitectureEssentially as developed by the Technical Infrastructure Workgroup
referenceditems &
collections
referenceditems &
collections
Special Databases
NSDLServicesNSDL
ServicesOther NSDLServices
CI Services
annotation
CI Services
discussion
CI Services
personalization
CI Services
authentication
CI Services
browsing
Core Services:information retrieval
Core Collection-Building Services
harvesting
Core Collection-Building Services
protocols
Core Services:metadata gathering
Portals &ClientsPortals &
ClientsPortals &Clients
Usage Enhancement
Collection Building
User Interfaces
NSDLCollections
NSDLCollections
NSDLCollections
CoreNSDL“Bus”
48
5S Modeling -> SystemsDomain Concepts (theory)
DLArchitecture
instance of
ModelingLanguage(Meta-Model)
Model
used to compose instance of
abstracted from
represented by
interpreted as
represented by
interpreted as
instance of
instance of
Running
DL DL
Actors
“Real”World
“real” worldobject
Q
49
Tools/Applications
5S MetaModel
5SGraphDL
Expert
DL Designer
5SL DL
Model
5SLGen
Practitioner
Researcher
TailoredDL
Teacher
componentpool
ODLSearch,ODLBrowse,ODLRate,ODLReview,
…….
Logging ModuleXMLLog
50
Requirements Analysis Design Implementation Test
5S 5SLOO ClassesWorkflow Components
DLEvaluation
5SGraph 5SLGenFormalTheory/Metamodel
DL XMLLog
51
5SL: a DL design language
• Domain specific languages – Address a particular class of problems by offering
specific abstractions and notations for the domain at hand
– Advantages: domain-specific analysis, program management, visualization, testing, maintenance, modeling, and rapid prototyping.
• XML-based realization of 5S– Interoperability– Use of many sub-languages (e.g., MIME types, XML
Schemas, UML notations)
52
5SL – The Minimal DL Metamodel
Index
Actor
Search Manager
Index Manager
Document
Collection Catalog
Metadata
Service
Manager
Interface Manager
Community
Event
Scenario
Service
Browsing Manager
User
Interface
Scenarios (Meta-) Model
Spatial
(Meta-) Model
Meta-Models
Meta-ModelsPrimitives
Stream
(Meta-)ModelStructural (Meta-) Model
Text AudioVideo Image
Societal (Meta-) Model
Retrieval
Model
uses
runs
receiver
Repository Manager
53
<document name=`ETD'>
<stream_enumeration>
<stream
value=`ETDText'>
<stream
value=`ETDAudio'>
...
</stream_enumeration>
<structured_stream>
%XMLSchema%
<structured_stream>
</document>
Example of Document declaration in theStructures Model
<Society>
<Actor>
<Community name='Patron‘/>
<Attribute name='name‘
type='String'/>
<Attribute name='ID‘
type='Integer'/>
</Community>
<Community name='Student'>
<Service>Converting</Service>
</Community>
<Community name='ETDReviewer'>
<Service>Reviewing</Service>
</Community>
<Community name='ETDCataloguer'>
<Service>Cataloguing</Service>
</Community>
</Actor>
………
Example of Actors declaration in theSocieties Model
<SERVICE name ='Searching'>
<SCENARIO name='SimpleSearching'>
<NOTE>Simple scenario for an NDLTD
site searching service</NOTE>
<EVENT>
<SENDER>Patron</SENDER>
<RECEIVER>InterfaceManager</RECEIVER>
<OPERATION name=SearchCriteria/>
<PARAMETER>collection</PARAMETER>
<PARAMETER>query</PARAMETER>
</EVENT>
<EVENT>
<SENDER>InterfaceManager</SENDER>
<RECEIVER>SearchManager</RECEIVER>
<OPERATION name='Search'/>
<PARAMETER>collection</PARAMETER>
<PARAMETER>query</PARAMETER>
</EVENT>
<EVENT>
<SENDER>SearchManager</SENDER>
<RECEIVER>InterfaceManager</RECEIVER>
<PARAMETER name='Results'>WtdSet
</PARAMETER>
</EVENT> ….
Example of Service declaration in theScenario Model
54
• Help users model their own instances of a digital library (DL) in the 5S language (5SL).
• A simple modeling process which enables rapid generation of digital libraries
• Features– 5SGraph loads and displays a metamodel in a structured toolbox.– The structured editor of 5SGraph provides a top-down visual
building environment for the DL designer.– 5SGraph produces syntactically correct 5SL files according to the
visual model built by the designer.
5SGraph: A DL Modeling Tool
55
Overview of 5SGraph
Workspace
(instance model)
Structured
toolbox
(metamodel)
56
57
5SGen
• Version 1 -- MARIAN as the target system– Focused on rich structures: semantic networks– Behavior attached to nodes/links
• Version 2 -- Shifted for later work to componentized (ODL) approach – Focused on scenarios/societies– Structures/Spaces encapsulated within components (e.g.,
relational tables, indexes)– Only textual streams supported
• Version 3 – Practical DL (w. DSpace) – Doug Gorton
58
5SLGen – Version 2: ODL, Services, Scenarios
5SL-SocietiesModel (1)
XPATH/JDOMTransform (2)
XMI:ClassModel (3)
Xmi2Java (4)
JavaClasses
Model (5)
superclass
DeterministicFSM (10)
SMC (11)
JavaFinite
State MachineClass
Controller (12)
5SL-ScenarioModel (6)
XPath/JDOMTransform (7)
StateChartModel (8)
Scenario Synthesis (9)
ODLSearch
Java
Wrapping
import
ComponentPool
ODLBrowse
Java
Wrapping
import
.
.
.
JSPUser
InterfaceView (13)
Generated DL Services
DLDesigner
DLDesigner
binds
5SLGen
5SL-SocietiesModel (1)
XPATH/JDOMTransform (2)
XMI:ClassModel (3)
Xmi2Java (4)
JavaClasses
Model (5)
superclass
DeterministicFSM (10)
SMC (11)
JavaFinite
State MachineClass
Controller (12)
5SL-ScenarioModel (6)
XPath/JDOMTransform (7)
StateChartModel (8)
Scenario Synthesis (9)
ODLSearch
Java
Wrapping
import
ComponentPool
ODLBrowse
Java
Wrapping
import
.
.
.
ODLSearch
Java
Wrapping
import
ComponentPool
ODLBrowse
Java
Wrapping
import
.
.
.
JSPUser
InterfaceView (13)
Generated DL Services
DLDesigner
DLDesigner
binds
5SLGen
59
5S MetaModel
5SGraphDL
Expert
DL Designer
5SL DL
Model
5SLGen
Practitioner
Researcher
TailoredDL
Services
Teacher
componentpool
ODLSearch,ODLBrowse,ODLRate,ODLReview,
…….
Requirements (1) Analysis (2)
Implementation (4)
Design (3)
5SGraph 5SGen
Mapping Tool
5SSuite
60
Describing Quality inDigital Libraries
• What’s a “good” digital Library?– Central Concept: Quality!– Hypotheses of this work:
• Formal theory can help to define “what’s a good digital library” by:
• New formalizations of quality indicators for DLs within our 5S framework
• Contextualizing these measures within the Information Life Cycle
61
AuthoringModifying
OrganizingIndexing
Storing
Archiving
NetworkingAccessing
Filtering
Creation
DistributionUtilization
Significance
Similarity
Pertinence
AccuracyCompletenessConformance
Seeking
SearchingBrowsingRecommending
Relevance
Timeliness
Accessibility
Accessibility
Inactive
Active
Discard
RetentionMining
Semi-Active
Preservability
Timeliness
Preservability
Describing
Quality and the Information Life Cycle
62
Quality DimensionsDL Concept Dimensions of Quality Digital object Accessibility
Pertinence Preservability Relevance Similarity Significance Timeliness
Metadata specification Accuracy Completeness Conformance
Collection Completeness Impact Factor
Catalog Completeness Consistency
Repository Completeness Consistency
Services Composability Efficiency Effectiveness Extensibility Reusability Reliability
63
Services: Efficiency / Effectiveness
• Effectiveness– Very common measures: Precision, Recall, F1, 10-
precision, R-Precision– Other services may have different measures: e.g.,
Recommending, etc.
• Efficiency– let t(e) be the time of an event e
– let eix and efx be the initial and the final event of service sex .
– For service sex, efficiency is defined as:
• Efficiency(sex) = t(efx) - t(eix)
64
DL Integration
• What is “DL Integration”– Hide distribution– Hide heterogeneity– Enable autonomy of individual component
• Why Integration– island-DLs– inability to seamlessly and transparently
access knowledge across DLs
Utilize various autonomous DLs in concert
65
Integration: Urgency, Longevity
• If we collect, capture, acquire, or produce information, will it be usable in 100 years?
• NSF Digital Archiving Program
• Library of Congress National Digital Information Infrastructure and Preservation Program
66
DL interoperability approach
Intermediary-based mapping-based
Consists of
mediator wrapper agent
use
two architectures
federation Union Archiving
used in
Consists of
hybrid mapper composite mapper
use
schema mapping
use
Interrelated with
GA
trained by
DL integration formalization
based on
Union DL Definitions
• A Minimal Union Digital Library integrated from n DLs is given as a four-tuple: MinUnionDL=(Union Repository, Union Catalog, Minimal Union Services, Union Society).
• DL Integration Problem Definition: Given n individual digital libraries (DL1, DL2, …, DLn), each defined as described above, to integrate the n DLs is to create a Union DL.
68
Union Catalog Quality Measurement
• Complete– All the catalogs to be integrated are complete.
• Consistent– All the catalogs to be integrated are consistent.– Each descriptive metadata specification in the
union catalog describes only one digital object.
Member DLs of ETANA-DL
Repository
Catalog
DatabaseSearching
and Browsing
Archaeologists
Society
Archaeologists
Archaeologists
Society
Service
Lahav
Repository
Catalog
DatabaseSearching
and Browsing
Archaeologists
Society
Archaeologists
Archaeologists
Society
Service
Madaba
Repository
Catalog
DatabaseSearching
and Browsing
Archaeologists
Society
Archaeologists
Archaeologists
Society
Service
Megiddo
Repository
Catalog
DatabaseSearching
and Browsing
Archaeologists
Society
Archaeologists
Archaeologists
Society
Service
Umayri
…
Architecture of ETANA-DL, with centralized catalog and partially
decentralized repository
Union Catalog
Union Repository
ArchaeologistsGeneral Public
Union Society
Union Services
Harvesting, MappingSearching, Browsing, Recommendation,
Annotation, Object Comparison, Object SharingBinding, Visualization
71
Mapping confirmation
Mapping history
72
Union Catalog Integration
VN MetadataFormat
Global MetadataFormat
VNCatalog
HDCatalog
Union Catalog
MappingTool
Wrapper
MappingTool
Wrapper
HD MetadataFormat
Virtual Nimrin(VN)
Halif DigMaster(HD)
Union ArchDL
73
5SGraph5S Archaeology
MetaModelArchDL Expert ArchDL Designer
Structure Sub-model
ETANA-DLUnion Services
Descriptions
HarvestingMapping
SearchingBrowsing
…
Scenario Sub-model
VN Metadata Format
ETANA-DL Metadata Format
HD Metadata Format
Mapping Tool
Wrapper4VN Wrapper4HD
Inverted Files
Services DB
Index
Index
BrowseService
SearchService
Browse DB
OtherETANA-DL
Services
Web
Interface
XOAI
XOAI
VNCatalog
HDCatalog
UnionCatalog
5SGen
ComponentPool
Browsing…
5S definitional structure
Digital Object
RepositoryRepositoryCollectionCollection Minimal DL
Metadata Catalog
Descriptive Metadata
Specification
Structural Metadata
Specification
StreamsStreams StructuresStructures SpacesSpaces ScenariosScenarios SocietiesSocieties
indexingindexing
browsingbrowsing searchingsearching
servicesservices
hypertexthypertext
Structured Stream
Minimal archaeological DL in the5S framework
(A.i is from minimal DL, j is new)
StreamsStreams StructuresStructures SpacesSpaces ScenariosScenarios SocietiesSocieties
indexingindexing
browsingbrowsing searchingsearching
servicesservices
hypertexthypertext
Structured Stream
ArchObj
ArchColl
ArchObjArchObj
ArchCollArchColl
Arch Metadata catalogArchDO
ArchDRArchDRArchDCollArchDColl Minimal ArchDL
SpaTemOrgSpaTemOrg
StraDiaStraDia
Arch Descriptive Metadata specification
Descriptive Metadata
specification
A.1 A.2 A.3 A.4 A.5
A.6
A.8
A.9
A.10 A.11
A.12
A.7
12
A.1
4
5
6
7
8
9 10
3
Stream Structure Space Service Society
ImageStream
FeatureVector
Image Descriptor
StructuredFeatuteVector
ImageContent
Description
ImageDigitalObject
ImageObject
User InfoNeed
ImageCollection
VisualizationOperation
Content-based ImageSearching Service
Image DescriptorMetadata Catalog
Composite Descriptor
KNNQ
RQ
Minimal CBIR DL
DL Ref. Model Concepts -5S(see II.4.2)
• User -> Societies– Human and machine actors– End-users, Designers, Administrators, Application
Developers + Librarians (DL curric)
• Content -> Streams, Structures• Functionality -> Services -> Scenarios• Quality -> Services (recall 5SQual)• Policy -> Scenarios, Societies• Architecture -> Scenarios, Structures, Spaces
(components, protocols, standards, specs)
77
International Repository Infrastructure Workshop (Amsterdam, Mar 16-17, 2009)
• How can we strengthen the infrastructure for repositories: key solvable problems:
• Citation services - making citation data more easily available from repositories
• Repository handshake – talking to each other, user deposit into several at once
• Interoperable identification infrastructure – unambiguous people, documents (FRBR)
78
International Repository Infrastructure Workshop – and DL.org
• How are these 2 related?
• Can we learn from the Amsterdam meeting and focus on some important and solvable issues immediately?
79
Discussion Topics
• Faced in MARIAN, NCSTRL, CITIDEL, Ensemble, NSDL, ETANA
• Already solved: OAI-PMH• Focus
– Superimposed information / annotation– Citation information
• Approaches– 5S: 5SL, 5SGen, 5SQual– XML representations– Protocols (VIDI)
80
Summary
• Contextual Background– DL Definitions, Scope– DL Curricula Efforts– Interoperability Approaches
• 5S
• 5S Services Work
• International Repository Infrastructure Workshop (Amsterdam, Mar 16-17, 2009)
• Discussion Topics81
82
Questions?Discussion?
Thank You!