interoperability of collections brandon muramatsu on behalf of merlot and smete

16
Interoperability of Collections Brandon Muramatsu On behalf of MERLOT and SMETE

Upload: peter-dixon

Post on 11-Jan-2016

214 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Interoperability of Collections Brandon Muramatsu On behalf of MERLOT and SMETE

Interoperability of Collections

Brandon MuramatsuOn behalf of

MERLOT and SMETE

Page 2: Interoperability of Collections Brandon Muramatsu On behalf of MERLOT and SMETE

March 29, 2003Federated Search at MERLOT and SMETE 2

Underlying Question

How could widespread distribution of a collection’s item-level metadata enhance rather than dilute the value of the collection?

Andy Dong, SMETE

Director of Technology

Page 3: Interoperability of Collections Brandon Muramatsu On behalf of MERLOT and SMETE

March 29, 2003Federated Search at MERLOT and SMETE 3

Enhancing Value

• (Focusing on educational digital libraries and not Z39.50 providers like traditional digital libraries)

• Providing additional related resources to end-users of a collection– Example: ENC mathematics resources available to

users of MathDL• Providing additional value-added services on top

of multiple collections for end-users– Example: Providing MERLOT-“peer review” to

resources in Math Tools• Wider-scale discovery and use of resources

– Ideally leading to increased use of the host collection (especially if it provides a value added services)

Page 4: Interoperability of Collections Brandon Muramatsu On behalf of MERLOT and SMETE

March 29, 2003Federated Search at MERLOT and SMETE 4

Outline

• Approach– Federated Search

– Metadata Harvesting

• Policy Issues– Integrity

– Identity

• Technical Issues– SOAP and WSDL

– Prototype Implementation

• Discussion

Page 5: Interoperability of Collections Brandon Muramatsu On behalf of MERLOT and SMETE

March 29, 2003Federated Search at MERLOT and SMETE 5

Approaches: Harvesting

• Metadata from distributed collections are “harvested” or “gathered”– Typically using Open Archives Initiative-Protocol for

Metadata Harvesting• Item-level metadata is stored “permanently” by a

third party that creates a composite index of all item-level metadata it has stored

• Searches are conducted against “permanent” index of harvested item-level metadata

• Pros:– Common approach used by commercial indexing and

abstracting services and Web search engines

Page 6: Interoperability of Collections Brandon Muramatsu On behalf of MERLOT and SMETE

March 29, 2003Federated Search at MERLOT and SMETE 6

Approaches:Federated Search

• Metadata is searched synchronously from multiple, distributed collections

• Item-level metadata is held “temporarily” during a user’s session

• Search returns results of distributed search– May be with or without integrated lists of results

• Pros:– Common approach used by Libraries when they query

multiple databases

Page 7: Interoperability of Collections Brandon Muramatsu On behalf of MERLOT and SMETE

March 29, 2003Federated Search at MERLOT and SMETE 7

Concerns

Applies to Either Approach

Protect Value and Integrity of the Providing Collection that Lead to Issues of Sustainability

– Protect integrity of the providing collection

– Protect identity of the providing collection

Page 8: Interoperability of Collections Brandon Muramatsu On behalf of MERLOT and SMETE

March 29, 2003Federated Search at MERLOT and SMETE 8

Policy Principles

• Protect Integrity of the Providing Collection– Ensures Providing Collection maintains “control”

of it’s metadata to ensure quality• Enables use of the “name” of the providing

collection to indicate quality – Allow federated searches after “formal”

agreement– Prevent unauthorized access (either harvesting

or federated search) and redistribution• May use “keys” for authentication coupled with log

analysis

Page 9: Interoperability of Collections Brandon Muramatsu On behalf of MERLOT and SMETE

March 29, 2003Federated Search at MERLOT and SMETE 9

Policy Issues

• Protect Identity of the Providing Collection– Attribute providing collection as the source of

the metadata• Potentially need to acknowledge provider of the

metadata and the original cataloger of the metadata

– Ensure “branding” of the providing collection• Typically through using a logo

– Enables use of the “name” of the providing collection to indicate quality

Page 10: Interoperability of Collections Brandon Muramatsu On behalf of MERLOT and SMETE

March 29, 2003Federated Search at MERLOT and SMETE 10

Technology Issues

• Using SOAP and WSDL to Simplify Process– Gives service providers a mechanism to

publish available services, including the semantics and syntax for accessing and consuming the service (WSDL).

– Allows service consumers the ability to discover services and configure software clients to access remote services.

Page 11: Interoperability of Collections Brandon Muramatsu On behalf of MERLOT and SMETE

March 29, 2003Federated Search at MERLOT and SMETE 11

Technologies: SOAP

• Simple Object Access Protocol– W3C Consortium Spec: www.w3.org/TR/SOAP

• XML-based Protocol for Exchanging Information in a Distributed Environment– Envelope describing what is in the message

– Set of encoding rules for expressing instances of application-defined datatypes

– Convention for representing remote procedure calls and responses

Page 12: Interoperability of Collections Brandon Muramatsu On behalf of MERLOT and SMETE

March 29, 2003Federated Search at MERLOT and SMETE 12

Technologies: WSDL

• Web Services Description Language– W3C Consortium Spec: www.w3.org/TR/WSDL.html

• XML-based grammar for describing network services as collections of communications endpoints capable of exchanging messages– (XML-formatted description of network-based services

as a set of endpoints operating on messages containing either document oriented or procedure-oriented information.)

– Abstract definition tied to concrete network protocol and message format at each endpoint

Page 13: Interoperability of Collections Brandon Muramatsu On behalf of MERLOT and SMETE

March 29, 2003Federated Search at MERLOT and SMETE 13

Prototype Implementation Architecture

GLUE toolkitMERLOT ServicedoMerlotSearch(···)

WSDL

MERLOT Service deployment

Interface Discovery

Axis toolkitSMETE ServicedoSmeteSearch(···)

SMETE Sevice deployment

WSDL

Federated Search Client

MERLOT client stub SMETE client stub

GLUE toolkit

Service Access Thread Service Access Thread

Results Processing Search Input Transformation

Search DispatchSearch Input

Page 14: Interoperability of Collections Brandon Muramatsu On behalf of MERLOT and SMETE

March 29, 2003Federated Search at MERLOT and SMETE 14

Prototype Implementation

• Separate implementations (server-side code)– Different WSDL files

• Similar input parameters– Key, query, start, maxResults, language

• Different search syntaxes supported– Google API, Lucene, full XML IEEE LOM

• Different ranking methodologies but common agreement on scale– Convert to 1-100 scale to integrate results

• Still need to do better documentation of implementations

www.smete.org/?path=/public/about_smete/activities/technology/federated_search/smetesearchapiv2.jhtml

Page 15: Interoperability of Collections Brandon Muramatsu On behalf of MERLOT and SMETE

March 29, 2003Federated Search at MERLOT and SMETE 15

Relation to Standards and Specifications Activities

• No item-level metadata element set is prescribed– Though both MERLOT and SMETE use variants of

IMS/IEEE LOM– Presumably most educational digital libraries will be

familiar with Dublin Core and IEEE LOM• Not using IMS Digital Repositories Spec• No “standard” for query languages

– Z39.50 query and query-type is most widely adopted– Not using XML-based query languages XQuery or

XPath because of adoption issues and evolving specifications

• No widely adopted “standard” for federated search

Page 16: Interoperability of Collections Brandon Muramatsu On behalf of MERLOT and SMETE

March 29, 2003Federated Search at MERLOT and SMETE 16

Context of Use?

• Talking Primarily about Item-Level Metadata– Interpret as generally the non-pedagogical/context

items of IEEE 1484.12.1 Learning Object Metadata

• What about Context?– “Assignments”

– “Comments”

– “Reviews”