l c sl c s haystack dennis quan oxygen workshop, january, 2002

21
L C S Haystac k Dennis Quan Oxygen Workshop, January, 2002

Upload: neal-fox

Post on 04-Jan-2016

215 views

Category:

Documents


1 download

TRANSCRIPT

L C S

Haystack

Dennis Quan

Oxygen Workshop, January, 2002

L C SIntroduction

• Personalized information store

• Semistructured data with arbitrary metadata

• Unified ontology

• Standards-based components and infrastructure

• Compatible with existing systems

• Example user interface

• Integration with mail and groupware concepts

• Collaboration possibilities

L C SWhat is an Ontology?

• “The branch of metaphysics that deals with the nature of being. “ – American Heritage Dictionary

• Describes relationships between different objects in a system

• Like schemata or class hierarchies

L C SResource Description Format (RDF)

• Standard defined by W3C in 1999 (http://www.w3.org/RDF/)

• Models statements of the form:

<subject> <predicate> <object>

• Can be expressed as a labeled, directed graph

• For example, statements “Bob likes Alice” and “Bob likes Jane”:

Bob

Alicelikes

<rdf:RDF xmlns:rdf=“http://www.w3.org/1999/02/22-rdf-syntax-ns#”>

<rdf:Description rdf:about=“Bob”>

<likes rdf:resource=“Alice” />

<likes rdf:resource=“Jane” />

</rdf:Description>

</rdf:RDF>

Janelikes

L C SRDF Store

• RDF Store used by Haystack to store all information

• Runs off of a standard SQL database

• Provides querying facility

• Example: who likes Jane?

(?x likes Jane); return ?x

L C SBelief

• With multitude of information, how much is believable?

• Annotate who said what

• Also can describe belief network using RDF

• Example: John says that Bob likes Jane, and Bob believes John

• Belief Server—component of Haystack that evaluates belief network and “filters” the store for information believed by the user

Bob

Janelike

s

assertedBy

Johnbelieves

L C SCollections

• Basic means of aggregation

• Difference from “folders”: containment versus membership

• Categorization and subcategories

L C SQueries

• One possible means for constructing a collection (result set)

• Can use all possible metadata fields to construct query

• Natural language

• Multiple query sources—the Web, other people’s Haystacks, etc.

• Automatic update of query result sets

• Possibilities for machine learning (e.g., when a user removes an item from a result set—a message to Haystack that an object does not belong)

L C SServices

• Callable services in Haystack

• Also, automatic agents that respond to events

• Available methods described in metadata

• Haystack service initialization script also described in metadata

• Services mainly written in Java, but can be written in any language

L C SSOAP, WSDL and UDDI

• Relationship to Web Services standards:– Simple Object Access Protocol (SOAP)

http://www.w3.org/TR/SOAP/

– Web Services Description Language (WSDL)

http://msdn.microsoft.com/library/default.asp?url=/library/en-us/dnwebsrv/html/wsdl.asp

– Universal Description, Discovery and Integration (UDDI)

http://www.uddi.org/

• SOAP and HTTP/PUT used as protocols for communication between services, including the RDF Store

• RDFized version of WSDL used to describe services’ interfaces

• UDDI query functionality easily modeled in RDF query

L C SInference Layer

• The semantics defined in RDF often permit deduction

• Example: Fido is a dog and dogs are mammals Fido is a mammal

• Deduced knowledge is useful and should be stored

• Inference Layer recognizes patterns and triggers agents/services to perform deduction

L C SViews

• May be several different ways of looking at an object

• Example: appointment book can be viewed as a sortable list of appointments or a calendar

• Views are a distinct type of object used to model these different ways of looking at objects

L C SUser Interface Ontology

• UI components (e.g. JavaBeans, ActiveX controls) rich sources of metadata

• Form descriptions also describable with metadata

• Possible to construct a directed graph that models a user interface

• Similar in concept to XUL

• Permits dynamic deduction of user interface similar to XSLT, except semantic rather than syntactic

• Part: a Haystack UI component

• ViewPart: a kind of part specially designed to display a specific kind of View

L C SSWT

• Cross-platform Java widget toolkit

• Part of Eclipse project (http://www.eclipse.org/)

• Uses native operating systems’ widgets, avoiding performance problems

• Used for Part framework

• Integrates with Mozilla web browser

• Also possible to use ActiveX controls and GTK widgets

L C SOzone

• Haystack experimental user interface

• Modeled after a web browser

• Uses parts to describe user interface

L C SBrowse/Query Paradigm

• Browsing: going through nested folders/categories to locate sought item(s)

• Query: giving an explicit set of conditions to locate sought item(s)

• Ozone adopts hybrid Browse/Query paradigm

• Traditional subcategories still present in Collection view

• Also, parameterized categories similar to queries

• Previously issued queries persist as subcategories

L C SMail

• E-mail a good source of metadata-rich documents

• Messages, e-mail addresses, people and groups can be modeled in RDF

• Haystack agents can be used to filter e-mail to make it more manageable

• Many e-mail management techniques applicable to documents in general and vice versa

L C SStorage Model

• Objects in Haystack named by Uniform Resource Identifiers (URIs)

• URLs are a subclass of URIs

• Documents and web pages can be named by URLs

• HTTP/FTP/WebDAV servers can then be used to store documents

• Inefficient to store terabytes of “data” in RDF when existing storage solutions are effective

L C SCollaboration

• Allow Haystack-Haystack and Haystack-Semantic Web information exchange

• Filtration of imported data

• Who’s the expert? problem

• Privacy concerns

• Different ways of organizing information between different parties

• Can be used to model mailing lists, newsgroups, and groupware

L C SOntological Conversion

• Unlikely that everyone will agree on the same schemata

• Ontological conversion converts from one schema to another

• Can be implemented as Haystack agents that respond to metadata with “foreign” schemata

L C SImplementation

• Written for Java 2 platform (JDK 1.3.1)

• SWT (Eclipse) used for user interface components

• Mozilla web browser

• HSQL open source SQL database written in Java

• Lucene (Apache Jakarta project) search engine written in Java

• Tomcat (Apache Jakarta project) web server written in Java

• Parts written in Jython, Java-based Python interpreter