haystack

21
L C S Haystac k Dennis Quan Oxygen Workshop, January, 2002

Upload: bobby

Post on 21-Mar-2016

30 views

Category:

Documents


1 download

DESCRIPTION

Haystack. Dennis Quan Oxygen Workshop, January, 2002. Introduction. Personalized information store Semistructured data with arbitrary metadata Unified ontology Standards-based components and infrastructure Compatible with existing systems Example user interface - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Haystack

L C S

Haystack

Dennis Quan

Oxygen Workshop, January, 2002

Page 2: Haystack

L C SIntroduction

• Personalized information store• Semistructured data with arbitrary metadata• Unified ontology• Standards-based components and infrastructure• Compatible with existing systems• Example user interface• Integration with mail and groupware concepts• Collaboration possibilities

Page 3: Haystack

L C SWhat is an Ontology?

• “The branch of metaphysics that deals with the nature of being. “ – American Heritage Dictionary

• Describes relationships between different objects in a system

• Like schemata or class hierarchies

Page 4: Haystack

L C SResource Description Format (RDF)

• Standard defined by W3C in 1999 (http://www.w3.org/RDF/)

• Models statements of the form: <subject> <predicate> <object>

• Can be expressed as a labeled, directed graph• For example, statements “Bob likes Alice” and “Bob likes

Jane”:

Bob

Alicelikes

<rdf:RDF xmlns:rdf=“http://www.w3.org/1999/02/22-rdf-syntax-ns#”>

<rdf:Description rdf:about=“Bob”>

<likes rdf:resource=“Alice” />

<likes rdf:resource=“Jane” />

</rdf:Description>

</rdf:RDF>

Janelikes

Page 5: Haystack

L C SRDF Store

• RDF Store used by Haystack to store all information• Runs off of a standard SQL database• Provides querying facility• Example: who likes Jane?(?x likes Jane); return ?x

Page 6: Haystack

L C SBelief

• With multitude of information, how much is believable?• Annotate who said what• Also can describe belief network using RDF• Example: John says that Bob likes Jane, and Bob believes

John

• Belief Server—component of Haystack that evaluates belief network and “filters” the store for information believed by the user

Bob

Janelike

s

assertedBy

Johnbelieves

Page 7: Haystack

L C SCollections

• Basic means of aggregation• Difference from “folders”: containment versus

membership• Categorization and subcategories

Page 8: Haystack

L C SQueries

• One possible means for constructing a collection (result set)

• Can use all possible metadata fields to construct query• Natural language• Multiple query sources—the Web, other people’s

Haystacks, etc.• Automatic update of query result sets• Possibilities for machine learning (e.g., when a user

removes an item from a result set—a message to Haystack that an object does not belong)

Page 9: Haystack

L C SServices

• Callable services in Haystack• Also, automatic agents that respond to events• Available methods described in metadata• Haystack service initialization script also described in

metadata• Services mainly written in Java, but can be written in any

language

Page 10: Haystack

L C SSOAP, WSDL and UDDI

• Relationship to Web Services standards:– Simple Object Access Protocol (SOAP)

http://www.w3.org/TR/SOAP/– Web Services Description Language (WSDL)

http://msdn.microsoft.com/library/default.asp?url=/library/en-us/dnwebsrv/html/wsdl.asp

– Universal Description, Discovery and Integration (UDDI)http://www.uddi.org/

• SOAP and HTTP/PUT used as protocols for communication between services, including the RDF Store

• RDFized version of WSDL used to describe services’ interfaces

• UDDI query functionality easily modeled in RDF query

Page 11: Haystack

L C SInference Layer

• The semantics defined in RDF often permit deduction• Example: Fido is a dog and dogs are mammals Fido is a

mammal• Deduced knowledge is useful and should be stored• Inference Layer recognizes patterns and triggers

agents/services to perform deduction

Page 12: Haystack

L C SViews

• May be several different ways of looking at an object• Example: appointment book can be viewed as a sortable

list of appointments or a calendar• Views are a distinct type of object used to model these

different ways of looking at objects

Page 13: Haystack

L C SUser Interface Ontology

• UI components (e.g. JavaBeans, ActiveX controls) rich sources of metadata

• Form descriptions also describable with metadata• Possible to construct a directed graph that models a user

interface• Similar in concept to XUL• Permits dynamic deduction of user interface similar to

XSLT, except semantic rather than syntactic• Part: a Haystack UI component• ViewPart: a kind of part specially designed to display a

specific kind of View

Page 14: Haystack

L C SSWT

• Cross-platform Java widget toolkit• Part of Eclipse project (http://www.eclipse.org/)• Uses native operating systems’ widgets, avoiding

performance problems• Used for Part framework• Integrates with Mozilla web browser• Also possible to use ActiveX controls and GTK widgets

Page 15: Haystack

L C SOzone

• Haystack experimental user interface• Modeled after a web browser• Uses parts to describe user interface

Page 16: Haystack

L C SBrowse/Query Paradigm

• Browsing: going through nested folders/categories to locate sought item(s)

• Query: giving an explicit set of conditions to locate sought item(s)

• Ozone adopts hybrid Browse/Query paradigm• Traditional subcategories still present in Collection view• Also, parameterized categories similar to queries• Previously issued queries persist as subcategories

Page 17: Haystack

L C SMail

• E-mail a good source of metadata-rich documents• Messages, e-mail addresses, people and groups can be

modeled in RDF• Haystack agents can be used to filter e-mail to make it

more manageable• Many e-mail management techniques applicable to

documents in general and vice versa

Page 18: Haystack

L C SStorage Model

• Objects in Haystack named by Uniform Resource Identifiers (URIs)

• URLs are a subclass of URIs• Documents and web pages can be named by URLs• HTTP/FTP/WebDAV servers can then be used to store

documents• Inefficient to store terabytes of “data” in RDF when

existing storage solutions are effective

Page 19: Haystack

L C SCollaboration

• Allow Haystack-Haystack and Haystack-Semantic Web information exchange

• Filtration of imported data• Who’s the expert? problem• Privacy concerns• Different ways of organizing information between different

parties• Can be used to model mailing lists, newsgroups, and

groupware

Page 20: Haystack

L C SOntological Conversion

• Unlikely that everyone will agree on the same schemata• Ontological conversion converts from one schema to

another• Can be implemented as Haystack agents that respond to

metadata with “foreign” schemata

Page 21: Haystack

L C SImplementation

• Written for Java 2 platform (JDK 1.3.1)• SWT (Eclipse) used for user interface components• Mozilla web browser• HSQL open source SQL database written in Java• Lucene (Apache Jakarta project) search engine written in

Java• Tomcat (Apache Jakarta project) web server written in

Java• Parts written in Jython, Java-based Python interpreter