haystack
DESCRIPTION
Haystack. Dennis Quan Oxygen Workshop, January, 2002. Introduction. Personalized information store Semistructured data with arbitrary metadata Unified ontology Standards-based components and infrastructure Compatible with existing systems Example user interface - PowerPoint PPT PresentationTRANSCRIPT
L C S
Haystack
Dennis Quan
Oxygen Workshop, January, 2002
L C SIntroduction
• Personalized information store• Semistructured data with arbitrary metadata• Unified ontology• Standards-based components and infrastructure• Compatible with existing systems• Example user interface• Integration with mail and groupware concepts• Collaboration possibilities
L C SWhat is an Ontology?
• “The branch of metaphysics that deals with the nature of being. “ – American Heritage Dictionary
• Describes relationships between different objects in a system
• Like schemata or class hierarchies
L C SResource Description Format (RDF)
• Standard defined by W3C in 1999 (http://www.w3.org/RDF/)
• Models statements of the form: <subject> <predicate> <object>
• Can be expressed as a labeled, directed graph• For example, statements “Bob likes Alice” and “Bob likes
Jane”:
Bob
Alicelikes
<rdf:RDF xmlns:rdf=“http://www.w3.org/1999/02/22-rdf-syntax-ns#”>
<rdf:Description rdf:about=“Bob”>
<likes rdf:resource=“Alice” />
<likes rdf:resource=“Jane” />
</rdf:Description>
</rdf:RDF>
Janelikes
L C SRDF Store
• RDF Store used by Haystack to store all information• Runs off of a standard SQL database• Provides querying facility• Example: who likes Jane?(?x likes Jane); return ?x
L C SBelief
• With multitude of information, how much is believable?• Annotate who said what• Also can describe belief network using RDF• Example: John says that Bob likes Jane, and Bob believes
John
• Belief Server—component of Haystack that evaluates belief network and “filters” the store for information believed by the user
Bob
Janelike
s
assertedBy
Johnbelieves
L C SCollections
• Basic means of aggregation• Difference from “folders”: containment versus
membership• Categorization and subcategories
L C SQueries
• One possible means for constructing a collection (result set)
• Can use all possible metadata fields to construct query• Natural language• Multiple query sources—the Web, other people’s
Haystacks, etc.• Automatic update of query result sets• Possibilities for machine learning (e.g., when a user
removes an item from a result set—a message to Haystack that an object does not belong)
L C SServices
• Callable services in Haystack• Also, automatic agents that respond to events• Available methods described in metadata• Haystack service initialization script also described in
metadata• Services mainly written in Java, but can be written in any
language
L C SSOAP, WSDL and UDDI
• Relationship to Web Services standards:– Simple Object Access Protocol (SOAP)
http://www.w3.org/TR/SOAP/– Web Services Description Language (WSDL)
http://msdn.microsoft.com/library/default.asp?url=/library/en-us/dnwebsrv/html/wsdl.asp
– Universal Description, Discovery and Integration (UDDI)http://www.uddi.org/
• SOAP and HTTP/PUT used as protocols for communication between services, including the RDF Store
• RDFized version of WSDL used to describe services’ interfaces
• UDDI query functionality easily modeled in RDF query
L C SInference Layer
• The semantics defined in RDF often permit deduction• Example: Fido is a dog and dogs are mammals Fido is a
mammal• Deduced knowledge is useful and should be stored• Inference Layer recognizes patterns and triggers
agents/services to perform deduction
L C SViews
• May be several different ways of looking at an object• Example: appointment book can be viewed as a sortable
list of appointments or a calendar• Views are a distinct type of object used to model these
different ways of looking at objects
L C SUser Interface Ontology
• UI components (e.g. JavaBeans, ActiveX controls) rich sources of metadata
• Form descriptions also describable with metadata• Possible to construct a directed graph that models a user
interface• Similar in concept to XUL• Permits dynamic deduction of user interface similar to
XSLT, except semantic rather than syntactic• Part: a Haystack UI component• ViewPart: a kind of part specially designed to display a
specific kind of View
L C SSWT
• Cross-platform Java widget toolkit• Part of Eclipse project (http://www.eclipse.org/)• Uses native operating systems’ widgets, avoiding
performance problems• Used for Part framework• Integrates with Mozilla web browser• Also possible to use ActiveX controls and GTK widgets
L C SOzone
• Haystack experimental user interface• Modeled after a web browser• Uses parts to describe user interface
L C SBrowse/Query Paradigm
• Browsing: going through nested folders/categories to locate sought item(s)
• Query: giving an explicit set of conditions to locate sought item(s)
• Ozone adopts hybrid Browse/Query paradigm• Traditional subcategories still present in Collection view• Also, parameterized categories similar to queries• Previously issued queries persist as subcategories
L C SMail
• E-mail a good source of metadata-rich documents• Messages, e-mail addresses, people and groups can be
modeled in RDF• Haystack agents can be used to filter e-mail to make it
more manageable• Many e-mail management techniques applicable to
documents in general and vice versa
L C SStorage Model
• Objects in Haystack named by Uniform Resource Identifiers (URIs)
• URLs are a subclass of URIs• Documents and web pages can be named by URLs• HTTP/FTP/WebDAV servers can then be used to store
documents• Inefficient to store terabytes of “data” in RDF when
existing storage solutions are effective
L C SCollaboration
• Allow Haystack-Haystack and Haystack-Semantic Web information exchange
• Filtration of imported data• Who’s the expert? problem• Privacy concerns• Different ways of organizing information between different
parties• Can be used to model mailing lists, newsgroups, and
groupware
L C SOntological Conversion
• Unlikely that everyone will agree on the same schemata• Ontological conversion converts from one schema to
another• Can be implemented as Haystack agents that respond to
metadata with “foreign” schemata
L C SImplementation
• Written for Java 2 platform (JDK 1.3.1)• SWT (Eclipse) used for user interface components• Mozilla web browser• HSQL open source SQL database written in Java• Lucene (Apache Jakarta project) search engine written in
Java• Tomcat (Apache Jakarta project) web server written in
Java• Parts written in Jython, Java-based Python interpreter