giovanni maria sacco
TRANSCRIPT
![Page 1: Giovanni Maria Sacco](https://reader034.vdocument.in/reader034/viewer/2022052400/55a10fb51a28ab6a508b47ee/html5/thumbnails/1.jpg)
Guided Interactive Discovery of Guided Interactive Discovery of e-Government Servicese-Government Services
Giovanni Maria SaccoDipartimento di Informatica, Università di Torino
Corso Svizzera 185, 10149 Torino, [email protected]
Where is the knowledge we have lost in information?
T.S. Eliot, The Rock
![Page 2: Giovanni Maria Sacco](https://reader034.vdocument.in/reader034/viewer/2022052400/55a10fb51a28ab6a508b47ee/html5/thumbnails/2.jpg)
e-Government Services for citizens represent one of the most frequent and critical points of contact between citizens and public administrations.
THE PUBLIC FACE OF GOVERNMENT
e-services represent the only practical way of providing incentives and support to specific classes of citizens.
THE FRIENDLIER FACE OF GOVERNMENT
![Page 3: Giovanni Maria Sacco](https://reader034.vdocument.in/reader034/viewer/2022052400/55a10fb51a28ab6a508b47ee/html5/thumbnails/3.jpg)
DISCOVERY of e-services
rather than plain RETRIEVAL
is a critical functionality in e-government systems
But it is managed by search rather than explorative technology
![Page 4: Giovanni Maria Sacco](https://reader034.vdocument.in/reader034/viewer/2022052400/55a10fb51a28ab6a508b47ee/html5/thumbnails/4.jpg)
TRADITIONAL SEARCH TECHNIQUES
DO NOT WORK
![Page 5: Giovanni Maria Sacco](https://reader034.vdocument.in/reader034/viewer/2022052400/55a10fb51a28ab6a508b47ee/html5/thumbnails/5.jpg)
Since the vast majority of information is essentially textual and unstructured in nature
information retrieval techniques are extensively used both in pull and push strategies
BUT…
![Page 6: Giovanni Maria Sacco](https://reader034.vdocument.in/reader034/viewer/2022052400/55a10fb51a28ab6a508b47ee/html5/thumbnails/6.jpg)
1. almost 80% of relevant documents are not retrieved
2. extremely wide semantic gap between the user model (concepts) and the system model (words)
3. users have no or very little assistance in formulating queries
4. results are presented as a flat list with no systematic organization: browsing is difficult or impossible.
![Page 7: Giovanni Maria Sacco](https://reader034.vdocument.in/reader034/viewer/2022052400/55a10fb51a28ab6a508b47ee/html5/thumbnails/7.jpg)
RICH SEMANTIC SCHEMATA (ONTOLOGIES)
• End-users do not understand them
• Agent mediators required: costly to implement, not transparent, hard to understand what they do
• Schemata hard to design and maintain
![Page 8: Giovanni Maria Sacco](https://reader034.vdocument.in/reader034/viewer/2022052400/55a10fb51a28ab6a508b47ee/html5/thumbnails/8.jpg)
Traditional research has focussed on
RETRIEVAL OF INFORMATION
BUTThe most common task is BROWSING:
FIND RELATIONSHIPS
THIN ALTERNATIVES OUT
![Page 9: Giovanni Maria Sacco](https://reader034.vdocument.in/reader034/viewer/2022052400/55a10fb51a28ab6a508b47ee/html5/thumbnails/9.jpg)
Finding opportunities/services
Finding a job
Finding the laws and regulations that apply
BUT ALSO
Buying a digital camera
Finding a restaurant for tonight
Finding the cause of a malfunction
Selecting a photo
Finding a suspect/missing person from a photobank
….
![Page 10: Giovanni Maria Sacco](https://reader034.vdocument.in/reader034/viewer/2022052400/55a10fb51a28ab6a508b47ee/html5/thumbnails/10.jpg)
REQUIRE
A DIFFERENT INFORMATION ACCESS PARADIGM
GUIDED EXPLORATIONAND
INFORMATION THINNING
![Page 11: Giovanni Maria Sacco](https://reader034.vdocument.in/reader034/viewer/2022052400/55a10fb51a28ab6a508b47ee/html5/thumbnails/11.jpg)
Dynamic Taxonomies:
the first model to fully exploit multidimensional and faceted classifications
Sacco, G.M., “Dynamic taxonomies: a model for large information bases”, IEEE Tra ns . o n Da ta a nd Kno wle dg e Eng ine e ring , Ma y /June 2 0 0 0
US Patent n. 6,763,349 (EU pending)
DYNAMIC TAXONOMIES
![Page 12: Giovanni Maria Sacco](https://reader034.vdocument.in/reader034/viewer/2022052400/55a10fb51a28ab6a508b47ee/html5/thumbnails/12.jpg)
Representation
Intension: The infobase is described by a taxonomy designed by an expert (the schema)
Extension: Documents can be classified at any level of abstraction and each document is classified under n concepts (n>1)
No relationships other than subsumptions (IS-A, PART-OF) need to be represented in the schema.
DYNAMIC TAXONOMIES
![Page 13: Giovanni Maria Sacco](https://reader034.vdocument.in/reader034/viewer/2022052400/55a10fb51a28ab6a508b47ee/html5/thumbnails/13.jpg)
What is a concept?
A concept is a label which identifies a set of documents (classified under that concept)
A nominalistic approach: concepts are described by instances rather than by properties
Subsumptions require that an inclusion constraint is maintained:
If D(C) denotes the set of documents classified under C and C’ is a descendant of C in the hierarchy, D(C’)⊆D(C)
DYNAMIC TAXONOMIES
![Page 14: Giovanni Maria Sacco](https://reader034.vdocument.in/reader034/viewer/2022052400/55a10fb51a28ab6a508b47ee/html5/thumbnails/14.jpg)
How do concepts relate?
By subsumptions (IS-A, PART-OF)
By the Extensional Inference Rule:
Two concepts C and C’ are related if there is at least a document D which is classified both under C and C’ or one of their descendants
Because of the inclusion constraint, IS-A, PART-OF relationships are a special
case of the Extensional Inference Rule.
DYNAMIC TAXONOMIES
![Page 15: Giovanni Maria Sacco](https://reader034.vdocument.in/reader034/viewer/2022052400/55a10fb51a28ab6a508b47ee/html5/thumbnails/15.jpg)
DYNAMIC TAXONOMIES
Concepts extensionally related to G have a yellow background
A
B C D
E F G H I L M
a b c d e
![Page 16: Giovanni Maria Sacco](https://reader034.vdocument.in/reader034/viewer/2022052400/55a10fb51a28ab6a508b47ee/html5/thumbnails/16.jpg)
DYNAMIC TAXONOMIES
Concepts extensionally related to G have a yellow background
A
B C D
E F G H I L M
a b c d e
![Page 17: Giovanni Maria Sacco](https://reader034.vdocument.in/reader034/viewer/2022052400/55a10fb51a28ab6a508b47ee/html5/thumbnails/17.jpg)
Important consequence:
Relationships among concepts need not be anticipated but can be inferred from the actual classification
Advantages:
a simpler schema
adapts to new relationships (dynamic)
finds unexpected relationships (discovery)
DYNAMIC TAXONOMIES
![Page 18: Giovanni Maria Sacco](https://reader034.vdocument.in/reader034/viewer/2022052400/55a10fb51a28ab6a508b47ee/html5/thumbnails/18.jpg)
Putting it all together…
The browsing system
AN EXAMPLE
![Page 19: Giovanni Maria Sacco](https://reader034.vdocument.in/reader034/viewer/2022052400/55a10fb51a28ab6a508b47ee/html5/thumbnails/19.jpg)
1. Initial step: Tree picture of the entire infobase
AN EXAMPLE
The infobase schema is used for browsing
The initial focus is the entire infobase
![Page 20: Giovanni Maria Sacco](https://reader034.vdocument.in/reader034/viewer/2022052400/55a10fb51a28ab6a508b47ee/html5/thumbnails/20.jpg)
2. Zoom on a concept and see related concepts
AN EXAMPLE
This is the central operation:
1. The new focus is ANDed with the previous focus
2. The entire infobase is reduced to the documents in the current focus
3. The taxonomy is reduced in order to show all and only those concepts which are extensionally related to the selected focus (filtering)
![Page 21: Giovanni Maria Sacco](https://reader034.vdocument.in/reader034/viewer/2022052400/55a10fb51a28ab6a508b47ee/html5/thumbnails/21.jpg)
3. Iterate until the number of documents is sufficiently small
AN EXAMPLE
3 zoom operations are sufficient to select an 3 zoom operations are sufficient to select an average 10 documents from infobases with average 10 documents from infobases with 1,000,000 documents, described by a compact 1,000,000 documents, described by a compact taxonomy with 1,000 concepts.taxonomy with 1,000 concepts.
![Page 22: Giovanni Maria Sacco](https://reader034.vdocument.in/reader034/viewer/2022052400/55a10fb51a28ab6a508b47ee/html5/thumbnails/22.jpg)
• Simple and familiar interface (the only new operation is the Zoom, which is easily understood)
• The user is effectively guided to reach his goal: at each stage he has a complete list of all related concepts (i.e. a complete taxonomic summary of his current focus)
• Completely symmetric interaction: if A and B are related, the user will find B if he zooms on A, and A if he zooms on B (most systems are asymmetric)
• Discovery of unexpected relationships
BENEFITS
![Page 23: Giovanni Maria Sacco](https://reader034.vdocument.in/reader034/viewer/2022052400/55a10fb51a28ab6a508b47ee/html5/thumbnails/23.jpg)
• TRANSPARENCY: the user is in charge and knows exactly what’s happening
• EXCELLENT CONVERGENCE very few iterations needed
BENEFITS
![Page 24: Giovanni Maria Sacco](https://reader034.vdocument.in/reader034/viewer/2022052400/55a10fb51a28ab6a508b47ee/html5/thumbnails/24.jpg)
• Easy multilingual support (just translate concept labels)
• Easy to unobtrusively gather user interests
• Easy to accommodate reviews, popularity, etc.
• Effective push strategies
dbworldx.di.unito.it
BENEFITS
![Page 25: Giovanni Maria Sacco](https://reader034.vdocument.in/reader034/viewer/2022052400/55a10fb51a28ab6a508b47ee/html5/thumbnails/25.jpg)
• Simple integration with other retrieval techniques (IR, DB):
dynamic taxonomies as a prefilter:they establish the context for the query
dynamic taxonomies as a conceptual summary: they summarize long result lists
BENEFITS
![Page 26: Giovanni Maria Sacco](https://reader034.vdocument.in/reader034/viewer/2022052400/55a10fb51a28ab6a508b47ee/html5/thumbnails/26.jpg)
CONCLUSIONS
Dynamic taxonomies provide a single and simple access model that solves the vast majority of the information dissemination needs of public administrations
In fact they are so versatile that can be used for:
laws and regulations, e-commerce, medical guidelines, human resource management, multimedia information bases…
![Page 27: Giovanni Maria Sacco](https://reader034.vdocument.in/reader034/viewer/2022052400/55a10fb51a28ab6a508b47ee/html5/thumbnails/27.jpg)
Universal Knowledge Processor
High-performance dynamic taxonomy engine
• Microsoft Windows environment
• A set of high performance multithreaded COM objects
• Intension and extension in RAM even for large databases (20Mb for 1M documents)
• Extremely fast operation: 327 reduced taxonomies per second on a 800K item infobase
CONCLUSIONS
![Page 28: Giovanni Maria Sacco](https://reader034.vdocument.in/reader034/viewer/2022052400/55a10fb51a28ab6a508b47ee/html5/thumbnails/28.jpg)
THE SYSTEM IS AVAILABLE AT
www.knowledgeprocessors.com
Thank you!
CONCLUSIONS