connektion: a tool for exploiting conceptual graphs automatically learned from text
DESCRIPTION
Studying, understanding and exploiting the content of a digital library, and extracting useful information thereof, require automatic techniques that can effectively support the users. To this aim, a relevant role can be played by concept taxonomies. Unfortunately, the availability of such a kind of resources is limited, and their manual building and maintenance are costly and error-prone. This work presents ConNeKTion, a tool for conceptual graph learning and exploitation. It allows to learn conceptual graphs from plain text and to enrich them by finding concept generalizations. The resulting graph can be used for several purposes: finding relationships between concepts (if any), filtering the concepts from a particular perspective, keyword extraction and information retrieval. A suitable control panel is provided for the user to comfortably carry out these activities.TRANSCRIPT
Università degli studi di Bari “Aldo Moro”Dipartimento di Informatica
ConNeKTion: A Tool for Exploiting ConceptualGraphs Automatically Learned from Text
F. Leuzzi, S. Ferilli, F. Rotella{fabio.leuzzi, stefano.ferilli, fulvio.rotella}@uniba.it
9th Italian Research Conference on Digital LibrariesUniversità la Sapienza - Rome, Italy
January 31 - February 1, 2013
L.A.C.A.M. http://lacam.di.uniba.it
Overview
● Introduction & Objectives
● Tool overview
● Knowledge Representation Formalism
● Relevant concepts
● Information Retrieval
● Reasoning by Association
● Exploiting Tool
● Conclusions & Future Works
ConNeKTion: A Tool for Exploiting Conceptual Graphs Automatically Learned from TextF. Leuzzi, S. Ferilli, F. Rotella 2
Some repositories leave the responsibility of quality to the authors.
+Anybody can produce and distribute documents.
=Possible low average quality of the repository contents.
The study, understanding and exploitation of the content of a digital library,
with the aim to easily explore the semantic content of huge amounts of text.
Introduction
ConNeKTion: A Tool for Exploiting Conceptual Graphs Automatically Learned from TextF. Leuzzi, S. Ferilli, F. Rotella 3
IntroductionPossible solution:
● Natural Language Processing systems● Provide the grammatical structures contained in text
● Knowledge Representation formalisms● Semantic networks
● Graph learning techniques● To obtain a semantic network starting from the text
● In order to satisfy the information needs, the knowledge base
can be exploited:● To make summarizations
● To reason with it
● ...
ConNeKTion: A Tool for Exploiting Conceptual Graphs Automatically Learned from TextF. Leuzzi, S. Ferilli, F. Rotella 4
Objectives
Improving fruition of a DL
● Use of a tool providing advanced functionalities
● Mixed strategy for relevant concept recognition
● Semantic approach to information retrieval
● Automatic inference over the acquired knowledge
● Reasoning by association
ConNeKTion: A Tool for Exploiting Conceptual Graphs Automatically Learned from TextF. Leuzzi, S. Ferilli, F. Rotella 5
Tool overview
ConNeKTion: A Tool for Exploiting Conceptual Graphs Automatically Learned from TextF. Leuzzi, S. Ferilli, F. Rotella 6
Knowledge representation formalism
Only subject, verb and complement have been considered.
● Subjects and complements → concepts
● Verbs → relations between them
The frequency of arcs between the concepts in positive and negative
sentences has been taken into account.
● Enrich the representation formalism
● Give robustness to our solution through a statistical approach
subject,verb,
complement
subject,complement
ConNeKTion: A Tool for Exploiting Conceptual Graphs Automatically Learned from TextF. Leuzzi, S. Ferilli, F. Rotella 7
Relevant Concepts
● Relevant nodes are sought in the graph
● Mixed strategy
● Semantic network structure
● EM clustering provided by Weka
● Keyword Extraction
● Quantitative approach based on co-occurrences
● Qualitative approach exploiting WordNet
● Psychological approach based on principles of an effective
presentation
● Components empirically weighted
ConNeKTion: A Tool for Exploiting Conceptual Graphs Automatically Learned from TextF. Leuzzi, S. Ferilli, F. Rotella 8
Information Retrieval
● Word Sense Disambiguation
● One Domain per Discourse assumption: many uses of a word in a
coherent portion of text tend to share the same domain
● Prevalent domain individuation
● Extraction of all synsets for each term
● Extraction of all domains for each synset
● Choice of prevalent domain synset
● Pairwise Complete Link Agglomerative Clustering
● Each synset generates a singleton cluster
● For each pair of clusters
● If the complete link property holds
● Merge the involved clusters
ConNeKTion: A Tool for Exploiting Conceptual Graphs Automatically Learned from TextF. Leuzzi, S. Ferilli, F. Rotella 9
Information Retrieval
● Multi-strategy Similarity Measure on WordNet
● 3 components summed and normalized in ]0,1[
● depth (ancestors)
● breadth (direct neighbors)
● breadth (inverse neighbors)
● Document Partitioning
● For each document
● Each synset votes for a cluster
● User Query Processing
● Brute force WSD to find the best synsets combination
● Best combination used to return a ranked list of clusters
● Each cluster has a list of related documents obtained by the Document
Partitioning phase
ConNeKTion: A Tool for Exploiting Conceptual Graphs Automatically Learned from TextF. Leuzzi, S. Ferilli, F. Rotella 10
Reasoning ‘by association’
Breadth-First SearchGiven two nodes (concepts), a Breadth-First Search starts from
both nodes, the former searches the latter's frontier and vice
versa, until the two frontiers meet. Then the path is restored
going backward to the roots in both directions.
We also provide the number of positive/negative instances, and
the corresponding ratios over the total to help understanding
different gradations (permitted, prohibited, typical, rare, etc.) of
actions between two objects.
ConNeKTion: A Tool for Exploiting Conceptual Graphs Automatically Learned from TextF. Leuzzi, S. Ferilli, F. Rotella 11
Reasoning ‘by association’
Breadth-First Search
ConNeKTion: A Tool for Exploiting Conceptual Graphs Automatically Learned from TextF. Leuzzi, S. Ferilli, F. Rotella 12
The table below shows a sample of possible outcomes.E.g., an interpretation of case 1 can be: “the young looks television that talks about (and criticizes) facebook, because it typically does not help (rather distracts) schoolwork”.
Reasoning ‘by association’
Probabilistic approach
ConNeKTion: A Tool for Exploiting Conceptual Graphs Automatically Learned from TextF. Leuzzi, S. Ferilli, F. Rotella 13
Defined a formalism based on ProbLog language: pi :: fi
● fi : ground literal of the form link (subject, verb, complement)
● pi : ratio between the
sum of all examples for
which fi holds and the sum
of all possible links between
subject and complement
Real world data are typically noisy
and uncertain → need for strategies
that soften the classical rigid logical
reasoning
ConNeKTion
ConNeKTion: A Tool for Exploiting Conceptual Graphs Automatically Learned from TextF. Leuzzi, S. Ferilli, F. Rotella 14
ConNeKTion
ConNeKTion: A Tool for Exploiting Conceptual Graphs Automatically Learned from TextF. Leuzzi, S. Ferilli, F. Rotella 15
ConNeKTion
ConNeKTion: A Tool for Exploiting Conceptual Graphs Automatically Learned from TextF. Leuzzi, S. Ferilli, F. Rotella 16
ConNeKTion
ConNeKTion: A Tool for Exploiting Conceptual Graphs Automatically Learned from TextF. Leuzzi, S. Ferilli, F. Rotella 17
ConNeKTion
ConNeKTion: A Tool for Exploiting Conceptual Graphs Automatically Learned from TextF. Leuzzi, S. Ferilli, F. Rotella 18
Conclusions
ConNeKTion allows to learn conceptual graphs from plain text and to
enrich them by finding concept generalizations.
The resulting graph can be used for several purposes:
● finding relationships between concepts (if any)
● filtering the concepts from a particular perspective
● relevant concepts recognition and information retrieval
A suitable control panel is provided for the user to comfortably carry out
these activities.
ConNeKTion: A Tool for Exploiting Conceptual Graphs Automatically Learned from TextF. Leuzzi, S. Ferilli, F. Rotella 19
Future Works
ConNeKTion: A Tool for Exploiting Conceptual Graphs Automatically Learned from TextF. Leuzzi, S. Ferilli, F. Rotella 20
We plan to improve the natural language text pre-processing using anaphora
resolution in order to replace, where possible, pronouns with the explicit concept
they express.
All functionalities have parameters set empirically. A criteria for automatical
setting of suitable parameters is needed.
The preseted functionalities are based on the exploitation of WordNet. A strategy
to make the operators WordNet free can be desirable.
We also wish to extend the reasoning operators by adding an argumentation
operator, that could exploit probabilistic weights, intended as a rate of reliability,
to provide support or attack to a given statement.