connektion: a tool for exploiting conceptual graphs automatically learned from text

20
Università degli studi di Bari “Aldo Moro” Dipartimento di Informatica ConNeKTion: A Tool for Exploiting Conceptual Graphs Automatically Learned from Text F. Leuzzi, S. Ferilli, F. Rotella {fabio.leuzzi, stefano.ferilli, fulvio.rotella}@uniba.it 9th Italian Research Conference on Digital Libraries Università la Sapienza - Rome, Italy January 31 - February 1, 2013 L.A.C.A.M. http://lacam.di.uniba.it

Upload: university-of-bari-italy

Post on 20-Jun-2015

183 views

Category:

Documents


2 download

DESCRIPTION

Studying, understanding and exploiting the content of a digital library, and extracting useful information thereof, require automatic techniques that can effectively support the users. To this aim, a relevant role can be played by concept taxonomies. Unfortunately, the availability of such a kind of resources is limited, and their manual building and maintenance are costly and error-prone. This work presents ConNeKTion, a tool for conceptual graph learning and exploitation. It allows to learn conceptual graphs from plain text and to enrich them by finding concept generalizations. The resulting graph can be used for several purposes: finding relationships between concepts (if any), filtering the concepts from a particular perspective, keyword extraction and information retrieval. A suitable control panel is provided for the user to comfortably carry out these activities.

TRANSCRIPT

Page 1: ConNeKTion: A Tool for Exploiting Conceptual Graphs Automatically Learned from Text

Università degli studi di Bari “Aldo Moro”Dipartimento di Informatica

ConNeKTion: A Tool for Exploiting ConceptualGraphs Automatically Learned from Text

F. Leuzzi, S. Ferilli, F. Rotella{fabio.leuzzi, stefano.ferilli, fulvio.rotella}@uniba.it

9th Italian Research Conference on Digital LibrariesUniversità la Sapienza - Rome, Italy

January 31 - February 1, 2013

L.A.C.A.M. http://lacam.di.uniba.it

Page 2: ConNeKTion: A Tool for Exploiting Conceptual Graphs Automatically Learned from Text

Overview

● Introduction & Objectives

● Tool overview

● Knowledge Representation Formalism

● Relevant concepts

● Information Retrieval

● Reasoning by Association

● Exploiting Tool

● Conclusions & Future Works

ConNeKTion: A Tool for Exploiting Conceptual Graphs Automatically Learned from TextF. Leuzzi, S. Ferilli, F. Rotella 2

Page 3: ConNeKTion: A Tool for Exploiting Conceptual Graphs Automatically Learned from Text

Some repositories leave the responsibility of quality to the authors.

+Anybody can produce and distribute documents.

=Possible low average quality of the repository contents.

The study, understanding and exploitation of the content of a digital library,

with the aim to easily explore the semantic content of huge amounts of text.

Introduction

ConNeKTion: A Tool for Exploiting Conceptual Graphs Automatically Learned from TextF. Leuzzi, S. Ferilli, F. Rotella 3

Page 4: ConNeKTion: A Tool for Exploiting Conceptual Graphs Automatically Learned from Text

IntroductionPossible solution:

● Natural Language Processing systems● Provide the grammatical structures contained in text

● Knowledge Representation formalisms● Semantic networks

● Graph learning techniques● To obtain a semantic network starting from the text

● In order to satisfy the information needs, the knowledge base

can be exploited:● To make summarizations

● To reason with it

● ...

ConNeKTion: A Tool for Exploiting Conceptual Graphs Automatically Learned from TextF. Leuzzi, S. Ferilli, F. Rotella 4

Page 5: ConNeKTion: A Tool for Exploiting Conceptual Graphs Automatically Learned from Text

Objectives

Improving fruition of a DL

● Use of a tool providing advanced functionalities

● Mixed strategy for relevant concept recognition

● Semantic approach to information retrieval

● Automatic inference over the acquired knowledge

● Reasoning by association

ConNeKTion: A Tool for Exploiting Conceptual Graphs Automatically Learned from TextF. Leuzzi, S. Ferilli, F. Rotella 5

Page 6: ConNeKTion: A Tool for Exploiting Conceptual Graphs Automatically Learned from Text

Tool overview

ConNeKTion: A Tool for Exploiting Conceptual Graphs Automatically Learned from TextF. Leuzzi, S. Ferilli, F. Rotella 6

Page 7: ConNeKTion: A Tool for Exploiting Conceptual Graphs Automatically Learned from Text

Knowledge representation formalism

Only subject, verb and complement have been considered.

● Subjects and complements → concepts

● Verbs → relations between them

The frequency of arcs between the concepts in positive and negative

sentences has been taken into account.

● Enrich the representation formalism

● Give robustness to our solution through a statistical approach

subject,verb,

complement

subject,complement

ConNeKTion: A Tool for Exploiting Conceptual Graphs Automatically Learned from TextF. Leuzzi, S. Ferilli, F. Rotella 7

Page 8: ConNeKTion: A Tool for Exploiting Conceptual Graphs Automatically Learned from Text

Relevant Concepts

● Relevant nodes are sought in the graph

● Mixed strategy

● Semantic network structure

● EM clustering provided by Weka

● Keyword Extraction

● Quantitative approach based on co-occurrences

● Qualitative approach exploiting WordNet

● Psychological approach based on principles of an effective

presentation

● Components empirically weighted

ConNeKTion: A Tool for Exploiting Conceptual Graphs Automatically Learned from TextF. Leuzzi, S. Ferilli, F. Rotella 8

Page 9: ConNeKTion: A Tool for Exploiting Conceptual Graphs Automatically Learned from Text

Information Retrieval

● Word Sense Disambiguation

● One Domain per Discourse assumption: many uses of a word in a

coherent portion of text tend to share the same domain

● Prevalent domain individuation

● Extraction of all synsets for each term

● Extraction of all domains for each synset

● Choice of prevalent domain synset

● Pairwise Complete Link Agglomerative Clustering

● Each synset generates a singleton cluster

● For each pair of clusters

● If the complete link property holds

● Merge the involved clusters

ConNeKTion: A Tool for Exploiting Conceptual Graphs Automatically Learned from TextF. Leuzzi, S. Ferilli, F. Rotella 9

Page 10: ConNeKTion: A Tool for Exploiting Conceptual Graphs Automatically Learned from Text

Information Retrieval

● Multi-strategy Similarity Measure on WordNet

● 3 components summed and normalized in ]0,1[

● depth (ancestors)

● breadth (direct neighbors)

● breadth (inverse neighbors)

● Document Partitioning

● For each document

● Each synset votes for a cluster

● User Query Processing

● Brute force WSD to find the best synsets combination

● Best combination used to return a ranked list of clusters

● Each cluster has a list of related documents obtained by the Document

Partitioning phase

ConNeKTion: A Tool for Exploiting Conceptual Graphs Automatically Learned from TextF. Leuzzi, S. Ferilli, F. Rotella 10

Page 11: ConNeKTion: A Tool for Exploiting Conceptual Graphs Automatically Learned from Text

Reasoning ‘by association’

Breadth-First SearchGiven two nodes (concepts), a Breadth-First Search starts from

both nodes, the former searches the latter's frontier and vice

versa, until the two frontiers meet. Then the path is restored

going backward to the roots in both directions.

We also provide the number of positive/negative instances, and

the corresponding ratios over the total to help understanding

different gradations (permitted, prohibited, typical, rare, etc.) of

actions between two objects.

ConNeKTion: A Tool for Exploiting Conceptual Graphs Automatically Learned from TextF. Leuzzi, S. Ferilli, F. Rotella 11

Page 12: ConNeKTion: A Tool for Exploiting Conceptual Graphs Automatically Learned from Text

Reasoning ‘by association’

Breadth-First Search

ConNeKTion: A Tool for Exploiting Conceptual Graphs Automatically Learned from TextF. Leuzzi, S. Ferilli, F. Rotella 12

The table below shows a sample of possible outcomes.E.g., an interpretation of case 1 can be: “the young looks television that talks about (and criticizes) facebook, because it typically does not help (rather distracts) schoolwork”.

Page 13: ConNeKTion: A Tool for Exploiting Conceptual Graphs Automatically Learned from Text

Reasoning ‘by association’

Probabilistic approach

ConNeKTion: A Tool for Exploiting Conceptual Graphs Automatically Learned from TextF. Leuzzi, S. Ferilli, F. Rotella 13

Defined a formalism based on ProbLog language: pi :: fi

● fi : ground literal of the form link (subject, verb, complement)

● pi : ratio between the

sum of all examples for

which fi holds and the sum

of all possible links between

subject and complement

Real world data are typically noisy

and uncertain → need for strategies

that soften the classical rigid logical

reasoning

Page 14: ConNeKTion: A Tool for Exploiting Conceptual Graphs Automatically Learned from Text

ConNeKTion

ConNeKTion: A Tool for Exploiting Conceptual Graphs Automatically Learned from TextF. Leuzzi, S. Ferilli, F. Rotella 14

Page 15: ConNeKTion: A Tool for Exploiting Conceptual Graphs Automatically Learned from Text

ConNeKTion

ConNeKTion: A Tool for Exploiting Conceptual Graphs Automatically Learned from TextF. Leuzzi, S. Ferilli, F. Rotella 15

Page 16: ConNeKTion: A Tool for Exploiting Conceptual Graphs Automatically Learned from Text

ConNeKTion

ConNeKTion: A Tool for Exploiting Conceptual Graphs Automatically Learned from TextF. Leuzzi, S. Ferilli, F. Rotella 16

Page 17: ConNeKTion: A Tool for Exploiting Conceptual Graphs Automatically Learned from Text

ConNeKTion

ConNeKTion: A Tool for Exploiting Conceptual Graphs Automatically Learned from TextF. Leuzzi, S. Ferilli, F. Rotella 17

Page 18: ConNeKTion: A Tool for Exploiting Conceptual Graphs Automatically Learned from Text

ConNeKTion

ConNeKTion: A Tool for Exploiting Conceptual Graphs Automatically Learned from TextF. Leuzzi, S. Ferilli, F. Rotella 18

Page 19: ConNeKTion: A Tool for Exploiting Conceptual Graphs Automatically Learned from Text

Conclusions

ConNeKTion allows to learn conceptual graphs from plain text and to

enrich them by finding concept generalizations.

The resulting graph can be used for several purposes:

● finding relationships between concepts (if any)

● filtering the concepts from a particular perspective

● relevant concepts recognition and information retrieval

A suitable control panel is provided for the user to comfortably carry out

these activities.

ConNeKTion: A Tool for Exploiting Conceptual Graphs Automatically Learned from TextF. Leuzzi, S. Ferilli, F. Rotella 19

Page 20: ConNeKTion: A Tool for Exploiting Conceptual Graphs Automatically Learned from Text

Future Works

ConNeKTion: A Tool for Exploiting Conceptual Graphs Automatically Learned from TextF. Leuzzi, S. Ferilli, F. Rotella 20

We plan to improve the natural language text pre-processing using anaphora

resolution in order to replace, where possible, pronouns with the explicit concept

they express.

All functionalities have parameters set empirically. A criteria for automatical

setting of suitable parameters is needed.

The preseted functionalities are based on the exploitation of WordNet. A strategy

to make the operators WordNet free can be desirable.

We also wish to extend the reasoning operators by adding an argumentation

operator, that could exploit probabilistic weights, intended as a rate of reliability,

to provide support or attack to a given statement.