inference and reconciliation in a crowdsourced lexical-semantic network · inference and...

Inference and Reconciliationin a Crowdsourced Lexical-Semantic Network

Manel Zarrouk, Mathieu Lafourcade, and Alain Joubert

LIRMM, Montpellier,161, rue Ada,34392 Montpellier Cedex,

France

{manel.zarrouk, lafourcade, joubert}@lirmm.fr

Abstract. Lexical-semantic network construction andvalidation is a major issue in NLP. No matter theconstruction strategies used, automatically inferring newrelations from already existing ones is a way to improvethe global quality of the resource by densifying thenetwork. In this context, the purpose of an inferenceengine is to formulate new conclusions (i.e. relationsbetween terms) from already existing premises (alsorelations) on the network. In this paper we devise aninference engine for the JeuxDeMots lexical networkwhich contains terms and typed relations betweenterms. In the JeuxDeMots project, the lexical networkis constructed with the help of a game with a purposeand thousands of players. Polysemous terms may berefined in several senses (bank may be a bank-financialinstitution or a bank-river) but as the network isindefinitely under construction (in the context of aNever Ending Learning approach) some senses may bemissing. The approach we propose is based on thetriangulation method implementing semantic transitivitywith a blocking mechanism for avoiding proposingdubious new relations. Inferred relations are proposedto contributors to be validated. In case of invalidation, areconciliation strategy is undertaken to identify the causeof the wrong inference : an exception, an error in thepremises or a transitivity confusion due to polysemy withthe identification of the proper word senses at stake.

Keywords. Inference, reconciliation, lexical networks.

Inferencia y reconciliación en la redléxica y semántica basada en la

técnica crowdsourced

Resumen. Construcción y validación de una redléxica y semántica es un reto en el procesamiento delenguaje natural. Para todas las estrategias usadasde construcción, un método de mejorar la calidadgeneral del recurso es la inferencia automática derelaciones nuevas a partir de las existentes, lo cualresulta en el aumento de la densidad de la red.En este contexto un motor de inferencia tiene elobjetivo de deducir las conclusiones nuevas, es decir,

relaciones entre términos, a partir de las premisasexistentes (también relaciones) en la red. En esteartículo se diseña un motor de inferencia para lared léxica JeuxDeMots, la cual contiene términos yrelaciones definidas entre términos. En el proyectoJeuxDeMots la red léxica se construye mediante unjuego con propósito y miles de jugadores. Términospolisémicos pueden ser refinados en varios significados(un banco puede ser una institución financiara y unamueble) pero como la red se está construyendo demanera infinita (en el contexto del enfoque “Aprendizajeque Nunca Termina”), algunos significados puedenfaltar. El enfoque propuesto se basa en el método detriangulación implementando la transitividad semánticacon el mecanismo de bloqueo para evitar las propuestasde relaciones nuevas dudosas. Las relaciones inferidasse proponen a los contribuyentes para validarlas. En elproceso de validación se puede emplear la estrategiade reconciliación con el fin de identificar la causa deuna inferencia incorrecta: una excepción, un erroren las premisas o una confusión de transitividadpor la polisemia cuando es necesario identificar lossignificados apropiados de palabras.

Palabras clave. Inferencia, reconciliación, redesléxicas.

1 Introduction

Developing lexico-semantic network for NLP inone the the major issue of the field. Mostof the existing resources have been constructedby hand, like for instance the famous WordNet.Of course some tools are generally designedfor consistency checking, but nevertheless thetask remains time consuming and costly. Fullyautomated approaches are generally limited toterm coocurrences as extracting precise semanticrelations between terms from text is really difficult.New approaches involving crowdsourcing areflowering in NLP especially with the advent ofAmazon Mechanical Turk or in a broader scope

Susana

Cuadro de texto

Computación y Sistemas Vol. 17 No.2, 2013 pp.147-159 ISSN 1405-5546

Wikipedia and Wiktionnary, to cite the most wellknown examples. WordNet ([8] and [2]) is sucha lexical network based on synsets which canbe roughly considered as concepts. [14] withEuroWordnet a multilingual version of WordNetand [10] for WOLF, a French version of WordNet,applied automated crossing of WordNet and otherlexical resources with some manual checking.[9] constructed automatically BabelNet a largemultilingual lexical network from the Wikipediaencyclopedia, but mostly with term coocurrences.Hownet [1] is another example of a large bilingualknowledge base (English and Chinese) containingsemantic relations between word form, conceptsand attributes. In Hownet there are much moredifferent relations than in WordNet, although bothprojects started in the 80’s as manually constructedby linguists and psychologists.

A highly lexicalized lexico-semantic networkcan contain concepts but also plain words (andmulti-word expressions) as entry points (nodes)along with word senses. The idea itself ofword senses in the lexicographic tradition may bedebatable in the case of resources for semanticanalysis, and we generally prefer to consider wordusages. By word usages we mean refinementsof a given word which are clearly identified bylocutors but might not be always separated formother word usages of the same entry. A wordusage put the emphasis on how and whichcontext the term is actually used by locutors. Apolysemic term has several usages that mightdiffer substantially for word senses as classicallydefined. A given usage can also in turn haveseveral refinements. For example, frigate can be abird or a ship. A frigate>boat can be distinguishedas a modern ship of an ancient vessel. In thecontext of a collaborative construction, such alexical resource should be considered as beingconstantly under construction. For a polysemicterm, some refinements might be just missing.As a general rule, we have no definite certitudeabout the state of an entry. There is no way(unless by close inspection) to know if a given entryrefinements are fully completed, and even if thisquestion is really relevant.

The building of a collaborative lexical network (orany similar resource in general) can be devisedaccording to two strategies. First, as a contributivesystem like Wikipedia where people willingly addand complete entries (like for Wiktionary). Second,contributions can be made indirectly thanks to

games (better known as GWAP [13]) and inthis case players do not need to be aware thatwhile playing they are helping building a lexicalresource. In any case, the built lexical network isnot free of errors which are corrected along theirdiscovery. Past experience shows that playersand contributors complete the resource on termsthat interest them. Thus a large number ofobvious relations are not contained in the lexicalnetwork but are indeed necessary for a high qualityresources usable in various NLP application andsemantic analysis. For example, contributorsseldom indicate that a particular bird type can fly,as it is considered an obvious generality. Onlynotable facts which are not easily deductible arecontributed. Well known exceptions are alsogenerally contributed and take the form of anegative weight for the relation (for example, flyagent:−100−−−−−−→ ostrich).

In order to consolidate the lexical network,we adopt a strategy based on a simple (if notsimplistic) inference mechanism to propose newrelations from those existing. The approach isstrictly endogenous as it doesn’t rely on anyother external resources. Inferred relations aresubmitted either to contributors for voting or toexpert for direct validation/invalidation. A largepercentage of the inferred relations has been foundto be correct. However, a non negligible part ofthem are found to be wrong and understandingwhy is both relevant and useful. The explanationprocess can be viewed as reconciliation betweenthe inference engine and the validator who isguided through a dialog to explain why he found theconsidered relation incorrect. The possible causesfor a wrong inferred relation may come from threepossible origins : false premises that were used bythe inference engine, exception or confusion due tosome polysemy.

In this article, we first present the principlesbehind lexical network construction with crowd-sourcing and games with a purpose (also knowas human-based computation game) and exem-plify them with the JeuxDeMots project. Then, wepresent the outline of an elicitation engine basedon an inference engine and a reconciliation engine.Then we describe the system performance basedon our experiments.

Susana

Cuadro de texto

148 Manel Zarrouk, Mathieu Lafourcade, and Alain Joubert

Susana

Cuadro de texto

Computación y Sistemas Vol. 17 No.2, 2013 pp. 147-159 ISSN 1405-5546

2 Lexical Network and Crowdsourcing

There are many ways for building a lexical networkconsidering some crucial factors as the quality ofdata, cost and time. Beside manual or automatedstrategies, contributive approaches are more andmore popular as they are both cheap to set upand efficient in quality. More specifically, thereis an increasing trend of using on-line GWAPs(game with a purpose [12]) method for feedingsuch resource.

The JDM lexical network is constructed througha set of on-line associative games. In thesegames, players are appealed to contribute onlexical and semantic relations between terms orverbal expressions which are presented in thenetwork by the arcs interconnecting nodes in agraph. The informations in the JDM network aregathered by an unnegotiated crowd agreement(classical contributive systems rely on a negotiatedcrowd agreement).

2.1 JeuxDeMots: a GWAP for Building aLexico-semantic Network

JeuxDeMots1 is a two player GWAP, launchedin September 2007, that aims to build a largelexico-semantic network [4]. The network iscomposed of terms (as vertices) and typedrelations (as links between vertices). It containsterms and possible refinements in a similar wayto the WordNet synset [8]. There are more than50 types for relations and relation occurrences areweighted. The weight of a relation is interpreted asa strength, but not directly as a probability of beingvalid neither a confidence level.

When Player A begins a game, instructionsconcerning the type of lexical relation (synonyms,antonym, domain, etc.) are displayed, as well asa term T chosen from the database. Player A hasa limited time to enter terms which, to his mind,correspond to term T and the lexical relation. Ascreenshot of the user interface during a game isshown in Figure 1 and of the outcome of the gamein Figure 2.

The maximum number of terms a player canenter is limited, thus encouraging the player to thinkcarefully about his choices. The same term T,along with the same instructions, are later givento another player, Player B, for whom the process

1http://jeuxdemots.org

is identical. To make the game more fun, the twoplayers score points for words they both choose.Score calculation is explained in [3] and wasdesigned to increase both precision and recall inthe construction of the database. Answers given byboth players are displayed, those common to bothplayers are highlighted, as are their scores.

For a target term T, common answers from bothplayers are inserted into the database. Answersgiven by only one of the two players are notinserted into the database, thus reducing noise (i.e.mistake of players, in this case) and the chances ofdatabase corruption (voluntary erroneous answersfrom a malicious player). The semantic networkis therefore constructed by connecting terms bytyped and weighted relations, validated by pairs ofplayers. These relations are labelled according tothe instructions given to the players and weightedaccording to the number of pairs of players whochoose them. Initially, prior to putting the gameonline, the database was populated with nodes,however if a pair of players suggest a non-existingterm, the new node is added to the database.

In the interest of quality and consistency, it wasdecided that the validation process would involveanonymous players playing together. A relationis considered valid if and only if it is given by atleast one pair of players. This validation processis similar to that used by [13] for the indexingof images, and by [5] to collect common senseknowledge and [11] for knowledge extraction. Asfar as we know, this technique has never beenused for building semantic networks. In NLP,other Web-based systems exist, such as OpenMind Word Expert [7], which aims at creating largesense-tagged corpora with the help of Web users,and SemKey [6] which makes use of WordNet andWikipedia to disambiguate lexical forms referring toconcepts, thus identifying semantic keywords.

More than 1200000 games of JeuxDeMots havebeen played since its launch in September 2007corresponding of around 20000 hour of cumulatedplay (with 1 minute per game).

2.2 Diko: a Contributive Tool for the JDMNetwork

Diko2 is a web based tool for displaying theinformation contained in the JDM lexical networkbut also can be used as a contributive tool. The

2http://www.jeuxdemots.org/diko.php

Susana

Cuadro de texto

Inference and Reconciliation in a Crowdsourced Lexical-Semantic Network 149

Susana

Cuadro de texto


Fig. 1. Snapshot of an ongoing game, where the player is asked to give words he/she associates to the target term(Willy Wonka). So far, 6 words have been given

Fig. 2. Snapshot of the result of the game, where three words were in common which are linked to the target term (WillyWonka) in the JDM lexical network

necessity to not rely only on the JeuxDeMots gamefor building the lexical network comes from the factthat many relation types of JDM are either difficultto grasp for a casual player or not very productive(not many terms can be associated). Furthermore,the need of such a contributive tool historicallycame from the players themselves as they wantedto become direct contributors of JDM.

The principle of the contribution process is that aproposition made by a user will be voted pro or conby other users. When a given number of votes havebeen casted, an expert validator is notified andfinally includes (or excludes) the proposed relationin the network. The expert can reject altogether arelation proposition or to include it with a negative

weight if this is found to be relevant (for example,the relation bird has−part:−100−−−−→ fin may to be worthyas present in the network). Contributions can alsobe made by automatic processes to be scrutinizedand voted by users. What we propose in this paperfalls under this type of scenario or contributions /validations.

2.3 Some JDM Network Characteristics

At the moment of writing this article, JDM networkcontained 251358 terms and over 1500000 relations(amongst them 14658 being negative). Morethan 4500 terms have some refinements (varioususages) for a total of around 15000 usages.

Susana

Cuadro de texto


Susana

Cuadro de texto


Fig. 3. Snapshot of a Diko screen for the term lapin (rabbit) as animal. In French, lapin can also refers to the fur, themeat, etc.

Although JDM has ontological relations, it isnot an ontology per se with some clean andwell-though hierarchy of concepts or terms. A giventerm can have a substantial set of hypernyms thatcovers a large part of the ontological chain to upperconcepts. For example, hyperonyme(cat) = {feline,mammal, living being, pet, vertebrate, ...}. In theprevious hyperonyms set, we omitted weights forsimplification, but in all generality, heavier termsare those felt by users as being the most relevant.

3 Inference and Reconciliation for anElicitation Engine

We designed a system for augmenting the numberof relations in the JDM lexical network having twomain components: (a) an inference engine and(b) a reconciliator. The inference engine proposesrelations as if it were a contributor, to be validatedby other human contributors or experts. In case ofinvalidation of an inferred relation, the reconciliatoris invoked to try to assess why the inferred relationwas found wrong. Elicitation here should beunderstood as the process to make some implicit

knowledge of the user into explicit relations in thelexical network.

3.1 Inference Engine

The main ideas about inferences in our system arethe following:

— inferring is for the engine to derive logicalconclusions (under the form of relationsbetween terms) from previously knownpremises, which are existing relations;

— candidate inferences may be logically blockedon the basis of the presence or absence ofsome other relations;

— candidate inferences can be filtered out on thebasis of a strength evaluation;

— conclusions made by the inference engine aresupposed to be correct but may turn out tobe correct, correct but irrelevant or incorrectwhen proposed to a human validator.

Susana

Cuadro de texto


Susana

Cuadro de texto


In this paper, the type of inference we areworking with, is based on the transitivity of theontological relation is-a (hypernym). If a term A isa kind of B and B holds some relation R with C,then we can expect that A holds the same relationwith C. The schema for the inference is as followsin Figure 4.

B

A C

1

is-a

3R’?

2R

Fig. 4. Simple inference triangular schema applied tothe transitivity of hyperonymy (is-a relation). Relation(1) and (2) are the premises, and the relation (3) is thelogical conclusion proposed in the lexical network and tobe validated

More formally, we can write:

∃ A is−a−−−−→ B ∧ ∃ B R−−→ C ⇒ A R−−→ C

For example,

cat is−a−−−−−−→ feline ∧ feline has−part−−−−−−→ claw⇒ cat has−part−−−−→ claw

The inference engine is applied on terms havingat least one hypernym (anyway the schema couldnot be applied otherwise). Let us consider aterm T with a set of weighted hypernyms. Fromeach hypernym, the inference engine deducesa set of inferences. Those inference sets arenot disjoint in the general case, and the weightof an inference proposed in several sets is theincremental geometric mean of each occurrence.

For example, as mentioned before, we have thefollowing weighted hypernyms for cat:

{felina, living being,mammal,pet, vertebrate, ...}.

The inference cat has−parts−−−−→ skeleton maycome from several hypernyms but strongly fromvertebrate. The inference cat location−−−−→ house maycome only from pet.

3.1.1 Logical Filtering

Of course, this schema given above is far too naive,especially considering the resource we are dealingwith. In effect, B is possibly a polysemous term andways to block inferences that are certainly wrongcan be devised. If there are two distinct meaningsof the term B that hold respectively the first and thesecond relation, then most probably the inferenceis wrong. This can be formalized (in a positive way)as follows:

∃ A is−a−−−−→ B ∧ ∃ B R−−−−→ C∧( ∃ Bi

mean−of−−−−→ B ∧ ∃ Bjmean−of−−−−→ B )∧

( 6 ∃ A is−a−−−−→ Bi ∨ 6 ∃ BjR−−−−→ C )

⇒ A R−−−−→ C

B

Bi Bj

A C

1

is-a

3R?

4

is-a

2R

5R

Fig. 5. Triangular inference schema with logical blockingbased on the polysemy of the middle term B

Moreover, if one of the premises is tagged astrue but irrelevant, then the inference in blocked.

3.1.2 Statistical Filtering

It is possible to evaluate a confidence level (onan open scale) for each produced inference, insuch a way that dubious inferences can be filteredout. The weight W of an inferred relation is thegeometric mean of the weight of the premises(relations (1) and (2) in Figure 5). If the secondpremise has a negative value, the weight is nota number and the proposal is discarded. As thegeometric mean is less tolerant to small valuesthan the arithmetic mean, inferences not based ontwo rather certain relations (premises) are unlikelyto pass.

W(A R−−→ C) =( W(A is−a−−→ B) * W(B R−−→ C) )1/2

Susana

Cuadro de texto


Susana

Cuadro de texto


3.2 Reconciliation Engine

Inferred relations are presented to the validatorto make a decision of their status: true, truebut irrelevant or false. In case of invalidation,the reconciliator tries to diagnose the reasons:error as one of the premises (previously existingrelations is false), exception or confusion due topolysemy (the inference has been made on apolysemous central term) and initiates a dialog withthe user. The dialog should be succinct as tofind more information with the fewest questions,without bothering the user but nevertheless tryingto evaluate if the user is of good faith. Toknow in which order to proceed, the reconciliatordetermines if the weights of the premises are ratherstrong or weak. This confidence is evaluated bycomparing the relation weight against the thresholdwhere the integrative value of the distribution ofthe relations separates the distribution in two equalparts (see Figure 6). For example, suppose wehave:

term A: schoolboyterm B: humanterm C: facerelation (1): schoolboy is−a−−−−→ human

Figure 6 presents the distribution curve for allrelations having a source term A (schoolboy) with apike separating evenly the surface under the curve.The threshold is at 60 where the pike intersects thedistribution curve.

The confidence threshold for the relation (1) isthe intersection of the distribution curve and thearea/2 curve:

— If W(A is−a−−−−→ B) >= confidence-threshold(A)⇒ A is−a−−−−→ B is a trusted relation;

— If W(A is−a−−−−→ B) < confidence-threshold(A)⇒ A is−a−−−−→ B is a dubious relation.

In the case if we treat both relations (1) and(2) as trusted, the reconciliator tries, by initiatinga dialog with the validator (3.2), to check at firstif the relation inferred is an exception. If not, itproceeds by checking if term B is polysemous andfinally checks if it is an error case. We check theerror case in the final step because the confidencelevel of relations (1) and (2) made them trusted. Forexample, suppose we have:

A:ostrich ; B:bird ; C:fly ; R:carac(1): ostrich is−a−−−−→ bird(2): bird carac−−−−−−→ fly⇒ (3): ostrich carac−−−−−−→ fly

In this case, it’s true that an ostrich is a birdand that a bird can fly, but the inferred relation anostrich can fly is a wrong one and it is consideredas an exception.

In the case of having a dubious relation either for(1) and (2), the reconciliator suspect that it is anerror case and this relation was the cause of wronginferences. So it asks the validator to confirm orto disprove it. In case of disapproval of one of therelations we have an error. If not, we proceed withchecking if it’s an exception case or a polysemy.For example, suppose we have:

A:kid ; B:human ; C:wing ; R:has-part(1): kid is−a−−−−→ human(2): human has−parts−−−−−−→ wing⇒ (3): kid has−parts−−−−−−→ wing

Obviously the relation kid has−parts−−−−−−→ wing iswrong and the relation human has−parts−−−−−−→ wing isthe cause of the wrong inference.

3.2.1 Case of Error in Premises

In this case, suppose that relation (1) (in Figure 5)has a relatively low weight. The reconciliator asksthe validator if the relation (1) is true .

— If the answer is negative, a negative weightis attributed to (1) and the reconciliation iscompleted; as such, this relation will notbe used later on as premises on furtherinferences.

— If the answer is positive, ask if (2) is true andproceed as above if the answer is negative;

— Otherwise, move to the other cases(exception, polysemy).

Susana

Cuadro de texto


Susana

Cuadro de texto


Fig. 6. Distribution of the outgoing relations for ecolier (Eng. schoolboy ). The distribution appears to follow a power law.The pike is the frontier where the surface under the curve on the left is equal to the surface on the right. The intersectionof the pike and the curve can be used as a threshold value for relations as being trustable or not. In the case of the termecolier, relations below a weight of 60 might be dubious

3.2.2 Errors as Exceptions

In the case we have two trusted relations, thereconciliator asks the validator if the inferredrelation A R−−→ C is a kind of exception. Ifit is the case, the relation is stored in thelexical network with a negative weight along witha meta-information which indicates that it is anexception. Relations that are exceptions do notparticipate as premises.

3.2.3 Errors due to Polysemy

In this case, if the middle term (B) presentinga polysemy is mentioned as polysemous in thenetwork, the refinement terms B1, B2, ..., Bn arepresented to the validator so he can choose theappropriate one. The validator can propose anew term as a refinement if he is not satisfiedwith the listed ones (inducing the creation of anew refinement). If there is no meta informationthat indicates the term is polysemous, we ask firstthe validator if it is indeed the case. After this

procedure, two new relations A is−a−−−−→ Bi andBj

R−−−−→ C will be included in the network withsome positive value and the inference engine willuse them later on (Bi and Bj being refinements ofB choosen by the user).

4 Experimentation

We made an experiment with a unique run of theengine over the lexical network. The purpose isto access the production of the inference enginealong with the blocking and filtering. Then fromthe set of supposedly valid inferred relations, wetook a random sample of 300 propositions foreach relation type and undertook the validation /reconciliation process. The experiment conductedis for evaluation purpose only, as actually thesystem is running iteratively along with contributorsand games.

Susana

Cuadro de texto


Susana

Cuadro de texto


Is A R−−→ C true ?

Store the relation

End ofreconciliation

Is A R−−→ Cfalse an

exception?

Store A R−−→ Cwith a W<0.

annotedas"exception"

for A is−a−−−−→ Band B R−−→ C

choose the mostappropriate term todisambiguate therelation or enter anew one if needed

Check if B prop−−→ Polysemous

Is the term Bpolysemous?

Is A is−a−−→ B true?

Establichnew

assignedrelations

A is−a−−→ Bi

and Bjis−a−−→

C and storethem in the

network

W(A is−a−−→ B)= -W(Ais−a−−→ B) and relation

stored in the network

Is B R−−→ C true?

W(B R−−→ C)= -W(BR−−→ C) and relation

stored in the network

Is A R−−→ Ca general

truth?

Store A R−−→ C inthe network with

W<0 annotated as"general truth"

Have youany remark

or clue?

rather true/possible/true irrelevant

Don’t know

mostly false

rather true/possible/true irrelevant

(rather true/possible/true irrelevant)

mostly false

rather true mostly false

No

No

mostly false

mostly false

Yes

Yes

rather true/possible

Fig. 7. Schema of the validation / reconciliation procedure. If an infered candidate relation is found as false then theprocedure aims at indentifying the source of the problem through a dialog with the user. The relation can be an exceptionand stored in the network with such an annotation. If the relation is not an exception, then the central term B is checkedfor being polysemous and if it is the case the user is invited to select a proper usage of B for each premise

4.1 Unleashing the Inference Engine

We applied the inference engine on around 20000randomly selected terms having at least onehypernym and thus produced 1484209 inferences(77089 more were blocked). The threshold forfiltering was set to a weight of 25 (that is to say onlyinferences with a weight equal to or above 25 havebeen considered). This value is relevant as whena human contributor proposed relation is validated

by an expert, it is introduced with a default weightof 25. In Figure 8, the distribution appears to befollowing a power law, which is not totally surprisingas the relation distribution in the lexical network isby itself governed by such a distribution.

Table 1 presents the number of relationsproposed by the inference engine and Table 2the various status of the proposed inferences.The different types for the second premise (thegeneric R relation in the inference triangulation)

Susana

Cuadro de texto


Susana

Cuadro de texto


0,1

1

10

100

1000

10000

100000

1000000

0 100 200 300 400 500 600 700 800

Nu

mb

er

of

occ

ure

nce

s

Inference weight

nb

Puissance (nb)

Fig. 8. Proposed inferences distribution according to weights. Weight bellow 25 are not displayed

Table 1. Productivity of relation types when exploited by the inference engine

Relation type nb proposed nb existing productivityis-a 91 037 91 799 99.16%has-parts 37 2688 21 886 1702.86%holo 108 191 13 124 824.37%lieu 271 717 26 346 1031.34%carac 203 095 24 180 839.92%agent-1 198 359 6 820 2908.48%instr-1 24 957 4 797 520.26%patient-1 14 658 3 930 372.97%lieu-1 145 159 8 835 1642.99%lieu-action 50 035 4 559 1097.49%object mater 4 313 3 097 139.26%

are variously productive. Of course, this is mainlydue to the number of existing relations and thedistribution of their type in the network. Theproductivity of a relation type is the ratio betweenthe number of the proposed inferences and thenumber of occurrences of this relation type in thenetwork.

The transitive inference on is-a is the lessproductive which might seem surprising at firstglance. In fact, the is-a relation is already quitepopulated in the network, and as such, fewernew relations can be inferred. The figures areinverted for some other relations that are not sowell populated in the lexical network but still are

Susana

Cuadro de texto


Susana

Cuadro de texto


Table 2. Status of the inferences proposed by the inference engine

Relation types nb proposed % nb bloqued % nb flitered %hypernym 91 037 6.13 4 034 5.23 53 586 26.32has-parts 372 688 25.11 31 421 40.76 100 297 49.26holonyme (inverse of has-parts) 108 191 7.28 17 944 23.27 26 818 13.17typical location 271 717 18.30 11 502 14.92 14 174 6.96caracteristics 203 095 13.68 2 647 3.43 6 576 3.23agent-1 (what the subject cando)

198 359 13.36 9052 11.74 1122 0.55

instr-1 (the suject can be aninstrument for what)

24 957 1.68 127 0.16 391 0.19

patient-1 (what can be done tothe subject)

14 658 0.98 7 0.01 13 0.00

typical location-1 (what can befound in/on the subject)

145 159 9.78 129 0.17 206 0.10

lieu-action (what can be donein/on the subject)

50 035 3.379 91 0.12 132 0.06

object mater (mater/substance ofthe subject)

4 313 0.29 135 0.17 262 0.12

Total 1 484 209 100 77 089 100 203 577 100

potentially valid. The agent semantic role (theagent-1 relation) is by far the most productive, with30 more propositions than what currently exists inthe lexical network.

4.2 Figures on Reconciliation

In Table 3 is presented some evaluation of thestatus of the inferences proposed by the inferenceengine. Inferences are valid for an overall of80-90% with around 10% valid but not relevant (likefor instance dog has−parts−−−−−−→ proton). We observethat error number in premises is quite low, andnevertheless errors can be easily corrected. Ofcourse, not all possible errors are detected throughthis process. More interestingly, the reconciliationallows in 5% of the cases to identify polysemousterms and refinements. Globally false negatives(inferences voted false but are true) and falsepositives (inferences voted true but are false) areevaluated to less than 0, 5%.

5 Conclusion

In this paper we presented some issues in buildinga lexico-semantic network with games and usercontributions and inferring new relations fromexisting ones. Such a network is highly lexicalized

and word usages are discovered incrementallyalong with its construction. Errors are naturallypresent in the resources as they might comefrom games played for difficult relations, but theyare usually discovered by contributors for termsthey are interested in. The same observationis generally done on what contributors contribute.To be able to enhance the network quality andcoverage, we proposed an elicitation engine basedon inferences and reconciliations. Inferencesare here conducted on the basis of a simpletriangulation based on the hypernymy transitivity,along a logical blocking and statistical filtering. Areconciliation process is conducted in case theinferred relation is proven wrong, in order to identifythe underlying cause. Concerning global figures,we can conclude that inferred relations are correctand relevant in about 78% of the cases and correctbut not relevant in 10% of the cases. Overallwrong inferences are about 12% with at least oneerror in the premises of about 2%, exceptions ofabout 5% and polysemy confusion of about 5%.Beside being just a tool for increasing the numberof relations in a lexical network, the elicitationengine is both an efficient error detector andpolysemy identifier. The actions taken during thereconciliation forbid an inference proven wrong tobe inferred again and again. Such an approach

Susana

Cuadro de texto


Susana

Cuadro de texto


Table 3. Results of the validation / reconciliation according to inference types

% valid % errorRelation types relevant not relevant in premises as exception due to polysemyis-a 76% 13% 2% 0% 9%has-parts 65% 8% 4% 13% 10%holonyme 57% 16% 2% 20% 5%typical location 78% 12% 1% 4% 5%caracteristics 82% 4% 2% 8% 4%agent-1 81% 11% 1% 4% 3%instr-1 62% 21% 1% 10% 6%patient-1 47% 32% 3% 7% 11%typical location-1 72% 12% 2% 10% 6%lieu-action 67% 25% 1% 4% 3%object mater 60% 3% 7% 18% 12%

should be pushed forward with other type ofinference schema and possibly with the evaluationof the distribution of the semantic classes of theterms on which inferences are conducted. Indeedsome semantic classes like concrete objects orliving beings may be substantially more productivefor certain relation types than for example abstractnouns of processes or events. Anyway, suchdiscrepancies of inference productivity betweenclasses are worthy to investigate further.

References

1. Dong, Z. & Dong, Q. (2006). HowNet and theComputation of Meaning. WorldScientific, London.

2. Fellbaum, C. & Miller, G. (1998). (eds) WordNet.The MIT Press.

3. Joubert, A. & Lafourcade, M. (2008). Jeuxdemots: un prototype ludique pour l’ï¿ 1

2mergence de

relations entre termes. In proc of JADT’2008, Ecolenormale supï¿ 1

2rieure Lettres et sciences humaines

, Lyon, France, 12-14 mars 2008, 8 p.

4. Lafourcade, M. (2007). Making people playfor lexical acquisition. In Proc. SNLP 2007,7th Symposium on Natural Language Processing.Pattaya, Thailande, 13-15 December 2007, 8 p.

5. Lieberman, H., Smith, D. A., & Teeters, A.(2007). Common consensus: a web-based gamefor collecting commonsense goals. In Proc. of IUI,Hawaii., 12 p.

6. Marchetti, A., Tesconi, M., Ronzano, F., Mosella,M., & Minutoli, S. (2007). Semkey: Asemantic collaborative tagging system. in Procs ofWWW2007, Banff, Canada, 9 p.

7. Mihalcea, R. & Chklovski, T. (2003). Openmindword expert: Creating large annotated datacollections with web users help. In Proceedingsof the EACL 2003, Workshop on LinguisticallyAnnotated Corpora (LINC 2003), 10 p.

8. Miller, G., Beckwith, R., Fellbaum, C., Gross, D.,& Miller, K. (1990). Introduction to wordnet: anon-line lexical database. International Journal ofLexicography, 3(4), 235–244.

9. Navigli, R. & Ponzetto, S. (2012). Babelnet:Building a very large multilingual semantic network.In Proceedings of the 48th Annual Meeting of theAssociation for Computational Linguistics, Uppsala,Sweden, 11-16 July 2010, 216–225.

10. Sagot, B. & Fier, D. (2008). Construction d’unwordnet libre du franï¿ 1

2ais ï¿ 1

2partir de ressources

multilingues. TALN 2008, Avignon, France, 2008.,12.

11. Siorpaes, K. & Hepp, M. (2008). Games with apurpose for the semantic web. In IEEE IntelligentSystems, 23(3), 50–60.

12. Thaler, S., Siorpaes, K., Simperl, E., & Hofer,C. (2011). A survey on games for knowledgeacquisition. STI Technical Report, May 2011, 19.

13. von Ahn, L. & Dabbish, L. (2008). Designinggames with a purpose. Communications of theACM, 51(8), 58–67.

14. Vossen, P. (1998). Eurowordnet: a multilingualdatabase with lexical semantic networks, 200.

Susana

Cuadro de texto


Susana

Cuadro de texto


Manel Zarrouk is a PhDstudent at University Montpellier2 (France) and makesresearch in Natural LanguageProcessing at the TEXTEteam of LIRMM (Laboratoired’Informatique, de Robotique,et de Microélectronique de

Montpellier). Her research interests are inferencesof schemas and rules in large lexico-semanticnetworks, consolidation and evaluation of lexicalresources for semantic analysis.

Mathieu Lafourcade isassociated professor atUniversity Montpellier 2(France). He conductsresearch in Natural LanguageProcessing at the TEXTEteam of LIRMM (Laboratoired’Informatique, de Robotique,

et de Microélectronique de Montpellier). Hisresearch interests are semantic analysis withbioinspired appraoches (Ant Colony Algorithms),lexical semantics with large knowledge networksand vectors models, lexical acquisition throughcontributive systems like games with a purpose.M. Lafourcade is the inventor of the JeuxDeMotsproject.

Alain Joubert is associatedprofessor at University ofMontpellier 2 (France).He makes researchwithin the TEXTE teamof LIRMM (Laboratoired”Informatique, de Robotiqueet de Microélectronique de

Montpellier). Although he was an astrophysicist,his major research interests are now in the fieldof Natural Language Processing, particularly insemantics, lexical networks, GWAPs. He has beenworking for five years in the JeuxDeMots project.

Article received on 18/12/2012; accepted on 11/01/2013.

Susana

Cuadro de texto


Susana

Cuadro de texto


inference and reconciliation in a crowdsourced lexical-semantic network · inference and...

Documents