exploiting distributional semantic models in question answering

37
Exploiting Distributional Semantic Models in Question Answering Piero Molino 1,2 , Pierpaolo Basile 1,2 , Annalina Caputo 1 , Pasquale Lops 1 , Giovanni Semeraro 1 {[email protected]} 1 Dept. of Computer Science, University of Bari Aldo Moro 2 QuestionCube s.r.l. 6 th IEEE International Conference on Semantic Computing September 19-21, 2012 Palermo, Italy

Upload: pierpaolo-basile

Post on 26-May-2015

637 views

Category:

Technology


0 download

DESCRIPTION

This paper investigates the role of Distributional Semantic Models (DSMs) in Question Answering (QA), and specifically in a QA system called QuestionCube. QuestionCube is a framework for QA that combines several techniques to retrieve passages containing the exact answers for natural language questions. It exploits Information Retrieval models to seek candidate answers and Natural Language Processing algorithms for the analysis of questions and candidate answers both in English and Italian. The data source for the answer is an unstructured text document collection stored in search indices. In this paper we propose to exploit DSMs in the QuestionCube framework. In DSMs words are represented as mathematical points in a geometric space, also known as semantic space. Words are similar if they are close in that space. Our idea is that DSMs approaches can help to compute relatedness between users’ questions and candidate answers by exploiting paradigmatic relations between words. Results of an experimental evaluation carried out on CLEF2010 QA dataset, prove the effectiveness of the proposed approach.

TRANSCRIPT

Page 1: Exploiting Distributional Semantic Models in Question Answering

Exploiting Distributional Semantic Models in Question Answering

Piero Molino1,2, Pierpaolo Basile1,2,Annalina Caputo1, Pasquale Lops1, Giovanni Semeraro1

{[email protected]}1Dept. of Computer Science, University of Bari Aldo Moro

2QuestionCube s.r.l.

6th IEEE International Conference on Semantic ComputingSeptember 19-21, 2012 Palermo, Italy

Page 2: Exploiting Distributional Semantic Models in Question Answering

Background

A bottle of Tesgüino is on the table.

Everyone likes Tesgüino.

Tesgüino makes you drunk.

We make Tesgüino out of corn.

Page 3: Exploiting Distributional Semantic Models in Question Answering

Background

A bottle of Tesgüino is on the table.

Everyone likes Tesgüino.

Tesgüino makes you drunk.

We make Tesgüino out of corn.

It is a corn beer

Page 4: Exploiting Distributional Semantic Models in Question Answering

Background

You shall know a word by the company it keeps!

Meaning of a word is determined by its usage

4

Page 5: Exploiting Distributional Semantic Models in Question Answering

Background

Distributional Semantic Models (DSMs)

– represent words as points in a geometric space

– do not require specific text operations

– corpus/language independent

Widely used in IR and Computational Linguistic

– semantic text similarity, synonyms detection, query expansion, …

Not directly used in Question Answering (QA)

5

Page 6: Exploiting Distributional Semantic Models in Question Answering

Our idea

Using DSMs into a QA system– QuestionCube: a framework for QA

Exploit DSMs– compare semantic similarity between user’s question

and candidate answer

Page 7: Exploiting Distributional Semantic Models in Question Answering

Our idea

Using DSMs into a QA system– QuestionCube: a framework for QA

Exploit DSMs– compare semantic similarity between user’s question

and candidate answer

Q: Which beverages contain alcohol?A: Tesgüino makes you drunk.

Page 8: Exploiting Distributional Semantic Models in Question Answering

Outline

• Introduction to the QA problem

• Distributional Semantic Models

– semantic composition

• DSMs into QuestionCube

• Evaluation and results

• Final remark

Page 9: Exploiting Distributional Semantic Models in Question Answering

QA vs. IR

Question Answering Information Retrieval

Information need Natural language question Keywords

ResultFactoid answer or short passages

Whole documents

Information acquisition Reading answersSearching manually insidedocuments

Page 10: Exploiting Distributional Semantic Models in Question Answering

DISTRIBUTIONAL SEMANTIC MODELS

Page 11: Exploiting Distributional Semantic Models in Question Answering

Distributional Semantic Models…

Defined as: <T, C, R, W, M, d, S>

– T: target elements (words)

– C: context

– R: relation between T and C

– W: weighting schema

– M: geometric space TxC

– d: space reduction M -> M’

– S: similarity function defined in M’

Page 12: Exploiting Distributional Semantic Models in Question Answering

…Distributional Semantic Models

Term-term co-occurrence matrix: each cell contains the co-occurrences between two terms within a prefixed distance

dog cat computer animal mouse

dog 0 4 0 2 1

cat 4 0 0 3 5

computer 0 0 0 0 3

animal 2 3 0 0 2

mouse 1 5 3 2 0

Page 13: Exploiting Distributional Semantic Models in Question Answering

…Distributional Semantic Models

TTM: term-term co-occurrence matrix

Latent Semantic Analysis (LSA): relies on the Singular Value Decomposition (SVD) of the co-occurrence matrix

Random Indexing (RI): based on the Random Projection

Latent Semantic Analysis over Random Indexing (LSARI)

Page 14: Exploiting Distributional Semantic Models in Question Answering

Latent Semantic Analysis

Given M the TTM matrix

– apply the SVD decomposition M=UΣVT

• Σ is the matrix of singular values

– choose k singular values from Σ and approximatethe matrix M

• M*=U*Σ*V*T

– SVD

• discover high-order relations between terms

• reduce the sparseness of the original matrix

Page 15: Exploiting Distributional Semantic Models in Question Answering

Random Indexing

1. Generate and assign a context vector to each context element (e.g. document, passage, term, …)

2. Term vector is the sum of the context vectors in which the term occurs

– sometimes the context vector could be boosted by a score (e.g. term frequency, PMI, …)

15

Page 16: Exploiting Distributional Semantic Models in Question Answering

Context Vector

• sparse

• high dimensional

• ternary {-1, 0, +1}

• small number of randomly distributed non-zero elements

0 0 0 0 0 0 0 -1 0 0 0 0 1 0 0 -1 0 1 0 0 0 0 1 0 0 0 0 -1

16

Page 17: Exploiting Distributional Semantic Models in Question Answering

Random Indexing (formal)

Bn,k = An,mRm,k

B nearly preserves the distance between points(Johnson-Lindenstrauss lemma)

k << m

dr = c×d

17

RI is a locality-sensitive hashing method which approximate the cosine distance between vectors

Page 18: Exploiting Distributional Semantic Models in Question Answering

Random Indexing (example)

John eats a red appleCVjohn -> (0, 0, 0, 0, 0, 0, 1, 0, -1, 0)CVeat -> (1, 0, 0, 0, -1, 0, 0 ,0 ,0 ,0)CVred -> (0, 0, 0, 1, 0, 0, 0, -1, 0, 0)

TVapple= CVjohn+ CVeat+ CVred=(1, 0, 0, 1, -1, 0, 1, -1, -1, 0)

18

Page 19: Exploiting Distributional Semantic Models in Question Answering

Latent Semantic Analysis over Random Indexing

1. Reduce the dimension of the co-occurrences matrix using RI

2. Perform LSA over RI (LSARI)

– reduction of LSA computation time: RI matrix contains less dimensions than co-occurrences matrix

Page 20: Exploiting Distributional Semantic Models in Question Answering

QUESTIONCUBE FRAMEWORK

Page 21: Exploiting Distributional Semantic Models in Question Answering

QuestionCube framework…

Natural Language pipeline for Italian and English– PoS-tagging, lemmatization, Name Entity Recognition,

Phrase Chunking, Dependency Parsing, Word Sense Disambiguation (WSD)

IR model based on classical VSM or BM25

– several query expansion strategies

Passage retrieval

Page 22: Exploiting Distributional Semantic Models in Question Answering

…QuestionCube framework

Question analysis

Question

Documentretrieval

Question expansion and categorization

Passage Filter

Term filter

N-gram filter

Density filter

Sequence filter

Score norm

Ranked list ofpassages

Candidatepassages

Index

Page 23: Exploiting Distributional Semantic Models in Question Answering

DSMS INTO QUESTIONCUBE

Page 24: Exploiting Distributional Semantic Models in Question Answering

Integration

Question analysis

Question

Documentretrieval

Questionexpansion and categorization

Passage Filter

Term filter

N-gram filter

Density filter

Sequence filter

Score norm

Ranked list ofpassages

Candidatepassages

Distributional filter

New filterbased on DSMs

Index

Page 25: Exploiting Distributional Semantic Models in Question Answering

Distributional filter…

Compute the similarity between the user’s question and each candidate answer

Both user’s question and candidate answer are represented by more than one term

– a method to compose words is necessary

Page 26: Exploiting Distributional Semantic Models in Question Answering

…Distributional filter…

Questioncandidateanswer1

candidateanswer3

candidateanswer2

candidateanswern

Distributional filter

DSM

compositionalapproach

candidateanswer3

candidateanswer1

candidateanswer5

candidateanswern

re-ranking

Page 27: Exploiting Distributional Semantic Models in Question Answering

…Distributional filter (compositional approach)

Addition (+): pointwise sum of components

– represent complex structures by summing words which compose them

Given q=q1 q2 … qn and a=a1 a2 … am

– question:

– candidate answer:

– similarity

nqqqq

21

maaaa

21

aq

aqaqsim

),(

Page 28: Exploiting Distributional Semantic Models in Question Answering

EVALUATION AND RESULTS

Page 29: Exploiting Distributional Semantic Models in Question Answering

Evaluation…

Dataset: 2010 CLEF QA Competition

– 10,700 documents• from European Union legislation and European Parliament

transcriptions

– 200 questions

– Italian and English

DSMs

– 1,000 vector dimension (TTM/LSA/RI/RILSA)

– 50,000 most frequent words

– co-occurrence distance: 4

29

Page 30: Exploiting Distributional Semantic Models in Question Answering

...Evaluation

1. Effectiveness of DSMs in QuestionCube

2. Comparison between the several DSMs adopted

Metrics

– a@n: accuracy taking into account only the first nanswers

– MRR based on the rankof the correct answer

30

N

rank

N

i i1

1

Page 31: Exploiting Distributional Semantic Models in Question Answering

Evaluation (alone scenario)

Only distributional filter is adopted: no otherfilters in the pipeline

31

Passage Filter

Term filter

N-gram filter

Density filter

Sequence filter

Score norm

Distributional filter

Page 32: Exploiting Distributional Semantic Models in Question Answering

Evaluation (combined)

Distributional filter is combined with the other filters

– using CombSum function

– baseline: distributional filter is removed

32

Passage Filter

Term filter

N-gram filter

Density filter

Sequence filter

Score norm

Distributional filter

Page 33: Exploiting Distributional Semantic Models in Question Answering

Results (English)…Run a@1 a@5 a@10 a@30 MRR

alone TTM .060 .145 .215 .345 .107

RI .180 .370 .425 .535 .267

LSA .205 .415 .490 .600 .300

LSARI .190 .405 .490 .620 .295

combined baseline* .445 .635 .690 .780 .549

TTM .535 .715 .775 .810 .6141

RI .550 .730 .785 .870 .6371,2

LSA .560 .725 .790 .855 .6371

LSARI .555 .730 .790 .870 .6341

1significat wrt. baseline2significat wrt. TTM*without distributional filter

Page 34: Exploiting Distributional Semantic Models in Question Answering

…Results (Italian)Run a@1 a@5 a@10 a@30 MRR

alone TTM .060 .140 .175 .280 .097

RI .175 .305 .385 .465 .241

LSA .155 .315 .390 .480 .229

LSARI .180 .335 .400 .500 .254

combined baseline* .365 .530 .630 .715 .441

TTM .405 .565 .645 .740 .5391

RI .465 .645 .720 .785 .5551

LSA .470 .645 .690 .785 .5511

LSARI .480 .635 .690 .785 .5571,2

1significat wrt. baseline2significat wrt. TTM*without distributional filter

Page 35: Exploiting Distributional Semantic Models in Question Answering

Final remarks…

Alone: all the proposed DSMs are better than the TTM

– in particular LSA and LSARI

Combined: all the combinations are able to overcome the baseline

– English +16% (RI/LSA), Italian +26% (LSARI)

Slight difference in performance between LSA and LSARI

Page 36: Exploiting Distributional Semantic Models in Question Answering

…Final remarks

Distributional filter in the alone scenario shows an higher improvement than in the combination scenario

– find more effective way to combine filters (learning to rank approach)

Exploit more complex semantic compositional strategies

Page 37: Exploiting Distributional Semantic Models in Question Answering

Thank you for your attention!

Questions?<[email protected]>