answer extraction as sequence tagging with tree edit distance

Post on 30-Jan-2016

75 Views

Category:

Documents

0 Downloads

Preview:

Click to see full reader

DESCRIPTION

Answer Extraction as Sequence Tagging with Tree Edit Distance. Xuchen Yao, Benjamin Van Durme (Hopkins), Chris Callison-Burch (UPenn) and Peter Clark (Vulcan). B-ANS. O. O. O. O. O. Tennis. player. 23. Jennifer. Capriati. is. Treating QA as Sequence Tagging. - PowerPoint PPT Presentation

TRANSCRIPT

Answer Extraction as Sequence Taggingwith Tree Edit Distance

Xuchen Yao, Benjamin Van Durme (Hopkins),Chris Callison-Burch (UPenn) and Peter Clark (Vulcan)

6/11/2013 NAACL 2013, ATLANTA 2

Treating QA as Sequence Tagging

• 3 types of hidden states:– B-ANS (beginning of answer)– I-ANS (inside of answer)– O (outside of answer, i.e., not an answer)

Tennis player Jennifer Capriati is 23

B-ANS O O O O O

What sport does Jennifer Capriati play?

6/11/2013 NAACL 2013, ATLANTA 3

A Sequence Tagging Tasklinear-chain Conditional Random Field (CRF)

• a conditional model p(y|x)• θ:feature weights to learn• f(yt,yt-1,xt): feature function

– first order (only look back the previous state)– inspect the whole sequence of x

)},,(exp{)(

1)|( 1

11tttk

K

kk

T

tyyf

Zp x

xxy

Yt-1 Yt Yt+1

Xt-1 Xt Xt+1

hidden states Y

observation X

6/11/2013 NAACL 2013, ATLANTA 4

It's The features That Matter

• We aim at:– question/answer template-free– easy to design– fast to extract

• We end up with:– Chunking-like features– Question type features– Edit script features– Alignment distance features

6/11/2013 NAACL 2013, ATLANTA 5

It's The features That Matter

• We aim at:– question/answer template-free– easy to design– fast to extract

• We end up with:– Chunking-like features– Question type features– Edit script features– Alignment distance features

6/11/2013 NAACL 2013, ATLANTA 6

Chunking-like Features

• Each word t comes with pos[t], dep[t], ner[t]• Design features within a window size of 3:

– pos[t], pos[t-1], pos[t-2], pos[t+1], pos[t+2]– pos[t-1]pos[t-2], pos[t-1]pos[t], pos[t]pos[t+1], pos[t+1]pos[t+2]– pos[t-2]pos[t-1]pos[t], pos[t-1]pos[t]pos[t+1], pos[t]pos[t+1]pos[t+ 2]

• so as for dep[t], ner[t]

B-ANS O O O O O

Tennis player Jennifer Capriati is 23

nnpnmod

GAME-B

nnnmod

PER_DESC-B

nnpnmod

PERSON-B

nnpsubj

PERSON-I

vbzroot

-

cdprd

CARDINAL-B

6/11/2013 NAACL 2013, ATLANTA 7

Chunking-like Features + Question Type features

• Combining chunking-like features with question types– who, whom, when, where, how many, how much, how

long

feature example learned weight

qword = when | pos[t]=in | pos[t+1]=cd |pos[t+2]=nns in 90 days high

qword = who | pos[t-1] = nnp | pos[t] = nnp capriati high

qword = when | pos[t-1] = nnp | pos[t] = nnp capriati low

Chunking-like and Question-type Features

capture a lot of traditional intuitions of question/answer templates

But wait a minute...

How do we know my Q and S are talking about the same thing?

6/11/2013 NAACL 2013, ATLANTA 10

We don't know yet!

• What sport does Jennifer Capriati play?– "Play" is a song by recording artist Jennifer Lopez.– Tennis player Capriati is 23.– Jennifer Lopez played softball in high school.

• Answer-bearing sentence validation as early as during answer extraction– TED provides a mean to extract knowledge of shared

structure between question and candidate sentences.

6/11/2013 NAACL 2013, ATLANTA 11

Motivation for Tree Edit Distance

• What sport does Jennifer Capriati play?– "Play" is a song by recording artist Jennifer Lopez.– Tennis player Capriati is 23.– Jennifer Lopez played softball in high school.

• Answer-bearing sentence validation as early as during answer extraction– TED provides a mean to extract knowledge of shared

structure between question and candidate sentences.

6/11/2013 NAACL 2013, ATLANTA 12

Edit Script Features (New)Are Q and S really talking about the same thing?

B-ANS O O O O O

Tennis player Jennifer Capriati is 23

What Sport does Jennifer Capriati play

6/11/2013 NAACL 2013, ATLANTA 13

Tree Edit Modeldirection: answer tree → question tree

since answer tree usually contains much more info

prd

playernn

jennifernn

capriatinnp

sportnn

23cd

bevbz

subj

nmodnmod

tennisnn

nmod

Tennis player Jennifer Capriati is 23.

capriatinnp

jennifernnp

whatwp

sportnn

playvbz

dovbz

vmod vmod

nmod subj

nmodWhat sport does Jennifer Capriati play?

TreeEditDist

6/11/2013 NAACL 2013, ATLANTA 14

TED Alignment

prd

playernn

jennifernn

capriatinnp

sportnn

23cd

bevbz

subj

nmodnmod

tennisnn

nmod

capriatinnp

jennifernnp

whatwp

sportnn

playvbz

dovbz

vmod vmod

nmodsubj

nmod

align(capriati/nnp/subj)

renamePos(nn, nnp)

6/11/2013 NAACL 2013, ATLANTA 15

TED Alignment with Wordnet

prd

playernn

jennifernn

capriatinnp

sportnn

23cd

bevbz

subj

nmodnmod

tennisnn

nmod

capriatinnp

jennifernnp

whatwp

sportnn

playvbz

dovbz

vmod vmod

nmodsubj

nmodHypernym

Derived Form

6/11/2013 NAACL 2013, ATLANTA 16

TED Deletion (w/o WordNet)delLeaf/delSubTree/del

prd

playernn

jennifernn

capriatinnp

sportnn

23cd

bevbz

subj

nmodnmod

tennisnn

nmod

Tennis player Jennifer Capriati is 23.

capriatinnp

jennifernnp

whatwp

sportnn

playvbz

dovbz

vmod vmod

nmod subj

nmodWhat sport does Jennifer Capriati play?

6/11/2013 NAACL 2013, ATLANTA 17

TED InsertioninsLeaf/insSubTree/ins

jennifernn

capriatinnp

subj

nmod

Jennifer Capriati

subj

nmod

capriatinnp

jennifernnp

whatwp

sportnn

playvbz

dovbz

vmod vmod

nmod

What sport does Jennifer Capriati play?

insSubTree:

ins(play/vbz/vmod)ins(do/vbz/root)

6/11/2013 NAACL 2013, ATLANTA 18

Cost Design(only allowing renaming when lemmas match)

• Simple Cost Function:– If lemmas of two nodes are different:

• Deletion and Insertion is encouraged• A single Del/Ins cost is 3 since lemma/pos/dep are

deleted/inserted

– If lemmas of two nodes are same:• Renaming is encouraged• Rename cost is either 1 if renaming only POS or

Dep, or 2 if both

6/11/2013 NAACL 2013, ATLANTA 19

Search Method

• Heilman & Smith 2010:– greedy local search with tree kernels– slow

• This work:– polynomial time dynamic programming

(Zhang & Shasha, 1989)– walks in the post-order traversal of two trees

and compare them– much faster: 10,000 tree pairs per second

• for fast feature extraction on the go

6/11/2013 NAACL 2013, ATLANTA 20

A Complete Tree Edit ScriptinsLeaf(sport), ins(what), delLeaf(tennis), delLeaf(player),

renamePos(Jennifer), insLeaf(play), del(23), del(is), ins(does)

prd

playernn

jennifernn

capriatinnp

sportnn

23cd

bevbz

subj

nmodnmod

tennisnn

nmod

Tennis player Jennifer Capriati is 23.

capriatinnp

jennifernnp

whatwp

sportnn

playvbz

dovbz

vmod vmod

nmod subj

nmodWhat sport does Jennifer Capriati play?

TreeEditDist

6/11/2013 NAACL 2013, ATLANTA 21

insLeaf(sport)

playernn

jennifernn

capriatinnp

sportnn

23cd

bevbz

subj

nmodnmod

tennisnn

nmod

sport Tennis player Jennifer Capriati is 23.

prd

sportnn

nmod

6/11/2013 NAACL 2013, ATLANTA 22

ins(what)

playernn

jennifernn

capriatinnp

sportnn

23cd

bevbz

subj

nmodnmod

tennisnn

nmod

What sport Tennis player Jennifer Capriati is 23.

prd

sportnn

nmod

whatwp

vmod

6/11/2013 NAACL 2013, ATLANTA 23

delLeaf(tennis)

playernn

jennifernn

capriatinnp

23cd

bevbz

subj

nmodnmod

What sport player Jennifer Capriati is 23.

prd

sportnn

nmod

whatwp

vmod

6/11/2013 NAACL 2013, ATLANTA 24

delLeaf(player)

jennifernn

capriatinnp

23cd

bevbz

subj

nmod

What sport Jennifer Capriati is 23.

prd

sportnn

nmod

whatwp

vmod

6/11/2013 NAACL 2013, ATLANTA 25

renamePos(Jennifer/nn, Jennifer/nnp)

jennifernnp

capriatinnp

23cd

bevbz

subj

nmod

What sport Jennifer Capriati is 23.

prd

sportnn

nmod

whatwp

vmod

6/11/2013 NAACL 2013, ATLANTA 26

jennifernnp

capriatinnp

23cd

bevbz

subj

nmod

What sport Jennifer Capriati play is 23.

prd

sportnn

nmod

whatwp

vmod

6/11/2013 NAACL 2013, ATLANTA 27

ins(play)

jennifernnp

capriatinnp

23cd

bevbz

subj

nmod

What sport Jennifer Capriati play is 23.

prd

sportnn

nmod

whatwp

vmodplayvbz

vmod

6/11/2013 NAACL 2013, ATLANTA 28

del(23)

nmod

jennifernnp

capriatinnp

bevbz

subj

nmod

What sport Jennifer Capriati play is

sportnn

whatwp

vmodplayvbz

vmod

6/11/2013 NAACL 2013, ATLANTA 29

del(23)

capriatinnp

jennifernnp

whatwp

sportnn

playvbz

vmod vmod

nmod subj

nmodWhat sport Jennifer Capriati play is

bevbz

6/11/2013 NAACL 2013, ATLANTA 30

del(be)

capriatinnp

jennifernnp

whatwp

sportnn

playvbz

vmod vmod

nmod subj

nmodWhat sport Jennifer Capriati play?

6/11/2013 NAACL 2013, ATLANTA 31

ins(do)

dovbz

capriatinnp

jennifernnp

whatwp

sportnn

playvbz

vmod vmod

nmod subj

nmodWhat sport does Jennifer Capriati play?

6/11/2013 NAACL 2013, ATLANTA 32

Edit Script Features (New)Are Q and S really talking about the same thing?

B-ANS O O O O O

Tennis player Jennifer Capriati is 23

What Sport does Jennifer Capriati play

6/11/2013 NAACL 2013, ATLANTA 33

Edit Script Featuresanswers are more likely to be in deleted info

feature example learned weight

del_pos=cd 90 medium

del_ner=PERSON_B

Jennifer medium

align_pos=nnp Capriati low

• Could've combined these features with question-type features

• But adding them alone already boosted the performance quite a bit

6/11/2013 NAACL 2013, ATLANTA 34

System Architecture

TEDPOSDEPNER

6/11/2013 NAACL 2013, ATLANTA 35

System ArchitectureFeature extractor

1 chunking-like2 question type3 edit script based4 align distance

TEDPOSDEPNER

6/11/2013 NAACL 2013, ATLANTA 36

System ArchitectureFeature extractor

1 chunking-like2 question type3 edit script based4 align distance

CRFsuiteoutput seq. tags:

TEDPOSDEPNER

6/11/2013 NAACL 2013, ATLANTA 37

System Architecture

B-ANS O O O O O

Tennis player Jennifer Capriati is 23

Feature extractor

1 chunking-like2 question type3 edit script based4 align distance

CRFsuiteoutput seq. tags:

TEDPOSDEPNER

6/11/2013 NAACL 2013, ATLANTA 38

CRF

Reference Model:Marginal Prob Lexicon

B-ANS B-ANS: 0.821060 Tennis

O O: 0.981168 player

O O: 0.928007 Jennifer

O O: 0.935270 Capriati

O O: 0.998608 is

O O: 0.897305 23

O O: 0.998293 .

What sport does Jennifer Capriati play?

6/11/2013 NAACL 2013, ATLANTA 39

Dataset

set source # questions # pairs # positive length

clean?

TRAIN-ALL TREC8-12 1229 53417 6403 full No

TRAIN TREC8-12 94 4718 348 ≤40 Yes

DEV TREC13 82 1148 222 ≤40 Yes

TEST TREC13 89 1517 284 ≤40 Yes

• Based on the dataset from Wang et. al (2007)– training set contains about half of TREC8-12 data– test set contains roughly half of TREC13 (2004) data

• trained on only positive examples but tested on all examples

• majority vote as the final answer

6/11/2013 NAACL 2013, ATLANTA 40

Performance

6/11/2013 NAACL 2013, ATLANTA 41

Naive Baselineforce alignment

prd

playernn

jennifernn

capriatinnp

sportnn

23cd

bevbz

subj

nmodnmod

tennisnn

nmod

align(capriati/nnp/subj)

renamePos(nn, nnp)align(tennis, what)

capriatinnp

jennifernnp

whatwp

sportnn

playvbz

dovbz

vmod vmod

nmodsubj

nmod

6/11/2013 NAACL 2013, ATLANTA 42

Performance

6/11/2013 NAACL 2013, ATLANTA 43

CRF

Reference Model:Marginal Prob Lexicon

B-ANS B-ANS: 0.821060 Tennis

O O: 0.981168 player

O O: 0.928007 Jennifer

O O: 0.935270 Capriati

O O: 0.998608 is

O O: 0.897305 23

O O: 0.998293 .

What sport does Jennifer Capriati play?

6/11/2013 NAACL 2013, ATLANTA 44

Performance

6/11/2013 NAACL 2013, ATLANTA 45

CRF Forced

• If CRF didn't give an answer, then mark those whose prob is 50 times of the median absolute deviation (MAD) from the median prob.

• MAD=

median(|X-median(X)|)

Reference Model:Marginal Prob Lexicon

O O: 0.921060 Conant

O O: 0.991168 had

O O: 0.997307 been

O O: 0.998570 a

O O: 0.998608 photographer

O O: 0.999005 for

O O: 0.877619 Adm

O O: 0.988293 .

O O: 0.874101 Chester

O O: 0.924568 Nimitz

O O: 0.970045 during

B-ANS O: 0.464799 -> B-ANS World

I-ANS O: 0.493715 -> I-ANS War

I-ANS O: 0.449017 -> I-ANS II

O O: 0.915448 .

During what war did Nimitz serve?

How do the new TED features help?

6/11/2013 NAACL 2013, ATLANTA 47

Ablation Test

if the ablation test hasn't convinced you

6/11/2013 NAACL 2013, ATLANTA 49

the QA Sentence Ranking Task

• Introduced by Wang et al. 2007– Mengqiu Wang, Noah A. Smith and Teruko Mitamura, What is

the Jeopardy Model? A Quasi-Synchronous Grammar for Question Answering, In Proceedings of EMNLP '07, 2007 (Nominated for Best Paper Award)

• Rank whether a candidate sentence contains an answer to the question– What sport does Jennifer Capriati play?

– YES: Tennis player Jennifer Capriati is 23

– NO: Capriati, 23, was beaten in the second round on Wed.

• Essentially an IR ranking task

6/11/2013 NAACL 2013, ATLANTA 50

Using TED for Ranking QA Pairs

• Extract features from the edit script of transforming from a candidate answer tree to question tree

• Treat it as a binary classification task and train a logistic regression model on these features.

6/11/2013 NAACL 2013, ATLANTA 51

Performance(quantitively better and state-of-the-art)

System Approach Mean Average Prec.

Mean Reciprocal Rank

Wang et al. 07(Jeopardy paper)

Quasi-Synchronous Grammar

0.6029 0.6852

Heilman&Smith10 TED + Local Search 0.6091 0.6917

Wang&Manning10 CRF/TED 0.5951 0.6951

This Work TED + Dynamic Programming

0.6319 0.7270

6/11/2013 NAACL 2013, ATLANTA 52

Conclusion on Answer Extraction

• First effort on extracting answers– TED + Conditional Random Field– simple features, template-free– fast (200 QA pairs per second if pre-parsed)

• State-of-the-art on ranking QA pairs

– TED + logistic regression

– fast (less than 1 minute to train/test)

6/11/2013 NAACL 2013, ATLANTA 53

Release and Preview

• jacana-qa– http://code.google.com/p/jacana/

• ACL 2013– IR: Automatic Coupling of Answer Extraction and

Information Retrieval

– discourse align: PARMA: A Predicate Argument Aligner

– word align: A Lightweight and High Performance Monolingual Word Aligner

6/11/2013 NAACL 2013, ATLANTA 54

Release and Preview

• jacana-qa– http://code.google.com/p/jacana/

• ACL 2013– IR: Automatic Coupling of Answer Extraction and

Information Retrieval

– discourse align: PARMA: A Predicate Argument Aligner

– word align: A Lightweight and High Performance Monolingual Word Aligner

6/11/2013 NAACL 2013, ATLANTA 55

Release and Preview

• jacana-qa– http://code.google.com/p/jacana/

• ACL 2013– IR: Automatic Coupling of Answer Extraction and

Information Retrieval

– discourse align: PARMA: A Predicate Argument Aligner

– word align: A Lightweight and High Performance Monolingual Word Aligner

6/11/2013 NAACL 2013, ATLANTA 56

Release and Preview

• jacana-qa– http://code.google.com/p/jacana/

• ACL 2013– IR: Automatic Coupling of Answer Extraction and

Information Retrieval

– discourse align: PARMA: A Predicate Argument Aligner

– word align: A Lightweight and High Performance Monolingual Word Aligner

thank you

top related