salsa-ws 09/05 approximating textual entailment with lfg and framenet frames aljoscha burchardt,...

19
SALSA-WS 09/05 Approximating Textual Entailment with LFG and FrameNet Frames Aljoscha Burchardt, Anette Frank Computational Linguistics Department Saarland University, Saarbrücken Second Pascal Challenge Workshop Venice, April 2006

Upload: lindsay-mitchell

Post on 17-Jan-2018

216 views

Category:

Documents


0 download

DESCRIPTION

SALSA-WS 09/05 Frame Semantics (Fillmore 1976, Fillmore et. al. 2003) Lexical semantic classification of predicates and their argument structure A frame represents a prototypical situation (e.g. Commercial_transaction, Theft, Awareness) A set of roles identifies the participants or propositions involved Frames are organized in a hierarchy Berkeley FrameNet Project db: 600 frames, lexical units, annotated sentences

TRANSCRIPT

Page 1: SALSA-WS 09/05 Approximating Textual Entailment with LFG and FrameNet Frames Aljoscha Burchardt, Anette Frank Computational Linguistics Department Saarland

SALSA-WS 09/05

Approximating Textual Entailment with LFG and

FrameNet FramesAljoscha Burchardt, Anette Frank

Computational Linguistics DepartmentSaarland University, Saarbrücken

Second Pascal Challenge WorkshopVenice, April 2006

Page 2: SALSA-WS 09/05 Approximating Textual Entailment with LFG and FrameNet Frames Aljoscha Burchardt, Anette Frank Computational Linguistics Department Saarland

SALSA-WS 09/05

Outline of this Talk• Frame Semantics• A baseline system for approximating

Textual Entailment– LFG syntactical analyses with– Frame semantics– Statistical decision: entailed?

• Walk-through example from RTE 2006• RTE 2006 results / brief conclusions

Page 3: SALSA-WS 09/05 Approximating Textual Entailment with LFG and FrameNet Frames Aljoscha Burchardt, Anette Frank Computational Linguistics Department Saarland

SALSA-WS 09/05

Frame Semantics (Fillmore 1976, Fillmore et. al. 2003)

• Lexical semantic classification of predicates and their argument structure

• A frame represents a prototypical situation (e.g. Commercial_transaction, Theft, Awareness)

• A set of roles identifies the participants or propositions involved

• Frames are organized in a hierarchy• Berkeley FrameNet Project db: 600 frames,

9.000 lexical units, 135.000 annotated sentences

Page 4: SALSA-WS 09/05 Approximating Textual Entailment with LFG and FrameNet Frames Aljoscha Burchardt, Anette Frank Computational Linguistics Department Saarland

SALSA-WS 09/05

Seller BMW bought Rover from British Aerospace.

Buyer Rover was bought by BMW, which financed [...] the new Range Rover.

Goods BMW, which acquired Rover in 1994, is now dismantling the company.

Money BMW‘s purchase of Rover for $1.2 billion was a good move.

Linguistic Normalizations(Frame: Commerce_buy)

Voice: active / passive

POS: verb / noun

Lexicalization

Page 5: SALSA-WS 09/05 Approximating Textual Entailment with LFG and FrameNet Frames Aljoscha Burchardt, Anette Frank Computational Linguistics Department Saarland

SALSA-WS 09/05

Frame Semantics for RTEFocusing on lexical semantic classes and

role-based argument structure– Built-in normalizations help to determine

semantic similarity at a high level of abstraction

– Disregarding aspects of “deep“ semantics: negation, modality, quantification, ...

– Open for deeper modeling on demand (e.g. our treatment of modality)

Page 6: SALSA-WS 09/05 Approximating Textual Entailment with LFG and FrameNet Frames Aljoscha Burchardt, Anette Frank Computational Linguistics Department Saarland

SALSA-WS 09/05

A Baseline System for Approximating Textual Entailment

• Fine-grained LFG-based syntactic analysis – English LFG grammar (Riezler et al. 2002)– Wide-coverage with high-quality probabilistic

disambiguation• Frame Semantics

– Shallow lexical-semantic classification of predicate-argument structure

– Extensions: WordNet senses, SUMO concepts• Computing structural and semantic overlap of t

and h– Hypothesis: large overlap ≈ entailment

text hypothesis

Page 7: SALSA-WS 09/05 Approximating Textual Entailment with LFG and FrameNet Frames Aljoscha Burchardt, Anette Frank Computational Linguistics Department Saarland

Statistical Decision: Entailment?

ComputingSemanticOverlap

Linguistic AnalyseshypothesisLFG f-structure graph w/ frames & concepts

text LFG f-structure graph w/ frames & concepts

text-hypothesis match graphdifferent types of matches (aspects of similarity)

Feature extractionlexical, syntactic, semantic structure & overlap measures

Model training & classification

A Baseline System for Approximating Textual Entailment

Page 8: SALSA-WS 09/05 Approximating Textual Entailment with LFG and FrameNet Frames Aljoscha Burchardt, Anette Frank Computational Linguistics Department Saarland

SALSA-WS 09/05

Rule-based: extend & refine sem. proj.• NEs, Locations• Co-reference • Modality, etc.

Linguistic ComponentsXLE parsing: LFG f-structure

F-structure w/ semantics projection

WordNet-based WSD: WordNet &

SUMO

Fred / Detour / Rosy:

frames & roles

Using XLE term rewriting system (Crouch 2005)

Page 9: SALSA-WS 09/05 Approximating Textual Entailment with LFG and FrameNet Frames Aljoscha Burchardt, Anette Frank Computational Linguistics Department Saarland

SALSA-WS 09/05

Example from RTE 2006Pair 716

Text In 1983, Aki Kaurismäki directed his first full-time feature.

Hypothesis Aki Kaurismäki directed a film.

Page 10: SALSA-WS 09/05 Approximating Textual Entailment with LFG and FrameNet Frames Aljoscha Burchardt, Anette Frank Computational Linguistics Department Saarland

LFG F-Structures

Page 11: SALSA-WS 09/05 Approximating Textual Entailment with LFG and FrameNet Frames Aljoscha Burchardt, Anette Frank Computational Linguistics Department Saarland

SALSA-WS 09/05

Automatic Frame Annotation for Text (SALTO Viewer)

Fred & Rosyframes & roles

(statistical)

Detour Systemframes

(via WordNet)

Collins Parse

Page 12: SALSA-WS 09/05 Approximating Textual Entailment with LFG and FrameNet Frames Aljoscha Burchardt, Anette Frank Computational Linguistics Department Saarland

SALSA-WS 09/05

Automatic Frame Annotation for Hypothesis

716_h: Aki Karusmäki directed a film.

Page 13: SALSA-WS 09/05 Approximating Textual Entailment with LFG and FrameNet Frames Aljoscha Burchardt, Anette Frank Computational Linguistics Department Saarland

SALSA-WS 09/05

LFG + Frames for Hypothesis(FEFViewer)

Aki Kaurismäki directed a film.

Rule-based(LFG-NER)

Page 14: SALSA-WS 09/05 Approximating Textual Entailment with LFG and FrameNet Frames Aljoscha Burchardt, Anette Frank Computational Linguistics Department Saarland

SALSA-WS 09/05

Hypothesis-Text-Match Graphs Computing Structural and Semantic overlap

Match graph bundles overlapping partial graphs marked by match types

• Aspects of similarity– Syntax-based (i.e. lexical and structural): Identical

predicates (attributes) trigger node (edge) matches.– Semantics-based: Identical frames/concepts (roles)

trigger node (edge) matches.• Degrees of similarity

– Strict matching– Weak matching conditions for non-identical predicates:

• “Structurally related” e.g. via coreference (relative clauses, appositives, pronominals)

• “Semantically related” via WordNet, Frame-Relations

Page 15: SALSA-WS 09/05 Approximating Textual Entailment with LFG and FrameNet Frames Aljoscha Burchardt, Anette Frank Computational Linguistics Department Saarland

h: Aki Kaurismäki directed a film.

WordNetrelated

t: In 1983, Aki Kaurismäki directed his first full-time feature.

Grammaticallyrelated

Page 16: SALSA-WS 09/05 Approximating Textual Entailment with LFG and FrameNet Frames Aljoscha Burchardt, Anette Frank Computational Linguistics Department Saarland

Statistical Modeling• Feature extraction on the basis of

– Syntactic, Semantic matches (of different types)– Matching clusters’ sizes– Ratio (matched vs. hypothesis)– (Non-)matching modality– RTE-task, fragmentary (parse),…

• Training/classification with WEKA tool– Feature selection

1. Predicate Matches2. Frame overlap3. Matching cluster size

– Model 1: Conjunctive rule (Feat. 1,2)– Model 2: LogitBoost (Feat. 1,2,3)

Page 17: SALSA-WS 09/05 Approximating Textual Entailment with LFG and FrameNet Frames Aljoscha Burchardt, Anette Frank Computational Linguistics Department Saarland

RTE 2006 Resultsall tasks IE IR QA SUM

Model 1 59.0 49.5 59.5 54.5 72.5Model 2 57.8 48.5 58.5 57.0 67.0

• SUM (and IR) are natural tasks for Frame Semantics, IE and QA need more deeper modeling (aboutness vs. factivity)

• Error analysis– True positives: high semantic overlap– True negatives: 27% involve modality mismatches– False examples: poor modeling of dissimalrity

• Many high-frequency features measuring similarity• Few low-frequency features measuring dissimilarity

Page 18: SALSA-WS 09/05 Approximating Textual Entailment with LFG and FrameNet Frames Aljoscha Burchardt, Anette Frank Computational Linguistics Department Saarland

SALSA-WS 09/05

Brief Conclusions• Good approximation of semantic similarity

– Deep LFG syntactical analyses integrated with– Shallow lexical Frame Semantics (plus other lex.

resources)– Match graph measuring overlap

• Need better model for semantic dissimilarity– Too few rejections (false positives >> false negatives)

• Towards deeper modeling– Treatment of modal contexts– Integration of lexical inferences– Open for collaborations

Page 19: SALSA-WS 09/05 Approximating Textual Entailment with LFG and FrameNet Frames Aljoscha Burchardt, Anette Frank Computational Linguistics Department Saarland

stmt_type(f(0),declarative).tense(f(0),past).pred(f(0),direct).mood(f(0),indicative).dsubj(f(0),f(7)).dobj(f(0),f(2)).pred(f(2),film).num(f(2),sg).det_type(f(2),indef).proper(f(7),name).pred(f(7),'Kaurismaki').num(f(7),sg).mod(f(7),f(10)).proper(f(10),name).pred(f(10),'Aki').num(f(10),sg).sslink(f(0),s(41)).sslink(f(2),s(42)).sslink(f(7),s(45)).sslink(f(10),s(59)).

frame(s(41),'Behind_the_scenes').artist(s(41),s(45)).production(s(41),s(42)).frame(s(42),'Behind_the_scenes').frame(s(45),'People').person(s(45),s(59)).person(s(45),s(45)).

ont(s(41),s(48)).ont(s(42),s(49)).ont(s(45),s(56)).wn_syn(s(48),'direct#v#11').sumo_sub(s(48),'Steering').milo_sub(s(48),'Steering').wn_syn(s(49),'film#n#1').sumo_sub(s(49),'MotionPicture').milo_sub(s(49),'MotionPicture').sumo_syn(s(56),'Human').sumo_syn(s(58),'Human').

LFG + Frames for Hypothesis (FEF)