a wordnet “detour” to framenet
DESCRIPTION
A WordNet “Detour” to FrameNet. Aljoscha Burchardt Katrin Erk Anette Frank* Saarland University, DFKI* Saarbrücken {albu,erk,frank}@coli.uni-sb.de. Motivation. Demand for semantic information access (IE, QA,…) Available resources Large-scale (statistical) parsing systems WordNet(s) - PowerPoint PPT PresentationTRANSCRIPT
GermaNet-WS II03-2005
A WordNet “Detour” to FrameNet
Aljoscha Burchardt Katrin Erk
Anette Frank*
Saarland University, DFKI* Saarbrücken{albu,erk,frank}@coli.uni-sb.de
GermaNet-WS II03-2005
Motivation• Demand for semantic information access (IE, QA,
…)
• Available resources– Large-scale (statistical) parsing systems– WordNet(s)
• Modeling approximate lexical semantics• High coverage
– FrameNet, PropBank• Modeling predicate-argument structure• Limited coverage
• Aim: Combining methods to arrive at a high coverage, various-depth (lexical) semantic analysis
GermaNet-WS II03-2005
Outline
• FrameNet• Using Frames for NLP applications
– Current architecture– Coverage problems
• A WordNet detour to FrameNet– First Evaluation
• Conclusion and Outlook
GermaNet-WS II03-2005
FrameNet• Frame Semantics (Fillmore 1976, ...)
– Frame: a conceptual structure or prototypical situation– Frame elements (roles): participants of the situation
– Frame evoking elements (FEEs; verbs, nouns,…)
• Example instances of Statement :1. “[He Speaker] speaks [highly Manner] [of you Topic],” she said.2. “Did [Dominic Speaker] ever make any comments
[regarding Toby Topic] [to you Addressee]?”
• Berkeley FrameNet Project– Database of frames for core lexicon of English– Current release: 615 frames, ~ 8000 lexical units (LUs)
GermaNet-WS II03-2005
Saarbrücken SALSA (II) Project
• Manual frame-annotation of part of TIGER corpus
• Develop automatic methods for Frame/Role assignment
• Study metaphors, multi-word expressions• Study frames in context• Work out logical representation for
heuristic inferences• Funded by DFG
GermaNet-WS II03-2005
Using Frames for NLP applications
• LFG-based parsing and syntax-semantics interface– ParGram grammars for German and English (Butt et al. 2002)– Interfaces to statistical frame and role assignment (Baldewein
et al. 2004, Erk 2004)– Frame projection from f-structure (XLE transfer, Crouch 2005)
• Enriching Semantic Representation– Rule-based refinement of semantic representation– Autom. assignment of SUMO/MILO classes (using WordNet
WSD)
• Logical Representation and Reasoning
– FEF (frame exchange format)– Translation to logic programs (joint work with P. Baumgartner
and F. Suchanek, MPI Saarbrücken)– First scenario: RTE Challenge (PASCAL Network)
GermaNet-WS II03-2005
FEFViewer (by Alexander Koller)
GermaNet-WS II03-2005
FEF Example
F-Structure
string('Jessica Litman is a law professor.').
xcomp(f(0),f(13)).tense(f(0),pres).stmt_type(f(0),declarative).pred(f(0),be).mood(f(0),indicative).dsubj(f(0),f(1)).proper(f(1),name).pred(f(1),'Litman').num(f(1),sg).mod(f(1),f(4)).proper(f(4),name).pred(f(4),'Jessica').num(f(4),sg).subj(f(13),f(1)).pred(f(13),professor).num(f(13),sg).mod(f(13),f(16)).det_type(f(13),indef).pred(f(16),law).num(f(16),sg).
Semantics Projection
frame(s(93),'Education_teaching').rel(s(93),professor).ont(s(93),s(154)).
wn_syn(s(154),'professor#n#1').sumo_sub(s(154),'Position').milo_syn(s(154),'Professor').
rel(s(157),law).ont(s(157),s(156)).
wn_syn(s(156),'law#n#3').sumo_sub(s(156),'Proposition').milo_sub(s(156),'Proposition').
rel(s(166),'Jessica').ont(s(166),s(165)).
sumo_syn(s(165),'Human').frame(s(168),'People').
person(s(168),s(168)).person(s(168),s(166)).
rel(s(168),'Litman').ont(s(168),s(167)).
sumo_syn(s(167),'Human').
sslink(f(1),s(168)).sslink(f(4),s(166)).sslink(f(13),s(93)).sslink(f(16),s(157)).
GermaNet-WS II03-2005
Statistical Frame Assignment-Example-
“The Royal Navy servicemen being held captive by Iran are expected to be freed today.”
statistical (79,83) Calendric_unit
statistical (58,65) Expectation
GermaNet-WS II03-2005
Statistical Frame Assignment-Issues-
Learning statistical frame assignment from annotated FrameNet data– Coverage (often too few examples to
learn)– Too little ambiguity
• Reason: frame-wise annotation• E.g. have only LU of Birth• 0.7% of the current 8000 LUs ambiguous at
all• Baseline for assigning each word its most
frequent frame at 93% f-score.
GermaNet-WS II03-2005
Frame Assignment viaWordNet ”Detour“
• Assign frame(s) on the basis of WordNet related words
• Addresses coverage problem• Requires WSD to WordNet
– SenseRelate system by Ted Pedersen et al available, alternatively
– always take first (most frequent) synset
GermaNet-WS II03-2005
Frame Assignment via Detour –Example-
“The Royal Navy servicemen being held captive by Iran are expected to be freed today.”
statistical (79,83) Calendric_unitstatistical (58,65) Expectation
serviceman#n#1 (16,26) Peoplehold#v#20 (33,37) Containingcaptive#a#1 (38,45) Prisonexpect#v#1 (58,66) Expectationfree#v#6 (73,78) Emitting,Use_firearm
GermaNet-WS II03-2005
FN-Detour Algorithm
Input: a target word (synset)1. Use WordNet
Search words = target word, synonyms, antonyms, hypernyms
2. Look up FrameNetCandidate frames = all frames that list any
search word as LUs
3. Select and return best frame(s) from candidate frames
GermaNet-WS II03-2005
Detour ExampleStep 1: WordNet
Target: serviceman#n#1serviceman, military man, man, military personnel
=> skilled worker, trained worker => worker => person, individual, someone, somebody, mortal, human, soul => organism, being => living thing, animate thing => object, physical object => entity => causal agent, cause, causal agency => entity
GermaNet-WS II03-2005
Detour ExampleStep 2: Candidate Frames
Word(s) Frame(s)
cause Causation
object Goal
man, individual, person
People
GermaNet-WS II03-2005
Detour ExampleStep 3: Weights
Word(s) Frame(s) Weight
man, individual, person
People 1.68
cause Causation 0.06
object Goal 0.03
GermaNet-WS II03-2005
Weighting
• Factors1. WordNet distance of FEE from target word
(similarity)2. “Spreading factor“, i.e. the number of
frames a word evokes 3. Matching vs. LU lookup (boost)
• ive)dnet_relatfactor(worSpreading_
rBoostFactod)target_worelative,(wordnet_rsimilarity
GermaNet-WS II03-2005
Special: Matching Frame Names
• E.g. Research does not (yet) list the noun researcher as LU
• If there is no LU for a given word, Detour system looks for matching frame names
• Lower weighting for match
GermaNet-WS II03-2005
Matching Example• Target: researcher#n#1research worker, researcher, investigator
=> scientist, man of science => person, individual, someone, somebody,… => organism, being => living thing, animate thing => object, physical object => entity => causal agent, cause, causal agency => entity
GermaNet-WS II03-2005
Matching Example (ctd)
• Target: researcher#n#1
Word(s) Type Frame(s) Weight
researcher, research worker
Match Research 2
scientist LU People_by_vocation 0.38
individual, person
LU People 0.33
...
GermaNet-WS II03-2005
Evaluation
• Problem: no off-hand gold standard• FrameNet data (100.000 annotated
instances)– All annotated words are LUs of some
frame– Detour not really necessary
• Solution: detour-only version of our system must not look up target word
GermaNet-WS II03-2005
First Evaluation Results(detour-only)
Frames assigned per synset
none 1 >1
Total instances 13% 71% 16%
Gold standard frame contained
- 38% 7%
Table 1: Frame assignment of detour-only system (FrameNet corpus).80.000 frame instances (60.000 verb, 20.000 noun, 20.000 adj./adv.)
GermaNet-WS II03-2005
Inspection of Misses
Gold standard frame
Frames assigned by system
Instances
Manufacturing
Invention
Intentionally_create
Building
Cause_to_start
Getting
Transformation
1912
11541
GermaNet-WS II03-2005
Recent Evaluation Results(detour-only)
• Return best frame(s) condition may be too strict (ambiguity is there)
• Take first and second best result frame(s) – Gold standard contained +10%– Number of returned frames rises from 1,3 to 3
• Does the WSD system help?– “Always take first synset” slightly better +4%
GermaNet-WS II03-2005
Evaluation (full system)
• Coverage: 96%• Gold standard in (best) result: 83%
– WSD not always optimal– Ambiguity leads to a higher weighting
of another frame
GermaNet-WS II03-2005
Issues• Just to mention: frames only (no roles)• Weighting hand-crafted, improvement
possible?• Threshold needed (“Is there a frame that
fits?”)• What about German?
– Access to GermaNet • Available Perl packages for WordNet 2.0• WSD system as well• Encoding problems (“Period of transition”)
– German FrameNet data not (yet) in Berkeley format– Coverage?
GermaNet-WS II03-2005
Conclusion and Outlook• Detour via WordNet allows assignment
of FrameNet frames in many „unknown“ cases
• Still: this is the beginning of a journey• Web interface (link on my HP):
http://www.coli.uni-saarland.de/~albu/cgi-bin/FN-Detour.cgi
• Student project to– Prepare release– More evaluation– Learning of weighting?– Transfer to German?