combining approaches for identifying metonymy classes of named locations
DESCRIPTION
Talk given at EPIA 2007, December 4 2007, Guimaraes, PortugalTRANSCRIPT
Combining Approaches forIdentifying Metonymy Classes of
Named Locations
Sven Hartrumpf and Johannes Leveling
Intelligent Information and Communication Systems (IICS)University of Hagen (FernUniversität in Hagen)
58084 Hagen, [email protected]
EPIA 2007, Dec. 4, Guimarães, Portugal
IdentifyingMetonymyClasses of
NamedLocations
S. Hartrumpfand
J. Leveling
Introduction
MetonymyClasses forLocationNames
CorpusAnnotationwithMetonymyInformation
MetonymyClassifiers
ClassifierCombination
EvaluationResults
Conclusionand Outlook
References
Outline
1 Introduction
2 Metonymy Classes for Location Names
3 Corpus Annotation with Metonymy Information
4 Metonymy Classifiers
5 Classifier Combination
6 Evaluation Results
7 Conclusion and Outlook
IdentifyingMetonymyClasses of
NamedLocations
S. Hartrumpfand
J. Leveling
Introduction
MetonymyClasses forLocationNames
CorpusAnnotationwithMetonymyInformation
MetonymyClassifiers
ClassifierCombination
EvaluationResults
Conclusionand Outlook
References
Figurative Speech
DefinitionMetonymy is a figure of speech in which a speaker usesone entity to refer to another that is related to it(Lakoff and Johnson, 1980)
→ senses different from normal reading→ identifying metonymy can be seen as word sense
disambiguation→ classification task• levels of classification:
coarse (LITERAL/NON-LITERAL)medium (LIT /MET /MIX )fine
IdentifyingMetonymyClasses of
NamedLocations
S. Hartrumpfand
J. Leveling
Introduction
MetonymyClasses forLocationNames
CorpusAnnotationwithMetonymyInformation
MetonymyClassifiers
ClassifierCombination
EvaluationResults
Conclusionand Outlook
References
Metonymy
• Typically, metonymy recognition experiments onEnglish texts
• Growing importance in research and applications:• SemEval I task at ACL 2007 (Markert and Nissim,
2007): recognition of metonymic location andorganization names
• Question Answering (Stallard, 1993),• Machine Translation (Kamei and Wakao, 1992),• Geographic Information Retrieval (Leveling and
Hartrumpf, 2006)
IdentifyingMetonymyClasses of
NamedLocations
S. Hartrumpfand
J. Leveling
Introduction
MetonymyClasses forLocationNames
CorpusAnnotationwithMetonymyInformation
MetonymyClassifiers
ClassifierCombination
EvaluationResults
Conclusionand Outlook
References
Metonymy Classes(Markert and Nissim, 2002)
Class Description
Medium Fine
LIT literal literal, geographic senseMET place-for-event →event
place-for-people:place-for-gov(ernment) →people in governmentplace-for-off(icials) →people in official administrationplace-for-org(anization) →organization at locationplace-for-pop(ulation) →population
place-for-product →product from placeothermet metonymy not covered by regular
patternMIX mixed literal and metonymic sense
IdentifyingMetonymyClasses of
NamedLocations
S. Hartrumpfand
J. Leveling
Introduction
MetonymyClasses forLocationNames
CorpusAnnotationwithMetonymyInformation
MetonymyClassifiers
ClassifierCombination
EvaluationResults
Conclusionand Outlook
References
Metonymy Classes(Markert and Nissim, 2002)
Class Description
Medium Fine
LIT literal literal, geographic senseMET place-for-event →event
place-for-people:place-for-gov(ernment) →people in governmentplace-for-off(icials) →people in official administrationplace-for-org(anization) →organization at locationplace-for-pop(ulation) →population
place-for-product →product from placeothermet metonymy not covered by regular
patternMIX mixed literal and metonymic sense
Example for literal :Seit Beginn des Kosovo-Krieges rekrutiert die UCK in DEUTSCHLAND
Kämpfer. – 9951(Since the beginning of the Kosovo war, the UCK recruits fighters inGERMANY.)
IdentifyingMetonymyClasses of
NamedLocations
S. Hartrumpfand
J. Leveling
Introduction
MetonymyClasses forLocationNames
CorpusAnnotationwithMetonymyInformation
MetonymyClassifiers
ClassifierCombination
EvaluationResults
Conclusionand Outlook
References
Metonymy Classes(Markert and Nissim, 2002)
Class Description
Medium Fine
LIT literal literal, geographic senseMET place-for-event →event
place-for-people:place-for-gov(ernment) →people in governmentplace-for-off(icials) →people in official administrationplace-for-org(anization) →organization at locationplace-for-pop(ulation) →population
place-for-product →product from placeothermet metonymy not covered by regular
patternMIX mixed literal and metonymic sense
Example for place-for-event :Nach dem KOSOVO geht es in Makedonien und Montenegro weiter. – 6336(After KOSOVO, it will continue in Macedonia and Montenegro.)
IdentifyingMetonymyClasses of
NamedLocations
S. Hartrumpfand
J. Leveling
Introduction
MetonymyClasses forLocationNames
CorpusAnnotationwithMetonymyInformation
MetonymyClassifiers
ClassifierCombination
EvaluationResults
Conclusionand Outlook
References
Metonymy Classes(Markert and Nissim, 2002)
Class Description
Medium Fine
LIT literal literal, geographic senseMET place-for-event →event
place-for-people:place-for-gov(ernment) →people in governmentplace-for-off(icials) →people in official administrationplace-for-org(anization) →organization at locationplace-for-pop(ulation) →population
place-for-product →product from placeothermet metonymy not covered by regular
patternMIX mixed literal and metonymic sense
Example for place-for-off :. . . DEUTSCHLAND (wird) mehr Geschick haben als Clinton. – 2435(. . . GERMANY will be more successful than Clinton.)
IdentifyingMetonymyClasses of
NamedLocations
S. Hartrumpfand
J. Leveling
Introduction
MetonymyClasses forLocationNames
CorpusAnnotationwithMetonymyInformation
MetonymyClassifiers
ClassifierCombination
EvaluationResults
Conclusionand Outlook
References
Metonymy Classes(Markert and Nissim, 2002)
Class Description
Medium Fine
LIT literal literal, geographic senseMET place-for-event →event
place-for-people:place-for-gov(ernment) →people in governmentplace-for-off(icials) →people in official administrationplace-for-org(anization) →organization at locationplace-for-pop(ulation) →population
place-for-product →product from placeothermet metonymy not covered by regular
patternMIX mixed literal and metonymic sense
Example for place-for-product :Politisch sollte die Unterschrift Belgrads unter RAMBOUILLET erzwungenwerden. – 12087(The signature of Belgrade under RAMBOUILLET should be forced politically.)
IdentifyingMetonymyClasses of
NamedLocations
S. Hartrumpfand
J. Leveling
Introduction
MetonymyClasses forLocationNames
CorpusAnnotationwithMetonymyInformation
MetonymyClassifiers
ClassifierCombination
EvaluationResults
Conclusionand Outlook
References
Metonymy Classes(Markert and Nissim, 2002)
Class Description
Medium Fine
LIT literal literal, geographic senseMET place-for-event →event
place-for-people:place-for-gov(ernment) →people in governmentplace-for-off(icials) →people in official administrationplace-for-org(anization) →organization at locationplace-for-pop(ulation) →population
place-for-product →product from placeothermet metonymy not covered by regular
patternMIX mixed literal and metonymic sense
Example for othermet :Dabei ist AFRIKA auch bei dieser Zusammenstellung von Musik eher eineideelle Klammer. – 8415(But AFRICA is an ideational cramp for this composition of music, too.)
IdentifyingMetonymyClasses of
NamedLocations
S. Hartrumpfand
J. Leveling
Introduction
MetonymyClasses forLocationNames
CorpusAnnotationwithMetonymyInformation
MetonymyClassifiers
ClassifierCombination
EvaluationResults
Conclusionand Outlook
References
Metonymy Classes(Markert and Nissim, 2002)
Class Description
Medium Fine
LIT literal literal, geographic senseMET place-for-event →event
place-for-people:place-for-gov(ernment) →people in governmentplace-for-off(icials) →people in official administrationplace-for-org(anization) →organization at locationplace-for-pop(ulation) →population
place-for-product →product from placeothermet metonymy not covered by regular
patternMIX mixed literal and metonymic sense
Example for mixed :Die Friedensfahrt gewinnt im Osten DEUTSCHLANDS wieder stark anRenommee. – 1498(The peace tour makes a reputation in the eastern part of GERMANY again.)
IdentifyingMetonymyClasses of
NamedLocations
S. Hartrumpfand
J. Leveling
Introduction
MetonymyClasses forLocationNames
CorpusAnnotationwithMetonymyInformation
MetonymyClassifiers
ClassifierCombination
EvaluationResults
Conclusionand Outlook
References
Data and Annotation (1/2)
• TüBa-D/Z corpus containing articles from the Germannewspaper taz (27,067 sentences with 500,628 tokens)
• Annotation levels:• (PoS tags)• NE tags (LOC, PER, ORG, and MISC)• NE subclasses (e.g. first names, last names, and other
parts of a name)• Label corresponding to medium and fine metonymy
classification• Example: token Africa →(NE, LOC, region, MET,
othermet)
→ 1,515 (18.5%) of all toponyms are used in a nonliteralsense
IdentifyingMetonymyClasses of
NamedLocations
S. Hartrumpfand
J. Leveling
Introduction
MetonymyClasses forLocationNames
CorpusAnnotationwithMetonymyInformation
MetonymyClassifiers
ClassifierCombination
EvaluationResults
Conclusionand Outlook
References
Data and Annotation (2/2)
Annotation checking:• Applied the variation (or inconsistency) detection tool
DECCA (http://decca.osu.edu/)• Used corrections supplied by the TüBa-D/Z corpus
publishers• Identify additional spelling errors by frequency analysis→ Errors in text and on levels of PoS tags, NE tags, NE
subclasses, medium and fine metonymy classes
IdentifyingMetonymyClasses of
NamedLocations
S. Hartrumpfand
J. Leveling
Introduction
MetonymyClasses forLocationNames
CorpusAnnotationwithMetonymyInformation
MetonymyClassifiers
ClassifierCombination
EvaluationResults
Conclusionand Outlook
References
Frequency of MetonymyClasses
Class Frequency
Coarse Medium Fine
LITERAL LIT literal 6672
NON-LITERAL MET (1433) place-for-event 55place-for-gov 51place-for-off 512place-for-org 148place-for-pop 340place-for-product 10othermet 317
MIX mixed 82
IdentifyingMetonymyClasses of
NamedLocations
S. Hartrumpfand
J. Leveling
Introduction
MetonymyClasses forLocationNames
CorpusAnnotationwithMetonymyInformation
MetonymyClassifiers
ClassifierCombination
EvaluationResults
Conclusionand Outlook
References
Metonymy Classifiers
• All classifiers are based on a memory-based learner,TiMBL (supervised learning)
• All classifiers implemented by different people• Shallow classifier 1 (SC1): relies largely on features
obtained from gazetteer lookup• Shallow classifier 2 (SC2): includes features encoding
ontological sorts from the context• Deep classifier (DC): employs features from parse
results (syntactico-semantic parsing with a semanticallyoriented computer lexicon)
IdentifyingMetonymyClasses of
NamedLocations
S. Hartrumpfand
J. Leveling
Introduction
MetonymyClasses forLocationNames
CorpusAnnotationwithMetonymyInformation
MetonymyClassifiers
ClassifierCombination
EvaluationResults
Conclusionand Outlook
References
Metonymy Classifier SC1Main features for training instances:• 109 features• Character features (e.g. token starts with capital letter?)• Semantic entities (entity classes for the token obtained
from morpholexical analysis)• PoS tags• Gazetteer lookups (for cities, countries, etc.)• Metonymy context (metonymy class of the token to the left)
IdentifyingMetonymyClasses of
NamedLocations
S. Hartrumpfand
J. Leveling
Introduction
MetonymyClasses forLocationNames
CorpusAnnotationwithMetonymyInformation
MetonymyClassifiers
ClassifierCombination
EvaluationResults
Conclusionand Outlook
References
Metonymy Classifier SC2Main features for training instances:• 269 features• Sentence context (lemma and distance to the location
token)• Word context (the first three and the last three characters
of the token, PoS tag, position in the sentence,upper/lower case information, and word length)
• Metonymy context (metonymy class of two precedingtokens)
• Ontological sorts (for words in the context, using a bitvector representation of a sort hierachy)
• Sentence length (number of tokens)
IdentifyingMetonymyClasses of
NamedLocations
S. Hartrumpfand
J. Leveling
Introduction
MetonymyClasses forLocationNames
CorpusAnnotationwithMetonymyInformation
MetonymyClassifiers
ClassifierCombination
EvaluationResults
Conclusionand Outlook
References
Metonymy Classifier DC (1/2)
Background:• Syntactico-semantic parser (WOCADI) delivers
features for the deep classifier• Semantic result: MultiNet (multilayered extended
semantic networks, Helbig (2006)); MultiNet nodes:disambiguated word readings (concepts)
• Syntactic result: dependency graph• Important resource for the parser:
semantically oriented lexicon (HaGenLex)
IdentifyingMetonymyClasses of
NamedLocations
S. Hartrumpfand
J. Leveling
Introduction
MetonymyClasses forLocationNames
CorpusAnnotationwithMetonymyInformation
MetonymyClassifiers
ClassifierCombination
EvaluationResults
Conclusionand Outlook
References
Metonymy Classifier DC (1/2)• 13 features• p-quality: quality of the parser result as a numerical value between 500
and 1000• token: name token; type: name type (i.e. lemma)• dep-rel: dependency relation leading to the governor (mother
constituent)• role: semantic role filled by the name• appos-molec: name accompanied by a molecular apposition?• adjective: lemma of modifying adjective• csister-ctype: lemma of coordinated sister node with compound
reduction• csister-entity: semantic entity value of coordinated sister node• mother-entity: semantic entity value of mother constituent• mother-sort: ontological sort of mother constituent• mother-type: type (i.e. lemma) of mother• mother-ctype: type (i.e. lemma) of mother with compound reduction
IdentifyingMetonymyClasses of
NamedLocations
S. Hartrumpfand
J. Leveling
Introduction
MetonymyClasses forLocationNames
CorpusAnnotationwithMetonymyInformation
MetonymyClassifiers
ClassifierCombination
EvaluationResults
Conclusionand Outlook
References
Classifier Combination
Features for training instances:• 15 features• results for the location token (from SC1, SC2, DC)• results for tokens in the context (from SC1, SC2, DC)
IdentifyingMetonymyClasses of
NamedLocations
S. Hartrumpfand
J. Leveling
Introduction
MetonymyClasses forLocationNames
CorpusAnnotationwithMetonymyInformation
MetonymyClassifiers
ClassifierCombination
EvaluationResults
Conclusionand Outlook
References
Results on Coarse LevelClass SC1 SC2 DC
P R P R P R
LITERAL 89.16 93.74 93.36 93.71 94.01 36.71NON-LITERAL 64.36 49.83 71.81 70.63 82.31 32.87
IdentifyingMetonymyClasses of
NamedLocations
S. Hartrumpfand
J. Leveling
Introduction
MetonymyClasses forLocationNames
CorpusAnnotationwithMetonymyInformation
MetonymyClassifiers
ClassifierCombination
EvaluationResults
Conclusionand Outlook
References
Results on Coarse LevelClass SC1 SC2 DC Combined
P R P R P R P R
LITERAL 89.16 93.74 93.36 93.71 94.01 36.71 95.13 94.83NON-LITERAL 64.36 49.83 71.81 70.63 82.31 32.87 77.54 78.61
IdentifyingMetonymyClasses of
NamedLocations
S. Hartrumpfand
J. Leveling
Introduction
MetonymyClasses forLocationNames
CorpusAnnotationwithMetonymyInformation
MetonymyClassifiers
ClassifierCombination
EvaluationResults
Conclusionand Outlook
References
Results on Medium LevelClass SC1 SC2 DC
P R P R P R
LIT 88.97 94.18 93.35 93.68 93.80 36.75MET 63.27 48.08 70.08 68.81 81.76 33.15MIX 54.29 23.17 22.35 23.17 26.67 4.88
IdentifyingMetonymyClasses of
NamedLocations
S. Hartrumpfand
J. Leveling
Introduction
MetonymyClasses forLocationNames
CorpusAnnotationwithMetonymyInformation
MetonymyClassifiers
ClassifierCombination
EvaluationResults
Conclusionand Outlook
References
Results on Medium LevelClass SC1 SC2 DC Combined
P R P R P R P R
LIT 88.97 94.18 93.35 93.68 93.80 36.75 94.75 95.23MET 63.27 48.08 70.08 68.81 81.76 33.15 76.11 77.60MIX 54.29 23.17 22.35 23.17 26.67 4.88 75.00 18.29
IdentifyingMetonymyClasses of
NamedLocations
S. Hartrumpfand
J. Leveling
Introduction
MetonymyClasses forLocationNames
CorpusAnnotationwithMetonymyInformation
MetonymyClassifiers
ClassifierCombination
EvaluationResults
Conclusionand Outlook
References
Results on Fine LevelClass SC1 SC2 DC
P R P R P R
literal 87.71 96.36 92.55 94.08 93.35 36.84mixed 55.88 23.17 17.72 17.07 22.22 4.88othermet 41.03 20.19 34.75 30.91 34.62 8.52place-for-event 37.50 10.91 12.50 12.73 46.67 12.73place-for-gov 42.11 15.69 20.00 13.73 63.64 13.73place-for-off 50.58 42.77 55.35 52.54 67.90 35.94place-for-org 42.86 14.19 42.31 37.16 51.52 11.49place-for-pop 30.87 13.53 45.29 43.82 52.35 22.94place-for-product 0.00 0.00 0.00 0.00 0.00 0.00
IdentifyingMetonymyClasses of
NamedLocations
S. Hartrumpfand
J. Leveling
Introduction
MetonymyClasses forLocationNames
CorpusAnnotationwithMetonymyInformation
MetonymyClassifiers
ClassifierCombination
EvaluationResults
Conclusionand Outlook
References
Results on Fine LevelClass SC1 SC2 DC Combined
P R P R P R P R
literal 87.71 96.36 92.55 94.08 93.35 36.84 89.78 97.80mixed 55.88 23.17 17.72 17.07 22.22 4.88 85.71 14.63othermet 41.03 20.19 34.75 30.91 34.62 8.52 52.55 22.71place-for-event 37.50 10.91 12.50 12.73 46.67 12.73 30.00 5.45place-for-gov 42.11 15.69 20.00 13.73 63.64 13.73 87.50 13.73place-for-off 50.58 42.77 55.35 52.54 67.90 35.94 62.25 60.55place-for-org 42.86 14.19 42.31 37.16 51.52 11.49 55.79 35.81place-for-pop 30.87 13.53 45.29 43.82 52.35 22.94 61.15 28.24place-for-product 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
IdentifyingMetonymyClasses of
NamedLocations
S. Hartrumpfand
J. Leveling
Introduction
MetonymyClasses forLocationNames
CorpusAnnotationwithMetonymyInformation
MetonymyClassifiers
ClassifierCombination
EvaluationResults
Conclusionand Outlook
References
Effect of Metonymy Support inthe Lexicon
Metonymysupport
Sentenceconstraint
#Sentences Parse results (%)
Full Chunks Failed
no NON-LITERAL
1,124 47.15 37.46 15.39
no constraint 27,067 54.08 31.09 14.83
yes NON-LITERAL
1,124 52.40 32.21 15.39
no constraint 27,067 53.60 31.19 15.21
IdentifyingMetonymyClasses of
NamedLocations
S. Hartrumpfand
J. Leveling
Introduction
MetonymyClasses forLocationNames
CorpusAnnotationwithMetonymyInformation
MetonymyClassifiers
ClassifierCombination
EvaluationResults
Conclusionand Outlook
References
Conclusion and Outlook
• Classifiers differ in their strengths and weaknesses(for example, the deep method shows the highestprecision values, but recall values are low because theyare limited by the parser coverage)
→ Combined classifier outperforms each single classifiersignificantly
• Created a new resource about metonymy in German• Metonymy support in the lexicon improves results of
syntactico-semantic parser• Future work: investigate semantic representation of
metonymic names;application to QA and GIR
IdentifyingMetonymyClasses of
NamedLocations
S. Hartrumpfand
J. Leveling
Introduction
MetonymyClasses forLocationNames
CorpusAnnotationwithMetonymyInformation
MetonymyClassifiers
ClassifierCombination
EvaluationResults
Conclusionand Outlook
References
Selected ReferencesHelbig, Hermann (2006). Knowledge Representation and the Semantics of Natural
Language. Berlin: Springer. URL http://www.springer.com/sgw/cda/frontpage/0,11855,1-40109-22-72041224-0,00.html.
Kamei, Shin-ichiro and Takahiro Wakao (1992). Metonymy: Reassessment, survey ofacceptability, and its treatment in machine translation systems. In Proceedings of the30th Annual Meeting of the Association for Computational Linguistics (ACL’92), pp.309–311. Newark, Delaware.
Lakoff, George and Mark Johnson (1980). Metaphors We Live By. Chicago UniversityPress.
Leveling, Johannes and Sven Hartrumpf (2006). On metonymy recognition for GIR. InProceedings of GIR-2006, the 3rd Workshop on Geographical Information Retrieval(hosted by SIGIR 2006). Seattle, Washington. URLhttp://www.geo.unizh.ch/~rsp/gir06/papers/individual/leveling.pdf.
Markert, Katja and Malvina Nissim (2002). Towards a corpus annotated for metonymies:The case of location names. In Proceedings of the 3rd International Conference onLanguage Resources and Evaluation (LREC 2002). Las Palmas, Spain.
Markert, Katja and Malvina Nissim (2007). Task 08: Metonymy resolution at SemEval-07. InProceedings of SemEval 2007.
Stallard, David (1993). Two kinds of metonymy. In Proceedings of the 31st Annual Meetingof the Association for Computational Linguistics (ACL’93), pp. 87–94. Columbus, Ohio.URL http://www.aclweb.org/anthology/P93-1012.