salsa the saarbrücken lexical semantics annotation & acquisition project aljoscha burchardt,...

19
SALSA The Saarbrücken Lexical Semantics Annotation & Acquisition Project Aljoscha Burchardt, Katrin Erk, Anette Frank, Andrea Kowalski, Sebastian Pado, Manfred Pinkal

Upload: lilian-atkinson

Post on 29-Dec-2015

224 views

Category:

Documents


2 download

TRANSCRIPT

SALSA

The Saarbrücken Lexical Semantics Annotation & Acquisition Project

Aljoscha Burchardt, Katrin Erk, Anette Frank, Andrea Kowalski, Sebastian Pado,

Manfred Pinkal

Semantic Annotation in SALSA

Manual semantic annotation of 0.8 million words of syntactically annotated German newspaper text (TIGER Corpus, Releases 1, 2)

with frames and frame elements (Berkeley FrameNet Database), staying as close as possible to the Berkeley FrameNet database

SALSA: What's special?

SALSA is about GermanCross-lingual divergencies?

Cross-lingual Divergencies

Convincing cross-lingual portability results (ED) in general

Adaptation necessary because of Inappropriate granularity of distinctions between FEs

Missing FEs

(Rare cases of) inappropriate granularity of frames

SALSA: What's special?

SALSA is about GermanCross-lingual divergencies?

Corpus-driven lexicon development through exhaustive full-text annotation

Difficult cases

Incompleteness of Berkeley FrameNet

Difficult cases

Metaphors

Support Verb Constructions

Idioms

Difficult phenomena: Some Figures

 Sample of 246

Lemmas Sub-corpus nehmen

  Number % Number %

Standard readings 4638 85,7% 42 17,4%

Metaphor 369 6,8% 38 15,8%

Support 326 6,0% 132 54,8%

Idiom 79 1,5% 29 12,0%

Non-literal use 774 14,3% 199 82,6%

Total 5412 100,0% 241 100,0%

SALSA corpus: Release I

Total size of 20.000 annotated instancesConsistent annotation through different verification steps All occurrences/readings of > 400 German verbal predicates (different frequency bands)Scheduled for Summer 2006

The SALTO Annotation Tool

SALSA II: Automatic Annotation and Acquisition

Fred, Rosy, and Shalmaneser: A tool-chain for shallow semantic analysis Talk by Katrin and Sebastian

SALSA II: Automation

Fred, Rosy, and Shalmaneser: A tool-chain for shallow semantic analysis Talk by Katrin and Sebastian

The Detour System (through WordNet to FrameNet) Talk by Anette and Al

Fred &

Rosy

Fred,

Detour

& Rosy

SALSAII: Automation

Fred, Rosy, and Shalmaneser: A tool-chain for shallow semantic analysis Talk by Katrin and SebastianThe Detour System (through WordNet to FrameNet) Talk by Anette and AlCross-lingual projection of frame-semantic information Katrin and Sebastian

Cross-lingual Projection

SALSAII: Automation & Application

Fred, Rosy, and Shalmaneser: A tool-chain for shallow semantic analysis Talk by Katrin and Sebastian

The Detour System (through WordNet to FrameNet) Talk by Anette and Al

Cross-lingual projection of frame-semantic information Katrin and Sebastian

Textual Entailment (RTE) Anette and Al

t: In 1983, Aki Kaurismäki directed his first full-time feature.

h: Aki Kaurismäki directed a film.

t: In 1983, Aki Kaurismäki directed his first full-time feature.

h: Aki Kaurismäki directed a film.

WordNetrelated

Grammaticallyrelated

SALSA: Future Work

Bottstrapping frame information by data expansion techniquesLinking lexical semantic resourcs with upper-model ontologiesAnalysis of non-compositional phenomenaA worked-out semantic lexiconApplication to textual entailment