csi: a coarse sense inventory for 85% wsd · csi labels are easy to use and have a high...

2
CSI: A Coarse Sense Inventory for 85% WSD https://sapienzanlp.github.io/csi/ English All-Words WSD Results on the ALL dataset in terms of F1 and perplexity/F1 trade-off (GTO). The authors gratefully acknowledge the support of the ERC Consolidator Grant MOUSSE No. 726487 under the European Union’s Horizon 2020 research and innovation programme. This work was supported in part by the MIUR under grant “Dipartimenti di eccellenza 2018-2022” of the Department of Computer Science of Sapienza University. Zero- and Few-Shot Learning CSI increases the sample efficiency of a supervised WSD model. CSI enables BERT to attain the best performance on out-of-vocabulary words. Metrics ● We need to consider at the same time the difficulty of the task and the performance of the model. ● Geometric Trade-Off (GTO): geometric mean of F1 and the perplexity (PPL) of a random guessing model: Descriptiveness Task The street drowsed on an August afternoon in the shade of the curbside tree, and silence was a weight. Physics & astronomy Noun.state Factotum Acoustics Music, sound & dancing Noun.attribute Achievements Competitors Semi-automatic 200 labels All parts of speech included Factotum label covers more than 18% of WordNet synsets SuperSenses (SuS) Manual 45 labels Only nouns and verbs are meaningfully clustered WordNet Domains (WND) Experimental Setup ELMo contextualized embeddings + bi-directional LSTM BERT contextualized embeddings + bi-directional LSTM BERT contextualized embeddings + dense layer Training: SemCor Dev: SemEval-07 Test: Senseval-2 + Senseval-3 + SemEval-13 + SemEval-15 (ALL) CSI ●A Coarse Sense Inventory (CSI) obtained by manually grouping Roget’s categories into 45 labels and mapping them to WordNet synsets. Shared among lemmas, covering all parts of speech. ● CSI labels are easy to use and have a high descriptiveness. clothing fashion clothing textile FASHION AND CLOTHING {fabric, cloth, textile} {bobby pin, hair grip} {dress, frock} Roget’s Categories CSI Labels WordNet Synsets fragrance pungency odour OLFACTORY {olfactory property, smell, aroma} {odour, smell, olfactory perception} Label Identification Synset Mapping Building the Inventory 1 2 200 target words from SemCor annotated by 3 annotators. Caterina Lacerra*, Michele Bevilacqua*, Tommaso Pasini and Roberto Navigli {lacerra, bevilacqua, pasini, navigli}@di.uniroma1.it Ducks were swimming in the pool. Word Sense Disambiguation: the fine-granularity problem SPORT, GAMES AND RECREATION A small body of standing water. An excavation that is filled with water. A small lake. GEOGRAPHY AND PLACES BUSINESS, ECONOMICS AND FINANCE An association of companies. An organization of people that can be shared. Any of various games played on a pool table. Pool? Descriptive labels, that are easy to use. Trade-off between granularity and performance. Reduce the need for manually annotated data. Tn : Training corpus with n annotated sentences per target word Labels CSI 0.81 2.23 WND 0.74 1.80 SuS 0.69 2.04 WordNet 0.51 - K-IAA [0-1] Descriptiveness [1-3] POSSESSION COMPUTING NAUTICAL TIME HIStoRY OLFACTORY BIOLOGY VISUAL MEDIA CHORES & ROUTINE CHEMISTRY & MINERALOGY GEOGRAPHY & PLACES EDUCATION & SCIENCE SEX FARMING EVALUATION LAW & CRIME ENVIRONMENT LIQUID & GAS emotions GENERAL METEOROLOGY Space & touch MATHEMATICS Transport & travel FISHING & HUNTING Health & medicine Physics & astronomy FOOD, DRINK & TASTE

Upload: others

Post on 13-Aug-2020

0 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: CSI: A Coarse Sense Inventory for 85% WSD · CSI labels are easy to use and have a high descriptiveness. clothing fashion clothing textile FASHION AND CLOTHING {fabric, cloth, textile}

CSI: A Coarse Sense Inventory for 85% WSDhttps://sapienzanlp.github.io/csi/

English All-Words WSD

Results on the ALL dataset in terms of F1 and perplexity/F1 trade-off (GTO).

The authors gratefully acknowledge the support of the ERC Consolidator Grant MOUSSE No. 726487 under the European Union’s Horizon 2020 research and innovation programme.

This work was supported in part by the MIUR under grant “Dipartimenti di eccellenza 2018-2022” of the Department of Computer Science of Sapienza University.

Zero- and Few-Shot Learning

CSI increases the sample efficiency of a supervised WSD model.

CSI enables BERT to attain the best performance on out-of-vocabulary words.

Metrics● We need to consider at the same time

the difficulty of the task and the performance of the model.

● Geometric Trade-Off (GTO): geometric mean of F1 and the perplexity (PPL) of a random guessing model:

Descriptiveness Task

The street drowsed on an August afternoon in the shade of the curbside tree, and silence was a weight.

Physics & astronomyNoun.state Factotum

Acoustics Music, sound & dancingNoun.attribute

Achievements

Competitors

Semi-automatic 200 labels All parts of speech included

Factotum label covers more than 18% of WordNet synsets

SuperSenses (SuS) Manual 45 labels

Only nouns and verbs are meaningfully clustered

WordNet Domains (WND)

Experimental Setup

ELMo contextualized embeddings + bi-directional LSTM

BERT contextualized embeddings + bi-directional LSTM

BERT contextualized embeddings + dense layer

Training: SemCor

Dev: SemEval-07

Test: Senseval-2 + Senseval-3 + SemEval-13 + SemEval-15 (ALL)

CSI● A Coarse Sense Inventory (CSI)

obtained by manually grouping Roget’s categories into 45 labels and mapping them to WordNet synsets.

● Shared among lemmas, covering all parts of speech.

● CSI labels are easy to use and have a high descriptiveness.

clothing fashion

clothing textile

FASHION AND CLOTHING

{fabric, cloth, textile}

{bobby pin, hair grip}

{dress, frock}

Roget’s Categories

CSI Labels

WordNet Synsets

fragrancepungency

odour

OLFACTORY

{olfactory property, smell, aroma}

{odour, smell, olfactoryperception}

Label Identification

Synset Mapping

Building the Inventory

1

2

200 target words from SemCor annotated by 3 annotators.

Caterina Lacerra*, Michele Bevilacqua*, Tommaso Pasini and Roberto Navigli{lacerra, bevilacqua, pasini, navigli}@di.uniroma1.it

Ducks were swimming in the pool.

Word Sense Disambiguation: the fine-granularity problem

SPORT, GAMES AND RECREATION

A small body of standing water.

An excavation that is filled with water.

A small lake.

GEOGRAPHY AND PLACES

BUSINESS, ECONOMICS AND FINANCE

An association of companies.

An organization of people that can be shared.

Any of various games played on a pool table.

Pool?

Descriptive labels, that are easy to use.

Trade-off between granularity and performance.

Reduce the need for manually annotated data.Tn : Training corpus with n annotated sentences per target word

Labels

CSI 0.81 2.23

WND 0.74 1.80

SuS 0.69 2.04

WordNet 0.51 -

K-IAA[0-1]

Descriptiveness [1-3]

POSSESSION

COMPUTING

NAUTICALTIME

HIStoRY

OLFACTORY

BIOLOGY VISUAL

MEDIA

CHORES & ROUTINE

CHEMISTRY & MINERALOGY

GEOGRAPHY

& PLACES

EDUCATION & SCIENCESEX

FARMING

EVALUATION

LAW & CRIME

ENVIRONMENT

LIQUID & GAS

emotions

GENERALMETEOROLOGY

Space & touch

MATHEMATICSTransport & travel

FISHING &

HUNTING

Health & medicine

Physics &

astronomy

FOOD, DRINK & TASTE

Page 2: CSI: A Coarse Sense Inventory for 85% WSD · CSI labels are easy to use and have a high descriptiveness. clothing fashion clothing textile FASHION AND CLOTHING {fabric, cloth, textile}

CSI: A Coarse Sense Inventory for 85% WSDhttps://sapienzanlp.github.io/csi/

English All-Words WSD

Results on the ALL dataset in terms of F1 and perplexity/F1 trade-off (GTO).

The authors gratefully acknowledge the support of the ERC Consolidator Grant MOUSSE No. 726487 under the European Union’s Horizon 2020 research and innovation programme.

This work was supported in part by the MIUR under grant “Dipartimenti di eccellenza 2018-2022” of the Department of Computer Science of Sapienza University.

Zero- and Few-Shot Learning

CSI increases the sample efficiency of a supervised WSD model.

CSI enables BERT to attain the best performance on out-of-vocabulary words.

Metrics● We need to consider at the same time

the difficulty of the task and the performance of the model.

● Geometric Trade-Off (GTO): geometric mean of F1 and the perplexity (PPL) of a random guessing model:

Descriptiveness Task

The street drowsed on an August afternoon in the shade of the curbside tree, and silence was a weight.

Physics & astronomyNoun.state Factotum

Acoustics Music, sound & dancingNoun.attribute

Achievements

Competitors

Semi-automatic 200 labels All parts of speech included

Factotum label covers more than 18% of WordNet synsets

SuperSenses (SuS) Manual 45 labels

Only nouns and verbs are meaningfully clustered

WordNet Domains (WND)

Experimental Setup

ELMo contextualized embeddings + bi-directional LSTM

BERT contextualized embeddings + bi-directional LSTM

BERT contextualized embeddings + dense layer

Training: SemCor

Dev: SemEval-07

Test: Senseval-2 + Senseval-3 + SemEval-13 + SemEval-15 (ALL)

CSI● A Coarse Sense Inventory (CSI)

obtained by manually grouping Roget’s categories into 45 labels and mapping them to WordNet synsets.

● Shared among lemmas, covering all parts of speech.

● CSI labels are easy to use and have a high descriptiveness.

clothing fashion

clothing textile

FASHION AND CLOTHING

{fabric, cloth, textile}

{bobby pin, hair grip}

{dress, frock}

Roget’s Categories

CSI Labels

WordNet Synsets

fragrancepungency

odour

OLFACTORY

{olfactory property, smell, aroma}

{odour, smell, olfactoryperception}

Label Identification

Synset Mapping

Building the Inventory

1

2

200 target words from SemCor annotated by 3 annotators.

Caterina Lacerra*, Michele Bevilacqua*, Tommaso Pasini and Roberto Navigli{lacerra, bevilacqua, pasini, navigli}@di.uniroma1.it

Ducks were swimming in the pool.

Word Sense Disambiguation: the fine-granularity problem

SPORT, GAMES AND RECREATION

A small body of standing water.

An excavation that is filled with water.

A small lake.

GEOGRAPHY AND PLACES

BUSINESS, ECONOMICS AND FINANCE

An association of companies.

An organization of people that can be shared.

Any of various games played on a pool table.

Pool?

Descriptive labels, that are easy to use.

Trade-off between granularity and performance.

Reduce the need for manually annotated data.Tn : Training corpus with n annotated sentences per target word

Labels

CSI 0.81 2.23

WND 0.74 1.80

SuS 0.69 2.04

WordNet 0.51 -

K-IAA[0-1]

Descriptiveness [1-3]

POSSESSION

COMPUTING

NAUTICAL

TIME

HIStoRY

OLFACTORY

BIOLOGYVISUAL

MEDIA

CHORES & ROUTINE

CHEMISTRY & MINERALOGY

GEOGRAPHY & PLACES

EDUCATION & SCIENCE

SEX

FARMING

EVALUATION

LAW & CRIME

ENVIRONMENT

LIQUID & GAS

emotions

GENERALMETEOROLOGY

Space & touch MATHEMATICS

Transport & travel

FISHING & HUNTING

Health & medicine

Physics & astronomy

FOOD, DRINK & TASTE