do we need lexicographers? prospects for automatic lexicography adam kilgarriff lexical computing...

24
Do we need lexicographers? Prospects for automatic lexicography Adam Kilgarriff Lexical Computing Ltd University of Leeds UK

Upload: bernard-fowler

Post on 05-Jan-2016

215 views

Category:

Documents


1 download

TRANSCRIPT

Page 1: Do we need lexicographers? Prospects for automatic lexicography Adam Kilgarriff Lexical Computing Ltd University of Leeds UK

Do we need lexicographers?Prospects for automatic

lexicography

Adam Kilgarriff

Lexical Computing Ltd

University of Leeds

UK

Page 2: Do we need lexicographers? Prospects for automatic lexicography Adam Kilgarriff Lexical Computing Ltd University of Leeds UK

Bolzano, May 2012 Adam Kilgarriff 2

Outline

Precision and recall Between corpus and dictionary Shopping list Conclusions

Page 3: Do we need lexicographers? Prospects for automatic lexicography Adam Kilgarriff Lexical Computing Ltd University of Leeds UK

Bolzano, May 2012 Adam Kilgarriff 3

Find me all the fat cats

a request for information

Page 4: Do we need lexicographers? Prospects for automatic lexicography Adam Kilgarriff Lexical Computing Ltd University of Leeds UK

Bolzano, May 2012 Adam Kilgarriff 4

High recall

Lots of responses Maybe not all good

Page 5: Do we need lexicographers? Prospects for automatic lexicography Adam Kilgarriff Lexical Computing Ltd University of Leeds UK

Bolzano, May 2012 Adam Kilgarriff 5

High precision

Fewer hits Higher confidence

Page 6: Do we need lexicographers? Prospects for automatic lexicography Adam Kilgarriff Lexical Computing Ltd University of Leeds UK

Bolzano, May 2012 Adam Kilgarriff 6

Information-seeking

Recall Precision

Computers good bad

People bad good

Page 7: Do we need lexicographers? Prospects for automatic lexicography Adam Kilgarriff Lexical Computing Ltd University of Leeds UK

Bolzano, May 2012 Adam Kilgarriff 7

Cyborg: part-human, part-computer

Treat your computer with respect. You and it can do great things

together.

Page 8: Do we need lexicographers? Prospects for automatic lexicography Adam Kilgarriff Lexical Computing Ltd University of Leeds UK

Bolzano, May 2012 Adam Kilgarriff 8

Lexicography: finding facts about words

Shopping list collocations grammatical patterns examples synonyms labels

– region– domain– register

translations meanings

Page 9: Do we need lexicographers? Prospects for automatic lexicography Adam Kilgarriff Lexical Computing Ltd University of Leeds UK

Szeged, Jan 2008 Kilgarriff, Global WordNet 9

What is a word sense (1) SFIP

– Sufficiently frequent insufficiently predictable

(a glass of) whisky x (a glass of) tequila

Page 10: Do we need lexicographers? Prospects for automatic lexicography Adam Kilgarriff Lexical Computing Ltd University of Leeds UK

Szeged, Jan 2008 Kilgarriff, Global WordNet 10

What is a word sense (2)

homonymy

analogy polysemy rules

collocation

Page 11: Do we need lexicographers? Prospects for automatic lexicography Adam Kilgarriff Lexical Computing Ltd University of Leeds UK

Szeged, Jan 2008 Kilgarriff, Global WordNet 11

What is a word sense (3) A cluster

– Of instances of use Operationalised as: corpus lines

– Clustered by lexicographers

Page 12: Do we need lexicographers? Prospects for automatic lexicography Adam Kilgarriff Lexical Computing Ltd University of Leeds UK

Szeged, Jan 2008 Kilgarriff, Global WordNet 12

What is a word sense (3)

Page 13: Do we need lexicographers? Prospects for automatic lexicography Adam Kilgarriff Lexical Computing Ltd University of Leeds UK

Szeged, Jan 2008 Kilgarriff, Global WordNet 13

What is a word sense (3)

Page 14: Do we need lexicographers? Prospects for automatic lexicography Adam Kilgarriff Lexical Computing Ltd University of Leeds UK

Szeged, Jan 2008 Kilgarriff, Global WordNet 14

What is a word sense (3)

Page 15: Do we need lexicographers? Prospects for automatic lexicography Adam Kilgarriff Lexical Computing Ltd University of Leeds UK

Szeged, Jan 2008 Kilgarriff, Global WordNet 15

What is a word sense (3)

Page 16: Do we need lexicographers? Prospects for automatic lexicography Adam Kilgarriff Lexical Computing Ltd University of Leeds UK

Szeged, Jan 2008 Kilgarriff, Global WordNet 16

What is a word sense (3) A cluster

– Of instances of use Operationalised as: corpus lines

– Clustered by lexicographers Makes sense of

– Overlapping senses– Different dictionaries, different senses– Lumping and splitting

Page 17: Do we need lexicographers? Prospects for automatic lexicography Adam Kilgarriff Lexical Computing Ltd University of Leeds UK

Szeged, Jan 2008 Kilgarriff, Global WordNet 17

I don’t believe in word senses

Believe in:– resurrection ghost witch vampire god miracle

fairy Philosophy:

– Ontological commitment– (same meaning different register)

“good entities to build belief systems on”

Page 18: Do we need lexicographers? Prospects for automatic lexicography Adam Kilgarriff Lexical Computing Ltd University of Leeds UK

Szeged, Jan 2008 Kilgarriff, Global WordNet 18

But I’m an NLP person Automatic clustering? Inspiration:

– Hindle 1991, Schütze 1993, Grefenstette 1993, Lin 1999

– You can get semantic sense from corpora+stats

Page 19: Do we need lexicographers? Prospects for automatic lexicography Adam Kilgarriff Lexical Computing Ltd University of Leeds UK

Szeged, Jan 2008 Kilgarriff, Global WordNet 19

First attempt Longman 1994 Abject failure

– No grammar– Corpus too small and noisy– Naïve clustering– Useless programmer

Page 20: Do we need lexicographers? Prospects for automatic lexicography Adam Kilgarriff Lexical Computing Ltd University of Leeds UK

Szeged, Jan 2008 Kilgarriff, Global WordNet 20

Second attempt SENSEVALS 1998, 2001, 2004… mitigated failure

– Rarely over two thirds correct

Page 21: Do we need lexicographers? Prospects for automatic lexicography Adam Kilgarriff Lexical Computing Ltd University of Leeds UK

Szeged, Jan 2008 Kilgarriff, Global WordNet 21

Third attempt SADD (semi-automatic dictionary

drafting) 2008 With Pavel Rychly I thought I knew what I was doing but

– Probably a failure

Page 22: Do we need lexicographers? Prospects for automatic lexicography Adam Kilgarriff Lexical Computing Ltd University of Leeds UK

Szeged, Jan 2008 Kilgarriff, Global WordNet 22

Collocations Easy

– Most words don’t go with most other words

Then build on what we can do well (metaphor, analogy, homonymy, rules:

all much harder)

Page 23: Do we need lexicographers? Prospects for automatic lexicography Adam Kilgarriff Lexical Computing Ltd University of Leeds UK

Bolzano, May 2012 Adam Kilgarriff 23

Lexicography: finding facts about words

Shopping list

collocations grammatical patterns examples synonyms labels

– region– domain– register

translations meanings

Yes

Yes

Yes

Yes

YesYes

Yes

Yes

?

No

Page 24: Do we need lexicographers? Prospects for automatic lexicography Adam Kilgarriff Lexical Computing Ltd University of Leeds UK

Bolzano, May 2012 Adam Kilgarriff 24

Thank you

http://www.sketchengine.co.uk