named entity recognition for twitter microposts (only) using distributed word representations

18
ELIS – Multimedia Lab Fréderic Godin , Baptist Vandersmissen, Wesley De Neve & Rik Van de Walle Multimedia Lab, Ghent University – iMinds Find me at: @frederic_godin / www.fredericgodin.com Named Entity Recognition for Twitter Microposts (only) using Distributed Word Representations

Upload: fgodin

Post on 19-Feb-2017

906 views

Category:

Social Media


0 download

TRANSCRIPT

Page 1: Named Entity Recognition for Twitter Microposts (only) using Distributed Word Representations

ELIS – Multimedia Lab

Fréderic Godin, Baptist Vandersmissen, Wesley De Neve & Rik Van de Walle

Multimedia Lab, Ghent University – iMinds

Find me at: @frederic_godin / www.fredericgodin.com

Named Entity Recognition for Twitter Microposts(only) using Distributed Word Representations

Page 2: Named Entity Recognition for Twitter Microposts (only) using Distributed Word Representations

2

ELIS – Multimedia Lab

NER in Twitter Microposts using distributed word representationsFréderic Godin et al.

31 July 2015

Introduction

Goal: Recognizing 10 types of named entities (NEs) in noisy Twitter microposts

Problem: Tweets contain spelling mistakes, slang and lack uniform grammar rules

Page 3: Named Entity Recognition for Twitter Microposts (only) using Distributed Word Representations

3

ELIS – Multimedia Lab

NER in Twitter Microposts using distributed word representationsFréderic Godin et al.

31 July 2015

Traditional solutionsTypical features: Ortographic features, gazetteers, corpus statistics or other parsing techniques (PoS and chunking)

Typical machine learning techniques: CRF, HMM

Page 4: Named Entity Recognition for Twitter Microposts (only) using Distributed Word Representations

4

ELIS – Multimedia Lab

NER in Twitter Microposts using distributed word representationsFréderic Godin et al.

31 July 2015

  POS

Ortho-graphic

Gazetteers

Brown clustering

Word embedding

ML F1(%)

ousia X X X – GloVeentity linking using SVM

56.41

NLANGP – X X X word2vec & GloVe CRF++ 51.4

0

nrc – – X X word2vecsemi-Markov MIRA

44.74

multimedialab – – – – word2vec FFNN 43.7

5

USFD X X X X – CRF L-BFGS 42.46

iitp X X X – – CRF++ 39.84

Hallym X – – X correlation analysis CRFsuite 37.2

1

lattice X X – X – CRF wapiti 16.47

Baseline – X X – – CRFsuite 31.97

An overview of the used approaches

Page 5: Named Entity Recognition for Twitter Microposts (only) using Distributed Word Representations

5

ELIS – Multimedia Lab

NER in Twitter Microposts using distributed word representationsFréderic Godin et al.

31 July 2015

A simple, general but effective neural network architecture

Use word2vec to generate good feature representations for words (=unsupervised learning)

Feed those word representations to another neural network (NN) for any classification task (=supervised learning)

Example Feature representation

Machine learning Label(s)

Learn word2vec word representations

once in advance

Train a new NN for any task

Page 6: Named Entity Recognition for Twitter Microposts (only) using Distributed Word Representations

6

ELIS – Multimedia Lab

NER in Twitter Microposts using distributed word representationsFréderic Godin et al.

31 July 2015

Word2vec: automatically learning good features

2D projection of a 400D space of the top 1000 words used on Twitter. The model was trained on 400 million tweets having 5 billion words

Page 7: Named Entity Recognition for Twitter Microposts (only) using Distributed Word Representations

7

ELIS – Multimedia Lab

NER in Twitter Microposts using distributed word representationsFréderic Godin et al.

31 July 2015

A simple, general but effective neural network architecture (1)

W(t-1)

W(t)

W(t+1)

Look

up

N-dim

N-dim

N-dim

Feed forward neural

networkTag(W(t))

Example Feature representation

Machine learning Label(s)

Concatenate (3N-dim)Window = 3

Page 8: Named Entity Recognition for Twitter Microposts (only) using Distributed Word Representations

8

ELIS – Multimedia Lab

NER in Twitter Microposts using distributed word representationsFréderic Godin et al.

31 July 2015

A simple, general but effective neural network architecture (2)

from

Beijing

to

Look

up

N-dim

N-dim

N-dim

Feed forward neural

networkLocation

Example Feature representation

Machine learning Label(s)

Concatenate (3N-dim)Window = 3

Page 9: Named Entity Recognition for Twitter Microposts (only) using Distributed Word Representations

9

ELIS – Multimedia Lab

NER in Twitter Microposts using distributed word representationsFréderic Godin et al.

31 July 2015

Postprocessing (1)

FR ML

W(1)

W(2)

W(3)

Label(1)

Label(2)

Label(3)

Post- processing

Label(1)

Label(2)

Label(3)

Correct for inconsistencies

NE starting with an I-tag

Multi-word expressions having different categories

Page 10: Named Entity Recognition for Twitter Microposts (only) using Distributed Word Representations

10

ELIS – Multimedia Lab

NER in Twitter Microposts using distributed word representationsFréderic Godin et al.

31 July 2015

Postprocessing (2)

FR ML

Manchester

United

is

B-Loc

I-sportsteam

O

Post- processing

B-sportsteam

I-sportsteam

O

Correct for inconsistencies

NE starting with an I-tag

Multi-word expressions having different categories

Page 11: Named Entity Recognition for Twitter Microposts (only) using Distributed Word Representations

11

ELIS – Multimedia Lab

NER in Twitter Microposts using distributed word representationsFréderic Godin et al.

31 July 2015

Experimental setup

Feature Learning

Word2vec Skipgram with negative sampling

400 million raw English tweets (limited preprocessing)

Neural Network

One hidden layer, with 500 hidden units

Word embeddings of size 400, Voc of 3mil words

Mini-batch SGD and Dropout

Experiments with Tanh and ReLU

Page 12: Named Entity Recognition for Twitter Microposts (only) using Distributed Word Representations

12

ELIS – Multimedia Lab

NER in Twitter Microposts using distributed word representationsFréderic Godin et al.

31 July 2015

Word2vec results

Slang

- Wrong capitalization- Sometimes not in Gazetteer

Spelling

Page 13: Named Entity Recognition for Twitter Microposts (only) using Distributed Word Representations

13

ELIS – Multimedia Lab

NER in Twitter Microposts using distributed word representationsFréderic Godin et al.

31 July 2015

Normalizing slang words/spelling

Page 14: Named Entity Recognition for Twitter Microposts (only) using Distributed Word Representations

14

ELIS – Multimedia Lab

NER in Twitter Microposts using distributed word representationsFréderic Godin et al.

31 July 2015

Dealing with capitalization + gazetteer functionality

Page 15: Named Entity Recognition for Twitter Microposts (only) using Distributed Word Representations

15

ELIS – Multimedia Lab

NER in Twitter Microposts using distributed word representationsFréderic Godin et al.

31 July 2015

Results

  POS

Ortho-graphic

Gazetteers

Brown clustering

Word embedding

ML F1(%)

ousia X X X – GloVeentity linking using SVM

56.41

NLANGP – X X X word2vec & GloVe CRF++ 51.4

0

nrc – – X X word2vecsemi-Markov MIRA

44.74

multimedialab – – – – word2vec FFNN 43.7

5

USFD X X X X – CRF L-BFGS 42.46

iitp X X X – – CRF++ 39.84

Hallym X – – X correlation analysis CRFsuite 37.2

1

lattice X X – X – CRF wapiti 16.47

BASELINE – X X – – CRFsuite 31.97

Page 16: Named Entity Recognition for Twitter Microposts (only) using Distributed Word Representations

16

ELIS – Multimedia Lab

NER in Twitter Microposts using distributed word representationsFréderic Godin et al.

31 July 2015

Lessons learned

Feature LearningA W2V window of 1 worked best

More syntax-oriented embeddings

Neural NetworksMultiple layers did not improve the F1-score

Dropout and ReLU worked best

Postprocessing

Multi-word expressions often have different categories

Page 17: Named Entity Recognition for Twitter Microposts (only) using Distributed Word Representations

17

ELIS – Multimedia Lab

NER in Twitter Microposts using distributed word representationsFréderic Godin et al.

31 July 2015

Conclusion

End-to-end semi-supervised neural network architecture

No feature engineering needed

Reusable architecture

Beats traditional systems that only use hand-crafted features

Page 18: Named Entity Recognition for Twitter Microposts (only) using Distributed Word Representations

18

ELIS – Multimedia Lab

NER in Twitter Microposts using distributed word representationsFréderic Godin et al.

31 July 2015

#Questions?

http://www.fredericgodin.com/software/

The word2vec Twitter model is available at:

@frederic_godin