emotion classification using massive examples extracted from the web

Post on 26-Jun-2015

1.202 Views

Category:

Education

0 Downloads

Preview:

Click to see full reader

TRANSCRIPT

1

Emotion Classification Using Massive Examples Extracted from the Web

Ryoko TOKUHISA, Kentaro INUI, Yuji MATSUMOTO

COLING’2008

Date: 2009-02-19

2

Outline

Introduction Emotion Classification Experiments Conclusion

3

Introduction

Goal: proposing a data-oriented method for inferring the emotion of a speaker conversing with a dialog system.

Method Obtaining a huge collection of emotion-provoking event

instances from the web. Decomposing the emotion classification task into two sub-steps:

Coarse-grained: sentiment polarity classification. Fine-grained: emotion classification.

4

The Basic Idea

Classification problem: a given input sentence is to be classified either into 10 emotion classes or neutral class.

Basic idea: learning what emotion is typically provoked in what situation (emotion-provoking event). Ex.: “I traveled for to get to the shop, but it was

closed” -> disappointing.

5

6

Building an EP Corpus Taking ten emotions (happiness, fear…) as emotion

classes. Building a handcrafted lexicon of emotion words

(349 emotion words) classified into the ten emotions.

7

Building an EP Corpus cont. Using 349 emotion words to find sentences in the Web corpus that poss

ibly contain emotion-provoking events. A subordinate clause was extracted as an emotion-provoking event inst

ance if: It was subordinated to a matrix clause headed by an emotion word. The relation between the subordinate and matrix clauses is marked by o

ne of the eight connectives ( ので , から , ため , て , のは , のが , ことは , ことが ).

Ex.: “I was disappointed that is suddenly started raining.” the subordinate: it suddenly started raining. connective: that.

8

Building an EP Corpus cont. Apply above emotion lexicons and patterns to collection

1.3 million events.

The evaluation of EP corpus by annotators.

9

Sentiment Polarity Classification Neutral sentences are not the majority in real Web

texts. 1000 sentences randomly sampled from the web:

Using the positive and negative examples stored in emotion-provoking corpus.

Assuming the sentence to be neutral if the output of the model is near the decision boundary.

10

Sentiment Polarity Classification cont. SVMs and the features (n-grams and the sentimen

t polarity of the word themselves).

where, the sentiment dictionary (1880 positive words and 2490 negative words) from 50 thousand most frequent words sampled from the Web.

11

Emotion Classification

Applying the KNN (k-nearest-neighbor) approach by using the EP corpus.

Similarity measure: using cosine similarity between bag-of-words vectors (Instance and EP)

||||

)(

EPI

EPII, EPsim

12

Experiment for Sentiment Classification Two test sets:

TestSet1: 31 positive utterances, 34 negative utterances, and 25 neutral utterances.

TestSet2: 1140 samples (judged Correct) are 491 positives, 649 negatives sentences and additional 501 neutral sentences.

Testing classification in both two-class and three-class setting.

Metric: F-measure

13

14

Experiment for Emotion Classification Three test sets

TestSet1 (2p, best)

TestSet1 (1p, acceptable)

TestSet2: using the results of their judgments on the correctness.

15

Baseline vs. KNN

Baseline (Pointwise Mutual Information, PMI)

where ei ∈ {angry, disgust, fear, joy, sadness, surprise,…}

cwj: each content word. Emotion class decision:

KNN: 1-NN, 3-NN and 10-NN. One step: retrieve top-k examples from the EP corpus. Two step: retrieve top-k examples from the correspon

ding sentiment pool.

)()(

)( )(

cwhitsehits

e, cwhitse, cwPMI

j jii , cwePMIEscore )( )(

16

17

Conclusion

Decomposing the emotion classification task into two sub-steps.

Word n-gram features alone are more or less sufficient to classify sentence when a very large amount of training data is available.

Two-step classification was effective for fine-grained emotion classification and outperform baseline model.

top related