construction of a sentimental word dictionary

Construction of a Sentimental Word DictionaryEduard C. Dragut

Cyber CenterPurdue University

Weiyi MengComputer Science Department

Binghamton University

Clement YuComputer Science DepartmentUniversity of Illinois at Chicago

Prasad SistlaComputer Science DepartmentUniversity of Illinois at Chicago

Sentimental Word Dictionary

The proposed dictionary has the majority sentiment property each word with a given part of speech has polarity p

(positive or negative) if the majority sense of the word with that part of speech has polarity p

E.g., the word “bland” has 3 senses, out of which 2 have negative polarity and 1 a positive polarity.

Why such a property is important? Deduction of other sentimental words with the same

property. Detection of inconsistencies in input dictionaries.

Motivation

Why? It facilitates opinion mining and opinion retrieval

What? Reviews, comments about products, services,

government policies reviewing.

Where? The Web has plenty of reviews, comments and

reports about products, services, etc.

Bottom line: Many approaches rely on lexicons of polar words.

Contribution

A deduction approach A set of about 20 inference rules.

The resulting sentimental word dictionary contains approximately 50% more words than the seed dictionary.

The accuracy of the deduced polarities is comparable to that of human judgment. 3 students were asked to evaluate 100 randomly

chosen words.

Building the Sentimental Dictionary

Construct the sentimental dictionary on top of the electronic dictionary WordNet

The idea:1. Start from a small number of words whose

polarities are known.2. Propagate the polarities to the synsets.

Use inference rules

3. Determine the polarities of new words Use the majority sentiment definition.

4. Go to 2, until no more polarities are inferred.

Synset Polarity InferenceExample of an inference rule with one word

Hypothesis: w a word with polarity p Conclusion:

(positive or negative) and two synsets.

→

• Example:– “advance” has positive polarity in General Inquirer;– It has two senses with identical relative frequencies in WordNet;– Hence, we deduce that both its synsets have positive polarities.

p

0.5 0.5

w

s1 s2

p

pp

0.5 0.5

w

s1 s2

Synset Polarity InferenceExample of an inference rule

Hypothesis: w a word with polarity p Conclusion:

(positive or negative).

→

• Example:– “consumate” has positive polarity in General Inquirer.

p w

s1 sns2 …

with known polarities ≠ p

•unknown polarities•sum of relative frequencies > 0.5

p w

s1 sns2 …

with known polarities ≠ p

all have polarity p

Experiments: Automatic Discovery

Data sets WordNet [Fellbaum98] and

3 sentimental dictionaries General Inquirer [Stone96], Appraisal Lexicon [Taboada04] and Opinion Finder [Wilson05]

Take the union of the 3 sentimental dictionariesPOS Input Words Inferred Words Inferred Synsets

Noun 2,315 1,460 1,683

Verb 1,617 844 1,079

Adjective 2,937 1,407 1,907

Adverb 925 364 430

Total 7,794 4,075 5,099

Experiments: Accuracy

100 words were randomly chosen from the 4075 words according to their distributions Nouns (22.2%), adjectives (37.5%), adverbs (11.5%) and

verbs (28.8%)

3 humans judge their deduced polarities. The agreement between humans is 62%.

The agreement between humans and automatic deduction is 63.3%.

Reason for low agreement between humans: preconceived notions of the polarities of words/phrases

E.g., “eat at”.

Related Work

WordNet-based SentiWordNet[Esuli06] assigns degrees of polarities

Q-WordNet [Agerri10] starts from 6 synsets with “known” polarities:

• “positive”, “negative”, “good”, “bad”, “inferior” and “superior”.

• Propagates the polarities using the semantic relationships• E.g., antonym, hypernym, etc.

Measuring relative distance of a term from exemplars[Kamps04] E.g., “good” and “bad”

Corpora-based

References[Fellbaum98] C. Fellbaum. Wordnet: An on-line lexical database and some of its applications. 1998.[Stone96] P. Stone, D. Dunphy, M. Smith, and J. Ogilvie. The general inquirer: A computer approach to content analysis. In MIT Press, 1996[Taboada 04] M. Taboada and J. Grieve. Analyzing appraisal automatically. In AAAI Spring Symposium on Exploring Attitude and Affect in Text, 2004

References[Wilson05] T. Wilson, J. Wiebe, and P. Hoffmann. Recognizing contextual polarity in phrase-level sentiment analysis. In HLT/EMNLP, 2005.[Esuli06] A. Esuli and F. Sebastiani. Sentiwordnet: A publicly available lexical resource for opinion mining. In LREC, 2006.[Agerri10] R. Agerri and A. Garc´ıa-Serrano, Q-wordnet: Extracting polarity from wordnet senses, in LREC, 2010.[Kamps04] J. Kamps, M. Marx, R. Mokken, and M. de Rijke, Using wordnet to measure semantic orientation of adjectives, in LREC, 2004.

construction of a sentimental word dictionary

Documents