construction of a sentimental word dictionary
DESCRIPTION
Construction of a Sentimental Word Dictionary. Eduard C. Dragut Cyber Center Purdue University. Clement Yu Computer Science Department University of Illinois at Chicago. Prasad Sistla Computer Science Department University of Illinois at Chicago. Weiyi Meng Computer Science Department - PowerPoint PPT PresentationTRANSCRIPT
Construction of a Sentimental Word DictionaryEduard C. Dragut
Cyber CenterPurdue University
Weiyi MengComputer Science Department
Binghamton University
Clement YuComputer Science DepartmentUniversity of Illinois at Chicago
Prasad SistlaComputer Science DepartmentUniversity of Illinois at Chicago
Sentimental Word Dictionary
The proposed dictionary has the majority sentiment property each word with a given part of speech has polarity p
(positive or negative) if the majority sense of the word with that part of speech has polarity p
E.g., the word “bland” has 3 senses, out of which 2 have negative polarity and 1 a positive polarity.
Why such a property is important? Deduction of other sentimental words with the same
property. Detection of inconsistencies in input dictionaries.
Motivation
Why? It facilitates opinion mining and opinion retrieval
What? Reviews, comments about products, services,
government policies reviewing.
Where? The Web has plenty of reviews, comments and
reports about products, services, etc.
Bottom line: Many approaches rely on lexicons of polar words.
Contribution
A deduction approach A set of about 20 inference rules.
The resulting sentimental word dictionary contains approximately 50% more words than the seed dictionary.
The accuracy of the deduced polarities is comparable to that of human judgment. 3 students were asked to evaluate 100 randomly
chosen words.
Building the Sentimental Dictionary
Construct the sentimental dictionary on top of the electronic dictionary WordNet
The idea:1. Start from a small number of words whose
polarities are known.2. Propagate the polarities to the synsets.
Use inference rules
3. Determine the polarities of new words Use the majority sentiment definition.
4. Go to 2, until no more polarities are inferred.
Synset Polarity InferenceExample of an inference rule with one word
Hypothesis: w a word with polarity p Conclusion:
(positive or negative) and two synsets.
→
• Example:– “advance” has positive polarity in General Inquirer;– It has two senses with identical relative frequencies in WordNet;– Hence, we deduce that both its synsets have positive polarities.
p
0.5 0.5
w
s1 s2
p
pp
0.5 0.5
w
s1 s2
Synset Polarity InferenceExample of an inference rule
Hypothesis: w a word with polarity p Conclusion:
(positive or negative).
→
• Example:– “consumate” has positive polarity in General Inquirer.
p w
s1 sns2 …
with known polarities ≠ p
•unknown polarities•sum of relative frequencies > 0.5
p w
s1 sns2 …
with known polarities ≠ p
all have polarity p
Experiments: Automatic Discovery
Data sets WordNet [Fellbaum98] and
3 sentimental dictionaries General Inquirer [Stone96], Appraisal Lexicon [Taboada04] and Opinion Finder [Wilson05]
Take the union of the 3 sentimental dictionariesPOS Input Words Inferred Words Inferred Synsets
Noun 2,315 1,460 1,683
Verb 1,617 844 1,079
Adjective 2,937 1,407 1,907
Adverb 925 364 430
Total 7,794 4,075 5,099
Experiments: Accuracy
100 words were randomly chosen from the 4075 words according to their distributions Nouns (22.2%), adjectives (37.5%), adverbs (11.5%) and
verbs (28.8%)
3 humans judge their deduced polarities. The agreement between humans is 62%.
The agreement between humans and automatic deduction is 63.3%.
Reason for low agreement between humans: preconceived notions of the polarities of words/phrases
E.g., “eat at”.
Related Work
WordNet-based SentiWordNet[Esuli06] assigns degrees of polarities
Q-WordNet [Agerri10] starts from 6 synsets with “known” polarities:
• “positive”, “negative”, “good”, “bad”, “inferior” and “superior”.
• Propagates the polarities using the semantic relationships• E.g., antonym, hypernym, etc.
Measuring relative distance of a term from exemplars[Kamps04] E.g., “good” and “bad”
Corpora-based
References[Fellbaum98] C. Fellbaum. Wordnet: An on-line lexical database and some of its applications. 1998.[Stone96] P. Stone, D. Dunphy, M. Smith, and J. Ogilvie. The general inquirer: A computer approach to content analysis. In MIT Press, 1996[Taboada 04] M. Taboada and J. Grieve. Analyzing appraisal automatically. In AAAI Spring Symposium on Exploring Attitude and Affect in Text, 2004
References[Wilson05] T. Wilson, J. Wiebe, and P. Hoffmann. Recognizing contextual polarity in phrase-level sentiment analysis. In HLT/EMNLP, 2005.[Esuli06] A. Esuli and F. Sebastiani. Sentiwordnet: A publicly available lexical resource for opinion mining. In LREC, 2006.[Agerri10] R. Agerri and A. Garc´ıa-Serrano, Q-wordnet: Extracting polarity from wordnet senses, in LREC, 2010.[Kamps04] J. Kamps, M. Marx, R. Mokken, and M. de Rijke, Using wordnet to measure semantic orientation of adjectives, in LREC, 2004.