good review of sa
DESCRIPTION
sentimental analysisTRANSCRIPT
![Page 1: Good Review of Sa](https://reader033.vdocument.in/reader033/viewer/2022050908/563db8be550346aa9a9683c2/html5/thumbnails/1.jpg)
A Taste of Sentiment Analysis
Rob Zinkov
May 26th, 2011
Rob Zinkov () A Taste of Sentiment Analysis May 26th, 2011 1 / 105
![Page 2: Good Review of Sa](https://reader033.vdocument.in/reader033/viewer/2022050908/563db8be550346aa9a9683c2/html5/thumbnails/2.jpg)
Outline
1 Introduction
2 Basics of NLP
3 Basic Techniques for Sentiment Analysis
4 Advanced Techniques for Sentiment Analysis
5 Further Questions
Rob Zinkov () A Taste of Sentiment Analysis May 26th, 2011 2 / 105
![Page 3: Good Review of Sa](https://reader033.vdocument.in/reader033/viewer/2022050908/563db8be550346aa9a9683c2/html5/thumbnails/3.jpg)
Introduction
What is Sentiment Analysis?
Sentiment Analysis is a subfield of Computational Linguisticsconcerned with extracting emotions from text
Rob Zinkov () A Taste of Sentiment Analysis May 26th, 2011 3 / 105
![Page 4: Good Review of Sa](https://reader033.vdocument.in/reader033/viewer/2022050908/563db8be550346aa9a9683c2/html5/thumbnails/4.jpg)
Introduction
Applications
Rob Zinkov () A Taste of Sentiment Analysis May 26th, 2011 4 / 105
![Page 5: Good Review of Sa](https://reader033.vdocument.in/reader033/viewer/2022050908/563db8be550346aa9a9683c2/html5/thumbnails/5.jpg)
Introduction
Applications - Political Blogs
Rob Zinkov () A Taste of Sentiment Analysis May 26th, 2011 5 / 105
![Page 6: Good Review of Sa](https://reader033.vdocument.in/reader033/viewer/2022050908/563db8be550346aa9a9683c2/html5/thumbnails/6.jpg)
Introduction
Applications - Political Blogs
• Tracking opinions on issues
• Tracking which issues are held emotionally
• Tracking subjectivity of bloggers
Rob Zinkov () A Taste of Sentiment Analysis May 26th, 2011 6 / 105
![Page 7: Good Review of Sa](https://reader033.vdocument.in/reader033/viewer/2022050908/563db8be550346aa9a9683c2/html5/thumbnails/7.jpg)
Introduction
Political Blogs - Challenges
• Identifying opinion holder
• Associating opinions with issue
• Identifying public figures and legislation
Rob Zinkov () A Taste of Sentiment Analysis May 26th, 2011 7 / 105
![Page 8: Good Review of Sa](https://reader033.vdocument.in/reader033/viewer/2022050908/563db8be550346aa9a9683c2/html5/thumbnails/8.jpg)
Introduction
Applications - Product Reviews
Rob Zinkov () A Taste of Sentiment Analysis May 26th, 2011 8 / 105
![Page 9: Good Review of Sa](https://reader033.vdocument.in/reader033/viewer/2022050908/563db8be550346aa9a9683c2/html5/thumbnails/9.jpg)
Introduction
Product Reviews - Challenges
• Identifying aspects of product
• Associating opinions with aspects of product
• Identifying Fake Reviews
• No canonical form
Rob Zinkov () A Taste of Sentiment Analysis May 26th, 2011 9 / 105
![Page 10: Good Review of Sa](https://reader033.vdocument.in/reader033/viewer/2022050908/563db8be550346aa9a9683c2/html5/thumbnails/10.jpg)
Introduction
Applications - Financial News
Rob Zinkov () A Taste of Sentiment Analysis May 26th, 2011 10 / 105
![Page 11: Good Review of Sa](https://reader033.vdocument.in/reader033/viewer/2022050908/563db8be550346aa9a9683c2/html5/thumbnails/11.jpg)
Introduction
Financial News - Challenges
• Identifying the equity in the article (think commodities)
• Associating entities with market symbols
• Specialized financial terms with distinct sentiment
• Articles rarely only about one equity
Rob Zinkov () A Taste of Sentiment Analysis May 26th, 2011 11 / 105
![Page 12: Good Review of Sa](https://reader033.vdocument.in/reader033/viewer/2022050908/563db8be550346aa9a9683c2/html5/thumbnails/12.jpg)
Introduction
Applications - Brand Tracking
Rob Zinkov () A Taste of Sentiment Analysis May 26th, 2011 12 / 105
![Page 13: Good Review of Sa](https://reader033.vdocument.in/reader033/viewer/2022050908/563db8be550346aa9a9683c2/html5/thumbnails/13.jpg)
Introduction
Brand Tracking - Challenges
• Text likely to be unstructured
• Identifying Brand
• Identifying Opinion Holder/Demographic
Rob Zinkov () A Taste of Sentiment Analysis May 26th, 2011 13 / 105
![Page 14: Good Review of Sa](https://reader033.vdocument.in/reader033/viewer/2022050908/563db8be550346aa9a9683c2/html5/thumbnails/14.jpg)
Introduction
Goals
• Give a broad overview of the field
• Showcase the best current tools and approaches
Rob Zinkov () A Taste of Sentiment Analysis May 26th, 2011 14 / 105
![Page 15: Good Review of Sa](https://reader033.vdocument.in/reader033/viewer/2022050908/563db8be550346aa9a9683c2/html5/thumbnails/15.jpg)
Introduction
Caveats
• There are no good R code/libraries to do this (yet)
• This talk is biased towards my domains
• No one in this area really knows what they are doing
Rob Zinkov () A Taste of Sentiment Analysis May 26th, 2011 15 / 105
![Page 16: Good Review of Sa](https://reader033.vdocument.in/reader033/viewer/2022050908/563db8be550346aa9a9683c2/html5/thumbnails/16.jpg)
Introduction
History
Rob Zinkov () A Taste of Sentiment Analysis May 26th, 2011 16 / 105
![Page 17: Good Review of Sa](https://reader033.vdocument.in/reader033/viewer/2022050908/563db8be550346aa9a9683c2/html5/thumbnails/17.jpg)
Introduction
History
• Grew out of Web integration Field
• Started as extension of knowledge extraction
• This is why field sometimes called Opinion Mining
• Also why papers as likely to occur in ACL as in WWW
• Many early algorithms are extraction patterns
• Field was still largely academic
Rob Zinkov () A Taste of Sentiment Analysis May 26th, 2011 17 / 105
![Page 18: Good Review of Sa](https://reader033.vdocument.in/reader033/viewer/2022050908/563db8be550346aa9a9683c2/html5/thumbnails/18.jpg)
Introduction
Then something happened
Rob Zinkov () A Taste of Sentiment Analysis May 26th, 2011 18 / 105
![Page 19: Good Review of Sa](https://reader033.vdocument.in/reader033/viewer/2022050908/563db8be550346aa9a9683c2/html5/thumbnails/19.jpg)
Introduction
Unique Challenges in Sentiment Analysis
Rob Zinkov () A Taste of Sentiment Analysis May 26th, 2011 19 / 105
![Page 20: Good Review of Sa](https://reader033.vdocument.in/reader033/viewer/2022050908/563db8be550346aa9a9683c2/html5/thumbnails/20.jpg)
Introduction
Opinions are not Facts
Rob Zinkov () A Taste of Sentiment Analysis May 26th, 2011 20 / 105
![Page 21: Good Review of Sa](https://reader033.vdocument.in/reader033/viewer/2022050908/563db8be550346aa9a9683c2/html5/thumbnails/21.jpg)
Introduction
Order Matters
• Sentences at end of article have stronger influence on sentiment
• Sentences at beginning of article have stronger influence on sentiment
• Irrelevant sentences influence sentiment of document.
Rob Zinkov () A Taste of Sentiment Analysis May 26th, 2011 21 / 105
![Page 22: Good Review of Sa](https://reader033.vdocument.in/reader033/viewer/2022050908/563db8be550346aa9a9683c2/html5/thumbnails/22.jpg)
Introduction
Order Matters - Valience Shifts
The camera is reasonable,but there are far better ones at this priceThe meal could have been better,though still tasty.
Rob Zinkov () A Taste of Sentiment Analysis May 26th, 2011 22 / 105
![Page 23: Good Review of Sa](https://reader033.vdocument.in/reader033/viewer/2022050908/563db8be550346aa9a9683c2/html5/thumbnails/23.jpg)
Introduction
Sentiment Orientation
• shifts in sentiment noted by special words
• special words usually have no sentiment of their own
• sentiment though consistent in each phrase
Rob Zinkov () A Taste of Sentiment Analysis May 26th, 2011 23 / 105
![Page 24: Good Review of Sa](https://reader033.vdocument.in/reader033/viewer/2022050908/563db8be550346aa9a9683c2/html5/thumbnails/24.jpg)
Introduction
Sentiment Orientation - continued
• Naive method misses these shifts
• Bag of Words model fails here
Rob Zinkov () A Taste of Sentiment Analysis May 26th, 2011 24 / 105
![Page 25: Good Review of Sa](https://reader033.vdocument.in/reader033/viewer/2022050908/563db8be550346aa9a9683c2/html5/thumbnails/25.jpg)
Introduction
Opinions polarize
Rob Zinkov () A Taste of Sentiment Analysis May 26th, 2011 25 / 105
![Page 26: Good Review of Sa](https://reader033.vdocument.in/reader033/viewer/2022050908/563db8be550346aa9a9683c2/html5/thumbnails/26.jpg)
Introduction
Rob Zinkov () A Taste of Sentiment Analysis May 26th, 2011 26 / 105
![Page 27: Good Review of Sa](https://reader033.vdocument.in/reader033/viewer/2022050908/563db8be550346aa9a9683c2/html5/thumbnails/27.jpg)
Introduction
Opinions have context
Small screenSmall carbon footprint
Rob Zinkov () A Taste of Sentiment Analysis May 26th, 2011 27 / 105
![Page 28: Good Review of Sa](https://reader033.vdocument.in/reader033/viewer/2022050908/563db8be550346aa9a9683c2/html5/thumbnails/28.jpg)
Introduction
Opinions need to be normalized
Rob Zinkov () A Taste of Sentiment Analysis May 26th, 2011 28 / 105
![Page 29: Good Review of Sa](https://reader033.vdocument.in/reader033/viewer/2022050908/563db8be550346aa9a9683c2/html5/thumbnails/29.jpg)
Introduction
People disagree on what words mean
Rob Zinkov () A Taste of Sentiment Analysis May 26th, 2011 29 / 105
![Page 30: Good Review of Sa](https://reader033.vdocument.in/reader033/viewer/2022050908/563db8be550346aa9a9683c2/html5/thumbnails/30.jpg)
Basics of NLP
Basics of Natural Language Processing
Rob Zinkov () A Taste of Sentiment Analysis May 26th, 2011 30 / 105
![Page 31: Good Review of Sa](https://reader033.vdocument.in/reader033/viewer/2022050908/563db8be550346aa9a9683c2/html5/thumbnails/31.jpg)
Basics of NLP
Introduction to NLP
• Computational Linguistics in centered in Frequency Counts
• Frequency Counts become statistic through which we reason
• This statistic has flaws but still useful
Rob Zinkov () A Taste of Sentiment Analysis May 26th, 2011 31 / 105
![Page 32: Good Review of Sa](https://reader033.vdocument.in/reader033/viewer/2022050908/563db8be550346aa9a9683c2/html5/thumbnails/32.jpg)
Basics of NLP
Stemming
It is useful to combine words with a common root.When counting terms this groups words that denote the same termThis is done by dropping the end
sleepingsleepersleeps
sleep
Rob Zinkov () A Taste of Sentiment Analysis May 26th, 2011 32 / 105
![Page 33: Good Review of Sa](https://reader033.vdocument.in/reader033/viewer/2022050908/563db8be550346aa9a9683c2/html5/thumbnails/33.jpg)
Basics of NLP
Stopwords
It is important to remove common words as they dominate all countsCommon words in English:
a, the, an, is, be, could, there
Most NLP libraries packaged with a list of stopwords
Rob Zinkov () A Taste of Sentiment Analysis May 26th, 2011 33 / 105
![Page 34: Good Review of Sa](https://reader033.vdocument.in/reader033/viewer/2022050908/563db8be550346aa9a9683c2/html5/thumbnails/34.jpg)
Basics of NLP
Sometimes words will need to more finely processedThe following tools exist in most NLP packagesI prefer the Stanford NLP software suitehttp://nlp.stanford.edu/software/index.shtml
Rob Zinkov () A Taste of Sentiment Analysis May 26th, 2011 34 / 105
![Page 35: Good Review of Sa](https://reader033.vdocument.in/reader033/viewer/2022050908/563db8be550346aa9a9683c2/html5/thumbnails/35.jpg)
Basics of NLP
Parsing
Rob Zinkov () A Taste of Sentiment Analysis May 26th, 2011 35 / 105
![Page 36: Good Review of Sa](https://reader033.vdocument.in/reader033/viewer/2022050908/563db8be550346aa9a9683c2/html5/thumbnails/36.jpg)
Basics of NLP
Parsing
• Structure also derivable by parsing sentences
• Treat text like programming language
• Algorithms can then convert text into Tree
• Algorithms exist to learn grammar
• Very Heavyweight
Rob Zinkov () A Taste of Sentiment Analysis May 26th, 2011 36 / 105
![Page 37: Good Review of Sa](https://reader033.vdocument.in/reader033/viewer/2022050908/563db8be550346aa9a9683c2/html5/thumbnails/37.jpg)
Basics of NLP
Shallow Parsing
Rob Zinkov () A Taste of Sentiment Analysis May 26th, 2011 37 / 105
![Page 38: Good Review of Sa](https://reader033.vdocument.in/reader033/viewer/2022050908/563db8be550346aa9a9683c2/html5/thumbnails/38.jpg)
Basics of NLP
Shallow Parsing
• Less heavy to use than a full parser
• Processes words into phrases
• Training Chunking parser significantly easier/faster
• Requires having words tagged with their part of speech
Rob Zinkov () A Taste of Sentiment Analysis May 26th, 2011 38 / 105
![Page 39: Good Review of Sa](https://reader033.vdocument.in/reader033/viewer/2022050908/563db8be550346aa9a9683c2/html5/thumbnails/39.jpg)
Basics of NLP
Part of Speech tagging
Rob Zinkov () A Taste of Sentiment Analysis May 26th, 2011 39 / 105
![Page 40: Good Review of Sa](https://reader033.vdocument.in/reader033/viewer/2022050908/563db8be550346aa9a9683c2/html5/thumbnails/40.jpg)
Basics of NLP
POS tagging
• Simplest operation to perform on words
• All NLP libraries support this operation
• Provides lightweight metadata
• Very common word feature
• Used by nearly all more complex NLP techniques
Rob Zinkov () A Taste of Sentiment Analysis May 26th, 2011 40 / 105
![Page 41: Good Review of Sa](https://reader033.vdocument.in/reader033/viewer/2022050908/563db8be550346aa9a9683c2/html5/thumbnails/41.jpg)
Basics of NLP
Dependency Parsing
Rob Zinkov () A Taste of Sentiment Analysis May 26th, 2011 41 / 105
![Page 42: Good Review of Sa](https://reader033.vdocument.in/reader033/viewer/2022050908/563db8be550346aa9a9683c2/html5/thumbnails/42.jpg)
Basics of NLP
Dependency Parsing
• Traditional Treebank Parsing is a bit bureaucratic
• Hides relations words have with each in sentence
• Dependency Parsing provides a lightweight alternative
• Alternative has looser representation, more language agnostic
• More readily captures which words modify each other
Rob Zinkov () A Taste of Sentiment Analysis May 26th, 2011 42 / 105
![Page 43: Good Review of Sa](https://reader033.vdocument.in/reader033/viewer/2022050908/563db8be550346aa9a9683c2/html5/thumbnails/43.jpg)
Basics of NLP
Wordnet
• Words can be related by how similar they are
• Words are similar if they mean similar things
• Words are similar is they are a type of another word
• Words can have many meanings
• Wordnet is a hand curated ontology that annotates these relations
Rob Zinkov () A Taste of Sentiment Analysis May 26th, 2011 43 / 105
![Page 44: Good Review of Sa](https://reader033.vdocument.in/reader033/viewer/2022050908/563db8be550346aa9a9683c2/html5/thumbnails/44.jpg)
Basics of NLP
Wordnet synsets
Rob Zinkov () A Taste of Sentiment Analysis May 26th, 2011 44 / 105
![Page 45: Good Review of Sa](https://reader033.vdocument.in/reader033/viewer/2022050908/563db8be550346aa9a9683c2/html5/thumbnails/45.jpg)
Basics of NLP
Wordnet concept network
Rob Zinkov () A Taste of Sentiment Analysis May 26th, 2011 45 / 105
![Page 46: Good Review of Sa](https://reader033.vdocument.in/reader033/viewer/2022050908/563db8be550346aa9a9683c2/html5/thumbnails/46.jpg)
Basics of NLP
Topic Modeling
Topic Modeling is a way to group and categorize documentsUsually unsupervised approach
Rob Zinkov () A Taste of Sentiment Analysis May 26th, 2011 46 / 105
![Page 47: Good Review of Sa](https://reader033.vdocument.in/reader033/viewer/2022050908/563db8be550346aa9a9683c2/html5/thumbnails/47.jpg)
Basics of NLP
CTM - Coorelated Topic Models
Rob Zinkov () A Taste of Sentiment Analysis May 26th, 2011 47 / 105
![Page 48: Good Review of Sa](https://reader033.vdocument.in/reader033/viewer/2022050908/563db8be550346aa9a9683c2/html5/thumbnails/48.jpg)
Basics of NLP
CTM - Coorelated Topic Models
• CTMs model the underlying topics within a document
• They differ from earlier approaches in capturing correlations betweentopics
• Give superior performance compared to other unsupervised models
• Available for use as an R package in CRAN (topicmodels)
Rob Zinkov () A Taste of Sentiment Analysis May 26th, 2011 48 / 105
![Page 49: Good Review of Sa](https://reader033.vdocument.in/reader033/viewer/2022050908/563db8be550346aa9a9683c2/html5/thumbnails/49.jpg)
Basics of NLP
Named Entity Recognition
The purpose of NER is to extract out and label phrases in a sentence
Bill Clinton arrived at the United Nations Building in Manhattan.
Rob Zinkov () A Taste of Sentiment Analysis May 26th, 2011 49 / 105
![Page 50: Good Review of Sa](https://reader033.vdocument.in/reader033/viewer/2022050908/563db8be550346aa9a9683c2/html5/thumbnails/50.jpg)
Basic Techniques for Sentiment Analysis
Sentiment Definitions
Rob Zinkov () A Taste of Sentiment Analysis May 26th, 2011 50 / 105
![Page 51: Good Review of Sa](https://reader033.vdocument.in/reader033/viewer/2022050908/563db8be550346aa9a9683c2/html5/thumbnails/51.jpg)
Basic Techniques for Sentiment Analysis
Opinion
A vector denoting representing an opinionwith values positive, negative, or neutral gradings
Rob Zinkov () A Taste of Sentiment Analysis May 26th, 2011 51 / 105
![Page 52: Good Review of Sa](https://reader033.vdocument.in/reader033/viewer/2022050908/563db8be550346aa9a9683c2/html5/thumbnails/52.jpg)
Basic Techniques for Sentiment Analysis
Opinion Holder
The agent an opinion belongs to.This mostly relevant in political blogs
Rob Zinkov () A Taste of Sentiment Analysis May 26th, 2011 52 / 105
![Page 53: Good Review of Sa](https://reader033.vdocument.in/reader033/viewer/2022050908/563db8be550346aa9a9683c2/html5/thumbnails/53.jpg)
Basic Techniques for Sentiment Analysis
Item Features
Facets of the object that are readily available
Rob Zinkov () A Taste of Sentiment Analysis May 26th, 2011 53 / 105
![Page 54: Good Review of Sa](https://reader033.vdocument.in/reader033/viewer/2022050908/563db8be550346aa9a9683c2/html5/thumbnails/54.jpg)
Basic Techniques for Sentiment Analysis
Sentiment Features
Facets of the object that an opinion may be subscribed.These are usually hard to tease out of the text
Rob Zinkov () A Taste of Sentiment Analysis May 26th, 2011 54 / 105
![Page 55: Good Review of Sa](https://reader033.vdocument.in/reader033/viewer/2022050908/563db8be550346aa9a9683c2/html5/thumbnails/55.jpg)
Basic Techniques for Sentiment Analysis
1. Gather a Seed set
Rob Zinkov () A Taste of Sentiment Analysis May 26th, 2011 55 / 105
![Page 56: Good Review of Sa](https://reader033.vdocument.in/reader033/viewer/2022050908/563db8be550346aa9a9683c2/html5/thumbnails/56.jpg)
Basic Techniques for Sentiment Analysis
Opinion corpora available at:
• Wiebe’s corpora http://www.cs.pitt.edu/mpqa/
• Sentiwordnet: http://sentiwordnet.isti.cnr.it/
• Personal dictionaries (available on request)
Rob Zinkov () A Taste of Sentiment Analysis May 26th, 2011 56 / 105
![Page 57: Good Review of Sa](https://reader033.vdocument.in/reader033/viewer/2022050908/563db8be550346aa9a9683c2/html5/thumbnails/57.jpg)
Basic Techniques for Sentiment Analysis
Gathering initial seed words
• Wiebe’s work comes with subjectivity scores in addition to sentiment
• Sentiwordnet was autogenerated, quality could be better
• Personal dictionaries hand generated, small but good quality
Rob Zinkov () A Taste of Sentiment Analysis May 26th, 2011 57 / 105
![Page 58: Good Review of Sa](https://reader033.vdocument.in/reader033/viewer/2022050908/563db8be550346aa9a9683c2/html5/thumbnails/58.jpg)
Basic Techniques for Sentiment Analysis
2. Learn sentiment of unknown words
Rob Zinkov () A Taste of Sentiment Analysis May 26th, 2011 58 / 105
![Page 59: Good Review of Sa](https://reader033.vdocument.in/reader033/viewer/2022050908/563db8be550346aa9a9683c2/html5/thumbnails/59.jpg)
Basic Techniques for Sentiment Analysis
Learn sentiment - Supervised
Rob Zinkov () A Taste of Sentiment Analysis May 26th, 2011 59 / 105
![Page 60: Good Review of Sa](https://reader033.vdocument.in/reader033/viewer/2022050908/563db8be550346aa9a9683c2/html5/thumbnails/60.jpg)
Basic Techniques for Sentiment Analysis
Learn sentiment - Supervised
• Get a large collection of them labeled
• Use this collection as is
Rob Zinkov () A Taste of Sentiment Analysis May 26th, 2011 60 / 105
![Page 61: Good Review of Sa](https://reader033.vdocument.in/reader033/viewer/2022050908/563db8be550346aa9a9683c2/html5/thumbnails/61.jpg)
Basic Techniques for Sentiment Analysis
Learn sentiment - Unsupervised - Turney
• Use Turney’s Method
• Calculate Pointwise Mutual Information between every word and theseed words ’excellent’ ’poor’
SO(w) = lg(hits(w NEAR excellent)hits(excellent)
hits(w NEAR poor)hits(poor)
)where hits(w NEAR y) = number of times w is within 10 words of the y
Rob Zinkov () A Taste of Sentiment Analysis May 26th, 2011 61 / 105
![Page 62: Good Review of Sa](https://reader033.vdocument.in/reader033/viewer/2022050908/563db8be550346aa9a9683c2/html5/thumbnails/62.jpg)
Basic Techniques for Sentiment Analysis
Learn sentiment - Unsupervised - Twitter
• Use Turney’s Method with Twitter
• Calculate Pointwise Mutual Information between every word andwhenever it appears with _̈ or ¨̂ within a tweet
• This method has the advantage of being multilingual, other kinds ofsmiles aside
Rob Zinkov () A Taste of Sentiment Analysis May 26th, 2011 62 / 105
![Page 63: Good Review of Sa](https://reader033.vdocument.in/reader033/viewer/2022050908/563db8be550346aa9a9683c2/html5/thumbnails/63.jpg)
Basic Techniques for Sentiment Analysis
Learn sentiment - Unsupervised - Wordnet
• Use wordnet to walk random paths from start word until arriving at aseed word
• Average across sentiments of all seed words arrived at• This method is the fastest and most accurate
Rob Zinkov () A Taste of Sentiment Analysis May 26th, 2011 63 / 105
![Page 64: Good Review of Sa](https://reader033.vdocument.in/reader033/viewer/2022050908/563db8be550346aa9a9683c2/html5/thumbnails/64.jpg)
Basic Techniques for Sentiment Analysis
3. Apply rules to simplify document
Rob Zinkov () A Taste of Sentiment Analysis May 26th, 2011 64 / 105
![Page 65: Good Review of Sa](https://reader033.vdocument.in/reader033/viewer/2022050908/563db8be550346aa9a9683c2/html5/thumbnails/65.jpg)
Basic Techniques for Sentiment Analysis
• Rules make words more independent
• Rewrites make it less likely to misclassify a phrase
Rob Zinkov () A Taste of Sentiment Analysis May 26th, 2011 65 / 105
![Page 66: Good Review of Sa](https://reader033.vdocument.in/reader033/viewer/2022050908/563db8be550346aa9a9683c2/html5/thumbnails/66.jpg)
Basic Techniques for Sentiment Analysis
4. Identify opinion phrases
Rob Zinkov () A Taste of Sentiment Analysis May 26th, 2011 66 / 105
![Page 67: Good Review of Sa](https://reader033.vdocument.in/reader033/viewer/2022050908/563db8be550346aa9a9683c2/html5/thumbnails/67.jpg)
Basic Techniques for Sentiment Analysis
• Shallow Parse the document into chunks
• Remove chunks with mostly neutral words
Rob Zinkov () A Taste of Sentiment Analysis May 26th, 2011 67 / 105
![Page 68: Good Review of Sa](https://reader033.vdocument.in/reader033/viewer/2022050908/563db8be550346aa9a9683c2/html5/thumbnails/68.jpg)
Basic Techniques for Sentiment Analysis
Alternatively, extract with some rules
Rob Zinkov () A Taste of Sentiment Analysis May 26th, 2011 68 / 105
![Page 69: Good Review of Sa](https://reader033.vdocument.in/reader033/viewer/2022050908/563db8be550346aa9a9683c2/html5/thumbnails/69.jpg)
Basic Techniques for Sentiment Analysis
5. Extend sentiment to phrases and sentences
Rob Zinkov () A Taste of Sentiment Analysis May 26th, 2011 69 / 105
![Page 70: Good Review of Sa](https://reader033.vdocument.in/reader033/viewer/2022050908/563db8be550346aa9a9683c2/html5/thumbnails/70.jpg)
Basic Techniques for Sentiment Analysis
• Ultimately, sentiment is for phrases and sentences
• Use sentiment on individual words as priors
• Sentiment is based on joint probability across words in phrase
• Use Naive Bayes or a Markov Model as needed
Rob Zinkov () A Taste of Sentiment Analysis May 26th, 2011 70 / 105
![Page 71: Good Review of Sa](https://reader033.vdocument.in/reader033/viewer/2022050908/563db8be550346aa9a9683c2/html5/thumbnails/71.jpg)
Basic Techniques for Sentiment Analysis
6. Aggregate sentiments for display
Rob Zinkov () A Taste of Sentiment Analysis May 26th, 2011 71 / 105
![Page 72: Good Review of Sa](https://reader033.vdocument.in/reader033/viewer/2022050908/563db8be550346aa9a9683c2/html5/thumbnails/72.jpg)
Basic Techniques for Sentiment Analysis
Group phrases based on what you want the sentiment
• Entities
• Topics
• Sentiment Features
• Item Features
• Users
Rob Zinkov () A Taste of Sentiment Analysis May 26th, 2011 72 / 105
![Page 73: Good Review of Sa](https://reader033.vdocument.in/reader033/viewer/2022050908/563db8be550346aa9a9683c2/html5/thumbnails/73.jpg)
Basic Techniques for Sentiment Analysis
8. Generating Summary
Rob Zinkov () A Taste of Sentiment Analysis May 26th, 2011 73 / 105
![Page 74: Good Review of Sa](https://reader033.vdocument.in/reader033/viewer/2022050908/563db8be550346aa9a9683c2/html5/thumbnails/74.jpg)
Basic Techniques for Sentiment Analysis
Generating Summary
• Largely only relevant when you returning text
• Rate all sentences based on readability
• Return snippet of text for each group with sentiment vector attached
Rob Zinkov () A Taste of Sentiment Analysis May 26th, 2011 74 / 105
![Page 75: Good Review of Sa](https://reader033.vdocument.in/reader033/viewer/2022050908/563db8be550346aa9a9683c2/html5/thumbnails/75.jpg)
Basic Techniques for Sentiment Analysis
Summary
1 Gather a seed set
2 Learn sentiment of unknown words
3 Apply rules to simplify document
4 Identify opinion phrases
5 Extend sentiment to phrases and document
6 Aggregate sentiments for display
7 Generate summary
Rob Zinkov () A Taste of Sentiment Analysis May 26th, 2011 75 / 105
![Page 76: Good Review of Sa](https://reader033.vdocument.in/reader033/viewer/2022050908/563db8be550346aa9a9683c2/html5/thumbnails/76.jpg)
Advanced Techniques for Sentiment Analysis
Anaphora Resolution
• Many articles refer entities by their name only a few times
• Opinions will usually co-occur with an anaphora of the entity
Rob Zinkov () A Taste of Sentiment Analysis May 26th, 2011 76 / 105
![Page 77: Good Review of Sa](https://reader033.vdocument.in/reader033/viewer/2022050908/563db8be550346aa9a9683c2/html5/thumbnails/77.jpg)
Advanced Techniques for Sentiment Analysis
Anaphora Resolution
• Simplest solution, replace all anaphora with their referent
• Trickier solution, aggregate all opinions associated with anaphora later
• Other options?
Rob Zinkov () A Taste of Sentiment Analysis May 26th, 2011 77 / 105
![Page 78: Good Review of Sa](https://reader033.vdocument.in/reader033/viewer/2022050908/563db8be550346aa9a9683c2/html5/thumbnails/78.jpg)
Advanced Techniques for Sentiment Analysis
Sentiment Analysis is fundamentally a DiscriminativeLearning Task
Rob Zinkov () A Taste of Sentiment Analysis May 26th, 2011 78 / 105
![Page 79: Good Review of Sa](https://reader033.vdocument.in/reader033/viewer/2022050908/563db8be550346aa9a9683c2/html5/thumbnails/79.jpg)
Advanced Techniques for Sentiment Analysis
Conditional Random Fields
• Sentiment is clearly affected by its surrounding context
• Sentiment is also affected by orientation shifting words
• Why not make these connections explicit in our model?
• Conditional Random Fields (CRFs) are a flexible way of representingthese connections.
Rob Zinkov () A Taste of Sentiment Analysis May 26th, 2011 79 / 105
![Page 80: Good Review of Sa](https://reader033.vdocument.in/reader033/viewer/2022050908/563db8be550346aa9a9683c2/html5/thumbnails/80.jpg)
Advanced Techniques for Sentiment Analysis
Conditional Random Fields
In a CRF, we represent posterior probability of a set of sentiments giventhe underlying text. A is a collection of cliques in the graph of connections.
p(y |x) =1
Z
∏A
ΨA(xA, yA)
ΨA(xA, yA) = exp
{∑k
θAk fAk(xA, yA)
}
Rob Zinkov () A Taste of Sentiment Analysis May 26th, 2011 80 / 105
![Page 81: Good Review of Sa](https://reader033.vdocument.in/reader033/viewer/2022050908/563db8be550346aa9a9683c2/html5/thumbnails/81.jpg)
Advanced Techniques for Sentiment Analysis
Linear Chain CRFs
If we assume the sentiment of any given word only depends on theprevious, the formula simplifies to
p(y |x) =1
Z
t∏exp
{∑k
θk fk(xt , yt , yt−1)
}
Rob Zinkov () A Taste of Sentiment Analysis May 26th, 2011 81 / 105
![Page 82: Good Review of Sa](https://reader033.vdocument.in/reader033/viewer/2022050908/563db8be550346aa9a9683c2/html5/thumbnails/82.jpg)
Advanced Techniques for Sentiment Analysis
Linear Chain CRFs are best understood as a discriminative version ofHidden Markov Models
Rob Zinkov () A Taste of Sentiment Analysis May 26th, 2011 82 / 105
![Page 83: Good Review of Sa](https://reader033.vdocument.in/reader033/viewer/2022050908/563db8be550346aa9a9683c2/html5/thumbnails/83.jpg)
Advanced Techniques for Sentiment Analysis
Skip-chain CRFs
But we can assume sentiment depends on words much further away
We can now connect entities to each other and connect phrases explicitlyseparated by a sentiment shifting word.
Rob Zinkov () A Taste of Sentiment Analysis May 26th, 2011 83 / 105
![Page 84: Good Review of Sa](https://reader033.vdocument.in/reader033/viewer/2022050908/563db8be550346aa9a9683c2/html5/thumbnails/84.jpg)
Advanced Techniques for Sentiment Analysis
CRFs - Conclusions
• CRFs allow us to add context to opinion
• Properly used they can handle the connections between sentiments onphrases as well as words
• CRFs allow us to link arbitrary features of words and labels to eachother
Rob Zinkov () A Taste of Sentiment Analysis May 26th, 2011 84 / 105
![Page 85: Good Review of Sa](https://reader033.vdocument.in/reader033/viewer/2022050908/563db8be550346aa9a9683c2/html5/thumbnails/85.jpg)
Advanced Techniques for Sentiment Analysis
Extensions
Rob Zinkov () A Taste of Sentiment Analysis May 26th, 2011 85 / 105
![Page 86: Good Review of Sa](https://reader033.vdocument.in/reader033/viewer/2022050908/563db8be550346aa9a9683c2/html5/thumbnails/86.jpg)
Advanced Techniques for Sentiment Analysis
Extensions - Time Series
Just order your documents in time, and can plot changes in sentiment
Rob Zinkov () A Taste of Sentiment Analysis May 26th, 2011 86 / 105
![Page 87: Good Review of Sa](https://reader033.vdocument.in/reader033/viewer/2022050908/563db8be550346aa9a9683c2/html5/thumbnails/87.jpg)
Advanced Techniques for Sentiment Analysis
Extensions - Time Series
• This one tends to get used with financial data and monitoring brands
• Requires having access to lots of articles to make sense
• There can be sparsity issues so apply proper shrinkage
Rob Zinkov () A Taste of Sentiment Analysis May 26th, 2011 87 / 105
![Page 88: Good Review of Sa](https://reader033.vdocument.in/reader033/viewer/2022050908/563db8be550346aa9a9683c2/html5/thumbnails/88.jpg)
Advanced Techniques for Sentiment Analysis
Beyond Positive and Negative
We can be more subtle
Rob Zinkov () A Taste of Sentiment Analysis May 26th, 2011 88 / 105
![Page 89: Good Review of Sa](https://reader033.vdocument.in/reader033/viewer/2022050908/563db8be550346aa9a9683c2/html5/thumbnails/89.jpg)
Advanced Techniques for Sentiment Analysis
Sarcasm
If you deal with Product Review this is helpful
Rob Zinkov () A Taste of Sentiment Analysis May 26th, 2011 89 / 105
![Page 90: Good Review of Sa](https://reader033.vdocument.in/reader033/viewer/2022050908/563db8be550346aa9a9683c2/html5/thumbnails/90.jpg)
Advanced Techniques for Sentiment Analysis
Sarcasm is best detected through punctuation and capitalization features
Rob Zinkov () A Taste of Sentiment Analysis May 26th, 2011 90 / 105
![Page 91: Good Review of Sa](https://reader033.vdocument.in/reader033/viewer/2022050908/563db8be550346aa9a9683c2/html5/thumbnails/91.jpg)
Advanced Techniques for Sentiment Analysis
Detecting Fake Reviews
• Fake Reviews are best treated as a classification task
• Collect enough and use frequency counts for features
• This is useful in production deployments and simple to implement
Rob Zinkov () A Taste of Sentiment Analysis May 26th, 2011 91 / 105
![Page 92: Good Review of Sa](https://reader033.vdocument.in/reader033/viewer/2022050908/563db8be550346aa9a9683c2/html5/thumbnails/92.jpg)
Advanced Techniques for Sentiment Analysis
Multilingual Sentiment Analysis
• Sentiment does not translate well
• Words that mean the same thing can not correspond wrt sentiment
• Retrain for each new language you wish to support
Rob Zinkov () A Taste of Sentiment Analysis May 26th, 2011 92 / 105
![Page 93: Good Review of Sa](https://reader033.vdocument.in/reader033/viewer/2022050908/563db8be550346aa9a9683c2/html5/thumbnails/93.jpg)
Advanced Techniques for Sentiment Analysis
Word-sense disambiguation
• This is largely not worth the effort
• Using the first sense of the word gives comparable performance tomore sophisticated approaches
• Exception: domain specific corpus where word is unlikely to be thefirst sense. Use specialized dictionaries for this case
Rob Zinkov () A Taste of Sentiment Analysis May 26th, 2011 93 / 105
![Page 94: Good Review of Sa](https://reader033.vdocument.in/reader033/viewer/2022050908/563db8be550346aa9a9683c2/html5/thumbnails/94.jpg)
Advanced Techniques for Sentiment Analysis
Comparisons
• Sometimes opinions are stated relevant two separate entities
• Superlatives are a special case of this
• Treat these as a ranking problem and handle as a separate problem
• Merge sentiments during aggregation
R is much better than SPSS
Rob Zinkov () A Taste of Sentiment Analysis May 26th, 2011 94 / 105
![Page 95: Good Review of Sa](https://reader033.vdocument.in/reader033/viewer/2022050908/563db8be550346aa9a9683c2/html5/thumbnails/95.jpg)
Further Questions
Lingering Questions
Rob Zinkov () A Taste of Sentiment Analysis May 26th, 2011 95 / 105
![Page 96: Good Review of Sa](https://reader033.vdocument.in/reader033/viewer/2022050908/563db8be550346aa9a9683c2/html5/thumbnails/96.jpg)
Further Questions
What keeps me from doing this in R?
Rob Zinkov () A Taste of Sentiment Analysis May 26th, 2011 96 / 105
![Page 97: Good Review of Sa](https://reader033.vdocument.in/reader033/viewer/2022050908/563db8be550346aa9a9683c2/html5/thumbnails/97.jpg)
Further Questions
Further Questions - Large Data
• Text analysis is hard to do in R
• R has memory limits
• Using Hadoop or BigMemory usually means giving up many libraries
• tm.plugins.distributed helps a bit
• snow and OpenMPI gives mixed results
Rob Zinkov () A Taste of Sentiment Analysis May 26th, 2011 97 / 105
![Page 98: Good Review of Sa](https://reader033.vdocument.in/reader033/viewer/2022050908/563db8be550346aa9a9683c2/html5/thumbnails/98.jpg)
Further Questions
Further Questions - Metadata
Is there a lightweight metadata format?Index Offset Property Value
2 10 POS NP
35 5 Sentiment Positive
17 7 POS JJ
51 20 Chunk NULL
20 8 Entity Person
2 45 Sentence NULL
Rob Zinkov () A Taste of Sentiment Analysis May 26th, 2011 98 / 105
![Page 99: Good Review of Sa](https://reader033.vdocument.in/reader033/viewer/2022050908/563db8be550346aa9a9683c2/html5/thumbnails/99.jpg)
Further Questions
Further Questions - Model Files
• Not enough of the tools take model files
• Model files are needed for tokenization,sentence splitting, postagging, chunking
• Without easy support for model files, multilingual support is difficult
• Without easy support, impossible to train better models as databecomes available
Rob Zinkov () A Taste of Sentiment Analysis May 26th, 2011 99 / 105
![Page 100: Good Review of Sa](https://reader033.vdocument.in/reader033/viewer/2022050908/563db8be550346aa9a9683c2/html5/thumbnails/100.jpg)
Further Questions
Further Questions - Rule Files
• No standard on preprocessing rules
• DSL required for them
• Is this something we need to provide?
• Until better techniques come around, essential for any performance
Rob Zinkov () A Taste of Sentiment Analysis May 26th, 2011 100 / 105
![Page 101: Good Review of Sa](https://reader033.vdocument.in/reader033/viewer/2022050908/563db8be550346aa9a9683c2/html5/thumbnails/101.jpg)
Further Questions
Theoretical Formulation
• Can these techniques be made less hacky?
• Dependency Parses provide much of the structure for trackingsentiment orientation
• Can structure be handled in a more unsupervised manner?
Rob Zinkov () A Taste of Sentiment Analysis May 26th, 2011 101 / 105
![Page 102: Good Review of Sa](https://reader033.vdocument.in/reader033/viewer/2022050908/563db8be550346aa9a9683c2/html5/thumbnails/102.jpg)
Further Questions
References
Best starting point:Sentiment Analysis and Subjectivity by Bing Liuhttp://www.cs.uic.edu/ liub/FBS/NLP-handbook-sentiment-analysis.pdf
Rob Zinkov () A Taste of Sentiment Analysis May 26th, 2011 102 / 105
![Page 103: Good Review of Sa](https://reader033.vdocument.in/reader033/viewer/2022050908/563db8be550346aa9a9683c2/html5/thumbnails/103.jpg)
Further Questions
References (More)
• Joint Extraction of Entities and Relations for Opinion Recognition(Choi 2006)
• Mining Opinion Features in Customer Reviews (Liu 2004)
• A Holistic Lexicon-Based Approach to Opinion Mining (Deng 2008)
• I Cant Recommend This Paper Highly Enough (Dillard thesis)
• Entity Discovery and Assignment for Opinion Mining Applications(Deng 2009)
• Extracting Product Features and Opinions from Reviews (Popescu2005)
Rob Zinkov () A Taste of Sentiment Analysis May 26th, 2011 103 / 105
![Page 104: Good Review of Sa](https://reader033.vdocument.in/reader033/viewer/2022050908/563db8be550346aa9a9683c2/html5/thumbnails/104.jpg)
Further Questions
Conclusions
• Sentiment Analysis is a relatively young area
• Still plenty of ideas to be explored
• Widely applicable
• Really fun
Rob Zinkov () A Taste of Sentiment Analysis May 26th, 2011 104 / 105
![Page 105: Good Review of Sa](https://reader033.vdocument.in/reader033/viewer/2022050908/563db8be550346aa9a9683c2/html5/thumbnails/105.jpg)
Further Questions
Questions?
Rob Zinkov () A Taste of Sentiment Analysis May 26th, 2011 105 / 105