unsupervised sentiment analysis
Post on 05-Dec-2014
3.585 Views
Preview:
DESCRIPTION
TRANSCRIPT
Taras Zagibalov© 2009
Taras ZagibalovT.Zagibalov@sussex.ac.uk
PhD candidate at University of SussexBrighton, UK
Ford Foundation International Fellowship fellowNatural languages: Russian, English, Mandarin
Programming: Java, Prolog
Taras Zagibalov© 2009
Unsupervised Sentiment Analysis
Listening to the Word of MouthListening to the Word of Mouth
What is it?How does it work?
How can it be used?
Taras Zagibalov© 2009
Outline
What is Sentiment Analysis Application of Sentiment Analysis Who's in the business? Unsolved Problems Why unsupervised? Is it effective?
Taras Zagibalov© 2009
Sentiment Analysis
Sentiment Analysis (or Opinion Mining) is a relatively new research area in Information
Retrieval and Natural Language Processing, which is concerned not with a document's topic,
but with what opinion it expresses
Taras Zagibalov© 2009
What is Sentiment Analysis
Subjectivity Classification Orientation Detection Opinion Holder and Target Extraction Feature-Based Opinion Mining
Taras Zagibalov© 2009
What is Sentiment Analysis
Subjectivity Classification Orientation Detection Opinion Holder and Target Extraction "Feature-Based Opinion Mining"
A car has four wheels.
vs
It's a good car.
Taras Zagibalov© 2009
What is Sentiment Analysis
Subjectivity Classification Orientation Detection Opinion Holder and Target Extraction "Feature-Based Opinion Mining"
It's a good car.
vs
It's a bad car.
Taras Zagibalov© 2009
What is Sentiment Analysis
Subjectivity Classification Orientation Detection Opinion Holder and Target Extraction "Feature-Based Opinion Mining"
Ian says it's a good car.
Taras Zagibalov© 2009
What is Sentiment Analysis
Subjectivity Classification Orientation Detection Opinion Holder and Target Extraction "Feature-Based Opinion Mining"
The wheels are good, but all the rest is just unusable.
Taras Zagibalov© 2009
Application of Sentiment Analysis
Where opinions can be found?
News feeds (Google, Yahoo, Reuters etc) Blogs (LJ, Technorati etc) Social Networks (Twitter, Facebook...) Customer review sites (Amazon, eBay...)
Taras Zagibalov© 2009
Application of Sentiment Analysis
Marketing Research Product Reviews Analysis Brand Tracking Influence Analysis
Public Opinion Tracking Customer correspondence analysis
Taras Zagibalov© 2009
Application of Sentiment Analysis
What questions can be answered by Sentiment analysis system?
What do customers think about our product? Which of our customers are unsatisfied? What features of our product are the worst? Who and how influences our image? What is public reaction to (some event or
some person)? and so on...
Taras Zagibalov© 2009
Example 1
On-line (blogs, mass-media) monitoring of a product promotion campaigns
Promotional campaign A is successful as most of on-line reviews are positive.
Promotional campaign B needs immediate actions as most of on-line reviews are negative.
A B
0
1
2
3
4
5
6
7
8
9
10
Taras Zagibalov© 2009
Example 2
New product release as it mirrored in customer on-line reviews
(A) Product release and add campaign is quite effective as public opinion is mostly positive. But the sentiment changes as sales grow (B), more people are unsatisfied and it needs to be analysed (probably some quality-related issues)
A B
0
1
2
3
4
5
6
7
8
Taras Zagibalov© 2009
Example 3
Influence analysis by tracking blogs
(A) Negative review in a newspaper does not affect a generally positive sentiment towards a product, although a positive review in a magazine (B) is quite effective.
A B
0
1
2
3
4
5
6
7
8
9
Taras Zagibalov© 2009
Who's in the business?
BrandWatch Istrategy Labs Cataphora Scoutlabs Lexalytics Infonic Attensity Open Dover ...
Taras Zagibalov© 2009
What's the technology?
Machine Learning Manually tagged training data sets User-tagged training data sets (“thumbs up” and the
“ five stars”)
Knowledge-based Approaches Manually created word-lists Generic word-lists (like SentiWordNet or sentiment
vocabularies)
Manual Processing
Taras Zagibalov© 2009
Unsolved Problems
Domain-dependency Unpredictable evaluation language Language-dependency
Taras Zagibalov© 2009
Unsolved Problems
Domain-dependency Unpredictable evaluation language Language-dependency
"The plot was unpredictable"
vs
"the steering was unpredictable"
Taras Zagibalov© 2009
Unsolved Problems
Domain-dependency Unpredictable evaluation language Language-dependency
“good” == “bad” in eBay
“3G” (technology for mobile phones) == “good”
Taras Zagibalov© 2009
Unsolved Problems
Domain-dependency Unpredictable evaluation language Language-dependency
Culture-related issues (“good” <> “ ”好 )
Language-related issues (SVO vs SOV)
Taras Zagibalov© 2009
Why unsupervised?
Cross-Domain applicability Multi-Lingual applicability Cheap Start
Taras Zagibalov© 2009
Why unsupervised?
Cross-Domain applicability Multi-Lingual applicability Cheap Start
No expensive human annotation needed: all information is found in the documents which needed to be processed.All extracted information is domain-specific and free from noise produced by “generic” word lists and wordnets.
Taras Zagibalov© 2009
Why unsupervised?
Cross-Domain applicability Multi-Lingual applicability Cheap Start
Unsupervised systems, being data-independent, can be easily ported to almost any language.
Taras Zagibalov© 2009
Why unsupervised?
Cross-Domain applicability Multi-Lingual applicability Cheap Start
Once an unsupervised system is developed it can be applied to new data almost immediately saving costs of data labelling and/or rules (word-lists) writing up.
Taras Zagibalov© 2009
Is it effective?
The unsupervised approach was tested on different language corpora (English, Simplified Chinese, Traditional Chinese, Japanese) and in many cases compared reasonably well with supervised methods.
Results were presented on some major international scientific conferences (ACL, IJCNLP, COLING, NTCIR).
Taras Zagibalov© 2009
Is it effective?
The approach can be easily combined with supervised techniques:
Unsupervised system can provide initial data for in-depth research of the data (building up word-lists and rule-sets)
Automatically extracted information can be used for training machine learning systems.
Taras Zagibalov© 2009
Conclusion
Unsupervised Sentiment Analysis is an efficient instument of keeping track of public opinion in different domains and languages.
It can be used as an entry point to a new domain or language.
It can be combined with supervised methods to increase accuracy.
top related