naïve multi label classification of you tube comments using

14
Naïve Multi-label classification of YouTube comments using comparative opinion mining By- Nidhi Baranwal MCA 5 th sem

Upload: nidhi-baranwal

Post on 06-Jan-2017

34 views

Category:

Education


1 download

TRANSCRIPT

Page 1: Naïve multi label classification of you tube comments using

Naïve Multi-label classification of YouTube comments

using comparative opinion mining

By- Nidhi Baranwal MCA 5th sem

Page 2: Naïve multi label classification of you tube comments using

Introduction

• People are connecting with each other in cyber space and show their sentiments in the form of comments. YouTube is considered as a king in the field of video sharing.

• There are situations in which opinion shared by user has comparative content. User sees the video of comparison of two options and shares his preference based on some reasoning.

• In this paper, Naïve Bayes machine learning algorithm is used to perform multi-label classification to find out the sentiments of the commentators .

• In order to reduce the computational requirements, it uses a naïve assumption that words around keywords related to particular option are enough to understand the sentiments of user.

Page 3: Naïve multi label classification of you tube comments using

Classification?

• Classification is a task to predict a class(label) of an instance based on data

• Supervised Learning Example: Naïve Bayes• We give the system a set of instances to learn • System builds knowledge of some structure• System can then classify new instances

Page 4: Naïve multi label classification of you tube comments using

Types of Classification

• Binary classification: each instance can be only one out of two classes

• Multiclass classification: each instance can be only one out of more than two classes

• Multi-label classification: each instance can be multiple classes at the same time

• Hierarchical multi-label classification: classes are organized in a hierarchy

Page 5: Naïve multi label classification of you tube comments using

Opinion Mining?

• Opinion mining or Sentiment analysis is concerned as “How people think about particular thing, person or idea”. • It is the process of determining whether a piece of writing is

positive, negative or neutral.• In comparative sentiment analysis we have to deal with multi-

aspect comments. Commentator compares more than one things, people or idea on the basis of some aspects.

Page 6: Naïve multi label classification of you tube comments using

Tasks Involved

• To find relevant comments following tasks are involved:

1. Gathering of data (gathering comments)2. Removal of noisy and irrelevant data.3. Manual assignment of sentiments to the comments in order to

make training corpus.4. Development and evaluation of classification model

Page 7: Naïve multi label classification of you tube comments using

Naïve Bayes Classifier

• Simple classification of words based on ‘Bayes theorem’.• It is a ‘Bag of words’ (text represented as collection of it’s

words, discarding grammar and order of words but keeping multiplicity) approach for analysis of a content

• Application: Sentiment detection, Email spam detection, Document categorization etc.

• Probabilistic Analysis of Naïve Bayes: for a document d and class c , by Bayes theorem

)()()/()|(

dPcPcdPdcP

Page 8: Naïve multi label classification of you tube comments using

Data Analysis

• It has worked on Iphone vs Android video, which consisted of over 8000 comments.

• Then filtered comments and only used comparative comments in the research.

• The dataset in this research is about 400 comments which are almost 5% of the original dataset.

Page 9: Naïve multi label classification of you tube comments using

Methodology followed

• Data collection• Class assignment (2 labels and 9 classes)• Facing difficulties with assigning annotations -handling problems with symbols and short forms -ambiguity in comments: various types• Finding part of speech and neighbor words of keywords from

comments• Using tools and steps for classification• Finding better results

Page 10: Naïve multi label classification of you tube comments using

Tools and Steps used

• We used WEKA(single label classification + joined label classification) and MEKA (multi label classification), specialized software , to perform machine learning tasks

• Following are the steps taken to develop classification model: Data Processing and Class balancing Classification Naïve Bayes Probabilistic classifier

Page 11: Naïve multi label classification of you tube comments using

Results obtained

• The results in terms of different performance measures are not satisfactory but the naïve assumption regarding neighborhood words of keywords performed well as compare to others.

• Single label comments and Joined label comments give poorer results than multi label

Page 12: Naïve multi label classification of you tube comments using

Contd…

Page 13: Naïve multi label classification of you tube comments using

Contd…

Page 14: Naïve multi label classification of you tube comments using

THANKS