ra’ai analyzer an arabic sentiment analyzer for...

20
Imam Mohammad Bin Saud Islamic University College of Computer and Information Sciences Department of Computer Science Ra’ai Analyzer An Arabic Sentiment Analyzer for Twitter By: Supervisor: Dr. Sarah Alhumoud Project Submitted in Partial Fulfillment for the Degree of B.Sc. In “Computer Science” Semester 1-2014 ID Name 431020432 Mawaheb Altuwijri 431042812 Tarfa Albuhairi 431020515 Wejdan Alohaideb

Upload: others

Post on 19-Aug-2020

1 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Ra’ai Analyzer An Arabic Sentiment Analyzer for Twitterimamsevenconference.weebly.com/uploads/1/8/3/1/... · Data has become the currency of this era as it is continuing its massive

Imam Mohammad Bin Saud Islamic University

College of Computer and Information Sciences

Department of Computer Science

Ra’ai Analyzer

An Arabic Sentiment Analyzer for Twitter

By:

Supervisor: Dr. Sarah Alhumoud

Project Submitted in Partial Fulfillment for the Degree of B.Sc. In “Computer

Science” Semester 1-2014

ID Name 431020432 Mawaheb Altuwijri

431042812 Tarfa Albuhairi

431020515 Wejdan Alohaideb

Page 2: Ra’ai Analyzer An Arabic Sentiment Analyzer for Twitterimamsevenconference.weebly.com/uploads/1/8/3/1/... · Data has become the currency of this era as it is continuing its massive

2

Abstract

Data has become the currency of this era as it is continuing its massive

increase in size and generation rate. These big data can be of great value for

organizations when analyzed properly. This research represents an

implementation of sentiment analysis on Twitter’s tweets which is one of the

biggest public and freely available big data sources. It will analyze Arabic

(Saudi dialect) tweets to extract sentiments of these tweets toward a specific

topic. It used a small dataset consisting of 1000 tweets collected from Twitter.

The collected tweets were analyzed using two approaches, supervised which

based on machine learning and unsupervised which based on lexicon. The

supervised approach used three algorithms which are Support Vector Machine

(SVM), Naive Bayes (NB) and K-Nearest Neighbor (KNN). The obtained results

by the cross validation option on the same dataset clearly confirm the superiority

of the supervised approach namely SVM, NB and KNN with accuracies’ degree

of 98%, 95% and 94%, consecutively. While the unsupervised classifier has

scored 78% of accuracy.

Page 3: Ra’ai Analyzer An Arabic Sentiment Analyzer for Twitterimamsevenconference.weebly.com/uploads/1/8/3/1/... · Data has become the currency of this era as it is continuing its massive

3

Arabic Abstract ( عريبالخلص امل )

اليت يتم ، وهذه البياانت حجما وسرعة ملستمراأصبحت البياانت عملة هذا العصر بسبب تزايدها

ذات قيمة ميكن أن تكون . البياانت الضخمة يطلق عليها البياانت الضخمةضخها بكمية هائلة يف وقت قصري

. هذا البحث يطلق وتفسريها ابلشكل املطلوب للمؤسسات، واحلكومات، وكذلك الباحثني إذا مت حتليلها عالية

عترب أحد أكرب مصادر ت واليتتويرت الشبكة االجتماعيةعليه "رأي"، والذي يقوم بتحليل الرأي الوارد يف تغريدات

مة. يقوم "رأي" بتحليل التغريدات العربية ذات اللهجة السعودية العامية، الستخراج رأي البياانت الضخ

الشبكة تغريدة جعت من 1000األشخاص حول موضوع واحد. مت حتليل جمموعة من التغريدات يبلغ عددها

الذي يعتمد على "Supervised" ستخدم هنجني يف حتليلها، أوهلما: النهج اإلشرايفتويرت. وقد ا االجتماعية

الذي يعتمد على "Unsupervised" ، والنهج الثاين: غري اإلشرايف"Machine Learning" م اآللةتعل

بناء املصنفات، لنهج اإلشرايف ال. مت استخدام ثالث خوارزميات يف "Dictionary" قاموس الكلمات

أما يف النهج غري اإلشرايف فقد مت بناء خوارزمية خاصة هبذه الطريقة، . KNNو SVM ،NBوهي:

. أظهرت نتائج النهج اإلشرايف ابستخدام السعودية العاميةواستخدمت أربعة قواميس مت بناؤها لتخدم اللهجة

حيث بلغت SVM،أن أفضل نسبة دقة تصنيف حققها املصنف "Cross Validation" اختبار

%. بلغت نسبة دقة املصنف غري 94آخرا بنسبة KNN %، وأييت95نسبة NB %، بينما حقق98

.%، مما يؤكد تفوق النهج اإلشرايف على النهج غري اإلشرايف78اإلشرايف

Page 4: Ra’ai Analyzer An Arabic Sentiment Analyzer for Twitterimamsevenconference.weebly.com/uploads/1/8/3/1/... · Data has become the currency of this era as it is continuing its massive

4

1. Introduction

Online data is doubling in size every two years [1]. The amount of

online data generated in 2013 was 4.4 Zettabytes (ZB), and in 2020 it will

reach 44 ZB [1]. Individual users are the main source, contributing 75% to

the overall produced data [2]. Big data is described by the 3'Vs model

(variety, velocity and volume). Data variety includes both structured and

unstructured data such as emails, videos, audios, images, click streams, logs,

posts, search queries. Velocity refers to the speed needed to process and

store the huge and complex data, to respond to the increasing and continuous

requests. Volume indicates the massive size of generated data [3].

Social networks such as Twitter and Facebook, which are popular

means for communication, are important sources for big data. Twitter, a

micro blogging social network, founded in 2006, enables users to freely,

easily, and instantaneously express, reach, and share opinions and feelings in

public in an SMS style text, called tweets. Each tweet has 140 characters or

less [4]. In 2014, a study shows that there are more than 5.8 million Arab

users [5] out of 255 million users from all over the world [6]. Based on

latest study done by twitter in 2013 shows that there are 500 million tweets

per day [7] while 10.8 million tweets of them are written in Arabic [5] as

shown in Figure 1. Saudi users produced 40% of all tweets in the Arab world

[5].

Figure 1: Twitter Arab usage

Page 5: Ra’ai Analyzer An Arabic Sentiment Analyzer for Twitterimamsevenconference.weebly.com/uploads/1/8/3/1/... · Data has become the currency of this era as it is continuing its massive

5

A combination of techniques such as Data Mining, Text Mining,

Machine Learning and Natural Language Processing (NLP) are used to

extract valuable information from big data streams. Data mining is the

exploration and analysis of large amounts of data [8]. The process of

extracting sentiments relies first on data mining techniques to find patterns

then Sentiment Analysis (SA) techniques are applied on these patterns.

SA is a one of the NLP concepts, which is also called opinion mining

[9]. This field of computer science is used to extract sentiment out of text

giving useful information about the author and his/her tendency toward a

specific topic. Two approaches can be used in SA, supervised approach,

known as corpus-based approach. The supervised approach uses machine

learning algorithms such as Support Vector Machine (SVM), Naïve Bayes

(NB), Decision Tree (D-Tree) and K-Nearest Neighbor (KNN) to build a

classifier. Feature extraction is applied before building a classifier by either

Unigram or Bigram. Unigram treats each word as an independent unit, while

Bigram treats a pair of consecutive words as one unit. The second approach

is unsupervised approach, also known as lexicon-based. The polarity of a

word is based on a dictionary where each word is associated with a polarity

value: +1, -1 or 0 for positive, negative or neutral, respectively [10].

The problem is that there is a need by organizations to know clients

tendencies, feedback and opinion in order to improve their services and

products. They usually resort to conduct interviews directly with clients or

distribute questionnaires to collect clients’ feedback. That drains a lot of

time, effort and cost. In addition, it may not serve as a precise indication to

actual costumers’ behavior and preferences as the questionnaire questions

may not cover all needs or it may not be answered thoughtfully and

accurately. Moreover, clients may not express their immediate feedback

Page 6: Ra’ai Analyzer An Arabic Sentiment Analyzer for Twitterimamsevenconference.weebly.com/uploads/1/8/3/1/... · Data has become the currency of this era as it is continuing its massive

6

openly in a questionnaire compared to what they do in Twitter. This hides

valuable information that would benefit the organization.

This research presents Ra’ai Analyzer (RA) that performs sentiment

analysis on Arabic content in Twitter. Ra’ai (رأي) is an Arabic word meaning

opinion or sentiment. RA extracts the sentiment from Arabic tweets (Saudi

dialect) to get the outcome of users' tendencies toward a specific topic.

Through providing a user interface to RA. The interface displays a bar plot

for the overall sentiment that represents the number of positive and negative

classified tweets. It also offers a word cloud showing most frequent words

appearing in tweets containing the keyword in demand. Where the word’s

size represents its occurrence frequency relative to other words. In addition,

through the interface the user can review the collected tweets and their

classification.

The objectives of RA are to give the organization the ability to

understand clients’ tendencies toward a product or a service. Also, try to find

the highest classification accuracy by analyzing and comparing the used

algorithms in the supervised approach.

This research is organized as follows: related work, methodology,

results and discussion; and finally the conclusion.

2. Related Work

The work that has been done regarding Arabic SA in Twitter is

limited. Papers could be categorized based on the classification approach

used. Each effort done in researches [10], [11], [12], [13], [14], [15] and [16]

have used a supervised approach. In paper [10] two algorithms were used

Page 7: Ra’ai Analyzer An Arabic Sentiment Analyzer for Twitterimamsevenconference.weebly.com/uploads/1/8/3/1/... · Data has become the currency of this era as it is continuing its massive

7

SVM, and NB. The experiment proved that SVM has better results than NB

while the use of Bigram did not enhance the accuracy.

While Paper [11] proposed a system for sentiment analysis using a

SVM with presence vector. Their experiment shows that SA accuracy was

not affected by adding Part Of Speech (POS).

Paper [12] has discussed the affection of Ngram and words frequency

on the accuracy of the classifier. They conducted that using Unigram has

enhanced the accuracy. Regarding word frequency, their experiments show

that the frequency vector has enhanced SVM accuracy.

Researchers in [13] have built the Kuwaiti Dialect Opinion Extraction

System from Twitter (KDOEST) using SVM and D-Tree algorithms to

classify tweets. Their results show that SVM accuracy is superior to D-Tree.

RapidMiner was used in paper [14] to classify their collected tweets

using both NB and D-Tree algorithms. They studied the impact of

considering the emotion faces, emoticon that are used widely by Twitter’s

users. Their approach showed that classification with emoticon has raised

the accuracy from 58% to 64%.

Another work has been done for the Jordanian dialect in paper [15]

where they used SVM, NB and KNN classifiers. Where the NB got the best

results among the others. Moreover, their negation handling did not enhance

accuracy.

While in paper [16], they examined both approaches supervised and

unsupervised. In the unsupervised approach, they built their lexicon

manually and enhanced it through adding the synonyms of the word. They

used SVM, NB, KNN and D-Tree for the supervised approach. The tests’

Page 8: Ra’ai Analyzer An Arabic Sentiment Analyzer for Twitterimamsevenconference.weebly.com/uploads/1/8/3/1/... · Data has become the currency of this era as it is continuing its massive

8

results clearly conform that supervised approach got better result than

unsupervised approach and SVM has got the first rank among them.

Authors of [17] and [18] have applied unsupervised approach. Paper

[17] used a lexicon of 200 words and using stemmer they reduced its size.

They used a small dataset of 100 tweets. Their accuracy was 73% affected

by the size of the lexicon.

In paper [18], they built their lexicon form a seed lexicon of 380

words and extended it, and used two methods to calculate tweet sentiment.

The first is the sum, which sums the each word’s polarity in the text, and the

second is the double polarity method where each word in the tweet has

positive and negative polarity, this had better result than the first method.

Table 1 lists the algorithms that have the best accuracy that were

conducted from the related work, and display the effect of the enhancement

factor on the accuracy. From the table it is obvious that among the used

algorithms SVM and NB got the best results. While the effect of

enhancement factor on the accuracy the table shows that Bigram had no

affect while Unigram has increased the accuracy.

Reference of

Paper

Algorithm with

Best Accuracy

Accuracy

Enhancement Factor

Effect of Enhancement

Factor

[10] SVM. Bigram. No effect.

[15] NB. Unigram. Increased.

Negation handling. No effect.

[12] NB. Unigram Increased with NB.

[12] SVM. Frequency vector. Increased with SVM.

[13] SVM. None. None.

[16] SVM. None. None.

[11] SVM. POS. No effect.

Table 1: Best result algorithms and the enhancement factor used in the related work.

Page 9: Ra’ai Analyzer An Arabic Sentiment Analyzer for Twitterimamsevenconference.weebly.com/uploads/1/8/3/1/... · Data has become the currency of this era as it is continuing its massive

9

5. Methodology

Two SA approaches were applied: supervised and unsupervised. The

supervised and unsupervised approaches have been trained on one domain

dataset. In general, the system contains four main components: collecting

tweets, preprocessing, filtering and classifying as shown in Figure 2.

5.1 Collecting Tweets Component

Tweets were collected on a specific topic using three

approaches, which are Twitter Application Program Interface (API)

by using search function, Twitter Archivist and manually by favoring

tweets then retrieving them using TwitteR package. The number of

collected tweets was more than 30,000 that contains advertisements,

retweets, and unrelated tweets. After removing the advertisements,

retweets, and unrelated tweets manually the number of the remaining

Figure 2: System architecture.

Page 10: Ra’ai Analyzer An Arabic Sentiment Analyzer for Twitterimamsevenconference.weebly.com/uploads/1/8/3/1/... · Data has become the currency of this era as it is continuing its massive

10

tweets was 1000 tweet. The number of positive tweets is 600, and

negative is 400.

TwitteR package is an R interface to Twitter API to retrieve

tweets and other functions. We found that the package works only if

the entered keyword was in the highest trending topics, and with more

than 300,000 tweets. For that reason, the football domain was selected

as it was the most trending Saudi topic in the region at the time of this

study, coinciding with the Asian and Gulf cups.

5.2 Preprocessing Component

Tweets were preprocessed to remove unrelated content such as

URLs, user names, non-Arabic letters, numbers, and punctuations.

5.3 Filtering Component

The third component has filtered the tweets by doing three

processes. First is removing all stop words from the collected tweets.

The available stop words list [19] was not suitable for SA because of

two reasons. The first reason is that it contained negation words such

as (لن, ما ,لم ,ال) that may change the sentiment of the tweets. The

second reason is that the used tweets are written mostly in Saudi

dialect which has different stop words than the available one. For SA

purposes the stop words list was modified to combine both the

Modern Standard Arabic (MSA) stop words along with Saudi dialect

stop word. In addition, negation words were removed from the stop

word list. The modified stop words list by research team is available at

[20]. The second process is normalizing each of (آ, إ ,أ) letters to (ا), (ة)

to (ه), and remove any diacritics (short vowels) such as (

). This normalizes all words to hold the same letters shape as the one

Page 11: Ra’ai Analyzer An Arabic Sentiment Analyzer for Twitterimamsevenconference.weebly.com/uploads/1/8/3/1/... · Data has become the currency of this era as it is continuing its massive

11

in the dictionary for the final step which is classification. The reason

for applying this process is that many Arabic Twitter users often

mistake between these similar letters and use them interchangeably.

Third process is correcting misspellings and removing the repeated

letters manually such as (gooooood = نزيييييي ) to (good = .( زين

5.4 Classification Component

This component includes two subcomponents:

5.4.1 Supervised Approach

The supervised approach used a training dataset which

contains 1000 tweets that was treated as Unigram and Bigram.

The frequency of the presence of a word affects the classifier’s

accuracy. For that reason, the feature selection was applied to

remove the word (هالل) from all the tweets; since it has the

highest frequency and it is not sentimental word.

The dataset is transformed into feature vector which

transform it to numeric to be understandable by the classifier.

The feature extraction is used as Term Frequency – Inverse

Document Frequency (TF-IDF) transformer. Then the feature

vector used different machine learning algorithms; SVM, NB,

and KNN. Each algorithm extracts a model (classifier) which

will be used in the evaluation step. The evaluation step took

the training dataset as testing dataset and uses the produced

model to provide the accuracy percentage of each classifier.

Java language with WEKA packages [21] was used to

implement the machine learning algorithms.

Page 12: Ra’ai Analyzer An Arabic Sentiment Analyzer for Twitterimamsevenconference.weebly.com/uploads/1/8/3/1/... · Data has become the currency of this era as it is continuing its massive

12

5.4.2 Unsupervised Approach

The unsupervised approach has two words dictionaries.

Positive words dictionary and negative words dictionary. The

dictionaries contain both MSA words and Saudi dialect words.

The MSA words dictionaries are built beforehand [22]. While

the Saudi dialect words were built with the help of a part of the

collected dataset about 840 tweets. Each word was labeled as

positive or negative based on two human experts. If the experts

did not agree on the label of a certain word, a third expert is

consulted to break the tie. The negation words that were

removed from the Stop words list were added to the negative

words dictionary. Finally, positive words dictionary was

expended from 1294 to 1451. While negative words dictionary

was expanded from 2245 to 2460. A sample of the expanded

words dictionaries available at [20].

Each tweet was split into a list of words. A match

between the tweet’s words and the dictionaries’ words is done

to score each tweet according to the number of words that

were found in both dictionaries.

To increase the accuracy, two dictionaries where added

to the classifier containing positive and negative phrases such

as (كما تدين تدان) which indicate negative sentiment, and (ما شاء هللا)

which indicate positive sentiment. The following algorithm

explains the process. It has a time complexity of O(N2). R

language was used to implement the lexicon-based algorithm.

INPUT: Tweets T, Positive words Lexicon PL, Negative words Lexicon NL,

Positive Phrases lexicon PP, Negative Phrases lexicon NP, Words of the tweets

W.

OUTPUT: P = {Pos, Neg, or Net}, where Pos: Positive, Neg: Negative, Net:

Page 13: Ra’ai Analyzer An Arabic Sentiment Analyzer for Twitterimamsevenconference.weebly.com/uploads/1/8/3/1/... · Data has become the currency of this era as it is continuing its massive

13

Neutral.

INITIALIZATION: Score = 0, P = 0, N= 0, where P: accumulate the positive

words, N: accumulate the negative words, Score: subtract N from P to get the

tweet score.

Begin

1. For each Ti ϵ T

1. For each PPi ϵ PP

1. If PPi ϵ Ti then

1. P = P + 1

2. End if

3. If NPi ϵ Ti then

1. N = N + 1

4. End if

2. End for

3. For each Wi ϵ Ti

1. If Wi ϵ PL then

1. P = P + 1

2. End if

3. If Wi ϵ NL

1. N = N + 1

4. End if

4. End for

5. Score = P – N

6. If Score > 0 then

1. P ← Pos

7. End If

8. If Score < 0 then

1. P ← Neg

9. End If

10. If Score == 0 then

1. P ← Net

11. End If

2. End for

End

6. Results and Discussions

The evaluation of accuracy for the supervised approach used the

precision, recall and accuracy equations. While the unsupervised approach

used the accuracy equation. These equations 1, 2 and 3 are presented below:

𝐏𝐫𝐞𝐜𝐢𝐬𝐢𝐨𝐧 = 𝐓𝐏 / (𝐓𝐏 + 𝐅𝐏) (1)

𝐑𝐞𝐜𝐚𝐥𝐥 = 𝐓𝐏 / (𝐓𝐏 + 𝐅𝐍) (2)

𝐀𝐜𝐜𝐮𝐫𝐚𝐜𝐲 = (𝐓𝐏 + 𝐓𝐍) / (𝐓𝐏 + 𝐓𝐍 + 𝐅𝐏 + 𝐅𝐍) (3)

Where TP, FP, TN, and FN are true positive, false positive, true

negative and false negative, respectively.

Page 14: Ra’ai Analyzer An Arabic Sentiment Analyzer for Twitterimamsevenconference.weebly.com/uploads/1/8/3/1/... · Data has become the currency of this era as it is continuing its massive

14

6.1 Supervised Approach

Three different experiments were conducted using three

classifiers: SVM, NB and KNN. The motivation of selecting these

classifiers was the superior performance they showed in previous

related studies [23], [10], [16]. For testing and validation purposes,

the 10-fold cross validation technique was used; since cross validation

is more suitable for small datasets.

Table 2 shows the experiment’s outcomes which clearly

confirms SVM has better accuracy than other classifiers. As appeared,

the classifiers’ accuracies are more than 90%. The reasons for the high

accuracy of the used classifiers compared with the ones in the related

work shown in this research could be summarized in four reasons.

Firstly, the limitation of the dataset domain where the dataset was

focused on one domain. Secondly, the use of 10-fold cross validation

with the value ten was chosen based on a related work that show its

suitability for small data set. Thirdly, the fullness of the data set with

common sentimental words. Finally, the use of feature selection to

remove the word (هالل); since it has the highest frequency and it is not

sentimental word.

Table 2 also shows the accuracy results of Unigram and bigram.

Figure 3 represents SVM’s performance which clarifies the impact of

Unigram and Bigram on the SVM’s accuracy. The test has been

repeated ten times and each time a different dataset is used The

Unigram has achieved higher accuracy than bigram. The reason

behind these results is the presence of negation words in negative

tweets. In Bigram, negation word will appear differently for the

classifier based on how the tweet was divided. Suppose the tweet ( أحمد

which means (Ahmed do not like football). Bigram (ما يحب الكورة

Page 15: Ra’ai Analyzer An Arabic Sentiment Analyzer for Twitterimamsevenconference.weebly.com/uploads/1/8/3/1/... · Data has become the currency of this era as it is continuing its massive

15

technique breaks the tweet into units of two words such that the

tweets’ units are ( كورة , يحبماأحمد ). The sentimental words (ما) and (يحب)

were separated, the unit (أحمد ما) will have

no meaning while the unit ( الكورةيحب ) will have a positive

meaning which will classify the tweet to positive rather than negative

and therefore decrease the accuracy of the results.

Figure 4 shows Unigram and Bigram effects on classifiers’

accuracy when the dataset size is 1000 tweets. Table 2 shows the

testing results average.

Ngram Unigram Bigram

Classifier SVM NB KNN SVM NB KNN

100Tweets 91% 92% 86% 73% 73% 71%

250Tweets 96% 95% 89% 79% 78% 74%

400Tweets 96% 96% 90% 83% 77% 78%

650Tweets 98% 97% 96% 95% 90% 92%

1000Tweets 98% 95% 94% 98% 79% 82%

60

65

70

75

80

85

90

95

100

0 200 400 600 800 1000

Acc

ura

cy P

erce

nta

ge

Number of tweets

UniGram

BiGram

Figure 3: The effect of using Unigram and Bigram on SVM

Table 2: Comparison between classifiers accuracy according to

Unigram and Bigram

Page 16: Ra’ai Analyzer An Arabic Sentiment Analyzer for Twitterimamsevenconference.weebly.com/uploads/1/8/3/1/... · Data has become the currency of this era as it is continuing its massive

16

The supervised approaches algorithms’ accuracy can be

improved by increasing the size of the training dataset, and by using a

words’ stemmer.

6.2 Unsupervised Approach

The lexicon-based test was applied on 160 tweets. The results

show that lexicon-based approach gives much lower accuracy

compared to the corpus-based. The achieved accuracy was 78%. The

accuracy of the unsupervised approach relies on the containment of

the dictionary matched words to the tweets’ words. Table 3 shows an

example of classified tweets by the unsupervised classifier.

The accuracy of the unsupervised approach is high compared

with the other related work mentioned in this research. The reason is

the fullness of the data set with a common sentimental words.

Tweet's

Polarity

Tweet after Prepressing

and Filtering Original Tweet

Positive

مره عجبنيا لعبهم جميل

The team has played well I

liked it much

يييييل ة. عجبنيأ ,هم اليوملعب جم مر

The team has played weeeell today, I

liked it much.

60

65

70

75

80

85

90

95

100

SVM NB KNN

Acc

ura

cy P

erce

nta

ge

Classifiers

Unigram

Bigram

Figure 4: The effect of Unigram and Bigram on classifiers’ accuracy

Page 17: Ra’ai Analyzer An Arabic Sentiment Analyzer for Twitterimamsevenconference.weebly.com/uploads/1/8/3/1/... · Data has become the currency of this era as it is continuing its massive

17

Negative

داؤهم ضعيفاالمدرب شين الالعبين

Bad coach, and the players

played poorly

!!! داؤهم ضعيفأالالعبين كان و ,ينالمدرب ش

Bad coach, and the players played

poorly !!!

The lexicon-based tool accuracy can be improved by expanding

dictionaries with more words, and using words’ stemmer.

Figure 5 parts (a) and (b) show some interface of Ra'ai

Analyzer. Part (a) shows the overall sentiment bar plot when a user

enters a keyword. Part (b) shows the words cloud function that appear

the frequency of the words, where the word has high frequency be

bigger size.

7. Conclusion and Future Work

SA still in its early steps and it is considered as one of the main

research trends among data scientists especially in Arabic language. This

research has presented an implementation of SA on the Arabic content of

Twitter. The following section 7.1, will display this research challenges. The

conclusion of the research is in section 7.2 and the future work is in section

7.3.

Table 3: Example of classifying tweets

Figure 5. a: Overall sentiment bar plot Figure 5. b: Words cloud function

Page 18: Ra’ai Analyzer An Arabic Sentiment Analyzer for Twitterimamsevenconference.weebly.com/uploads/1/8/3/1/... · Data has become the currency of this era as it is continuing its massive

18

7.1 Challenges

Some challenges formed an obstacle to complete some functions

and necessitates plan modifications. While other challenges were

conquered by team members. Following are some of the challenges:

Limited support for Arabic language in programming languages.

Difficulty of collecting Arabic tweets from Twitter.

Time's cost for filtering unrelated tweets.

Complexity of Arabic language structure.

Limited Arabic resources that provide Arabic word dictionary,

stemmer and stop word list for SA purposes.

Difficulty of obtaining Arabic SA information because of limited

resources.

7.2 Conclusion

Nowadays it is not enough to own data, but being able understand

it efficiently and analyze it in a timely manner gives its owner

knowledge and power. SA is one of the techniques used to analyze data.

Ra’ai Analyzer has implemented SA on Arabic (Saudi dialect tweets).

The results of the used approaches proved superiority of the supervised

approach using SVM over the unsupervised approach scoring 98% and

78% accuracy, respectively. While NB classifier achieved 95% and

KNN’s accuracy was 94%. Results could be improved on the basis of

further work, which will be accomplished in the near future.

Page 19: Ra’ai Analyzer An Arabic Sentiment Analyzer for Twitterimamsevenconference.weebly.com/uploads/1/8/3/1/... · Data has become the currency of this era as it is continuing its massive

19

7.3 Future Work

The system can be enhanced and improved by adding more

functions. Some functions were set as a future work because of the lack

of time in studying Arabic language. A list of future directions are:

Classify Arabic live streams of tweets.

Provide an approach to deal with word negation.

Use word stemmer to enhance the supervised and unsupervised

classifier performance.

References

[1] J. Gantz and D. Reinsel, "Digital Universe Study: Extracting Value from Chaos," EMC2, June

2011. [Online]. Available: Internet: http://www.emc.com/leadership/programs/digital-

universe.htm. [Accessed 6 Nov 2014].

[2] "The 2011 IDC Digital Universe study sponsored by EMC," [Online]. Available: "

Interhttp://www.emc.com/collateral/about/news/idc-emc-digital-universe-2011-

infographic.pdf. [Accessed 6 Nov 2014].

[3] S. Sagiroglu and a. D.Sinanc, "Big data: A review," in Proc. CTS, 2013, pp. 42 - 47.

[4] "About," Twitter, [Online]. Available: https://about.twitter.com/what-is-twitter. [Accessed 6

Nov 2014].

[5] S. Media, "Twitter in the Arab Region," Dubai School of Government, 1 March 2014. [Online].

Available:

http://www.arabsocialmediareport.com/Twitter/LineChart.aspx?&PriMenuID=18&CatID=25&

mnu=Cat. [Accessed 6 January 2015].

[6] "Twitter Reports First Quarter 2014 Results," Twitter, 29 Apr 2014. [Online]. Available:

https://investor.twitterinc.com/releasedetail.cfm?releaseid=843245. [Accessed 2 Feb 2015].

[7] Twitter, "ANNUAL REPORT 2013," Twitter, San Francisco, 2013.

[8] J.Han, M.Kamber and J.Pei., Mining: Concepts and Techniques, Waltham: Morgan kaufmann,

2012, pp. 24-16.

Page 20: Ra’ai Analyzer An Arabic Sentiment Analyzer for Twitterimamsevenconference.weebly.com/uploads/1/8/3/1/... · Data has become the currency of this era as it is continuing its massive

20

[9] B. Liu, "(2012, Apr 22). Sentiment Analysis and Opinion Mining," (1st edition). [On-line].

Available: http://www.cs.uic.edu/~liub/FBS/SentimentAnalysis-and-OpinionMining.pdf [Des

22, 2014].

[10] A. Shoukry and a. A. Rafea, "Sentence Level Arabic Sentiment Analysis," in Proc. CTS, 2012, pp.

546 - 550.

[11] M. Abdul-Mageed, S. K¨ubler and a. M. Diab, "SAMAR: A System for Subjectivity and

Sentiment Analysis of Arabic Social Media," in Proc. WASSA, 2012, pp. 19-28.

[12] S. Ahmed and G. Qadah., "Key Issues in Conducting Sentiment Analysis on Arabic Social Media

Text," in Porc. IIT, 2013, pp. 72 – 77.

[13] J. Salamah and A. Elkhlifi, "Microblogging Opinion Mining Approach for Kuwaiti Dialect," in

Proc. ICCTIM, Dubai, 2014.

[14] S. Al-Osaimi and K. Badruddin, "Role of Emotion icons in Sentiment classification of Arabic

Tweets," in Porc. MEDES '14, 2014, pp.167-171.

[15] R. Duwairi, R. Marji, N. Sha'ban and S. Rushaidat, "Sentiment Analysis in Arabic Tweets," in

Porc. ICICS, 2014, pp. 1 - 6.

[16] N. Abdulla1, N. Ahmed, M. Shehab and M. Al-Ayyoub, "Arabic Sentiment Analysis: Lexicon-

Based and Corpus-Based," in Proc. AEECT, 2013, pp. 1 – 6.

[17] L. Albraheem and H. Al-Khalifa, "Exploring the problems of Sentiment Analysis in Informal," in

Proc. IIWAS '12, 2012, pp. 415-418.

[18] S. El-Beltagy and A. Ali, "Open Issues in the Sentiment Analysis of Arabic," in Porc. IIT, 2013,

pp. 215-220.

[19] "Stop words list," [Online]. Available: https://code.google.com/p/stop-words/. [Accessed 25

Des 2014].

[20] Appendices, "Dropbox," 1 Jan 2015. [Online]. Available:

https://www.dropbox.com/s/ipq6ann2yex8wn0/Appendices.docx?dl=0. [Accessed 1 Jan

2015].

[21] "Weka 3: Data Mining Software in Java," WEKA The Univeristy of Waikato, [Online]. Available:

http://www.cs.waikato.ac.nz/ml/weka/. [Accessed 25 12 2014].

[22] " Arabic MPQA subjective lexicon & Arabic opinion holder corpus," 23 May 2012. [Online].

Available: http://nlp4arabic.blogspot.com/2012/05/arabic-mpqa-subjective-lexicon-

arabic.html. [Accessed 25 Dec 2014].

[23] R. Khasawneh, H. Wahsheh, M. Al Kabi and I. Aismadi, "Sentiment analysis of arabic social

media content: a comparative study," in Porc. ICITST, 2013, pp. 101 - 106.