ra’ai analyzer an arabic sentiment analyzer for...
TRANSCRIPT
Imam Mohammad Bin Saud Islamic University
College of Computer and Information Sciences
Department of Computer Science
Ra’ai Analyzer
An Arabic Sentiment Analyzer for Twitter
By:
Supervisor: Dr. Sarah Alhumoud
Project Submitted in Partial Fulfillment for the Degree of B.Sc. In “Computer
Science” Semester 1-2014
ID Name 431020432 Mawaheb Altuwijri
431042812 Tarfa Albuhairi
431020515 Wejdan Alohaideb
2
Abstract
Data has become the currency of this era as it is continuing its massive
increase in size and generation rate. These big data can be of great value for
organizations when analyzed properly. This research represents an
implementation of sentiment analysis on Twitter’s tweets which is one of the
biggest public and freely available big data sources. It will analyze Arabic
(Saudi dialect) tweets to extract sentiments of these tweets toward a specific
topic. It used a small dataset consisting of 1000 tweets collected from Twitter.
The collected tweets were analyzed using two approaches, supervised which
based on machine learning and unsupervised which based on lexicon. The
supervised approach used three algorithms which are Support Vector Machine
(SVM), Naive Bayes (NB) and K-Nearest Neighbor (KNN). The obtained results
by the cross validation option on the same dataset clearly confirm the superiority
of the supervised approach namely SVM, NB and KNN with accuracies’ degree
of 98%, 95% and 94%, consecutively. While the unsupervised classifier has
scored 78% of accuracy.
3
Arabic Abstract ( عريبالخلص امل )
اليت يتم ، وهذه البياانت حجما وسرعة ملستمراأصبحت البياانت عملة هذا العصر بسبب تزايدها
ذات قيمة ميكن أن تكون . البياانت الضخمة يطلق عليها البياانت الضخمةضخها بكمية هائلة يف وقت قصري
. هذا البحث يطلق وتفسريها ابلشكل املطلوب للمؤسسات، واحلكومات، وكذلك الباحثني إذا مت حتليلها عالية
عترب أحد أكرب مصادر ت واليتتويرت الشبكة االجتماعيةعليه "رأي"، والذي يقوم بتحليل الرأي الوارد يف تغريدات
مة. يقوم "رأي" بتحليل التغريدات العربية ذات اللهجة السعودية العامية، الستخراج رأي البياانت الضخ
الشبكة تغريدة جعت من 1000األشخاص حول موضوع واحد. مت حتليل جمموعة من التغريدات يبلغ عددها
الذي يعتمد على "Supervised" ستخدم هنجني يف حتليلها، أوهلما: النهج اإلشرايفتويرت. وقد ا االجتماعية
الذي يعتمد على "Unsupervised" ، والنهج الثاين: غري اإلشرايف"Machine Learning" م اآللةتعل
بناء املصنفات، لنهج اإلشرايف ال. مت استخدام ثالث خوارزميات يف "Dictionary" قاموس الكلمات
أما يف النهج غري اإلشرايف فقد مت بناء خوارزمية خاصة هبذه الطريقة، . KNNو SVM ،NBوهي:
. أظهرت نتائج النهج اإلشرايف ابستخدام السعودية العاميةواستخدمت أربعة قواميس مت بناؤها لتخدم اللهجة
حيث بلغت SVM،أن أفضل نسبة دقة تصنيف حققها املصنف "Cross Validation" اختبار
%. بلغت نسبة دقة املصنف غري 94آخرا بنسبة KNN %، وأييت95نسبة NB %، بينما حقق98
.%، مما يؤكد تفوق النهج اإلشرايف على النهج غري اإلشرايف78اإلشرايف
4
1. Introduction
Online data is doubling in size every two years [1]. The amount of
online data generated in 2013 was 4.4 Zettabytes (ZB), and in 2020 it will
reach 44 ZB [1]. Individual users are the main source, contributing 75% to
the overall produced data [2]. Big data is described by the 3'Vs model
(variety, velocity and volume). Data variety includes both structured and
unstructured data such as emails, videos, audios, images, click streams, logs,
posts, search queries. Velocity refers to the speed needed to process and
store the huge and complex data, to respond to the increasing and continuous
requests. Volume indicates the massive size of generated data [3].
Social networks such as Twitter and Facebook, which are popular
means for communication, are important sources for big data. Twitter, a
micro blogging social network, founded in 2006, enables users to freely,
easily, and instantaneously express, reach, and share opinions and feelings in
public in an SMS style text, called tweets. Each tweet has 140 characters or
less [4]. In 2014, a study shows that there are more than 5.8 million Arab
users [5] out of 255 million users from all over the world [6]. Based on
latest study done by twitter in 2013 shows that there are 500 million tweets
per day [7] while 10.8 million tweets of them are written in Arabic [5] as
shown in Figure 1. Saudi users produced 40% of all tweets in the Arab world
[5].
Figure 1: Twitter Arab usage
5
A combination of techniques such as Data Mining, Text Mining,
Machine Learning and Natural Language Processing (NLP) are used to
extract valuable information from big data streams. Data mining is the
exploration and analysis of large amounts of data [8]. The process of
extracting sentiments relies first on data mining techniques to find patterns
then Sentiment Analysis (SA) techniques are applied on these patterns.
SA is a one of the NLP concepts, which is also called opinion mining
[9]. This field of computer science is used to extract sentiment out of text
giving useful information about the author and his/her tendency toward a
specific topic. Two approaches can be used in SA, supervised approach,
known as corpus-based approach. The supervised approach uses machine
learning algorithms such as Support Vector Machine (SVM), Naïve Bayes
(NB), Decision Tree (D-Tree) and K-Nearest Neighbor (KNN) to build a
classifier. Feature extraction is applied before building a classifier by either
Unigram or Bigram. Unigram treats each word as an independent unit, while
Bigram treats a pair of consecutive words as one unit. The second approach
is unsupervised approach, also known as lexicon-based. The polarity of a
word is based on a dictionary where each word is associated with a polarity
value: +1, -1 or 0 for positive, negative or neutral, respectively [10].
The problem is that there is a need by organizations to know clients
tendencies, feedback and opinion in order to improve their services and
products. They usually resort to conduct interviews directly with clients or
distribute questionnaires to collect clients’ feedback. That drains a lot of
time, effort and cost. In addition, it may not serve as a precise indication to
actual costumers’ behavior and preferences as the questionnaire questions
may not cover all needs or it may not be answered thoughtfully and
accurately. Moreover, clients may not express their immediate feedback
6
openly in a questionnaire compared to what they do in Twitter. This hides
valuable information that would benefit the organization.
This research presents Ra’ai Analyzer (RA) that performs sentiment
analysis on Arabic content in Twitter. Ra’ai (رأي) is an Arabic word meaning
opinion or sentiment. RA extracts the sentiment from Arabic tweets (Saudi
dialect) to get the outcome of users' tendencies toward a specific topic.
Through providing a user interface to RA. The interface displays a bar plot
for the overall sentiment that represents the number of positive and negative
classified tweets. It also offers a word cloud showing most frequent words
appearing in tweets containing the keyword in demand. Where the word’s
size represents its occurrence frequency relative to other words. In addition,
through the interface the user can review the collected tweets and their
classification.
The objectives of RA are to give the organization the ability to
understand clients’ tendencies toward a product or a service. Also, try to find
the highest classification accuracy by analyzing and comparing the used
algorithms in the supervised approach.
This research is organized as follows: related work, methodology,
results and discussion; and finally the conclusion.
2. Related Work
The work that has been done regarding Arabic SA in Twitter is
limited. Papers could be categorized based on the classification approach
used. Each effort done in researches [10], [11], [12], [13], [14], [15] and [16]
have used a supervised approach. In paper [10] two algorithms were used
7
SVM, and NB. The experiment proved that SVM has better results than NB
while the use of Bigram did not enhance the accuracy.
While Paper [11] proposed a system for sentiment analysis using a
SVM with presence vector. Their experiment shows that SA accuracy was
not affected by adding Part Of Speech (POS).
Paper [12] has discussed the affection of Ngram and words frequency
on the accuracy of the classifier. They conducted that using Unigram has
enhanced the accuracy. Regarding word frequency, their experiments show
that the frequency vector has enhanced SVM accuracy.
Researchers in [13] have built the Kuwaiti Dialect Opinion Extraction
System from Twitter (KDOEST) using SVM and D-Tree algorithms to
classify tweets. Their results show that SVM accuracy is superior to D-Tree.
RapidMiner was used in paper [14] to classify their collected tweets
using both NB and D-Tree algorithms. They studied the impact of
considering the emotion faces, emoticon that are used widely by Twitter’s
users. Their approach showed that classification with emoticon has raised
the accuracy from 58% to 64%.
Another work has been done for the Jordanian dialect in paper [15]
where they used SVM, NB and KNN classifiers. Where the NB got the best
results among the others. Moreover, their negation handling did not enhance
accuracy.
While in paper [16], they examined both approaches supervised and
unsupervised. In the unsupervised approach, they built their lexicon
manually and enhanced it through adding the synonyms of the word. They
used SVM, NB, KNN and D-Tree for the supervised approach. The tests’
8
results clearly conform that supervised approach got better result than
unsupervised approach and SVM has got the first rank among them.
Authors of [17] and [18] have applied unsupervised approach. Paper
[17] used a lexicon of 200 words and using stemmer they reduced its size.
They used a small dataset of 100 tweets. Their accuracy was 73% affected
by the size of the lexicon.
In paper [18], they built their lexicon form a seed lexicon of 380
words and extended it, and used two methods to calculate tweet sentiment.
The first is the sum, which sums the each word’s polarity in the text, and the
second is the double polarity method where each word in the tweet has
positive and negative polarity, this had better result than the first method.
Table 1 lists the algorithms that have the best accuracy that were
conducted from the related work, and display the effect of the enhancement
factor on the accuracy. From the table it is obvious that among the used
algorithms SVM and NB got the best results. While the effect of
enhancement factor on the accuracy the table shows that Bigram had no
affect while Unigram has increased the accuracy.
Reference of
Paper
Algorithm with
Best Accuracy
Accuracy
Enhancement Factor
Effect of Enhancement
Factor
[10] SVM. Bigram. No effect.
[15] NB. Unigram. Increased.
Negation handling. No effect.
[12] NB. Unigram Increased with NB.
[12] SVM. Frequency vector. Increased with SVM.
[13] SVM. None. None.
[16] SVM. None. None.
[11] SVM. POS. No effect.
Table 1: Best result algorithms and the enhancement factor used in the related work.
9
5. Methodology
Two SA approaches were applied: supervised and unsupervised. The
supervised and unsupervised approaches have been trained on one domain
dataset. In general, the system contains four main components: collecting
tweets, preprocessing, filtering and classifying as shown in Figure 2.
5.1 Collecting Tweets Component
Tweets were collected on a specific topic using three
approaches, which are Twitter Application Program Interface (API)
by using search function, Twitter Archivist and manually by favoring
tweets then retrieving them using TwitteR package. The number of
collected tweets was more than 30,000 that contains advertisements,
retweets, and unrelated tweets. After removing the advertisements,
retweets, and unrelated tweets manually the number of the remaining
Figure 2: System architecture.
10
tweets was 1000 tweet. The number of positive tweets is 600, and
negative is 400.
TwitteR package is an R interface to Twitter API to retrieve
tweets and other functions. We found that the package works only if
the entered keyword was in the highest trending topics, and with more
than 300,000 tweets. For that reason, the football domain was selected
as it was the most trending Saudi topic in the region at the time of this
study, coinciding with the Asian and Gulf cups.
5.2 Preprocessing Component
Tweets were preprocessed to remove unrelated content such as
URLs, user names, non-Arabic letters, numbers, and punctuations.
5.3 Filtering Component
The third component has filtered the tweets by doing three
processes. First is removing all stop words from the collected tweets.
The available stop words list [19] was not suitable for SA because of
two reasons. The first reason is that it contained negation words such
as (لن, ما ,لم ,ال) that may change the sentiment of the tweets. The
second reason is that the used tweets are written mostly in Saudi
dialect which has different stop words than the available one. For SA
purposes the stop words list was modified to combine both the
Modern Standard Arabic (MSA) stop words along with Saudi dialect
stop word. In addition, negation words were removed from the stop
word list. The modified stop words list by research team is available at
[20]. The second process is normalizing each of (آ, إ ,أ) letters to (ا), (ة)
to (ه), and remove any diacritics (short vowels) such as (
). This normalizes all words to hold the same letters shape as the one
11
in the dictionary for the final step which is classification. The reason
for applying this process is that many Arabic Twitter users often
mistake between these similar letters and use them interchangeably.
Third process is correcting misspellings and removing the repeated
letters manually such as (gooooood = نزيييييي ) to (good = .( زين
5.4 Classification Component
This component includes two subcomponents:
5.4.1 Supervised Approach
The supervised approach used a training dataset which
contains 1000 tweets that was treated as Unigram and Bigram.
The frequency of the presence of a word affects the classifier’s
accuracy. For that reason, the feature selection was applied to
remove the word (هالل) from all the tweets; since it has the
highest frequency and it is not sentimental word.
The dataset is transformed into feature vector which
transform it to numeric to be understandable by the classifier.
The feature extraction is used as Term Frequency – Inverse
Document Frequency (TF-IDF) transformer. Then the feature
vector used different machine learning algorithms; SVM, NB,
and KNN. Each algorithm extracts a model (classifier) which
will be used in the evaluation step. The evaluation step took
the training dataset as testing dataset and uses the produced
model to provide the accuracy percentage of each classifier.
Java language with WEKA packages [21] was used to
implement the machine learning algorithms.
12
5.4.2 Unsupervised Approach
The unsupervised approach has two words dictionaries.
Positive words dictionary and negative words dictionary. The
dictionaries contain both MSA words and Saudi dialect words.
The MSA words dictionaries are built beforehand [22]. While
the Saudi dialect words were built with the help of a part of the
collected dataset about 840 tweets. Each word was labeled as
positive or negative based on two human experts. If the experts
did not agree on the label of a certain word, a third expert is
consulted to break the tie. The negation words that were
removed from the Stop words list were added to the negative
words dictionary. Finally, positive words dictionary was
expended from 1294 to 1451. While negative words dictionary
was expanded from 2245 to 2460. A sample of the expanded
words dictionaries available at [20].
Each tweet was split into a list of words. A match
between the tweet’s words and the dictionaries’ words is done
to score each tweet according to the number of words that
were found in both dictionaries.
To increase the accuracy, two dictionaries where added
to the classifier containing positive and negative phrases such
as (كما تدين تدان) which indicate negative sentiment, and (ما شاء هللا)
which indicate positive sentiment. The following algorithm
explains the process. It has a time complexity of O(N2). R
language was used to implement the lexicon-based algorithm.
INPUT: Tweets T, Positive words Lexicon PL, Negative words Lexicon NL,
Positive Phrases lexicon PP, Negative Phrases lexicon NP, Words of the tweets
W.
OUTPUT: P = {Pos, Neg, or Net}, where Pos: Positive, Neg: Negative, Net:
13
Neutral.
INITIALIZATION: Score = 0, P = 0, N= 0, where P: accumulate the positive
words, N: accumulate the negative words, Score: subtract N from P to get the
tweet score.
Begin
1. For each Ti ϵ T
1. For each PPi ϵ PP
1. If PPi ϵ Ti then
1. P = P + 1
2. End if
3. If NPi ϵ Ti then
1. N = N + 1
4. End if
2. End for
3. For each Wi ϵ Ti
1. If Wi ϵ PL then
1. P = P + 1
2. End if
3. If Wi ϵ NL
1. N = N + 1
4. End if
4. End for
5. Score = P – N
6. If Score > 0 then
1. P ← Pos
7. End If
8. If Score < 0 then
1. P ← Neg
9. End If
10. If Score == 0 then
1. P ← Net
11. End If
2. End for
End
6. Results and Discussions
The evaluation of accuracy for the supervised approach used the
precision, recall and accuracy equations. While the unsupervised approach
used the accuracy equation. These equations 1, 2 and 3 are presented below:
𝐏𝐫𝐞𝐜𝐢𝐬𝐢𝐨𝐧 = 𝐓𝐏 / (𝐓𝐏 + 𝐅𝐏) (1)
𝐑𝐞𝐜𝐚𝐥𝐥 = 𝐓𝐏 / (𝐓𝐏 + 𝐅𝐍) (2)
𝐀𝐜𝐜𝐮𝐫𝐚𝐜𝐲 = (𝐓𝐏 + 𝐓𝐍) / (𝐓𝐏 + 𝐓𝐍 + 𝐅𝐏 + 𝐅𝐍) (3)
Where TP, FP, TN, and FN are true positive, false positive, true
negative and false negative, respectively.
14
6.1 Supervised Approach
Three different experiments were conducted using three
classifiers: SVM, NB and KNN. The motivation of selecting these
classifiers was the superior performance they showed in previous
related studies [23], [10], [16]. For testing and validation purposes,
the 10-fold cross validation technique was used; since cross validation
is more suitable for small datasets.
Table 2 shows the experiment’s outcomes which clearly
confirms SVM has better accuracy than other classifiers. As appeared,
the classifiers’ accuracies are more than 90%. The reasons for the high
accuracy of the used classifiers compared with the ones in the related
work shown in this research could be summarized in four reasons.
Firstly, the limitation of the dataset domain where the dataset was
focused on one domain. Secondly, the use of 10-fold cross validation
with the value ten was chosen based on a related work that show its
suitability for small data set. Thirdly, the fullness of the data set with
common sentimental words. Finally, the use of feature selection to
remove the word (هالل); since it has the highest frequency and it is not
sentimental word.
Table 2 also shows the accuracy results of Unigram and bigram.
Figure 3 represents SVM’s performance which clarifies the impact of
Unigram and Bigram on the SVM’s accuracy. The test has been
repeated ten times and each time a different dataset is used The
Unigram has achieved higher accuracy than bigram. The reason
behind these results is the presence of negation words in negative
tweets. In Bigram, negation word will appear differently for the
classifier based on how the tweet was divided. Suppose the tweet ( أحمد
which means (Ahmed do not like football). Bigram (ما يحب الكورة
15
technique breaks the tweet into units of two words such that the
tweets’ units are ( كورة , يحبماأحمد ). The sentimental words (ما) and (يحب)
were separated, the unit (أحمد ما) will have
no meaning while the unit ( الكورةيحب ) will have a positive
meaning which will classify the tweet to positive rather than negative
and therefore decrease the accuracy of the results.
Figure 4 shows Unigram and Bigram effects on classifiers’
accuracy when the dataset size is 1000 tweets. Table 2 shows the
testing results average.
Ngram Unigram Bigram
Classifier SVM NB KNN SVM NB KNN
100Tweets 91% 92% 86% 73% 73% 71%
250Tweets 96% 95% 89% 79% 78% 74%
400Tweets 96% 96% 90% 83% 77% 78%
650Tweets 98% 97% 96% 95% 90% 92%
1000Tweets 98% 95% 94% 98% 79% 82%
60
65
70
75
80
85
90
95
100
0 200 400 600 800 1000
Acc
ura
cy P
erce
nta
ge
Number of tweets
UniGram
BiGram
Figure 3: The effect of using Unigram and Bigram on SVM
Table 2: Comparison between classifiers accuracy according to
Unigram and Bigram
16
The supervised approaches algorithms’ accuracy can be
improved by increasing the size of the training dataset, and by using a
words’ stemmer.
6.2 Unsupervised Approach
The lexicon-based test was applied on 160 tweets. The results
show that lexicon-based approach gives much lower accuracy
compared to the corpus-based. The achieved accuracy was 78%. The
accuracy of the unsupervised approach relies on the containment of
the dictionary matched words to the tweets’ words. Table 3 shows an
example of classified tweets by the unsupervised classifier.
The accuracy of the unsupervised approach is high compared
with the other related work mentioned in this research. The reason is
the fullness of the data set with a common sentimental words.
Tweet's
Polarity
Tweet after Prepressing
and Filtering Original Tweet
Positive
مره عجبنيا لعبهم جميل
The team has played well I
liked it much
يييييل ة. عجبنيأ ,هم اليوملعب جم مر
The team has played weeeell today, I
liked it much.
60
65
70
75
80
85
90
95
100
SVM NB KNN
Acc
ura
cy P
erce
nta
ge
Classifiers
Unigram
Bigram
Figure 4: The effect of Unigram and Bigram on classifiers’ accuracy
17
Negative
داؤهم ضعيفاالمدرب شين الالعبين
Bad coach, and the players
played poorly
!!! داؤهم ضعيفأالالعبين كان و ,ينالمدرب ش
Bad coach, and the players played
poorly !!!
The lexicon-based tool accuracy can be improved by expanding
dictionaries with more words, and using words’ stemmer.
Figure 5 parts (a) and (b) show some interface of Ra'ai
Analyzer. Part (a) shows the overall sentiment bar plot when a user
enters a keyword. Part (b) shows the words cloud function that appear
the frequency of the words, where the word has high frequency be
bigger size.
7. Conclusion and Future Work
SA still in its early steps and it is considered as one of the main
research trends among data scientists especially in Arabic language. This
research has presented an implementation of SA on the Arabic content of
Twitter. The following section 7.1, will display this research challenges. The
conclusion of the research is in section 7.2 and the future work is in section
7.3.
Table 3: Example of classifying tweets
Figure 5. a: Overall sentiment bar plot Figure 5. b: Words cloud function
18
7.1 Challenges
Some challenges formed an obstacle to complete some functions
and necessitates plan modifications. While other challenges were
conquered by team members. Following are some of the challenges:
Limited support for Arabic language in programming languages.
Difficulty of collecting Arabic tweets from Twitter.
Time's cost for filtering unrelated tweets.
Complexity of Arabic language structure.
Limited Arabic resources that provide Arabic word dictionary,
stemmer and stop word list for SA purposes.
Difficulty of obtaining Arabic SA information because of limited
resources.
7.2 Conclusion
Nowadays it is not enough to own data, but being able understand
it efficiently and analyze it in a timely manner gives its owner
knowledge and power. SA is one of the techniques used to analyze data.
Ra’ai Analyzer has implemented SA on Arabic (Saudi dialect tweets).
The results of the used approaches proved superiority of the supervised
approach using SVM over the unsupervised approach scoring 98% and
78% accuracy, respectively. While NB classifier achieved 95% and
KNN’s accuracy was 94%. Results could be improved on the basis of
further work, which will be accomplished in the near future.
19
7.3 Future Work
The system can be enhanced and improved by adding more
functions. Some functions were set as a future work because of the lack
of time in studying Arabic language. A list of future directions are:
Classify Arabic live streams of tweets.
Provide an approach to deal with word negation.
Use word stemmer to enhance the supervised and unsupervised
classifier performance.
References
[1] J. Gantz and D. Reinsel, "Digital Universe Study: Extracting Value from Chaos," EMC2, June
2011. [Online]. Available: Internet: http://www.emc.com/leadership/programs/digital-
universe.htm. [Accessed 6 Nov 2014].
[2] "The 2011 IDC Digital Universe study sponsored by EMC," [Online]. Available: "
Interhttp://www.emc.com/collateral/about/news/idc-emc-digital-universe-2011-
infographic.pdf. [Accessed 6 Nov 2014].
[3] S. Sagiroglu and a. D.Sinanc, "Big data: A review," in Proc. CTS, 2013, pp. 42 - 47.
[4] "About," Twitter, [Online]. Available: https://about.twitter.com/what-is-twitter. [Accessed 6
Nov 2014].
[5] S. Media, "Twitter in the Arab Region," Dubai School of Government, 1 March 2014. [Online].
Available:
http://www.arabsocialmediareport.com/Twitter/LineChart.aspx?&PriMenuID=18&CatID=25&
mnu=Cat. [Accessed 6 January 2015].
[6] "Twitter Reports First Quarter 2014 Results," Twitter, 29 Apr 2014. [Online]. Available:
https://investor.twitterinc.com/releasedetail.cfm?releaseid=843245. [Accessed 2 Feb 2015].
[7] Twitter, "ANNUAL REPORT 2013," Twitter, San Francisco, 2013.
[8] J.Han, M.Kamber and J.Pei., Mining: Concepts and Techniques, Waltham: Morgan kaufmann,
2012, pp. 24-16.
20
[9] B. Liu, "(2012, Apr 22). Sentiment Analysis and Opinion Mining," (1st edition). [On-line].
Available: http://www.cs.uic.edu/~liub/FBS/SentimentAnalysis-and-OpinionMining.pdf [Des
22, 2014].
[10] A. Shoukry and a. A. Rafea, "Sentence Level Arabic Sentiment Analysis," in Proc. CTS, 2012, pp.
546 - 550.
[11] M. Abdul-Mageed, S. K¨ubler and a. M. Diab, "SAMAR: A System for Subjectivity and
Sentiment Analysis of Arabic Social Media," in Proc. WASSA, 2012, pp. 19-28.
[12] S. Ahmed and G. Qadah., "Key Issues in Conducting Sentiment Analysis on Arabic Social Media
Text," in Porc. IIT, 2013, pp. 72 – 77.
[13] J. Salamah and A. Elkhlifi, "Microblogging Opinion Mining Approach for Kuwaiti Dialect," in
Proc. ICCTIM, Dubai, 2014.
[14] S. Al-Osaimi and K. Badruddin, "Role of Emotion icons in Sentiment classification of Arabic
Tweets," in Porc. MEDES '14, 2014, pp.167-171.
[15] R. Duwairi, R. Marji, N. Sha'ban and S. Rushaidat, "Sentiment Analysis in Arabic Tweets," in
Porc. ICICS, 2014, pp. 1 - 6.
[16] N. Abdulla1, N. Ahmed, M. Shehab and M. Al-Ayyoub, "Arabic Sentiment Analysis: Lexicon-
Based and Corpus-Based," in Proc. AEECT, 2013, pp. 1 – 6.
[17] L. Albraheem and H. Al-Khalifa, "Exploring the problems of Sentiment Analysis in Informal," in
Proc. IIWAS '12, 2012, pp. 415-418.
[18] S. El-Beltagy and A. Ali, "Open Issues in the Sentiment Analysis of Arabic," in Porc. IIT, 2013,
pp. 215-220.
[19] "Stop words list," [Online]. Available: https://code.google.com/p/stop-words/. [Accessed 25
Des 2014].
[20] Appendices, "Dropbox," 1 Jan 2015. [Online]. Available:
https://www.dropbox.com/s/ipq6ann2yex8wn0/Appendices.docx?dl=0. [Accessed 1 Jan
2015].
[21] "Weka 3: Data Mining Software in Java," WEKA The Univeristy of Waikato, [Online]. Available:
http://www.cs.waikato.ac.nz/ml/weka/. [Accessed 25 12 2014].
[22] " Arabic MPQA subjective lexicon & Arabic opinion holder corpus," 23 May 2012. [Online].
Available: http://nlp4arabic.blogspot.com/2012/05/arabic-mpqa-subjective-lexicon-
arabic.html. [Accessed 25 Dec 2014].
[23] R. Khasawneh, H. Wahsheh, M. Al Kabi and I. Aismadi, "Sentiment analysis of arabic social
media content: a comparative study," in Porc. ICITST, 2013, pp. 101 - 106.