systematic literature review on opinion mining of big data for … · 2018-03-18 · systems....
TRANSCRIPT
6 http://www.webology.org/2017/v14n2/a156.pdf
Webology, Volume 14, Number 2, December, 2017
Home Table of Contents Titles & Subject Index Authors Index
Systematic Literature Review on Opinion Mining of Big Data for
Government Intelligence
Akshi Kumar
Department of Computer Science & Engineering, Delhi Technological University, Delhi-110042,
India. E-mail: [email protected]
Abhilasha Sharma
Department of Computer Science & Engineering, Delhi Technological University, Delhi-110042,
India. E-mail: [email protected]
Received August 21, 2017; Accepted December 25, 2017
Abstract
With the advent of new technology paradigm, SMAC (Social media, Mobile, Analytics and
Cloud) the information network generates an infinite ocean of data spreading faster and larger
than earlier. A high quality information extracted from this massive volume of data, named as big
data, urges the development of an efficient and effective decision support system and powerful
strategic tools in the area of government intelligence. This pool of information can be explored
for the benefit of an organization/system and to better understand its stakeholder needs by
collecting, mining opinions about every point or subject of interest. Digitally intelligent and
smart governance has been identified as a dynamic field with new studies being reported at
various research avenues. The need to review, analyze and evaluate research studies across
literature is thus fostered motivating us to identify existing trends, research gaps and potential
directions of future work within this domain. This paper intends to provide a systematic literature
review within the promising area of opinion mining and its application to the area of government
intelligence.
Keywords
Opinion mining; Big data; Digital governance; Systematic literature review
7 http://www.webology.org/2017/v14n2/a156.pdf
Introduction
Optimal processing and analytics of big data enables to better decision making and strategic
movement in relevant area. Different real time industry verticals are expanding their existing
data sets with data available from social web, i.e. big data to get a better understanding of their
customers' behavior and preferences. And this can be achieved by the means of opinion mining
which aims to extract people opinion from web. Different sources of big data provide a massive
amount of valuable information which can be examine and analyze for extracting sentiments.
The major sources of big data are: black box data, transportation data, stock exchange data,
social media data, power grid data, search engine data, healthcare data, etc. (Arputhamary &
Jayapriya, 2016). Opinion mining on big data has emerged as a notable direction of research with
scientific trials and promising applications being explored substantially. It has turned out as an
exciting new trend with a gamut of practical applications that range from Business and Trades;
Recommender Systems; Expert Finding; Politics; Government Sector; Healthcare Industry;
amongst others.
The terminology "opinion mining" was first observed in 2003 in a research paper by Dave et al.
By 2006 it started to be involved in web applications and various other domain like marketing
for product and service reviews, entertainment, politics, recommendations, business etc. which in
turn broaden its application spectra. With the rapid increase of people participation and
communication over social web where they express their ideas, opinion over a public forum;
opinion mining extracts the public mindset and the findings are being used in various domains.
As per literature so far, various application areas of opinion mining can be broadly classified as:
business intelligence (BI), government intelligence (GI), information security and analysis (ISA),
market intelligence (MI), sub component technology (SCT) and smart society services (SSS)
whereas the techniques used are machine learning (ML), lexicon based (LB), hybrid and concept
based. But no study has been conducted on reviewing of opinion mining implementation in
various application areas till now. However, few surveys are there which covers the existing
opinion mining techniques, opinion mining task classification, opinion mining evolution, opinion
mining and big data etc., but none of them emphasizes over its application areas and correlates
them with its techniques.
Specifically, to improve and optimize decisions and performance of government, the thrust area
is to explore and understand opinion mining, its applications and techniques using big data in
government intelligence. The idea is to envision “digital governance” by virtue of social web
adoption (social media data; source of big data), where government can take advantage of social
platforms involving huge user participation. The motivation is clearly based on the “Digital India
Initiative” recently proposed and projected by the Government of India. The initiative motivates
to embed the concept of S-governance which includes Social and Sentiment governance (Kumar
& Sharma, 2016) combines to form Smart governance in order to transform the nation into
8 http://www.webology.org/2017/v14n2/a156.pdf
digitally empowered knowledge economy and includes projects that aim to ensure that
government services are available to citizens electronically/digitally and people get benefit of the
latest information and communication technology (Makeinindia, n.d.; Digital India, 2014).
Hence, the requirement is to go through a state-of-art literature survey and review the significant
research on the subject of opinion mining of big data for digital governance. In this review, we
explicate how the voluminous amount of data i.e. big data is utilized for analyzing public
posts/comments/tweets/blogs/discussions by the application of opinion mining in government
intelligence.
Induction of Internet and communication technologies transforms the conventional governance
in to digital governance and its existence is visible in the year 2010 (Janowski, 2015). Therefore,
the proposed survey is an exhaustive study of the published articles of opinion mining
applications from 2011-2017 with the usage of various datasets (big data) in the field of digital
governance and its applications. The comparative analysis of different techniques of opinion
mining in its various application areas has been done on the basis of original studies. This paper
conducts a systematic literature survey which outlines the application areas of opinion mining,
discusses the introduction of opinion mining in digital governance and summarizes opinion
mining based digital governance progress graph so far.
The process for carrying out a systematic literature review (SLR) is given by Kitchenham et al.
in 2009. However, in this SLR we reshape the core architecture of the phases and represent them
in a more structured and sequential manner. This makes the SLR more systematic, well classified
and organized, for a better understanding and course of action. The rest of the paper is organized
as follows: Section 2 elaborates the three terms, i.e., opinion mining, big data and digital
governance individually and their influence on each other when coupled together. Section 3
describes the purpose of required phases to be performed during the conduct of a Systematic
Literature Review (SLR). Section 4 details the Review Planning phase (structured in a new
template) for the proposed SLR which contains the motivation and aim of the research, to gather
and analyze the relevant primary studies of research, its quality assessment/verification and
documentation. The next phase Review Conduct is elaborated in Section 5 which contains
searching strategy, literature review of selected studies in visual and tabulated form. Section 6
deals with the Review Reporting phase which document the results and discussion of complete
review. Finally, the paper winds up with a conclusion and future works in the last section.
Opinion Mining of Big Data for Digital Governance
With the progression of Web 2.0 and its plug-ins such as social networks, micro blogs, mobile
technologies, audio, video, web searches, discussion forums, posts, comments and several other
social media, citizens can vocalize their opinion and experiences in a more effective manner.
9 http://www.webology.org/2017/v14n2/a156.pdf
Digital empowerment of citizens plays a vital role in providing people centered applications by
transforming the conventional IT paradigm into next generation paradigm of SMAC based
systems. Social media and Mobile generates a voluminous amount of data i.e. big data, which
can be exploited by government for the extraction of public opinion to improvise and enhance
the process of decision making using techniques of opinion mining. Hence, opinion mining of
big data in digital governance plays a significant role for the success of government programs
and initiatives. The three basic terminologies namely, opinion mining, big data and digital
governance are briefly discussed as follows:
Opinion mining (OM), deals with opinion based natural language processing which includes
genre distinctions, emotion and mood recognition, ranking, relevance computations, perspectives
in text, text source identification and opinion oriented summarization (Kumar & Teeja, 2012).
Opinion Mining is "a field of study that tends to use the natural language processing techniques
to extract, capture or identify the attitude (otherwise sentiment) of a person with respect to a
particular subject. It is the automated mining of attitudes, opinions, and emotions from text,
speech, and database sources through NLP" (Kumar & Sharma, 2016).
Sentiment analysis (also called opinion mining, review mining or appraisal extraction, attitude
analysis) is the "task of detecting, extracting and classifying opinions, sentiments and attitudes
concerning different topics, as expressed in textual input" (Ravi & Ravi, 2015).
Big Data, a huge amount of data which is full of conversations, social networking sites, online
discussion forums, web based audio/video presentations, micro blogs, commercial transactions,
web logs, comments, pictures, documents, etc. It is popping up from numerous sources. Due to
its complex structure/nature, traditional database architectures are unable to collect/capture,
contemplate and conceptualize it for further processing. For productive and proper use of big
data assets it is necessary to understand its key attributes. Seven (Vs) characteristics (Mcnulty,
2014) of big data are volume, variety, velocity, veracity, variability, visualization and value.
10 http://www.webology.org/2017/v14n2/a156.pdf
Figure 1. V- Model of People Participation in Governance (The Digital Governance Initiative, n.d.)
Digital Governance
Governance is all about the action, manner or power of governing to ensure the environment of
an organization is stable, transparent, responsive and accountable and working in best interest of
the public. It includes mechanism of policy establishment, formulation of decision making
process and implementation, authority balancing and performance monitoring in order to manage
public affair.
At the same time, "good governance is to establish, consensus amongst the stakeholders of a
country with the goal to improve quality of life enjoyed by all citizens" (Kumar & Sharma,
2016). Good governance follows the rule of law and aims for efficient processes and effective
outcomes. The key attributes of good governance provided by the commission of human rights
(United Nations Human Rights, Office of the High Commissioner, n.d.) in a resolution 2000/64
are Transparency, Responsibility, Accountability, Participation and Responsiveness. Good
governance is about taking decisions based on information set or knowledge set. "Digitization of
this information is required for its easy access available on network which open arms to all
individuals of every community or village for use of this information - paving the way for Digital
Governance" (The Digital Governance Initiative, n.d.).
Representative Mode Individual /
Collective
In-situ Domain Ex-situ
Passive /
Reactive Approach
Pro-active
/Interactive
Indirect / Delayed Direct/Immediate Impact
Conventional
Governance Model
People
Participation
Digital
Governance Model
11 http://www.webology.org/2017/v14n2/a156.pdf
The intent of digital governance is to ensure that common citizens should associated with
decision-making processes as it affects them directly or indirectly, which in turn improves their
conditions and the quality of lives (Nath, 2003). This new facet of governance will assure that
citizens are active contributor in deciding the kind of services they want as well as attentive
consumer of services offered to them. And this is what required for a powerful democratic
country.
The perspective and participation of people changes with the transformation of governance
model from conventional to digital is represented in Figure 1. This comparison states that in
contrast to conventional governance, digital governance allows a more closer contact of public
and decision-makers in the government and results in greater access and check over governance
mechanism in order to constitute a more responsive, transparent, responsible, accountable and
efficient governance" (The Digital Governance Initiative, n.d.).
Opinion Mining of Big Data for Digital Governance results in a more stabilized and smart
governance which analyze data available on social web and extract public sentiments for the
process of decision making. The evolution of web based digital governance is depicted in figure
2 which also exhibits the parallel mapping of opinion mining, big data and digital governance
with the versions of web.
Web 1.0 is the web of cognition and primarily read-only, Web 2.0 is the web of people
corresponds to communication and Web 3.0 has become the web of data which stands for co-
operation (Kumar & Sharma, 2015). With the emergence of Web 3.0, the concept of smart and
socio-sentimental governance is readily apparent, which transforms the data into decision. With
the evolution of web, the social media model for digital governance is also emerging to utilize
the high volume of unstructured data (Kumar & Joshi, 2017). Although, there are no fixed model
for digital governance. Rather, they are customized by investigating and research, conducive to
cater the needs of a nation or society. Some generic models being practiced are: critical flow
model, comparative analysis model, broadcasting model, e-advocacy model and interactive
service model (Nath, 2003).
The idea is to look for models that enrich the big data analytic solution architecture, tools and
technologies for creating awareness and opportunities that make big data and its analytics in
government domain. Amongst all, e-advocacy and lobbying is most frequently used model as it
motivates the personal and people participation in decision making process of organizations and
government officials. Some of the projects based on e-advocacy model are: Greenpeace Cyber-
Activist Community, Independent Media Centre, Panchayats, Drop the Debt Campaign, PRS
Legislative Research, IGC Internet etc. (The Digital Governance Initiative, n.d.), (Nath, 2003).
"Advocacy is an activity by an individual or group which aims to influence decisions within
political, economic, and social systems and institutions. Advocacy can include many activities
12 http://www.webology.org/2017/v14n2/a156.pdf
that a person or organization undertakes including media campaigns, public speaking,
commissioning and publishing research or conducting exit poll or the filing of an amicus brief."
(Wikipedia contributors, 2017).
The purpose of advocacy is to notify information and communicate facts to politicians and to
their deputies or officers about emotional and mental constitution of public mind and its
pertinence in governance. Legislators work for public welfare and they want views, outlook,
facts, and insights to figure out realism. Now here public participation is critical and their
reaction, inferences, viewpoint shared on social media could be mined (by the use of opinion
mining) in order to provide the information/news required by officials/lawmakers. Therefore,
inducement of opinion mining in advocacy will go a long way and continue a long innings in the
scope of advocacy.
Figure 2. Evolution of Web Based Digital Governance
Systematic Review Phases
The Systematic Literature Review(SLR) comprise of three phases represented in figure 3 and its
outline is planned, executed and organized as discussed by Kitchenham and Charters
(Kitchenham et al., 2009). However, the core architecture of the phases is restructured to make
them more classified and understandable.
Review Planning, being the first phase of SLR is crucial to identify and capture the existing and
relevant research work done in the area. The information gathered is further examined and
evaluated for the development of SLRS (Systematic Literature Review Specification) document.
The second phase is Review Conduct, which provides a collective and comprehensive review of
existing literature. Review Reporting is the last phase of SLR in which various findings,
observations and results are inspected and reported.
13 http://www.webology.org/2017/v14n2/a156.pdf
Review Planning
Review planning is the most important and fundamental stage of SLR. It leads
o To explore and understand the motivation behind the survey; and
o To make out the need of review.
Next step is to define research objectives with the help of questions as per the aim of SLR. A
well formed/designed question can help in taking key decisions related to research. Thereafter,
all the possible and relevant reviews/research work are captured/extracted for the development of
review protocol which includes following steps: Review Elicitation, Review Analysis, Review
Validation and Review Documentation. Detailed review planning phase is represented in figure 4
and discussed below.
Figure 3. Systematic Review Stages
Identify the Need
A huge amount of work has been done in the area of opinion mining of big data. Social web
statistics reveals that more than
o 3 billion people across world are online and use Internet for exchanging information and
ideas, sharing, searching facts, publishing amongst others; and
o 2 billion people worldwide maintaining social accounts (Twitter, Facebook, etc.) (Kumar &
Sharma, 2016).
14 http://www.webology.org/2017/v14n2/a156.pdf
Gamut of social networking sites has offered a platform for citizens to publically vocalize their
ideas in distinct areas which increases the significance of any issue/discussion very swiftly.
Hence, the necessity of incorporating public opinion and interest in the process of decision
making has grown exponentially throughout the years. And the fact that it is vitally important in
governance has been observed across literature studies.
Opinion mining is the key underpin of synergetic policymaking. It helps in listening to the
problem rather than asking it and hence ensures a detailed and precise speculation/impression of
verity (Osimo & Mureddu, 2012). It helps in concluding thousands of inferences by considering
various opinion, ideas, discussions and feedback from citizens and covers many spheres like
advisory, decision making, advocacy, early warning system detection etc.
The serviceability of collective and interactive aspect of social web by extracting and analyzing
sentiments can proffer a significant, incomparable platform for enormous association of citizens
in governance measures in order to make it smart and intelligent.
Analytics on the unstructured data can transform the governance model into a social governance
model, as an integration of social media and S-governance. In view of this, the need is to further
research and study this area to gain insights into public opinion by mining user generated data on
social web for empowering digital governance.
15 http://www.webology.org/2017/v14n2/a156.pdf
Figure 4. Review Planning Activities
16 http://www.webology.org/2017/v14n2/a156.pdf
Identify Research Questions
This review paper aims to explore opinion mining techniques in the field of advocacy in order to
develop, deepen and broaden the understanding of S-governance. Research questions (RQs) have
been formulated according to ensure complete coverage of existing research on the topic and the
scope of opinion mining to evolve in the area of advocacy. Table 1 enlists the research questions
identified for this SLR.
Table 1. Identified Research Questions
RQ# Research Questions
RQ1 What are the different application areas and techniques of
opinion mining?
RQ2 How much work has been done in the area of government
intelligence using opinion mining?
RQ3 Which (big) data sets are used for opinion mining in government
intelligence?
RQ4 Which techniques of opinion mining have been used in advocacy
so far?
Develop Review Protocol
This phase takes input as research questions, processes them with various sequential steps and
finally generates output as Systematic Literature Review Specification (SLRS) document. We
divide the activity of development of review protocol into four steps: Review Elicitation, Review
Analysis, Review Validation and Review Documentation as represented in Figure 5.
17 http://www.webology.org/2017/v14n2/a156.pdf
Research Question(s)
SLRS
Figure 5. Steps of Developing Review Protocol
Review Elicitation
Decomposition of Research Questions into individual and small units has been done for the
formation of search terms. Following were the search terms identified: opinion mining, big data,
and opinion mining applications. Identified search terms then get refined by incorporating
alternative terms and synonyms.
These equivalent terms were: sentiment analysis, sentiment extraction, and social web data.
Different permutation and combination of basic and refined terms was then applied with OR,
AND logistics for search of relevant studies.
Search of Available Primary Studies has been done considering the following digital four
libraries: Science Direct, Springer, ACM Digital Library and IEEE Transactions. The search
terms were explored in titles, keywords and abstracts. This step results all the available primary
studies put together in contemplation of answering research questions.
Review Analysis
Inclusion/Exclusion (I/E) Criteria is based to limit the scope of research to find fitting answers to
the RQs. The collected primary studies were input to this step and inclusion/exclusion criteria
was incorporated to select or reject the studies based on their relevancy.
Develop
Review
Protocol
Review Elicitation
Review Analysis
Review Documentation
Review Validation
18 http://www.webology.org/2017/v14n2/a156.pdf
Inclusion Criteria
o Journal Papers of Science Direct, Springer, ACM Digital Library and IEEE
o Papers from year 2011-2017
o Opinion mining and its applications
o Techniques of opinion mining
o Secondary studies of opinion mining
o Survey of opinion mining techniques in its various application areas
Exclusion Criteria
o Conference papers of: Science Direct, Springer, ACM Digital Library and IEEE
o Non textual representation of opinions/sentiments
o Big data process optimization and its survey
o Task classification of opinion mining.
Classification of Papers has been done based on the above mentioned inclusion/exclusion
criteria. Owing to the theoretical limitations and feasibility, we considered only journal papers
from selective digital databases, thus limiting the scope of our study. Table 2 depicts the process
of paper classification in different phases. Total 194 papers were selected as the outcome of this
step.
Table 2. Paper Classification Stats
Elicitation Phase Selected Papers Analysis Phase Selected Papers
Science Direct
Springer Journals
ACM Digital Library
IEEE Transactions
240 primary papers
Inclusion/exclusion
criteria (40)
Redundant papers (4) 194 final papers
Review Validation
Data Extraction and Synthesis accomplishes the appropriate analysis from the 257 final studies
for the extraction of relevant information required to answer the research queries. Selected
papers were referred for the extraction of information, such as year of publication, contributing
author(s), application areas, existing techniques, datasets used for assessment of performance,
etc. All such information is tabulated in order to get simplify the data synthesis process for
review. Data synthesis process summarizes the extracted information in the form of tables
(representation of categorized information) and diagrams like graphs and charts (representation
of measurable information). Table 3 and 6; Figure 6, 7, 8, 9, 10 and 11 were taken as output of
this step.
19 http://www.webology.org/2017/v14n2/a156.pdf
Review Documentation
Review Documentation results in a document call up as Systematic Literature Review
Specification (SLRS) document. It is a detailed description of relevant studies and their
systematic compilation and presentation in order to answer the research objectives in the
projected research area. Section 6 covers this step.
Review Conduct
Review conduct executes the proposal offered in planning phase. The actual process
implementation of studies elicitation and their analysis based on inclusion/exclusion criteria is
performed here by employing Search Strategy. The subsequent step is to explicate the
comprehensive Literature Survey by incorporating the diagrams, tabulations and answers of the
identified research questions.
Search Strategy
In this step, we see how much work is ongoing in this field. An extensive search of relevant
primary studies is done to capture the mounting interest. The chart in Figure 7 represents the year
wise number of papers published in different application areas, reflecting the attention. This area
is gaining from researchers. Some of the popular journals and conference in which the primary
studies have been published is listed in Table 3.
Figure 7. Year wise studies distribution
20 http://www.webology.org/2017/v14n2/a156.pdf
Table 3. Some popular journals of primary studies of this SLR
Publication Name Type
Decision Support Systems Journal
Expert Systems with Applications Journal
Knowledge-Based Systems Journal
Computers & Security Journal
IEEE Transactions on cybernetics Journal
Procedia Computer Science Journal
IEEE Intelligent Systems Journal
Information Processing &Management Journal
Applied Intelligence Journal
Quality & Quantity Journal
IEEE Transactions on Knowledge and Data Engineering Journal
ICT Express Journal
Applied Soft Computing Journal
Procedia Engineering Journal
Journal of Intelligent Information Systems Journal
Procedia Computer Science Journal
Computers in Human Behavior Journal
IEEE Transactions on Multimedia Journal
Literature Survey
In this phase, a state-of-art literature survey is carried to review the substantial research in the
area of opinion mining and its application area. The preliminary study from pertinent literature is
summarized in Table 4.
In the area of business intelligence, Tsai et al. (2011) proposed an algorithm for novel business
blogs detection by applying machine learning techniques. Chung et al. (2012) developed a new
framework based on ML techniques for extracting relation of customer rating and reviews. Jang
et al. (2013) proposed a method for reviewers' attitude extraction and their traits identification.
Zitnik et al. (2012) used ML techniques to enhance business process operations. In 2013, Pai et
al. proposed a ML based method to aid customers in better decision making and to support
organization for better understanding of product/service appraisals. Wu et al. (2014) and Li et al.
(2014) used ML techniques for stock markets prediction of automatic trading decisions and
market movements.
In the area of government intelligence, Cao et al., Katakis et al., Mohammad et al. in years 2011,
2014 and 2015 respectively worked in the field of voting advisory with use of machine learning
techniques. Tools are developed for providing voting advice to application users and political
parties are also facilitated by social media sites by making public opinion available to them.
Automation of manual processes has been done.
21 http://www.webology.org/2017/v14n2/a156.pdf
Panasyuk et al. in 2014, Cheng et al., Adedoyin-Olowe et al. and Gull et al. in 2016, worked in
the area of politics using machine learning techniques. Various requirements like public opinion
analysis system, identification of conflicting communities, trend prediction of elections etc. have
been fulfilled. Haselmeyer et al. in 2016 contributed in politics domain with the use of lexicon
based techniques. Objectives were to find sentiment polarity of user comments to provide
political portal recommendation and usage of sentiment dictionary in negative campaigning of
parties. De Fortuny et al. in 2012, Xianghua et al. in 2013, Makazhanov et al. in 2014, Kagan et
al. in 2015 use hybrid techniques in politics. They emphasizes over the automation of manual
process of opinion analysis over social communication channels and how political preference
prediction has been done on Twitter. In 2014, Rill et al. used concept level ontology based
technique to conclude that twitter detection of any topic is much faster than other communication
channels.
O'Leary, D. E in 2016 and Singh et al. in 2017 worked in the area of advocacy and policy making
using machine learning techniques. They aim to analyze public sentiments against government
policy. ElTayeby et al. in 2014 and Tayal et al. in 2016 applied machine learning techniques in
campaigning and developed tools to estimate their success. Also, clustering of similar opinions
has been performed to study the impact of media.
In the area of information and security analysis, Bhargava et al. in 2016 used lexicon based
techniques for opinion summarization. Framework has been proposed for entity based topic
oriented summarization. Proposed graph based technique summarizes redundant opinions.
He and Zhou, Liu et al. and Fan et al. in 2011, Zhai et al., Balahur et al., Cruz et al., Lane et al.
and Wang et al. in 2012, Bagheri et al., and Haddi et al. in 2013, Wu et al., Kansal and
Toshniwal, Dařena et al., Zheng et al., and Dehkharghani et al. in 2014, Wang and Liu, Kurian
and Asokan in 2015, Li et al. and Qiang et al. in 2016 and Rout et al. in 2017 applied various
machine learning algorithms for analyzing information like in summarization systems and their
automation and in security analysis like spam detection etc.
In the sub component technology domain, Rosa et al. in 2015 used lexicon based techniques. The
studies done by them presents recommendation systems for restaurant reviews and music. In
2016, Lei et al. and Dong et al. and in 2017, X Ma et al. applied hybrid techniques to develop
recommender system for products and rating prediction. Qiao et al. and Bridge and Healy in
2011, Anooj in 2012, Kardan and Ebrahimi, Geva and Zahavi, García-Cumbreras et al., Chen
and Wang, Peleja et al., Scharkow in 2013, Sun et al. in 2014, Capdevila et al., Coussement et
al., Colace et al., Wei et al., Alahmadi and Zeng, Sohail et al. in 2015, Pröllochs et al., Li et al.,
Zhang et al., and Bilici & Saygın in 2016 and Gurini et al., Ma et al. in 2017 used machine
learning techniques in citation analysis, recommender systems etc.
22 http://www.webology.org/2017/v14n2/a156.pdf
In the area of smart society services, Zakzouk and Mathkour, Cambria et al. in 2012, Valsamidis
et al., Huh et al., Lu in 2013, Yan et al., de Oña et al., Bogdanova et al., Chen et al., Cheng et al.,
Cao et al., Wiley et al., Küçük et al. in 2014, İskender & Batı, Aliandu, Bucur, Ji et al., Bobicev
et al. in 2015, Lv & El-Gohary, Hu et al., Bui et al., Rossetti et al., Yang et al., Lee et al., in 2016
and James et al., Kuflik et al., Du et al., Dai & Hao in 2017 worked over machine learning
techniques. Muller et al. and Ali et al. in 2015 make use of ontology in concept level technique
for the automation of hotel reservation system and enhancing the data related to transportation by
mining social media. Hybrid techniques were used by Cameron et al., in 2013, Nguyen et al. and
Ortigosa et al. in 2014, Shen & Kuo in 2015, Asghar et al. and Ali et al., Xue et al.in
2017.Various areas such as drug epidemiology, online depression community analysis, tourism
monitoring, e-learning in education, health related issues have been considered for analysis. A
significant amount of work has been reported in this area using lexicon based techniques i.e.
Moreo et al. in 2012, Mostafa in 2013, Lei et al. in 2014, Teodorescu, Yu & Wang in 2015 and
Neidhardt et al. in 2017.
Marketing intelligence is one of the widely used application area of opinion mining. Various
researchers have applied machine learning techniques in review analysis of products and
services, reputation monitoring, placements of ads etc. Ghose & Ipeirotis, Sanchez-Monzon et
al., Fan et al., Park & Lee, Xu et al., Bai, Chen & Tseng in 2011, Yu et al., Liu et al., Chen et al.,
Schumaker et al., Min & Park, Reyes & Rosso, Esparza et al., Kang et al. in 2012, Wöllmer et
al., Bafna & Toshniwal, Liu et al., Yaakub et al., Basari et al., Yu et al. in 2013, Wu et al., Zhang
et al., Jiang et al., Vinodhini & Chandrasekaran, 2014a, 2014b, Yang & Chao, Wang et al., Zhang
et al., Smailović et al. in 2014, Zhang et al., Parkhe & Biswas, Bastı et al., He et al., Sandoval &
Hernández, Zimmermann et al., D’Avanzo & Pilato, Amarouche et al., Nguyen et al., Fernández-
Martínez et al. in 2015, Yan et al., Li et al., Vinodhini & Chandrasekaran, Cosma & Acampora,
Hur et al., Luo et al., Chua & Banerjee, Jin et al., Zhang et al., Malik & Shakshuki, Djouvas et
al., Lee et al., Khan et al., Manaman et al., Nayak et al., Chen et al., Feuerriegel & Prendinger,
Anand & Naorem in 2016 and Wang, Yuan et al., Tsirakis et al., Manek et al., Gui et al., Tsai &
Wang, Chan & Chong, Ding et al., Hu et al. in 2017 are the study of work done in MI using ML
techniques. Concept level techniques are implemented in year 2014 and 2015 by Ali et al. and
Liu et al. respectively. An approach has been proposed for identification of product features and
their relevant opinions. Cataldi et al. in 2013, Kang & Park and Li et al., in 2014, Van de Kauter
et al. in 2015, Li et al. in 2016 and Law et al., Daniel et al. in 2017 implemented hybrid
techniques in MI. Moreover, lexicon based techniques are used by Mostafa in 2013 to find
sentiment polarity towards famous brands and their services ; Cho et al. in 2014 proposed a
process for automated lexicon creation of stock market and a method is proposed for revising
sentiment dictionaries to classify sentiments of product reviews. Song et al. in 2015 proposed a
framework of customer review analysis for evaluating quality of service.
23 http://www.webology.org/2017/v14n2/a156.pdf
Table 4. Literature work in application area of opinion mining
S.
No
Application
Area
Technique
Used
Year Paper
Count
References
1 Business
Intelligence
Machine
Learning
2011 1 (Tsai & Kwee, 2011)
2012 2 (Chung & Seng, 2012), (Zitnik, 2012)
2013 2 (Jang et al., 2013), (Pai et al., 2013)
2014 2 (Wu et al., 2014), (Li et al., 2014)
2 Government
Intelligence
Concept
level
2014 1 (Rill et al., 2014)
Hybrid
2012 1 (De Fortuny et al., 2012)
2013 1 (Xianghua et al., 2013)
2014 1 ( Makazhanov et al., 2014)
2015 1 (Kagan et al., 2015)
Lexicon 2016 1 (Haselmayer & Jenny, 2016)
Machine
Learning
2011 1 (Cao et al., 2011)
2014 3 (Katakis et al., 2014), (ElTayeby et al., 2014),
(Panasyuk et al., 2014)
2015 1 (Mohammad et al., 2015)
2016 5 (Cheng et al., 2016), (Tayal & Yadav, 2016),
(Adedoyin-Olowe et al., 2016), (Gull et al.,
2016), (O'Leary, 2016)
2017 1 (Singh et al., 2017)
3
Information
and Security
Analysis
Lexicon 2016 1 (Bhargava et al., 2016)
Machine
Learning
2011 3 (He & Zhou, 2011), (Liu et al., 2011), (Fan &
Wu, 2011)
2012 5 (Zhai et al., 2012), (Balahur et al., 2012),
(Cruz et al., 2012), (Lane et al., 2012), (Wang
et al., 2012)
2013 2 ( Bagheri et al., 2013),( Haddi et al., 2013)
2014 5 (Wu et al., 2014), (Kansal & Toshniwal,
2014), (Dařena et al., 2014), (Zheng et al.,
2014), (Dehkharghani et al., 2014)
2015 2 (Wang & Liu, 2015), (Kurian & Asokan,
2015)
2016 2 (Li et al., 2016), (Qiang et al., 2016)
2017 1 (Rout et al., 2017)
4 Market
Intelligence
Concept
level
2014 1 (Ali et al., 2014)
2015 1 (Liu et al., 2015)
Hybrid
2013 1 (Cataldi et al., 2013)
2014 2 (Kang & Park, 2014), (Li et al., 2014)
2015 1 (Van de Kauter et al., 2015)
2016 1 (Li et al., 2016)
2017 2 (Law et al., 2017), (Daniel et al., 2017)
Lexicon 2013 1 (Mostafa, 2013)
24 http://www.webology.org/2017/v14n2/a156.pdf
2014 1 (Cho et al., 2014)
2015 1 (Song et al., 2015)
Machine
Learning
2011 7 (Ghose & Ipeirotis, 2011), (Sanchez-Monzon
et al., 2011), (Fan et al., 2011), (Park & Lee,
2011), (Xu et al., 2011), (Bai, 2011), (Chen &
Tseng, 2011)
2012 8 (Yu et al., 2012), (Liu et al., 2012), (Chen et
al., 2012), (Schumaker et al., 2012), (Min &
Park, 2012), (Reyes & Rosso, 2012), (Esparza
et al., 2012), (Kang et al., 2012)
2013 6 (Wöllmer et al., 2013), (Bafna & Toshniwal,
2013), (Liu et al., 2013), (Yaakub et al., 2013),
(Basari et al. 2013), (Yu et al., 2013)
2014 9 (Wu et al., 2014), (Zhang et al., 2014), (Jiang
et al., 2014), (Vinodhini & Chandrasekaran,
2014a, 2014b), (Yang & Chao, 2014), (Wang
et al., 2014), (Zhang et al., 2014), (Smailović
et al., 2014)
2015 10 (Zhang et al., 2015), ( Parkhe & Biswas,
2015), (Bastı et al., 2015), (He et al., 2015),
(Sandoval & Hernández, 2015),
(Zimmermann et al., 2015), (D’Avanzo &
Pilato, 2015), (Amarouche et al., 2015),
(Nguyen et al., 2015), (Fernández-Martínez et
al., 2015)
2016 19 (Yan et al., 2016), (Manek et al., 2017), (Li et
al., 2016), (Vinodhini & Chandrasekaran,
2016), (Cosma & Acampora, 2016), (Hur et
al., 2016), (Luo et al., 2016), (Chua &
Banerjee, 2016), (Jin et al., 2016), (Zhang et
al., 2016), (Malik & Shakshuki, 2016),
(Djouvas et al., 2016), (Lee et al., 2016),
(Khan et al., 2016), (Manaman et al., 2016),
(Nayak et al., 2016), (Anand & Naorem,
2016), (Chen et al., 2016), (Feuerriegel &
Prendinger, 2016)
2017 9 (Wang, 2017), (Yuan et al., 2017), (Guo et al.,
2017), (Tsirakis et al., 2017), (Gui et al.,
2017), (Tsai & Wang, 2017), (Chan & Chong,
2017), (Ding et al., 2017), (Hu et al., 2017b)
5
Sub
Component
Technology
Hybrid 2016 2 (Lei et al., 2016), (Dong et al., 2016)
2017 1 (Ma et al.,2017)
Lexicon 2015 1 (Rosa et al., 2015)
Machine
Learning
2011 2 (Qiao et al., 2011), (Bridge & Healy, 2011)
2012 1 (Anooj, 2012)
2013 6 (Kardan & Ebrahimi, 2013), (Geva & Zahavi,
2013), (García-Cumbreras et al., 2013), (Chen
& Wang, 2013), (Peleja et al., 2013),
25 http://www.webology.org/2017/v14n2/a156.pdf
(Scharkow, 2013)
2014 1 (Sun et al., 2014)
2015 6 (Capdevila et al., 2015), (Coussement et al.,
2015), (Colace et al., 2015), (Wei et al., 2015),
(Alahmadi & Zeng, 2015), (Sohail et al.,
2015),
2016 4 (Pröllochs et al., 2016), (Li et al., 2016),
(Zhang et al., 2016), (Bilici & Saygın, 2016)
2017 2 (Gurini et al., 2017), (Ma et al., 2017)
6
Smart
Society
Services
Concept
level
2015 2 (Grant-Muller et al., 2015), (Ali et al., 2015)
Hybrid
2013 1 (Cameron et al., 2013)
2014 2 (Nguyen et al., 2014), (Ortigosa et al., 2014)
2015 1 (Shen & Kuo, 2015)
2016 2 (Asghar et al., 2016), (Ali et al., 2016)
2017 1 (Xue et al., 2017)
Lexicon
2012 1 (Moreo et al., 2012)
2013 1 (Mostafa, 2013)
2014 1 (Lei et al., 2014)
2015 2 (Teodorescu, 2015), (Yu & Wang, 2015)
2017 1 (Neidhardt et al., 2017)
Machine
Learning
2012 2 (Zakzouk & Mathkour, 2012), (Cambria et al.,
2012)
2013 3 (Valsamidis et al., 2013), (Huh et al., 2013),
(Lu, 2013)
2014 8 (Yan et al., 2014), (de Oña et al., 2014),
(Bogdanova et al., 2014), (Chen et al., 2014),
(Cheng et al., 2014), (Cao et al., 2014), (Wiley
et al., 2014), (Küçük et al., 2014)
2015 5 (İskender & Batı, 2015), (Aliandu, 2015),
(Bucur, 2015), (Ji et al., 2015), (Bobicev et
al., 2015)
2016 6 (Lv & El-Gohary, 2016), (Hu et al., 2017a),
(Bui et al., 2016), (Rossetti et al., 2016),
(Yang et al., 2016), (Lee et al., 2016)
2017 4 (James et al., 2017), (Kuflik et al., 2017), (Du
et al., 2017), (Dai & Hao, 2017)
Review Reporting/Results
This section answers the research questions identified in section 4.2.
RQ1 What are the different application areas & techniques of opinion mining?
Answer Opinion mining is widely used in various real life application areas which could be
broadly classified as follows:
26 http://www.webology.org/2017/v14n2/a156.pdf
o Government Intelligence (GI) exploits opinion mining in order to provide public participation
for improving the process of planning and decision making. It help governments and
organizations in resource optimization, improves democratic measures/mechanism,
revolutionize the delivery mode of public services in order to develop a citizen centric model
of governance. Government Intelligence spectrum covers Politics, Advisory Voting,
Advocacy etc.
o Sub Component Technology take advantage of opinion mining which has a significant
promising role in enabling the mechanics of other systems. Expert Finding, Recommender
Systems, Citation Analysis are some of the action areas of opinion mining in sub component
technology.
o Information and Security Analysis (ISA) utilizes opinion mining for the analysis of
information as well as security. Information could be analyzed by summarization techniques,
content analysis etc. and security analysis deals with spam detection, fraud detection etc.
Both information and security analysis are generic domain could be applied in any
application sphere of opinion mining for the sake of making results/conclusions more
organized and genuine.
o Market Intelligence (MI) deals with the process of collecting and analyzing information
associated to market analytics and research for good decision making and strategies
conceptualization. With the digitization of market intelligence, numerous data resources are
used to get an overall illustration of company's reputation monitoring and its existing market,
consumer needs and priorities, market challenges, struggle/competition, and expansion
competency for fresh and latest products and services. Trend analyzers and researches
utilizes opinion mining as a tool/technique which aids effectual and proficient assessment of
consumer/customer viewpoint. Various areas fall under the umbrella of market intelligence
such as Product/Service Reviews, Market Analysis and Decision Making, Product Service
Quality Improvement etc.
o Business Intelligence (BI) is an industrial science of gathering, analyzing, presenting,
reporting and broadcasting of business information to aid business analysts, executives and
end users with well updating of business decision and conclusions. The goal is to exploit
upcoming opportunities and executes an effective plan of action for its accomplishment.
Opinion Mining helps in strengthening the various activities of business intelligence such as
business process optimization, trend analysis etc. by considering the opinion of different
stakeholders. A new social commerce paradigm has emerged.
o Smart Society Services (SSS) is an integration of various necessities and utilities, required for
the anatomy of an innovative and sustainable society by the use of internet and
communication technologies and digital networks. These services targets to resolve the
social, environmental and economical problems of the present era in view of improving
wellbeing, quality of life, to focus on resource optimization and ultimately to meliorate the
collective intelligence of society. Opinion mining being a powerful concept/technology to
27 http://www.webology.org/2017/v14n2/a156.pdf
grab the public sentiments, plays an important role in cultivating the foundation of smart
society by contributing the public/audience propensity towards smart services. Some key
areas of smart society services are education sector, banking, healthcare, transportation etc.
o Some examples of their respective field of action are listed in Table 5. The corresponding
field of action for the respective application areas are not limited to the one mentioned over
here. More action fields could be accommodated into the broader category of applications
which is an open scope for this SLR.
Table 5. Opinion Mining Application Areas
Techniques of Opinion Mining
o Opinion mining techniques can be broadly divided into following approaches:
Application Area(s) Field of Action(s)
Government Intelligence
Voting Advisor
Advocacy & Policy Making
Advocacy Evaluation & Monitoring
Politics
Detection of Early Warning system
Campaigning
Natural Resource Protection/Environment Utility
Sub Component
Technology
Expert Finding
Recommender System
Argument Mapping Software
Detection of "flames"
Citation Analysis
Collective Intelligence
Information and
Security Analysis
Automated Content Analysis
Buying a commodity or service
Spam Detection
Automated Summarization
Decision Making Tasks
Market Intelligence
Product/Service Reviews
Market Analysis and Decision Making
Product Service Quality Improvement
Ads Placement
Reputation Monitoring
Stock Exchange
Business Intelligence
Business Process Optimization
Automated Trading System
Trend Prediction in Sales
Smart Society Services
Banking
Education
Health Care
Sports
Tourism Industry
Resource Optimization
28 http://www.webology.org/2017/v14n2/a156.pdf
o Machine Learning Based Approaches which implement machine learning algorithms that
can learn from data and helps in predictive analysis. These can be classified into two
groups: supervised and unsupervised. Supervised approaches apply past learning from
training data to test new data whereas unsupervised approach itself identifies unrevealed
patterns in unlabeled data.
o Lexicon Based Approaches which are based on sentiment thesaurus, a collection of
known and precompiled sentiment terms. The two main approaches for compilation of
opinion lexicon are: dictionary based and corpus based. In dictionary based approach,
manual process is used for creation of a small set of opinion words reflecting their
orientation and gets it enhanced by searching their synonyms and antonyms. Corpus
based approach uses statistical or semantic methods for finding sentiment polarity and
determining words emotional affinity.
o Hybrid Approaches are a combination of both machine learning and lexicon based
approach, collaborated for better performance.
o Concept Based Approaches have been introduced recently & are based on huge
knowledge bases and analyzing the conceptual information associated to natural language
opinion behind the multi-word expressions. Semantic analysis is performed by the use of
web ontologies or semantic networks (Cambria et al., 2013).
Thus, approaches used in opinion mining so far are listed as under (K. Ravi & V. Ravi, 2015),
(Medhat et al., 2014):
Machine Learning Based
Supervised Learning
Decision Tree Classifier
Rule Based Classifier
Probabilistic Classifier
Naive Bayes
Bayesian Network
Maximum Entropy
Linear Classifier
Support Vector machine
Neural Network
Unsupervised Learning
Meta Classifiers
Lexicon Based
Dictionary Based
Corpus Based
Statistical
Semantic
Hybrid
Concept Based
Ontology Based
Context/ Non Ontology Based
29 http://www.webology.org/2017/v14n2/a156.pdf
Table 6 represents opinion mining technique and their respective application area in which it is
used as observed from the selected final studies.
Table 6. Usage of Opinion Mining Techniques in its application areas
Application Area
Techniques
BI GI ISA MI SCT SSS
Machine Learning ✓ ✓ ✓ ✓ ✓ ✓
Lexicon Based ✓ ✓ ✓ ✓ ✓
Hybrid ✓ ✓ ✓ ✓
Concept Level ✓ ✓ ✓
Charts in Figure 8 and 9 depict the percentage of work done in various application areas of
opinion mining and the percentage usage of its techniques/approaches in these application areas.
Figure 8. % Work done in OM Applications
Figure 9. % Work done in OM Techniques
RQ2 How much work has been done in the area of government intelligence using opinion
mining?
30 http://www.webology.org/2017/v14n2/a156.pdf
Answer Following are the various field of action in government intelligence application arena
as listed in Table 7 along with their motivation, detail and scope.
Table 7. Literature work in area of Government Intelligence
S. No. Author Year Publication Field of
Action
Technique(s) Used Dataset Used
Motivation Details and Scope Name
Algorithm
Used Scope Source
1 Cao et al. 2011
Elsevier,
Decision
Support Systems
Voting
Advisor ML
Factor
Analysis, Ordinal
Logistic
Regression
User
Reviews
CNET Download.c
om
Online user reviews
are effective on
product sales and consumer decision-
making. The
“helpfulness” property of online user reviews
aids consumers to
manage with the huge
amount of information
available on Web and
promotes decision-making.
This paper investigates how
the features of online user
reviews effects on the number of helpfulness votes received.
Basic, stylistic and semantic
are the three characteristics. Latest reviews inclined to be
more precise and complete as
they consider previews
reviews, therefore receive
more helpfulness vote.
2 De Fortuny
et al. 2012
Elsevier, Expert
Systems with
Applications
Politics Hybrid
KDD,
Lexicon based
News
articles
All major Flemish
newspapers
in 2011
Manual analysis of
contrasting and evaluating opinions for
large amount of online
available articles is troublesome job.
Requisite is to
overcome the method by automating the
process.
In this paper, newspapers are
compared on specific topics in order to help political parties
in decision making by
proposing a framework based on text-mining. Online
dashboard updates Regular
update of stats in real time by online dashboard promote
above process. Future vision
is to apply the framework over American presidential race.
3 Xianghua et
al. 2013
Elsevier,
Knowledge
Based Systems
Politics Hybrid
NB, KNN,
SVM,LDA,
Hownet Lexicon
Blogs
2000-SINA-Blog
dataset,
300-SINA-Blog
dataset
With the development
of communication
technologies, social
media provides a
platform to share public sentiment and
ideas over any topic.
Manual analysis of these online reviews is
becoming difficult due
to exponential rise in their number. Hence,
objective is to make
the process convenient.
A trained LDA model is used
for opinion mining of Chinese
social reviews for
identification of sliding
windows topic. How Net lexicon is used to analyze the
related sentiment. Proposed
method performs better in topic identification and
sentiment analysis. Future
work targets to use existing method and compare it with
TSM, JST in different
datasets.
4 Katakis et al.
2014
IEEE
Transactions on
Cybernetics
Voting Advisor
ML
Dimensiona
lity Reduction
PCA
Preferenc
e Matcher research
Consortiu
m
www.preferencematche
r.org
To provide voting advice to the users of
social voting advice
applications is required.
To compare users political opinion and providing voting
recommendation, a new
online tool has been proposed for social voting advice
application. Future areas are
data privacy and transparency
in voting advice application
recommendation engine.
5 Makazhanov et al.
2014
Social
Network Analysis and
Mining
Politics
Hybrid
SVM Logic Regression
Model
Sentistrength
Twitter Election
Commissi
on of Pakistan
Twitter API
To understand users political preference
prediction problem on
Twitter network by forming contextual and
behavioral features
based prediction models.
Users political preference was predicted from the knowledge
derived from the content
generated and users behavior during the campaign.
Comparison concludes that
during preelection vocal users are unenthusiastic in their
preference change than silent
users.
31 http://www.webology.org/2017/v14n2/a156.pdf
6 ElTayeby et
al. 2014
Elsevier, Procedia
Computer
Science
Campaig
ning ML EM
User
Reviews Twitter
Social media plays a
vital role in spreading opinions over many
diversified subject
matters. Professionals take advantage of this
data for publicizing
their products and getting user feedback.
Hence, requirement is
to group similar opinions in to a cluster.
This paper inspects effect of
media over grouping of opinions. EM algorithm is
used for segregating opinions.
Grouping of opinion is done specific to certain political
topics and their percentage
media impact is calculated. Future works targets to collect
sentiment from multiple topics
and apply the proposed framework.
7 Panasyuk et al.
2014
Elsevier,
Procedia Computer
Science
Politics ML LDA News articles
Twitter being a
significant social
networking site facilitates the
identification of
communities. Objective is to
recognize conflicting
communities.
This paper investigates the
usage of Twitter for
determining This paper addresses how
Twitter can be used for
identifying conflict among communities users.
Documents were aggregated
based on topic and community followed by applying SA to
find views of every
community on individual topic. Ranking of topic with
opposing views has been done. Future scope is to find
how the difference level is
used for detection of early warning system.
8 Rill et al. 2014
Elsevier,
Knowledge Based
Systems
Politics Concept level
Ontology based
User Reviews
Twitter,
Trends
Social networking sites
are gaining popularity
over the past few years. Various
communication
channels are used for dissemination of
information. Aim is to
recognize that Twitter detects the topics faster
than other information
channels.
PoliTwi, a system is designed
to find upcoming political
topics in Twitter. Identified topics are broadcasted over
various channels. Results
were compared with Google Trends and results reflects
Twitter find topics earlier than
other communication channel.
9 Kagan et al. 2015
IEEE
Intelligent Systems
Politics Hybrid
SVM
classifier + Statistical
sentiment
diffusion model
India
Election
Tweet Database(
IET-DB)
-
Requirement is to
forecast the process of
spreading favor or denial toward a
candidate on Twitter
by implementing sentiment diffusion
forecasting.
Sentiment score is calculated
for each tweet to analyze
expressed sentiment Automatic learning is done for
diffusion estimation model in
order to find how a phenomenon diffuses over
network.
10 Mohammad et al.
2015
Elsevier,
Information & Processing
Management
Voting Advisor
ML SVM
Question
naires, User
Reviews
Amazon, Twitter
Social media sites
facilitate the political parties, candidates and
other stakeholders to make their views
accessible to public
during elections. Manual analysis of
opinions is difficult to
acquire an overall outlook of various
posts relevant to a
particular event. Hence, aim is to
automate the process
of analyzing opinions in electoral tweets.
In this paper, automatic
classifiers have been developed to identify emotion
and purpose labels in electoral tweets for 2012 US
presidential election.
32 http://www.webology.org/2017/v14n2/a156.pdf
11 Cheng et al. 2016 Applied Intelligence
Politics ML Recurrent NN
Micro
blogging
Websites
Twitter Sina Weibo
Requirement is to
implement sentiment parsing based methods
in opinion polling on
Chinese public figures using microblogs.
This paper uses RNN based
sequence labeling method for opinion target labeling and
their corresponding
sentiments. Result concludes that jointly representing
opinion targets and sentiments
make more sense in sentiment parsing of public figures in
microblogs.
12 Haselmeyer 2016 Quality &
Quantity Politics
Lexico
n
Dictionary
based
SentiWS
[22] -
Objective is to build a
negative sentiment dictionary by gathering
fine-grained sentiment
scores through crowdcoding.
In this paper a German
political language dictionary has been created specifically
for party statements and
media reports. New sentiment dictionary can be used in two
applications: negative
campaigning by parties and media tone.
13 Tayal et al. 2016 AI & Society Campaig
ning ML unigrams Twitter -
A sentiment analysis
tool is required for the estimation of success
rate of campaigns.
In this paper a detailed
analysis of Swachh Bharat Abhiyan has been computed
by the proposed tool which
extracts data from twitter. The tool computes weekly and
monthly analysis of campaign
related tweets.
14 Adedoyin-
Olowe et al. 2016
Elsevier, Expert
Systems with
Applications
Politics ML
Apriori,
Association Rule
Limimg,
TRCM algo proposed
Sports
and Politics
Huge amount of public
opinion over social
media like twitter can be utilized for
computation of
detection and tracking of related topics. Aim
is to apply topic
detection and tracking methods for extraction
of valuable content..
In this paper Transaction-
based Rule Change Mining is
applied two domains- sports (The English FA Cup 2012)
and politics (US Presidential
Elections 2012 and Super Tuesday 2012). It
performed better on dataset in
the sports domain.
15 Gull et al. 2016
Elsevier,
Procedia Computer
Science
Politics ML NB, SVM User Reviews
Objective is to use
linguistic analysis and opinion classifiers for
simplifying opinion
mining approach in order to determine
positive, negative and
neutral sentiments for Pakistan political
parties.
In this paper, information
from tweets has been extracted by preprocessing of
raw data. Comparison of SVM
and Naive Bayes has been done for tweets
classification where SVM
outperforms NB.
16 Singh et al. 2017 ICT Express
Advocacy &
Policy
Making
ML
Statistical
document classificatio
n with rule-
based
filtering
User
Reviews Twitter
Public opinion on social media can act as
a magnitude of
acceptance or rejection of certain ordinances.
Requirement is to
gauge sentiment of
public for the
government policy of
demonetization.
Indian government has implemented a
demonetization policy and
opinion mining concludes the favorable feedback of the
public for the same.
17 O'Leary, D. E.
2016
Elsevier,
Decision Support
Systems
Advocac
y & Policy
Making
ML
Decision
Trees, Regression
Analysis
Canada's
Digital
Compass
Sept–Oct
2014 from PWC's web
site
http://pwc-compass.ch
aordix.com/
Objective is to analyze
the data of a crowd
sourcing study to extract Canada's digital
future.
In this paper, relationship of
number of votes and number
of innovated comments has been validated with respect to
another case study. They
found that the number of votes is related positively with the
number of words and
magnitude of idea.
33 http://www.webology.org/2017/v14n2/a156.pdf
The following charts in figure 10, 11 and 12 represent the percentage of work done in various
field actions of government intelligence and the percentage use of opinion mining techniques in
these action areas.
Figure 10. % work done in GI
RQ3 Which (big) data sets are used for opinion mining in government intelligence?
Answer From the above literature review following are the big data sets used for opinion mining
in government intelligence:
User Reviews from CNET, Twitter, Google
Trends, Election India Tweet
News Articles from Newspapers, Twitter
Twitter API Preference Matcher Resort Consortium
Questionnaire from Amazon, Twitter Micro blogging sites Twitter Sina, Weibo
MyGov.in Facebook
Citymis application Sports and politics from Twitter
RQ4 Which techniques of opinion mining have been used in advocacy so far?
Answer Extensive literature review revealed that machine learning and hybrid techniques have
been used in the area of advocacy. Semantic Role Labelling, GATE Tool, Rule based method,
Naive Bayes, Corpus Based, Statistical document classification with rule-based filtering are
some of the approaches applied for opinion mining in advocacy so far.
34 http://www.webology.org/2017/v14n2/a156.pdf
Machine Learning
65%Lexicon based
6%
Hybrid23%
Concept Level6%
OM Techniques used in GI(%)
Figure 11. % work done in GI with OM techniques
2011 2012 2013 2014 2015 2016 2017
Year wise Number of Publication(s) in GI
1 1 1 5 2 6 1
0
1
2
3
4
5
6
7
Year wise Number of Publication(s) in GI
Figure 12. Year wise GI Publications
Conclusion
This paper presents a systematic, comprehensive review on the research work done in different
application areas of opinion mining during 2011-2017.The application areas are classified into
six broad domains i.e. Government Intelligence, Market Intelligence, Information and Security
Analysis, Sub Component Technology, Business Intelligence and Smart Society Services. Some
important conclusions have been drawn by answering identified research questions. From the
literature it is clear that Market Intelligence is the major application area of opinion mining while
GI and BI are the least explored ones. Machine learning techniques are mostly used in GI so far
with the maximum number of publication in GI is in the year 2016 till now. Certain research
gaps which are identified in this study are:
35 http://www.webology.org/2017/v14n2/a156.pdf
o To understand public opinion and sentiments by a human or machine is a difficult job due
to intricate and ambiguous nature of their verbalization. Therefore, automation of
sentiment extraction from text is a big challenge.
o User generated content are usually briefly written and vernacular in nature with
grammatical and typo mistakes. Therefore, opinion mining in such content is
comparatively difficult than regular texts.
o Due to domain specific characteristics of data, many sentimental words are not used for
expressing opinions. An expression could be positive or neutral in nature although
negative words are used.
o Affordability of opinion mining software are only with organizations and government and
not by citizens due to which content analysis has not been democratized.
o Few efforts have been reported in leveraging big social data for intelligent opinion
mining which is a dynamic field with problems stemming either from the velocity and
volume of data or the variety in application domain.
o There is no standard model or framework exists till date which incorporates opinion
mining in any action field(s) of government intelligence.
o Detection of Early Warning Systems & Advocacy Evaluation & Monitoring are the two
action field(s) which have been least explored in the area of government intelligence.
The study has certain limitations, specifically the restriction of covering four electronic databases
with a manual search of high quality journal or conference papers only. Future study will try to
cover more papers by automating searches.
References
Adedoyin-Olowe, M., Gaber, M. M., Dancausa, C. M., Stahl, F., & Gomes, J. B. (2016). A rule
dynamics approach to event detection in Twitter with its application to sports and politics.
Expert Systems with Applications, 55, 351-360.
Alahmadi, D. H., & Zeng, X. J. (2015). ISTS: Implicit social trust and sentiment based approach to
recommender systems. Expert Systems with Applications, 42(22), 8840-8849.
Ali, F., Kim, E. K., & Kim, Y. G. (2015). Type-2 fuzzy ontology-based opinion mining and information
extraction: A proposal to automate the hotel reservation system. Applied Intelligence, 42(3),
481-500.
Ali, F., Kwak, D., Khan, P., Islam, S. R., Kim, K. H., & Kwak, K. S. (2017). Fuzzy ontology-based
sentiment analysis of transportation and city feature reviews for safe traveling. Transportation
Research Part C: Emerging Technologies, 77, 33-48.
Ali, F., Kwak, K. S., & Kim, Y. G. (2016). Opinion mining based on fuzzy domain ontology and
Support Vector Machine: A proposal to automate online review classification. Applied Soft
Computing, 47, 235-250.
Aliandu, P. (2015). Sentiment Analysis to Determine Accommodation, Shopping and Culinary
Location on Foursquare in Kupang City. Procedia Computer Science, 72, 300-305.
36 http://www.webology.org/2017/v14n2/a156.pdf
Amarouche, K., Benbrahim, H., & Kassou, I. (2015). Product Opinion Mining for Competitive
Intelligence. Procedia Computer Science, 73, 358-365.
Anand, D., & Naorem, D. (2016). Semi-supervised Aspect Based Sentiment Analysis for Movies
Using Review Filtering. Procedia Computer Science, 84, 86-93.
Anooj, P. K. (2012). Clinical decision support system: Risk level prediction of heart disease using
weighted fuzzy rules. Journal of King Saud University-Computer and Information Sciences,
24(1), 27-40.
Arputhamary, B., & Jayapriya, V. (2016). Interpreting the public sentiment variations on twitter using
virtual shuffling for data movement in big data. International Journal of Control Theory and
Applications, 9(26), 389-395
Asghar, M. Z., Ahmad, S., Qasim, M., Zahra, S. R., & Kundi, F. M. (2016). SentiHealth: creating
health-related sentiment lexicon using hybrid approach. SpringerPlus, 5(1), 1139.
Bafna, K., & Toshniwal, D. (2013). Feature based summarization of customers’ reviews of online
products. Procedia Computer Science, 22, 142-151.
Bagheri, A., Saraee, M., & De Jong, F. (2013). Care more about customers: unsupervised domain-
independent aspect detection for sentiment analysis of customer reviews. Knowledge-Based
Systems, 52, 201-213.
Bai, X. (2011). Predicting consumer sentiments from online text. Decision Support Systems, 50(4),
732-742.
Balahur, A., Kabadjov, M., Steinberger, J., Steinberger, R., & Montoyo, A. (2012). Challenges and
solutions in the opinion summarization of user-generated content. Journal of Intelligent
Information Systems, 39(2), 375-398.
Basari, A. S. H., Hussin, B., Ananta, I. G. P., & Zeniarja, J. (2013). Opinion mining of movie review
using hybrid method of support vector machine and particle swarm optimization. Procedia
Engineering, 53, 453-462.
Bastı, E., Kuzey, C., & Delen, D. (2015). Analyzing initial public offerings' short-term performance
using decision trees and SVMs. Decision Support Systems, 73, 15-27.
Bhargava, R., Sharma, Y., & Sharma, G. (2016). ATSSI: Abstractive Text Summarization Using
Sentiment Infusion. Procedia Computer Science, 89, 404-411.
Bilici, E., & Saygın, Y. (2017). Why do people (not) like me?: Mining opinion influencing factors
from reviews. Expert Systems with Applications, 68, 185-195.
Bobicev, V., Sokolova, M., & Oakes, M. (2015). What goes around comes around: learning sentiments
in online medical forums. Cognitive Computation, 7(5), 609-621.
Bogdanova, D., Rosso, P., & Solorio, T. (2014). Exploring high-level features for detecting
cyberpedophilia. Computer Speech and Language, 28(1), 108-120.
Bridge, D., & Healy, P. (2012). The GhostWriter-2.0 case-based reasoning system for making content
suggestions to the authors of product reviews. Knowledge-Based Systems, 29, 93-103.
Bucur, C. (2015). Using Opinion Mining Techniques in Tourism. Procedia Economics and Finance,
23, 1666-1673.
Bui, N., Yen, J., & Honavar, V. (2016). Temporal Causality Analysis of Sentiment Change in a Cancer
Survivor Network. IEEE Transactions on Computational Social Systems, 3(2), 75-87.
37 http://www.webology.org/2017/v14n2/a156.pdf
Cambria, E., Benson, T., Eckl, C., & Hussain, A. (2012). Sentic PROMs: Application of sentic
computing to the development of a novel unified framework for measuring health-care quality.
Expert Systems with Applications, 39(12), 10533-10543.
Cambria, E., Schuller, B., Xia, Y., & Havasi, C. (2013). New avenues in opinion mining and sentiment
analysis. IEEE Intelligent Systems, 28(2), 15-21.
Cameron, D., Smith, G. A., Daniulaityte, R., Sheth, A. P., Dave, D., Chen, L., & Falck, R. (2013).
PREDOSE: a semantic web platform for drug abuse epidemiology using social media. Journal
of Biomedical Informatics, 46(6), 985-997.
Cao, J., Zeng, K., Wang, H., Cheng, J., Qiao, F., Wen, D., & Gao, Y. (2014). Web-based traffic
sentiment analysis: Methods and applications. IEEE transactions on Intelligent Transportation
systems, 15(2), 844-853
Cao, Q., Duan, W., & Gan, Q. (2011). Exploring determinants of voting for the “helpfulness” of online
user reviews: A text mining approach. Decision Support Systems, 50(2), 511-521.
Capdevila, J., Arias, M., & Arratia, A. (2016). GeoSRS: A hybrid social recommender system for
geolocated data. Information Systems, 57, 111-128.
Cataldi, M., Ballatore, A., Tiddi, I., & Aufaure, M. A. (2013). Good location, terrible food: detecting
feature sentiment in user-generated reviews. Social Network Analysis and Mining, 3(4), 1149-
1163.
Chan, S. W., & Chong, M. W. (2017). Sentiment analysis in financial texts. Decision Support Systems,
94, 53-64.
Chen, C. C., & Tseng, Y. D. (2011). Quality evaluation of product reviews using an information
quality framework. Decision Support Systems, 50(4), 755-768.
Chen, K., Luo, P., Xu, D., & Wang, H. (2016). The dynamic predictive power of company comparative
networks for stock sector performance. Information & Management, 53(8), 1006-1019.
Chen, L., & Wang, F. (2013). Preference-based clustering reviews for augmenting e-commerce
recommendation. Knowledge-Based Systems, 50, 44-59.
Chen, L., Qi, L., & Wang, F. (2012). Comparison of feature-level learning methods for mining online
consumer reviews. Expert Systems with Applications, 39(10), 9588-9601.
Chen, X., Vorvoreanu, M., & Madhavan, K. (2014). Mining social media data for understanding
students’ learning experiences. IEEE Transactions on Learning Technologies, 7(3), 246-259.
Cheng, J., Zhang, X., Li, P., Zhang, S., Ding, Z., & Wang, H. (2016). Exploring sentiment parsing of
microblogging texts for opinion polling on Chinese public figures. Applied Intelligence, 45(2),
429-442.
Cheng, V. C., Leung, C. H., Liu, J., & Milani, A. (2014). Probabilistic aspect mining model for drug
reviews. IEEE Transactions on Knowledge and Data Engineering, 26(8), 2002-2013.
Cho, H., Kim, S., Lee, J., & Lee, J. S. (2014). Data-driven integration of multiple sentiment
dictionaries for lexicon-based sentiment classification of product reviews. Knowledge-Based
Systems, 71, 61-71.
Chua, A. Y., & Banerjee, S. (2016). Helpfulness of user-generated reviews as a function of review
sentiment, product type and information quality. Computers in Human Behavior, 54, 547-554.
Chung, W., & Tseng, T. L. B. (2012). Discovering business intelligence from online product reviews:
A rule-induction framework. Expert Systems with Applications, 39(15), 11870-11879.
38 http://www.webology.org/2017/v14n2/a156.pdf
Colace, F., De Santo, M., Greco, L., Moscato, V., & Picariello, A. (2015). A collaborative user-
centered framework for recommending items in Online Social Networks. Computers in Human
Behavior, 51, 694-704.
Cosma, G., & Acampora, G. (2016). A computational intelligence approach to efficiently predicting
review ratings in e-commerce. Applied Soft Computing, 44, 153-162.
Coussement, K., Benoit, D. F., & Antioco, M. (2015). A Bayesian approach for incorporating expert
opinions into decision support systems: A case study of online consumer-satisfaction detection.
Decision Support Systems, 79, 24-32.
Cruz, F. L., Troyano, J. A., Enríquez, F., Ortega, F. J., & Vallejo, C. G. (2013). ‘Long autonomy or long
delay?’ The importance of domain in opinion mining. Expert Systems with Applications, 40(8),
3174-3184.
D’Avanzo, E., & Pilato, G. (2015). Mining social network users' opinions to aid buyers’ shopping
decisions. Computers in Human Behavior, 51, 1284-1294.
Dai, H., & Hao, J. (2017). Mining social media data on marijuana use for post traumatic stress
disorder. Computers in Human Behavior, 70, 282-290.
Daniel, M., Neves, R. F., & Horta, N. (2017). Company event popularity for financial markets using
Twitter and sentiment analysis. Expert Systems with Applications, 71, 111-124.
Dařena, F., Žižka, J., & Přichystal, J. (2014). Clients’ freely written assessment as the source of
automatically mined opinions. Procedia Economics and Finance, 12, 103-110.
Dave, K., Lawrence, S., & Pennock, D. M. (2003, May). Mining the peanut gallery: Opinion
extraction and semantic classification of product reviews. In Proceedings of the 12th
international conference on World Wide Web (pp. 519-528). ACM.
De Fortuny, E. J., De Smedt, T., Martens, D., & Daelemans, W. (2012). Media coverage in times of
political crisis: A text mining approach. Expert Systems with Applications, 39(14), 11616-11622.
de Oña, R., López, G., de los Rios, F. J. D., & de Oña, J. (2014). Cluster analysis for diminishing
heterogeneous opinions of service quality public transport passengers. Procedia-Social and
Behavioral Sciences, 162, 459-466.
Dehkharghani, R., Mercan, H., Javeed, A., & Saygin, Y. (2014). Sentimental causal rule discovery
from Twitter. Expert Systems with Applications, 41(10), 4950-4958.
Digital India – A programme to transform India into digital empowered society and knowledge
economy. (2014, August 20). Retrieved August 20, 2017, from
http://pib.nic.in/newsite/PrintRelease.aspx?relid=108926
Digital India: Transforming India into a Knowledge Economy. Retrieved August 20, 2017, from
http://www.makeinindia.com/article/-/v/digital-india-transforming-india-into-a-knowledge-
economy
Ding, Z., Liu, Z., Zhang, Y., & Long, R. (2017). The contagion effect of international crude oil price
fluctuations on Chinese stock market investor sentiment. Applied Energy, 187, 27-36.
Djouvas, C., Mendez, F., & Tsapatsoulis, N. (2016). Mining online political opinion surveys for
suspect entries: An interdisciplinary comparison. Journal of Innovation in Digital Ecosystems,
3(2), 172-182.
39 http://www.webology.org/2017/v14n2/a156.pdf
Dong, R., O’Mahony, M. P., Schaal, M., McCarthy, K., & Smyth, B. (2016). Combining similarity and
sentiment in opinion mining for product recommendation. Journal of Intelligent Information
Systems, 46(2), 285-312.
Du, J., Xu, J., Song, H., Liu, X., & Tao, C. (2017). Optimization on machine learning based
approaches for sentiment analysis on HPV vaccines related tweets. Journal of Biomedical
Semantics, 8(1), 9.
ElTayeby, O., Molnar, P., & George, R. (2014). Measuring the Influence of Mass Media on Opinion
Segregation through Twitter. Procedia Computer Science, 36, 152-159.
Esparza, S. G., O’Mahony, M. P., & Smyth, B. (2012). Mining the real-time web: a novel approach to
product recommendation. Knowledge-Based Systems, 29, 3-11.
Fan, M., & Wu, G.. (2012). Opinion Summarization of Customer Comments. Physics Procedia, 24,
2220-2226.
Fan, T. K., & Chang, C. H. (2011). Blogger-centric contextual advertising. Expert Systems with
Applications, 38(3), 1777-1788.
Fernández-Martínez, F., García, A. H., & de María, F. D. (2015). Succeeding metadata based
annotation scheme and visual tips for the automatic assessment of video aesthetic quality in car
commercials. Expert Systems with Applications, 42(1), 293-305.
Feuerriegel, S., & Prendinger, H. (2016). News-based trading strategies. Decision Support Systems,
90, 65-74.
García-Cumbreras, M. Á., Montejo-Ráez, A., & Díaz-Galiano, M. C. (2013). Pessimists and optimists:
Improving collaborative filtering through sentiment analysis. Expert Systems with Applications,
40(17), 6758-6765.
Geva, T., & Zahavi, J. (2014). Empirical evaluation of an automated intraday stock recommendation
system incorporating both market data and textual news. Decision Support Systems, 57, 212-
223.
Ghose, A., & Ipeirotis, P. G. (2011). Estimating the helpfulness and economic impact of product
reviews: Mining text and reviewer characteristics. IEEE Transactions on Knowledge and Data
Engineering, 23(10), 1498-1512.
Grant-Muller, S. M., Gal-Tzur, A., Minkov, E., Nocera, S., Kuflik, T., & Shoor, I. (2014). Enhancing
transport data collection through social media sources: methods, challenges and opportunities
for textual data. IET Intelligent Transport Systems, 9(4), 407-417.
Gui, L., Zhou, Y., Xu, R., He, Y., & Lu, Q. (2017). Learning representations from heterogeneous
network for sentiment classification of product reviews. Knowledge-Based Systems, 124, 34-45.
Gull, R., Shoaib, U., Rasheed, S., Abid, W., & Zahoor, B. (2016). Pre processing of twitter's data for
opinion mining in political context. Procedia Computer Science, 96, 1560-1570.
Guo, K., Sun, Y., & Qian, X. (2017). Can investor sentiment be used to predict the stock price?
Dynamic analysis based on China stock market. Physica A: Statistical Mechanics and its
Applications, 469, 390-396.
Gurini, D. F., Gasparetti, F., Micarelli, A., & Sansonetti, G. (2018). Temporal people-to-people
recommendation on social networks with sentiment-based matrix factorization. Future
Generation Computer Systems, 78(1), 430-439.
40 http://www.webology.org/2017/v14n2/a156.pdf
Haddi, E., Liu, X., & Shi, Y. (2013). The role of text pre-processing in sentiment analysis. Procedia
Computer Science, 17, 26-32.
Haselmayer, M., & Jenny, M. Sentiment analysis of political communication: combining a dictionary
approach with crowdcoding. Quality & Quantity, 1-24.
He, W., Wu, H., Yan, G., Akula, V., & Shen, J. (2015). A novel social media competitive analytics
framework with sentiment benchmarks. Information & Management, 52(7), 801-812.
He, Y., & Zhou, D. (2011). Self-training from labeled features for sentiment analysis. Information
Processing & Management, 47(4), 606-616.
Hu, Y. H., Chen, K., & Lee, P. J. (2017a). The effect of user-controllable filters on the prediction of
online hotel reviews. Information & Management, 54(6), 728-744.
Hu, Y. H., Chen, Y. L., & Chou, H. L. (2017b). Opinion mining from online hotel reviews–A text
summarization approach. Information Processing & Management, 53(2), 436-449.
Huh, J., Yetisgen-Yildiz, M., & Pratt, W. (2013). Text classification for assisting moderators in online
health communities. Journal of Biomedical Informatics, 46(6), 998-1005.
Hur, M., Kang, P., & Cho, S. (2016). Box-office forecasting based on sentiments of movie reviews and
Independent subspace method. Information Sciences, 372, 608-624.
İskender, E., & Batı, G. B. (2015). Comparing Turkish Universities Entrepreneurship and
Innovativeness Index's Rankings with Sentiment Analysis Results on Social Media. Procedia -
Social and Behavioral Sciences, 195, 1543-1552.
James, T. L., Calderon, E. D. V., & Cook, D. F. (2017). Exploring patient perceptions of healthcare
service quality through analysis of unstructured feedback. Expert Systems with Applications, 71,
479-492.
Jang, H. J., Sim, J., Lee, Y., & Kwon, O. (2013). Deep sentiment analysis: Mining the causality
between personality-value-attitude for analyzing business ads in social media. Expert Systems
with applications, 40(18), 7492-7503.
Janowski, T. (2015). Digital government evolution: From transformation to contextualization.
Government Information Quarterly, 32(3), 221-236.
Ji, X., Chun, S. A., Wei, Z., & Geller, J. (2015). Twitter sentiment classification for measuring public
health concerns. Social Network Analysis and Mining, 5(1), 1-25.
Jiang, C., Liang, K., Chen, H., & Ding, Y. (2014). Analyzing market performance via social media: a
case study of a banking industry crisis. Science China Information Sciences, 57(5), 1-18.
Jin, J., Ji, P., & Gu, R. (2016). Identifying comparative customer requirements from product online
reviews for competitor analysis. Engineering Applications of Artificial Intelligence, 49, 61-73.
Kagan, V., Stevens, A., & Subrahmanian, V. S. (2015). Using twitter sentiment to forecast the 2013
pakistani election and the 2014 indian election. IEEE Intelligent Systems, 30(1), 2-5.
Kang, D., & Park, Y. (2014). Review-based measurement of customer satisfaction in mobile service:
Sentiment analysis and VIKOR approach. Expert Systems with Applications, 41(4), 1041-1050.
Kang, H., Yoo, S. J., & Han, D. (2012). Senti-lexicon and improved Naïve Bayes algorithms for
sentiment analysis of restaurant reviews. Expert Systems with Applications, 39(5), 6000-6010.
Kansal, H., & Toshniwal, D. (2014). Aspect based summarization of context dependent opinion words.
Procedia Computer Science, 35, 166-175.
41 http://www.webology.org/2017/v14n2/a156.pdf
Kardan, A. A., & Ebrahimi, M. (2013). A novel approach to hybrid recommendation systems based on
association rules mining for content recommendation in asynchronous discussion groups.
Information Sciences, 219, 93-110.
Katakis, I., Tsapatsoulis, N., Mendez, F., Triga, V., & Djouvas, C. (2014). Social voting advice
applications—definitions, challenges, datasets and evaluation. IEEE Transactions on
Cybernetics, 44(7), 1039-1052.
Khan, A. U. R., Khan, M., & Khan, M. B. (2016). Naïve Multi-label Classification of YouTube
Comments Using Comparative Opinion Mining. Procedia Computer Science, 82, 57-64.
Kitchenham, B., Brereton, O. P., Budgen, D., Turner, M., Bailey, J., & Linkman, S. (2009). Systematic
literature reviews in software engineering–a systematic literature review. Information and
Software Technology, 51(1), 7-15.
Küçük, A. E., Orhan, Z., & Kabil, M. (2014). Psychodroid: A Mobile Psychological Disorder
Detection Application by Dynamic Question Generation and Content Analysis. Procedia -
Social and Behavioral Sciences, 159, 691-696.
Kuflik, T., Minkov, E., Nocera, S., Grant-Muller, S., Gal-Tzur, A., & Shoor, I. (2017). Automating a
framework to extract and analyse transport related social media content: The potential and the
challenges. Transportation Research Part C: Emerging Technologies, 77, 275-291.
Kumar, A., & Joshi, A. (2017, March). Ontology driven sentiment analysis on social web for
government intelligence. In Proceedings of the Special Collection on eGovernment Innovations
in India (pp. 134-139). ACM.
Kumar, A., & Sharma, A. (2016). Paradigm shifts from e-governance to s-governance. The Human
Element of Big Data: Issues, Analytics, and Performance, 213.
Kumar, A., & Sharma, A. (2015). SWOT analysis of requirements engineering for web applications.
Journal of Advances Research in Science and Engineering, 4, 34-43.
Kumar, A., & Teeja, M. S. (2012). Sentiment analysis: A perspective on its past, present and future.
International Journal of Intelligent Systems and Applications, 4(10), 1.
Kurian, N., & Asokan, S. (2015). Summarizing user opinions: A method for labeled-data scarce
product domains. Procedia Computer Science, 46, 93-100.
Lane, P. C., Clarke, D., & Hender, P. (2012). On developing robust models for favourability analysis:
Model choice, feature sets and imbalanced data. Decision Support Systems, 53(4), 712-718.
Law, D., Gruss, R., & Abrahams, A. S. (2017). Automated defect discovery for dishwasher appliances
from online consumer reviews. Expert Systems with Applications, 67, 84-94.
Lee, A. J., Yang, F. C., Chen, C. H., Wang, C. S., & Sun, C. Y. (2016). Mining perceptual maps from
consumer reviews. Decision Support Systems, 82, 12-25.
Lee, S. W., Kim, I., Yoo, J., Park, S., Jeong, B., & Cha, M. (2016). Insights from an expressive writing
intervention on Facebook to help alleviate depressive symptoms. Computers in Human
Behavior, 62, 613-619.
Lei, J., Rao, Y., Li, Q., Quan, X., & Wenyin, L. (2014). Towards building a social emotion detection
system for online news. Future Generation Computer Systems, 37, 438-448.
Lei, X., Qian, X., & Zhao, G. (2016). Rating prediction based on social sentiment from textual
reviews. IEEE Transactions on Multimedia, 18(9), 1910-1921.
42 http://www.webology.org/2017/v14n2/a156.pdf
Li, H., Cui, J., Shen, B., & Ma, J. (2016). An intelligent movie recommendation system through
group-level sentiment analysis in microblogs. Neurocomputing, 210, 164-173.
Li, J., Xu, Z., Yu, L., & Tang, L. (2016). Forecasting oil price trends with sentiment of online news
articles. Procedia Computer Science, 91, 1081-1087.
Li, Q., Jin, Z., Wang, C., & Zeng, D. D. (2016). Mining opinion summarizations using convolutional
neural networks in Chinese microblogging systems. Knowledge-Based Systems, 107, 289-300.
Li, Q., Wang, T., Gong, Q., Chen, Y., Lin, Z., & Song, S. K. (2014). Media-aware quantitative trading
based on public Web information. Decision Support Systems, 61, 93-105.
Li, S., Ming, Z., Leng, Y., & Guo, J. (2016). Product ranking using hierarchical aspect structures.
Journal of Intelligent Information Systems, 1-22
Li, X., Xie, H., Chen, L., Wang, J., & Deng, X. (2014). News impact on stock price return via
sentiment analysis. Knowledge-Based Systems, 69, 14-23.
Liu, C. L., Hsaio, W. H., Lee, C. H., Lu, G. C., & Jou, E. (2012). Movie rating and review
summarization in mobile environment. IEEE Transactions on Systems, Man, and Cybernetics,
Part C (Applications and Reviews), 42(3), 397-407.
Liu, L. Z., Liu, H., Wang, H. S., Song, W., & Zhao, X. L. (2015). Generating domain-specific affective
ontology from Chinese reviews for sentiment analysis. Journal of Shanghai Jiaotong University
(Science), 20, 32-37.
Liu, R., An, Y., & Song, L. (2011). Chinese multi-document summarization based on opinion
similarity. Procedia Engineering, 15, 2026-2031.
Liu, Y., Jin, J., Ji, P., Harding, J. A., & Fung, R. Y. (2013). Identifying helpful online reviews: a
product designer’s perspective. Computer-Aided Design, 45(2), 180-194.
Lu, Y. (2013). Automatic topic identification of health-related messages in online health community
using text classification. SpringerPlus, 2(1), 309.
Luo, B., Zeng, J., & Duan, J. (2016). Emotion space model for classifying opinions in stock message
board. Expert Systems with Applications, 44, 138-146.
Lv, X., & El-Gohary, N. (2016). Text Analytics for Supporting Stakeholder Opinion Mining for Large-
scale Highway Projects. Procedia Engineering, 145, 518-524.
Ma, X., Lei, X., Zhao, G., & Qian, X. Rating prediction by exploring user’s preference and sentiment.
Multimedia Tools and Applications, 1-20.
Ma, Y., Chen, G., & Wei, Q. (2017). Finding users preferences from large-scale online reviews for
personalized recommendation. Electronic Commerce Research, 17 (1), 3-29.
Makazhanov, A., Rafiei, D., & Waqar, M. (2014). Predicting political preference of Twitter users.
Social Network Analysis and Mining, 4(1), 1-15.
Malik, H., & Shakshuki, E. M. (2016). Mining Collective Opinions for Comparison of Mobile Apps.
Procedia Computer Science, 94, 168-175.
Manaman, H. S., Jamali, S., & AleAhmad, A. (2016). Online reputation measurement of companies
based on user-generated content in online social networks. Computers in Human Behavior, 54,
94-100.
Manek, A. S., Shenoy, P. D., Mohan, M. C., & Venugopal, K. R. (2017). Aspect term extraction for
sentiment analysis in large movie reviews using Gini Index feature selection method and SVM
classifier. World Wide Web, 20(2), 135-154.
43 http://www.webology.org/2017/v14n2/a156.pdf
Mcnulty E. (2014, May 22). Understanding big data: The seven v's. Retrieved August 20, 2017, from
http://dataconomy.com/seven-vs-big-data/
Medhat, W., Hassan, A., & Korashy, H. (2014). Sentiment analysis algorithms and applications: A
survey. Ain Shams Engineering Journal, 5(4), 1093-1113.
Min, H. J., & Park, J. C. (2012). Identifying helpful reviews based on customer’s mentions about
experiences. Expert Systems with Applications, 39(15), 11830-11838.
Mohammad,S. M., Zhu, X., Kiritchenko, S., & Martin, J. (2015). Sentiment, emotion, purpose, and
style in electoral tweets. Information Processing &Management, 51(4), 480-499.
Moreo, A., Romero, M., Castro, J. L., & Zurita, J. M. (2012). Lexicon-based comments-oriented news
sentiment analyzer system. Expert Systems with Applications, 39(10), 9166-9180.
Mostafa, M. M. (2013). An emotional polarity analysis of consumers’ airline service tweets. Social
Network Analysis and Mining, 3(3), 635-649.
Mostafa, M. M. (2013). More than words: Social networks’ text mining for consumer brand
sentiments. Expert Systems with Applications, 40(10), 4241-4251.
Nath, V. (2003). Digital governance models: Moving towards good governance in developing
countries. E-commerce, 2(4). Retrieved August 20, 2017, from
http://www.amarc.org/documents/articles/nath-digital.pdf
Nayak, A., Pai, M. M., & Pai, R. M. (2016). Prediction models for indian stock market. Procedia
Computer Science, 89, 441-449.
Neidhardt, J., Rümmele, N., & Werthner, H. (2017). Predicting happiness: user interactions and
sentiment analysis in an online travel forum. Information Technology & Tourism, Information
Technology & Tourism, 17(1), 101–119.
Nguyen, T. H., Shirai, K., & Velcin, J. (2015). Sentiment analysis on social media for stock movement
prediction. Expert Systems with Applications, 42(24), 9603-9611.
Nguyen, T., Phung, D., Dao, B., Venkatesh, S., & Berk, M. (2014). Affective and content analysis of
online depression communities. IEEE Transactions on Affective Computing, 5(3), 217-226.
O'Leary, D. E. (2016). On the relationship between number of votes and sentiment in crowdsourcing
ideas and comments for innovation: A case study of Canada's digital compass. Decision Support
Systems, 88, 28-37.
Ortigosa, A., Martín, J. M., & Carro, R. M. (2014). Sentiment analysis in Facebook and its application
to e-learning. Computers in Human Behavior, 31, 527-541.
Osimo, D., & Mureddu, F. (2012). Research challenge on opinion mining and sentiment analysis.
Universite de Paris-Sud, Laboratoire LIMSI-CNRS.
Pai, M. Y., Chu, H. C., Wang, S. C., & Chen, Y. M. (2013). Electronic word of mouth analysis for
service experience. Expert Systems with Applications, 40(6), 1993-2006.
Panasyuk, A., Yu, E. S. L., & Mehrotra, K. G. (2014). Controversial topic discovery on members of
congress with twitter. Procedia Computer Science, 36, 160-167.
Park, Y., & Lee, S. (2011). How to design and utilize online customer center to support new product
concept generation. Expert Systems with Applications, 38(8), 10638-10647.
Parkhe, V., & Biswas, B. (2016). Sentiment analysis of movie reviews: finding most important movie
aspects using driving factors. Soft Computing, 20(9), 3373-3379.
44 http://www.webology.org/2017/v14n2/a156.pdf
Peleja, F., Dias, P., Martins, F., & Magalhães, J. (2013). A recommender system for the TV on the web:
integrating unrated reviews and movie ratings. Multimedia Systems, 19(6), 543-558.
Pröllochs, N., Feuerriegel, S., & Neumann, D. (2016). Negation scope detection in sentiment analysis:
Decision support for news-driven trading. Decision Support Systems, 88, 67-75.
Qiang, J. P., Chen, P., Ding, W., Xie, F., & Wu, X. (2016). Multi-document summarization using
closed patterns. Knowledge-Based Systems, 99, 28-38.
Qiao, Y. N., Yong, Q., & Di, H. (2011). Tensor Field Model for higher-order information retrieval.
Journal of Systems and Software, 84(12), 2303-2313.
Ravi, K., & Ravi, V. (2015). A survey on opinion mining and sentiment analysis: Tasks, approaches
and applications. Knowledge-Based Systems, 89, 14-46.
Remus, R., Quasthoff, U., & Heyer, G. (2010, May). SentiWS-A Publicly Available German-language
Resource for Sentiment Analysis. In: Proceedings of the 7th International Language Resources
and Evaluation (LREC'10), pp. 1168-1171.
Reyes, A., & Rosso, P. (2012). Making objective decisions from subjective data: Detecting irony in
customer reviews. Decision Support Systems, 53(4), 754-760.
Rill, S., Reinel, D., Scheidt, J., & Zicari, R. V. (2014). Politwi: Early detection of emerging political
topics on twitter and the impact on concept-level sentiment analysis. Knowledge-Based Systems,
69, 24-33.
Rosa, R. L., Rodriguez, D. Z., & Bressan, G. (2015). Music recommendation system based on user's
sentiments extracted from social networks. IEEE Transactions on Consumer Electronics, 61(3),
359-367.
Rossetti, M., Stella, F., & Zanker, M. (2016). Analyzing user reviews in tourism with topic models.
Information Technology & Tourism, 16(1), 5-21.
Rout, J. K., Dalmia, A., Choo, K. K. R., Bakshi, S., & Jena, S. K. (2017). Revisiting semi-supervised
learning for online deceptive review detection. IEEE Access, 5(1), 1319-1327.
Sanchez-Monzon, J., Putzke, J., & Fischbach, K. (2011). Automatic generation of product association
networks using latent dirichlet allocation. Procedia-Social and Behavioral Sciences, 26, 63-75.
Sandoval, J., & Hernández, G. (2015). Computational visual analysis of the order book dynamics for
creating high-frequency foreign exchange trading strategies. Procedia Computer Science, 51,
1593-1602.
Scharkow, M. (2013). Thematic content analysis using supervised machine learning: An empirical
evaluation using German online news. Quality & Quantity, 47(2), 761-773.
Schumaker, R. P., Zhang, Y., Huang, C. N., & Chen, H. (2012). Evaluating sentiment in financial news
articles. Decision Support Systems, 53(3), 458-464.
Shen, C. W., & Kuo, C. J. (2015). Learning in massive open online courses: Evidence from social
media mining. Computers in Human Behavior, 51, 568-577.
Singh, P., Sawhney, R. S., & Kahlon, K. S. (2017). Sentiment analysis of demonetization of 500 &
1000 rupee banknotes by Indian government. ICT Express.
https://doi.org/10.1016/j.icte.2017.03.001
Smailović, J., Grčar, M., Lavrač, N., & Žnidaršič, M. (2014). Stream-based active learning for
sentiment analysis in the financial domain. Information Sciences, 285, 181-203.
45 http://www.webology.org/2017/v14n2/a156.pdf
Sohail, S. S., Siddiqui, J., & Ali, R. (2015). OWA based book recommendation technique. Procedia
Computer Science, 62, 126-133.
Song, B., Lee, C., Yoon, B., & Park, Y. (2016). Diagnosing service quality using customer reviews: An
index approach based on sentiment and gap analyses. Service Business, 10(4), 775-798.
Sun, J., Wang, G., Cheng, X., & Fu, Y. (2015). Mining affective text to improve social media item
recommendation. Information Processing & Management, 51(4), 444-457.
Tayal, D. K., & Yadav, S. K. (2016). Sentiment analysis on social campaign “Swachh Bharat Abhiyan”
using unigram method. AI & Society, 1-13.
Teodorescu, H. N. (2015). Using analytics and social media for monitoring and mitigation of social
disasters. Procedia Engineering, 107, 325-334.
The Digital Governance Initiative. (n.d.). Digital Governance Concept. Retrieved August 20, 2017,
from http://www.digitalgovernance.org/index.php/concept
Tsai, F. S., & Kwee, A. T. (2011). Database optimization for novelty mining of business blogs. Expert
Systems with Applications, 38(9), 11040-11047.
Tsai, M. F., & Wang, C. J. (2017). On the risk prediction and analysis of soft information in finance
reports. European Journal of Operational Research, 257(1), 243-250.
Tsirakis, N., Poulopoulos, V., Tsantilas, P., & Varlamis, I. (2017). Large scale opinion mining for
social, news and blog data. Journal of Systems and Software, 127, 237-248.
United Nations Human Rights: Office of the High Commissioner. (n.d.). Good Governance and
Human Rights. Retrieved August 20, 2017, from
http://www.ohchr.org/EN/Issues/Development/GoodGovernance/Pages/GoodGovernanceIndex.aspxm
Valsamidis, S., Theodosiou, T., Kazanidis, I., & Nikolaidis, M. (2013). A framework for opinion
mining in blogs for agriculture. Procedia Technology, 8, 264-274.
Van de Kauter, M., Breesch, D., & Hoste, V. (2015). Fine-grained analysis of explicit and implicit
sentiment in financial news articles. Expert Systems with Applications, 42(11), 4999-5010.
Vinodhini, G., & Chandrasekaran, R. M. (2014). Measuring the quality of hybrid opinion mining
model for e-commerce application. Measurement, 55, 101-109.
Vinodhini, G., & Chandrasekaran, R. M. (2014). Opinion mining using principal component analysis
based ensemble model for e-commerce application. CSI transactions on ICT, 2(3), 169-179
Vinodhini, G., & Chandrasekaran, R. M. (2016). A comparative performance evaluation of neural
network based approach for sentiment classification of online reviews. Journal of King Saud
University-Computer and Information Sciences, 28(1), 2-12.
Wang, D., & Liu, Y. (2015). Opinion summarization on spontaneous conversations. Computer Speech
& Language, 34(1), 61-82
Wang, D., Zhu, S., & Li, T. (2013). SumView: A Web-based engine for summarizing product reviews
and customer opinions. Expert Systems with Applications, 40(1), 27-33.
Wang, T., Cai, Y., Leung, H. F., Lau, R. Y., Li, Q., & Min, H. (2014). Product aspect extraction
supervised with online domain knowledge. Knowledge-Based Systems, 71, 86-100.
Wang, Y. (2017). Stock market forecasting with financial micro-blog based on sentiment and time
series analysis. Journal of Shanghai Jiaotong University (Science), 22(2), 173-179.
46 http://www.webology.org/2017/v14n2/a156.pdf
Wei, C. P., Lin, W. B., Chen, H. C., An, W. Y., & Yeh, W. C. (2015). Finding experts in online forums
for enhancing knowledge sharing and accessibility. Computers in Human Behavior, 51, 325-
335.
Wikipedia contributors. (2017). Advocacy. In Wikipedia, The Free Encyclopedia. Retrieved August 20,
2017, from https://en.wikipedia.org/wiki/Advocacy
Wiley, M. T., Jin, C., Hristidis, V., & Esterling, K. M. (2014). Pharmaceutical drugs chatter on online
social networks. Journal of Biomedical Informatics, 49, 245-254.
Wöllmer, M., Weninger, F., Knaup, T., Schuller, B., Sun, C., Sagae, K., & Morency, L. P. (2013).
Youtube movie reviews: Sentiment analysis in an audio-visual context. IEEE Intelligent
Systems, 28(3), 46-53.
Wu, D. D., Zheng, L., & Olson, D. L. (2014). A decision support approach for online stock forum
sentiment analysis. IEEE Transactions on Systems, Man, and Cybernetics: Systems, 44(8),
1077-1087.
Wu, J. L., Yu, L. C., & Chang, P. C. (2014). An intelligent stock trading system using comprehensive
features. Applied Soft Computing, 23, 39-50.
Wu, M., Wang, L., Li, M., & Long, H. (2014). An approach of product usability evaluation based on
Web mining in feature fatigue analysis. Computers & Industrial Engineering, 75, 230-238.
Xianghua, F., Guo, L., Yanyan, G., & Zhiqiang, W. (2013). Multi-aspect sentiment analysis for Chinese
online social reviews based on topic modeling and HowNet lexicon. Knowledge-Based Systems,
37, 186-195.
Xu, K., Liao, S. S., Li, J., & Song, Y. (2011). Mining comparative opinions from customer reviews for
Competitive Intelligence. Decision Support Systems, 50(4), 743-754.
Xue, W., Li, T., & Rishe, N. Aspect identification and ratings inference for hotel reviews. World Wide
Web, 20(1), 23-37.
Yaakub, M. R., Li, Y., & Zhang, J. (2013). Integration of sentiment analysis into customer relational
model: the importance of feature ontology and synonym. Procedia Technology, 11, 495-501.
Yan, D., Zhou, G., Zhao, X., Tian, Y., & Yang, F. (2016). Predicting stock using microblog moods.
China Communications, 13(8), 244-257.
Yan, G., He, W., Shen, J., & Tang, C. (2014). A bilingual approach for conducting Chinese and English
social media sentiment analysis. Computer Networks, 75, 491-503.
Yang, F. C., Lee, A. J., & Kuo, S. C. (2016). Mining health social media with sentiment analysis.
Journal of Medical Systems, 40(11), 236.
Yang, H. L., & Chao, A. F. (2015). Sentiment analysis for Chinese reviews of movies in multi-genre
based on morpheme-based features and collocations. Information Systems Frontiers, 17(6),
1335-1352.
Yu, X., Liu, Y., Huang, X., & An, A. (2012). Mining online reviews for predicting sales performance:
A case study in the movie domain. IEEE Transactions on Knowledge and Data engineering,
24(4), 720-734.
Yu, Y., & Wang, X. (2015). World cup 2014 in the twitter world: A big data analysis of sentiments in
US sports fans’ tweets. Computers in Human Behavior, 48, 392-400.
Yu, Y., Duan, W., & Cao, Q. (2013). The impact of social and conventional media on firm equity
value: A sentiment analysis approach. Decision Support Systems, 55(4), 919-926.
47 http://www.webology.org/2017/v14n2/a156.pdf
Yuan, H., Xu, W., Li, Q., & Lau, R. (2017). Topic sentiment mining for sales performance prediction
in e-commerce. Annals of Operations Research, 1-24.
Zakzouk, T. S., & Mathkour, H. I. (2012). Comparing text classifiers for sports news. Procedia
Technology, 1, 474-480.
Zhai, Z., Liu, B., Wang, J., Xu, H., & Jia, P. (2012). Product feature grouping for opinion mining.
IEEE Intelligent Systems, 27(4), 37-44.
Zhang, H., Sekhari, A., Ouzrout, Y., & Bouras, A. (2016). Jointly identifying opinion mining elements
and fuzzy measurement of opinion intensity to analyze product features. Engineering
Applications of Artificial Intelligence, 47, 122-139.
Zhang, L., Hua, K., Wang, H., Qian, G., & Zhang, L. (2014). Sentiment analysis on reviews of mobile
users. Procedia Computer Science, 34, 458-465.
Zhang, W., Li, C., Ye, Y., Li, W., & Ngai, E. W. (2015). Dynamic business network analysis for
correlated stock price movement prediction. IEEE Intelligent Systems, 30(2), 26-33.
Zhang, X., Cui, L., & Wang, Y. (2014). Commtrust: Computing multi-dimensional trust by mining e-
commerce feedback comments. IEEE Transactions on Knowledge and Data Engineering, 26(7),
1631-1643.
Zhang, Y., Chen, M., Huang, D., Wu, D., & Li, Y. (2017). iDoctor: Personalized and professionalized
medical recommendations based on hybrid matrix factorization. Future Generation Computer
Systems, 66, 30-35.
Zheng, X., Lin, Z., Wang, X., Lin, K. J., & Song, M. (2014). Incorporating appraisal expression
patterns into topic modeling for aspect and sentiment word identification. Knowledge-Based
Systems, 61, 29-47.
Zimmermann, M., Ntoutsi, E., & Spiliopoulou, M. (2015). Discovering and monitoring product
features and the opinions on them with OPINSTREAM. Neurocomputing, 150, 318-330.
Zitnik, M. (2012). Using sentiment analysis to improve business operations. XRDS: Crossroads, The
ACM Magazine for Students, 18(4), 42-43.
Bibliographic information of this paper for citing:
Kumar, Akshi, & Sharma, Abhilasha (2017). "Systematic Literature Review on Opinion Mining
of Big Data for Government Intelligence." Webology, 14(2), Article 156. Available at:
http://www.webology.org/2017/v14n2/a156.pdf
Copyright © 2017, Akshi Kumar and Abhilasha Sharma.