a statistical approach for fake news detection using …

A STATISTICAL APPROACH FOR FAKE

NEWS DETECTION USING ML APPROACHES

Abstract - The paper analyzed the various methods of ML which can be used to identify or validate between fake and real information. Various fake news today which originates from entrusted sources can be identified to prevent people from misleading because of them getting the wrong information.

In the below mentioned methods, the word ordering is also taken into account along with context which was missing in the previous models, i.e., if the number of words are same in two different writings, it doesn't conclude that the meaning need to be same.

There is a famous “Fake News Challenge” by Cagle. Face book is also using ML and AI algorithms to identify between fake and real news. This is a classic project which is based on classifying texts for identifying the news using prepositions which are direct. So here is a one more proposed work for detection of fake news which uses Support Vector Machine to differentiate an article into valid or fake according to the meaning of words in it.

1 INTRODUCTION

1.1 Problem MotivationThe purpose behind making fake news detection is to prevent the rumors from spreading via platforms like social media or other messaging platforms. It helps in decreasing the spread of fake news which leads to activities like mob lynching.

No doubt, the technology era evolved for the betterment of the mankind. It made various things possible through innovation and improvement. But do we ever think it could also cause some serious negative effects. Now days, everyone is on social media with day-by-day increasing engagement on online platforms. Due to these social media platforms, spread of fake news is just a click away. The fake news spreads over the network so fast that it can impact millions of people at a time. Thus, results in activities like mob lynching, etc. With this motivation that detecting a news to be valid or not, we are trying to stop these activities and remove unwanted acts of violence from our society and from our state as a whole.

1.2 Research ObjectiveThe objective behind this is the detection of fake news. It is a classic text classification problem which has a straight forward proposition. It is needed to build a model which can distinguish between fake and real news. As discussed earlier, fake news detection is used in as where we need to distinguish between the real and fake news that lead to consequences discussed above, These areas include social media platforms such as Instagram, Facebook, Twitter, WhatsApp, Hike. Here the fake news gets a major boost and it becomes viral among people and finally it is spread around the country and globe.

2 RELATED PREVIOUS WORK

The fake news detection is not taken into a serious note till date. Very few initiatives have been taken into this topic.

Some papers based on this topic had been published by a group of students studying in Vivekananda Education Society's Institute of Technology. They tried to highlight the negative impacts of social media and Internet as a whole since the start of 20th century and gave the circulation of fake news a huge boon [3].

They also mentioned about the initiatives taken by Face book on Fake News Detection. They are working currently under alpha phase since last one year.

Also, In 2017, a student from Ho Chi Minh city University of Technology, Cambodia, implemented his work on this topic. His research works included use of methods related to Deep Learning models like GAN, Encoders and CNN. Also, a mechanism known as Bidirectional GRU is used in his project.

3 PROPOSED METHOD3.1 News Authenticator

This includes detection of the news whether it is fake or not. The News Authenticator who performs some pre define steps to validate a particular news. It will relate between the given news by the user and between different website and sources of news if found similar news elsewhere, it displays the news to be valid, otherwise to be fake. This can differentiate between fake and valid news and help us in various manner. Fake news spread because of the reason that is none other than people using social media like it is the part of their body. So, the news authenticator will validate news to be fake or real.

a. News Suggestion

News suggestion gives suggestion of recent news. It also gives suggestions about the news which has been given to authentication by the user. The news which are the most recent and given authentication by the user was suggested under this model. If that news is fake, then it will suggest news according to related news. It suggest or displays news according to the given keywords and authentication for that particular news.

4.LITERATURE SURVEY

As the spread of fake news is increasing rapidly, various approaches have been made in order to detect the fake news. Majorly there exists athree types of fake news contributors:

Social Bots: When the computer algorithm is controlled by a social media account , then it is known as a social bot.It generates the content automatically .

Rohit Rastogi1, Divya Sharma2,Umang Agrawal3CSE, ABES Engineering College, Ghaziabad, UP, India

1

Also, taking about its disadvantages, it is difficult with large data sets, since training with new data can take a lot of time and thus making it less effective. Also, SVMs will not be able to give estimated probability in a direct step[2].

4 SYSTEM DESIGN AND METHODOLOGY

1. System Design

1.1. System Architecture

Fig. 1 System Architecture for Fake news detection

1.1. Flow Chart

Fig. 2 Flow diagram for Fake news detection

1. Support Vector Machines (SVM)

It is a supervised machine learning algorithm which can be used for regression as well as classification purposes. But mostly it is used for the purpose of classification. It helps to find the hyper-plane which divides the dataset in two classes. Hyper-planes can be termed as the decision boundaries which help the machine learning model in classifying the data or data points. Data points which fall on the different side of the hyper-plane are classified into various classes. The dimension of the hyper- plane is decided by the number of features. The hyper-plane is a line when the number of input features is 2 and it is a two-dimensional plane when the number of input features is 3. When the number of features exceeds 3, it is then difficult to imagine. Using a hyper-plane how the data points are classified can be seen in the figure 4 shown below:

Trolls: It refers to the real human whose aim is to disrupt the online communities in order to provoke social media users into an emotional response.

Cyborg: These users are mix of automated activities with input given by human. Real humans register the accounts as a cover but they use programs in order to perform activities in social media.

There are two categories for the detection of false information:

Ÿ Linguistic Cue

Ÿ Network Analysis approaches.

Ÿ Further more methods have been explored :

Ÿ Naïve Bayes Classifier.

Ÿ Support Vector Machines (SVM).

1. Linguistic CueMethods

With the help of the study of various communicative behaviors, it is detected by the researchers that there is a deception in these kinds of approaches. They believe that the way differs in speaking of one is right and when is lying. In general, the word count of liars is larger than that of the truth-tellers. Fewer self-oriented pronouns along with using more sensory based word are used by the liars as compared to the other-oriented pronouns.[2]

The Linguistic Cue methods helps in the detection of fake news. It catches the information manipulators by identifying in the writing style of the news content. Data Representation, Deep Syntax, Semantic Analysis, and Sentiment Analysis are the main methods which have been implemented under this approach.

2. Support Vector Machine (SVM)

A Support Vector Machine (SVM) is a supervised learning algorithm. It can also be replaced by a Support Vector network (SVN). First of all, some specific data is organized in to different categories and along with this data the SVMs work, i.e, it is formed when it is already trained based on those collected data. Whenever any new data is collected, the work of SVM is to identify the category of that data and thus try to maximize the difference between their margins of blocks or classes (Bram Brick). It finds a hyper plane optimally which in result separated the dataset in to two groups. It gives correct results and also it performs very well on those smaller data sets which tends to be concise. It is very easy to be used and modified as well, as it can also classify and differentiate numbers. Also, the SVM have tendency to handle high dimensional spaces does in optimal memory space.

A Statistical Approach For Fake News Detection Using Ml Approaches

2

Vision & Quest, Vol. 9, No. 1, Jan.-June 2019ISSN: 0975-8410

RESULT:

Screenshot 2: False News showing Interface

Screenshot 3: True News showing Interface

CONCLUSION

A study was conducted by us which showed the spread of fake news across the Twitter's social media platform in the time span of October25, 2016 to November7, 2016. It was found that the most popular fake news were shared with great efforts for a short time period with frequency and volume which diminished within 12 to 48 hours.

We detect this fake news over our website also we provided some suggested news on that topic that is very helpful for any user.

In future we will enhance the efficiency as well as the accuracy of our project and also we enhance our user interface.

References

[1]. Whats App's fight against fake news :top new features it has t a k e n t o s t o p m i s i n f o r m a t i o n (https://indianexpress.com/article/technology/social/whatsapp-fight- against-fake-news-top-features-to-curb-spread-of-misinformation-5256782/).

[ 2 ] . Yu n u s G e n e s , D e t e c t i n g f a k e n e w s w i t h N L P (https://medium.com/@ Genyunus/detecting-fake-news-with-nlp-c893ec31dee8).

[3].Kai Shuand Huan Liu, A quick guide to fake news detection on sociall media (https://www.kdnuggets.com/2017/10/guide-fake-news-detection-social- media.html).

[4.] Manisha Gahirwal, Sanjana Moghe, Tanvi Kulkarni, Devesh Khakhar and Jayesh Bhatia, Fake news detection, International journal of advance research idea and innovation in technology, V o l u m e 4 , I s s u e 1 , 2 0 1 8 (https://www.ijariit.com/manuscripts/v4i1/V4I1-1432.pdf).

[5] . Working to stop misinformation and false news (https://www.facebook.com/facebookmedia/blog/working-to-stop-misinformation-and-false-news

Fig. 3 SVM Architecture for Fake news detection

3 IMPLEMENTATION AND RESULTS

In the Implementation of this project we use python, and some machine learning algorithm.

Fig. 3 SVM Architecture for Fake news detection

We use python because python has enough libraries to scrap website and extract data from real time website.

We use some machine learning algorithm for searching result and handle large data.

1. Software and Hardware Requirements1.1. Software Requirements

· Operating System: Windows 7 and above, Linux, Mac OS.

· Web browser: Microsoft Edge, Google Chrome (Version 69 and above), Mozilla.

· Jupyter: Version 5.7.0

· Atom.io: Version 1.31.1

1.2. Hardware Requirements

· Processer: Intel Core i5 and above.

· GPU: AMD Radeon R9/ Nvidia GTX 1050

2. Implementation Details

2.1. Screenshots

Screenshot 1: Contacts.

3

http://www.kdnuggets.com/2017/10/guide-fake-news-detection-social-

http://www.kdnuggets.com/2017/10/guide-fake-news-detection-social-

http://www.ijariit.com/manuscripts/v4i1/V4I1-1432.pdf)

http://www.facebook.com/

http://www.facebook.com/

a statistical approach for fake news detection using …

Documents