big data threat detection on cloud environment with business i … · 2018-09-01 · big data...

12
Big Data Threat Detection on Cloud Environment with Business Intelligence [1] Dr.D.NageswaraRao, [2] P.Hiranmanibala ,[3] G S PradeepGhantasala [1] Professor, [2] Assistant Professor, [3] Assistant Professor [1,3] Galgotias University Abstract Cloud computing includes security issues such as information security, computer security, data privacy and network security which are emerging at a rapid pace. Big data require processing the data distributed through numerous servers. It develops large quantity of data that would be available in the cloud environment. The progressive development of big data and also increased of threats to the information security. This study based on the significant area of business intelligence and analytics has emerged for both researchers and practitioners, to be solved in present day business organization that reflecting the magnitude and impact of data related problems. In this paper to identify threat for business on big data through cloud environment and analyze the challenges and merits it brings to enterprises. Keywords: Big Data, Cloud Computing, Threat detection, business Intelligence and analytics, Data Privacy, Security 1. Introduction Nowadays researcher of big data have focus on the research activities based on management and analysis of big data which define the approach of zero latency in International Journal of Pure and Applied Mathematics Volume 119 No. 18 2018, 1789-1799 ISSN: 1314-3395 (on-line version) url: http://www.acadpubl.eu/hub/ Special Issue http://www.acadpubl.eu/hub/ 1789

Upload: others

Post on 06-Aug-2020

2 views

Category:

Documents


0 download

TRANSCRIPT

Big Data Threat Detection on Cloud Environment

with Business Intelligence

[1]Dr.D.NageswaraRao,

[2] P.Hiranmanibala

,[3]G S PradeepGhantasala

[1]Professor,

[2]Assistant Professor,

[3]Assistant Professor

[1,3]Galgotias University

Abstract

Cloud computing includes security issues such as information security, computer

security, data privacy and network security which are emerging at a rapid pace. Big

data require processing the data distributed through numerous servers. It develops

large quantity of data that would be available in the cloud environment. The

progressive development of big data and also increased of threats to the information

security. This study based on the significant area of business intelligence and analytics

has emerged for both researchers and practitioners, to be solved in present day

business organization that reflecting the magnitude and impact of data related

problems. In this paper to identify threat for business on big data through cloud

environment and analyze the challenges and merits it brings to enterprises.

Keywords:

Big Data, Cloud Computing, Threat detection, business Intelligence and analytics,

Data Privacy, Security

1. Introduction

Nowadays researcher of big data have focus on the research activities based on

management and analysis of big data which define the approach of zero latency in

International Journal of Pure and Applied MathematicsVolume 119 No. 18 2018, 1789-1799ISSN: 1314-3395 (on-line version)url: http://www.acadpubl.eu/hub/Special Issue http://www.acadpubl.eu/hub/

1789

data analysis and also prevention of data loss using big data security and privacy.

According to various requirements, security and privacy are the two significant role is

used. In big data security generally refer to the benefit of big data which implement

solutions by increasing security, safety and reliability of the distributed system.

Whereas big data privacy aim is to protect the big data form unwanted interference

and unauthorized usage.

Business Intelligence (BI) is said to be practice of collecting data, integrating,

analyzing and presenting of business information with some kind of application and

technologies. The major objective of BI is to backing better and quick in business

decision making. In order to develop the operations of business and understand the

information which help to backing decision making. During recent years, big data

analysis and cloud computing are the two most significant technologies to introduce

the main stream business. In term of delivering a benefit for business and powerful

results, the two major technologies have introduced together. One of the major data

analysis methodologies is bigdata which enables by recent development in IT sector.

Cloud computing have already changed the approach of IT service that provided to an

organizations based upon user and business interact with resources. Thus big data

analysis need large amount of computing resources creating cost adoption of big data

methodology is not economical for several small and medium enterprises. According

to the cloud computing integration, risk is considered based on security in which

nearly 75% of IT experts acknowledge, security is the major risk involved [1]. With

cloud computing, data is stored and accessed through the internet. For some

businesses, it is essential to keep the data on the premises as their data is confidential.

There are several solutions to secure data including encryption, but it is the

organization’s responsibility to encrypt the data appropriately on the cloud. Although

the virtualization process is essential in any cloud technology, it might cause highly

technical security breaches as the data will be stored forever on virtual hardware even

when its index is deleted [2].

International Journal of Pure and Applied Mathematics Special Issue

1790

This proposed paper focus on security challenges, when the confidential or

sensible data process from the organization to the warehouse of big data like apache

Hadoop which provide various threat model and also framework of security control to

address and check the risk because of recognized security threats [3].

2. Literature Survey

Analysis of big data management is the goal of focusing the several researches in day

today activities. It defining the methodology of zero latency to data analytics and the

challenges of privacy and security based big data issues. In certain there are two

essential issues of privacy and security, while the distinct requirement. Security on big

data is used to implement the solutions which are consistency, increasing security and

dispersed of safety system. Instead of privacy on big data that emphases from

unwanted implication and unauthorized user on the protection of big data.

This paper proposes the big data on business intelligence [4] is the new challenges of

security and privacy of big data based on the analysis of zero-latency and notions of

full data[5]. These paper [6] presents the several methods of analyzing the big data

problems that handling via Hadoop distributed file system (HDFS) on the MapReduce

background. In this proposal implementation of MapReduce techniques using HDFS

for big data. Yadvav [7] proposes the algorithm and overview of architecture which is

used in large data sets. To implement the big data, these algorithms describe the

various tools that were established for analyzing them. And also analyzing the many

security issues, trends and application which are followed by using the large data set.

The cloud BI is the innovative concept using the cloud based architecture of

distributing as a service of BI capabilities, that comes at lesser cost until has quicker

flexibility and deployment [8].The delivery model for BI is software as a service BI

(SaaS BI) with a secure Internet connection in which the application are normally

organized at hosted location on outside of the company’s firewall and it is accessed by

an end user [9]. Cloud affects many of the present businesses which including the

major impact on the BI industry. It helps the way of analyzing their data, making

International Journal of Pure and Applied Mathematics Special Issue

1791

better decisions and could help them for turning it into valued business information

for the organizations [10].

The targeted attack of an advanced persistent threat (APT) is against a physical system

or a high value asset. It operates in low and slow mode on the contrast to mass

spreading malware like viruses, Trojons and worms. The networks maintain low

profile is called low mode and allow for low execution time is called slow mode. APT

attackers are to avoid the triggering alarms which are user credentials on often

leverage stolen or zero day exploits. This kind of attack extended over period of time

while the victim organization can take place to the intrusion that remains oblivious.

The 2010 verizon data breach investigation report concludes in 86% of cases was

recorded in the business logs and evidence about the data breach, but the detection

mechanisms is failed to the raise of security alarms [11].

3. Methodology

3.1 Big Data Threat detection Framework

Cloud Secure Alliance (CSA) is a best practice of non-profit organization to support

the use for providing the assurance of security within the cloud environment. It has

focused on the most important challenges which have created by big data groups to

implementing the securing big data services. There are four different traits which are

categorized the dissimilar privacy and security challenges of big data ecosystem.

These traits are data privacy, infrastructure security, data management and integrity

and reactive security. It offers the various threat models and security control model to

identify the threats for address and mitigate the risk. In this paper, the detailed

description of Hadoop system that identifies the various security weakness of such

system. And also concludes the reference security framework which analysis by

providing for an enterprise on cloud environment [12].

International Journal of Pure and Applied Mathematics Special Issue

1792

Figure 1: CSA- classification of the Top 10 Challenge

3.1.1 Infrastructure Security & Integrity

In the past three years, the common vulnerabilities and exposures (CVE) database

illustrate 4 commentary and secure Hadoop Vulnerabilities. This might either most the

most of vulnerability remediation happens with no public reporting internally within

the vendor environments with no public reporting themselves or it is not active that

reflect the security community. Security configuration files of Hadoop with no

validity prior are self-contained to such policies being organized. This outcome

usually in availability and data integrity issues.

Distributed Programming structures for secure computations

The structure of distributed programming is used for storage and computation to

process the huge amount of data. MapReduce framework is the best example in which

the separation of an input files into multiple portions. The first phase of MapReduce is

Mapper for respective portion that reads the data, executes some calculations and

output of a key/value list pairs. In the second phase Reducer is used to combine to

International Journal of Pure and Applied Mathematics Special Issue

1793

each distinct key in which the values belonging for the distinct key and outputs. There

are two main prevention measures to attack, one is securing the data and securing the

mappers in the occurrence of an untrusted mappers [12].

Best practices for Non-relational data stores for security

NoSQL database are quite developing with respect to the infrastructure security which

are popularized of non-relational data stores. For example NoSQL injection which is

not mature for strong solutions and at any point of its designing stage, security was

never part of the model. Designers using NoSQL DB typically the security is implant

in the middleware. It does not support for providing explicitly to enforcing the

database. Though, the additional challenges of NoSQL DB pose to robust of such

security practices based on clustering aspect.

End-Point Input Validation/Filtering

Several big data use cases from many sources to require data collection in enterprise

settings such as endpoint devices. For instance, an enterprise network, Security

Information and Event Management system (SIEM) may gather event logs from lots

of software applications and hardware devices. An important challenge in the data

gathering process is input authentication. Input validation and filtering is an

intimidating task modeled by untrusted input sources, particularly with the bring your

own device (BYOD) model.

Security and Compliance of Real-time Monitoring

Real-time security monitoring has continually been a task, given the number of alarms

caused by security devices. These alarms lead to several false positives, which are

typically overlooked or just “clicked away,” as persons can’t manage with the trim

amount. This difficult influence even rise with big data, given the velocity and

volume of data cricks. Though, big data technologies may also offer a chance, in the

sense that these technologies do permit for fast handling and data analytics of

dissimilar kinds of data. Which in its chance can be used to offer, for example, real-

time anomaly detection based on accessible security analytics.

International Journal of Pure and Applied Mathematics Special Issue

1794

3.1.2 Identity & Access Management

Access Control Lists (ACLs) and Role Based Access Control (RBAC) policy archives

for mechanisms like HBase and MapReduce are typically arranged through clear-text

archives. These archives are editable by restricted versions on the scheme like origin

and other application accounts.

Securing of Data Storage and Transactions Logs

The storage media of Data and transaction logs are kept in multi-tiered. Manually

affecting data between levels that provides the IT manager through mechanism over

precisely what data is stimulated and when. Though, as the extent of data set has

been, and sustains to be, growing exponentially, availability and scalability have

required autotyping for big data storing organization. Auto-tiering results don’t retain

way of where the information is stored, which poses new trials to protected data

storage. New appliances are authoritative to prevent unauthorized access and sustain

the 24/7 availability.

Granular Audits

When the attack occur with a real time security monitoring the moment which have

try to notify but during reality it will not be always happened. The audit information is

essential for the reason of bottom most missed attack, this audit information will help

to understand the happened situation and also study were this went wrong in order to

correct the compliance, regulation and forensic reason. Based on this reason auditing

is not a new process but in the case of granularity scope might be varies. When it deals

with several data objects that possibly are distributed.

Data Provenance

The Provenance have produced large provenance graph which is complex, so

provenance metadata will grow by enable the programming environment in the

application of big data. In order to detect metadata dependencies for security and

privacy application this is comprehensively computational by using the analysis of

large provenance graphs.

International Journal of Pure and Applied Mathematics Special Issue

1795

3.1.3 Data Privacy & Security

All problems related with SQL injection type of intrusion also get forwarded to

component of hadoop such as Impala and Hive. The prepared functions in SQL are

presently not accessible that would have allowed in separation of both data and query.

In the case of sensitivity data protection, there is a lack of native cryptographic

control. Often this kind of security is provided for application stack or outside the

data. While transferring the data from one node to another node plain text data will be

send. So location of the data can’t be strictly imposed and even scheduler may not

able to find the next resource to the data which is forced to read the data in the

network.

Scalable and preservation Privacy

This is used for securing analytics and data mining, where big data has become a

troubling sign of authoritarian by probably enabling assault of privacy, assault

marketing, increment in control of state and corporate and also reducing civil freedom.

In an organization, this recent technologies of analyzing made an advantage in data

analytics for identified the marketing purpose and also unidentified data for analytics

is not adequate to preserve user privacy.

Cryptographically Enforced Access Control and Secure Communication

This is one of the main metrics to ensure the confidential and sensitive personal

information is to be secure end-to-end which can be accessible for only authorized

entities based on access control policies were the data has encrypted. In this area,

particular research is Attribute Based Encryption (ABE) has made better, scalable and

also high efficient. The cryptographically secure data framework has been

implementing in order to ensure authentication and agreement among the distributed

entities.

Granular Access Control

As per the view of access control in which used as security property is a secrecy

prevention access data by the person which shouldn’t have access. In course grained

International Journal of Pure and Applied Mathematics Special Issue

1796

access method has the problem were the data could otherwise be shared is frequently

swept into the category of more preventive in guarantee of better security. Granular

access control provides the data manager with knife in behalf of sword to share

information as more as probable without compromise in secrecy.

4. Conclusion

Data analytics provide long usage for business in order to help in directing strategy to

improve profit and also by supporting in the process of decision making. Nowadays

the big data methodology and cloud computing technology are widely used in the

organization to shape up the business. It is comprehensive that big data provide

interesting opportunities for both users and business, so these are countered with huge

challenge based on security and privacy. Traditional security are lacking in proving

the accomplished solution to those challenges. The proposed alliance of cloud service

framework have introduced to solve the provided ten data security and privacy issues

that is addressed for creating big data process and computing BI with more secure.

Reference

[1] Marinela MIRCEA, Bogdan GHILIC – MICU and Marian STOICA, 2011,

“COMBINING BUSINESS INTELLIGENCE WITH CLOUD COMPUTING

TO DELIVERY AGILITY IN ACTUAL ECONOMY”, Economic

Computation & Economic Cybernetics Studies & Research, Vol. 45 Issue 1,

p1.

[2] Christina Tamer, Mary Kiley, Noushin Ashrafi & Jean- Pierre Kuilboer,

“RISKS AND BENEFITS OF BUSINESS INTELLIGENCE IN THE

CLOUD”, University of Massachusetts Boston, Management Science and

Information Systems Department, 100 Morrissey Blvd, Boston, MA, 02125

United States of America.

[3] Ajit Gaddam , “Securing Your Big Data Environment” .

International Journal of Pure and Applied Mathematics Special Issue

1797

[4] E. Damiani et al.: Business intelligence meets big data: a manifesto. In: Proc.

of the 3rd International Symposium on Data-Driven Process Discovery and

Analysis (post proceedings). Riva del Garda, Italy (August 2013).

[5] Claudio A. Ardagna and Ernesto Damiani , “Business Intelligence meets Big

Data: An Overview on Security and Privacy”.

[6] Agarwal, D., Das, S. and Abbadi, A. (2011). Big Data and Cloud Computing:

Current State and Future Opportunities. ACM 978-1-4503-0528-0/11/0003.

[7] Talia, D. (2013). Clouds for Scalable Big Data Analytics. Published by IEEE

Computer Society.

[8] Yuvraj Singh Gurjar & Vijay Singh Rathore, 2013, “Cloud Business

Intelligence – Is What Business Need Today”, International Journal of Recent

Technology and Engineering (IJRTE) ISSN: 2277-3878, Volume-1, Issue-6.

[9] “Software as a Service BI (SaaS BI)”,

http://searchbusinessanalytics.techtarget.com/definition/S oftware-as-a-

Service-BI-SaaS-BI, Requested in February 2014.

[10 ]“Business Intelligence in the Cloud”,

http://cloudcomputingtopics.com/2011/09/business- intelligence-in-the-cloud/,

Requested in December 2013.

[11] Big Data Analytics for Security Intelligence

[12] Top Ten Big Data Security and Privacy Challenges

International Journal of Pure and Applied Mathematics Special Issue

1798

1799

1800