research article a novel real-time ddos attack detection...

11
Research Article A Novel Real-Time DDoS Attack Detection Mechanism Based on MDRA Algorithm in Big Data Bin Jia, 1,2,3 Yan Ma, 1 Xiaohong Huang, 1 Zhaowen Lin, 1,2,3 and Yi Sun 2,3,4 1 Information and Network Center, Institute of Network Technology, Beijing University of Posts and Telecommunications, Beijing 100876, China 2 Science and Technology on Information Transmission and Dissemination in Communication Networks Laboratory, Shijiazhuang 050081, China 3 National Engineering Laboratory for Mobile Network Security (No. [2013] 2685), Beijing 100876, China 4 Network and Information Center, Institute of Network Technology and Institute of Sensing Technology and Business, Beijing University of Posts and Telecommunications, Beijing 100876, China Correspondence should be addressed to Bin Jia; jb [email protected] Received 25 March 2016; Revised 25 July 2016; Accepted 10 August 2016 Academic Editor: Nazrul Islam Copyright © 2016 Bin Jia et al. is is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. In the wake of the rapid development and wide application of information technology and Internet, our society has come into the information explosion era. Meanwhile, it brings in new and severe challenges to the field of network attack behavior detection due to the explosive growth and high complexity of network traffic. erefore, an effective and efficient detection mechanism that can detect attack behavior from large scale of network traffic plays an important role. In this paper, we focus on how to distinguish the attack traffic from normal data flows in Big Data and propose a novel real-time DDoS attack detection mechanism based on Multivariate Dimensionality Reduction Analysis (MDRA). In this mechanism, we first reduce the dimensionality of multiple characteristic variables in a network traffic record by Principal Component Analysis (PCA). en, we analyze the correlation of the lower dimensional variables. Finally, the attack traffic can be differentiated from the normal traffic by MDRA and Mahalanobis distance (MD). Compared with previous research methods, our experimental results show that higher precision rate is achieved and it approximates to 100% in True Negative Rate (TNR) for detection; CPU computing time is one-eightieth and memory resource consumption is one-third of the previous detection method based on Multivariate Correlation Analysis (MCA); computing complexity is constant. 1. Introduction e Denial of Service (DoS) attack is one of the most popular attacks on the Internet. It is implemented by forcing a kidnapped computer to launch or consuming its resources, such as CPU cycle, memory, and network bandwidth. When the DoS attack is generated by a great variety of distributed computers, it is called Distributed Denial of Service (DDoS). DDoS has become one of the main challenges to cyber security today. DDoS attack is launched by some remote-controlled Zom- bies. It prevents legitimate users from accessing some specific network services or paralyzes the victims’ own services by occupying computer resources or network bandwidth partly or completely. If there are more abnormal traffic data packets and more kidnapped Zombies hosts, more damage occurs in the network. If the number of Zombies hosts is large enough, it even can disrupt the whole network environment and all servers fleetly. In the summer of 1999, the Computer Incident Advisory Capability (CIAC) reported the first DDoS attack incident [1]. Since then, DDoS has become the mostly convenient and effective attack means frequently used by hackers. In 2000, it is the answer told by Internet sites (e.g., Microsoſt, Yahoo, and Amazon) that cannot be accessed for a long time, because of severe DDoS attack. DDoS attacks are mainly classified into three categories based on different attacked subjects. e first kind is called Hindawi Publishing Corporation Mathematical Problems in Engineering Volume 2016, Article ID 1467051, 10 pages http://dx.doi.org/10.1155/2016/1467051

Upload: others

Post on 23-Sep-2020

4 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Research Article A Novel Real-Time DDoS Attack Detection ...downloads.hindawi.com/journals/mpe/2016/1467051.pdf · e Denial of Service (DoS) attack is one of the most popular attacks

Research ArticleA Novel Real-Time DDoS Attack Detection Mechanism Based onMDRA Algorithm in Big Data

Bin Jia123 Yan Ma1 Xiaohong Huang1 Zhaowen Lin123 and Yi Sun234

1 Information and Network Center Institute of Network Technology Beijing University of Posts and TelecommunicationsBeijing 100876 China2Science and Technology on Information Transmission and Dissemination in Communication Networks LaboratoryShijiazhuang 050081 China3National Engineering Laboratory for Mobile Network Security (No [2013] 2685) Beijing 100876 China4Network and Information Center Institute of Network Technology and Institute of Sensing Technology and BusinessBeijing University of Posts and Telecommunications Beijing 100876 China

Correspondence should be addressed to Bin Jia jb qd2010bupteducn

Received 25 March 2016 Revised 25 July 2016 Accepted 10 August 2016

Academic Editor Nazrul Islam

Copyright copy 2016 Bin Jia et al This is an open access article distributed under the Creative Commons Attribution License whichpermits unrestricted use distribution and reproduction in any medium provided the original work is properly cited

In the wake of the rapid development and wide application of information technology and Internet our society has come into theinformation explosion era Meanwhile it brings in new and severe challenges to the field of network attack behavior detection dueto the explosive growth and high complexity of network traffic Therefore an effective and efficient detection mechanism that candetect attack behavior from large scale of network traffic plays an important role In this paper we focus on how to distinguishthe attack traffic from normal data flows in Big Data and propose a novel real-time DDoS attack detection mechanism basedon Multivariate Dimensionality Reduction Analysis (MDRA) In this mechanism we first reduce the dimensionality of multiplecharacteristic variables in a network traffic record by Principal Component Analysis (PCA) Then we analyze the correlation ofthe lower dimensional variables Finally the attack traffic can be differentiated from the normal traffic by MDRA andMahalanobisdistance (MD) Compared with previous research methods our experimental results show that higher precision rate is achievedand it approximates to 100 in True Negative Rate (TNR) for detection CPU computing time is one-eightieth and memoryresource consumption is one-third of the previous detectionmethod based onMultivariateCorrelationAnalysis (MCA) computingcomplexity is constant

1 Introduction

The Denial of Service (DoS) attack is one of the mostpopular attacks on the Internet It is implemented by forcinga kidnapped computer to launch or consuming its resourcessuch as CPU cycle memory and network bandwidth Whenthe DoS attack is generated by a great variety of distributedcomputers it is called Distributed Denial of Service (DDoS)DDoS has become one of the main challenges to cybersecurity today

DDoS attack is launched by some remote-controlledZom-bies It prevents legitimate users from accessing some specificnetwork services or paralyzes the victimsrsquo own services byoccupying computer resources or network bandwidth partly

or completely If there are more abnormal traffic data packetsand more kidnapped Zombies hosts more damage occurs inthe network If the number of Zombies hosts is large enoughit even can disrupt the whole network environment and allservers fleetly

In the summer of 1999 the Computer Incident AdvisoryCapability (CIAC) reported the first DDoS attack incident[1] Since then DDoS has become the mostly convenient andeffective attack means frequently used by hackers In 2000 itis the answer told by Internet sites (egMicrosoft Yahoo andAmazon) that cannot be accessed for a long time because ofsevere DDoS attack

DDoS attacks are mainly classified into three categoriesbased on different attacked subjects The first kind is called

Hindawi Publishing CorporationMathematical Problems in EngineeringVolume 2016 Article ID 1467051 10 pageshttpdxdoiorg10115520161467051

2 Mathematical Problems in Engineering

Netflow-DDoS attack and there are many typical instancessuch asDNS amplification attack SNMPamplification attackUDP Flood and ICMP FloodThe second one is connection-DDoS attack SYN Flood and TCP Flood are the mostinfluential attack cases Besides there is a kind ofDDoS attackbased on application such asHTTPGet Flood and SSL FloodIn this paper we focus on how to detect the Netflow-DDoSand connection-DDoS attacks

In spite of all the effort from industry to academia DDoSattack is still an open problem In recent years techniqueand level of DDoS attack are ceaselessly advancing with theimprovement of capability for attack detection With theemergence of Big Data technology it is particularly muchmore difficult than ever before to prevent the network fromvarious DDoS attacks The continuously growing networktraffic makes it impossible to detect network attack behaviorfrom such large scale of network traffic based on previousdetection methods

In this paper we address the abovementioned challengesand propose a novel method for real-time DDoS attackdetection based on Multivariate Dimensionality ReductionAnalysis (MDRA) algorithm which combines PrincipalComponent Analysis (PCA) and Multivariate CorrelationAnalysis (MCA) Compared with the previous solutions ourproposed algorithm has the following advantages

(i) Higher precision rate approximates to 100 in TrueNegative Rate (TNR)

(ii) CPU computing time is one-eightieth of the previousdetection method based on MCA

(iii) Memory resource consumption is one-third of theprevious detection method based on MCA

(iv) Computing complexity is constant

To the best of our knowledge this paper proposes thetheoretical method for the first time and attempts to applyit in the field of DDoS attack detection

The remainder of this paper is organized as follows Sec-tion 2 introduces the related work in DDoS attack detectionand analyzes related shortcomings Section 3 describes thetheoretical approach to our detection mechanism What ismore we design the attack detection framework based onMDRA Section 4 discusses the experimental details andgives the experimental results and analyses In Section 5 wesummarize this paper

2 Related Work

Although there is a development history of almost 20 yearsfor it DDoS attack detection is still a hot field of researchin industry and academia And its corresponding methodand technique have to keep up with the times along withcomplexity and diversity of DDoS attack means Previouswork mainly includes the following

In 2004 Kim et al [2] proposed a combined data miningapproach for the DDoS attack detection of the various typeswhich studied the automatic feature selection module andthe classifier generation module Because the analysis of perdata flow is indispensable toDDoS attack detection they used

the data based on Netflow as the gathering data In 2007Scherrer et al [3] focused on how to extract DDoS attackfeatures and how to detect and filter DDoS attack packets bya number of known characteristics In 2008 Lee et al [4]designed a method for proactive detection of DDoS attackby exploiting its architecture and selecting different variablesbased on attack features then they performed cluster analysisfor proactive detection of attack In 2010 Nguyen and Choi[5] introduced a method for preliminary detection of DDoSattacks by classifying the network conditions They selectedsome variables based on the key features What is more theyapplied the 119896-nearest neighbor (119896-NN) method to classifythe network conditions into each phase of DDoS attack Inaddition Tsai and Lin [6] told us a new method to detect theDDoS attack called ldquoTriangle Area Based Nearest ApproachrdquoBy using this approach the accuracy and the False PositiveRate (FPR) were improved In 2012 Bhange et al [7] pre-sented the idea about the DDoS attack and its impact onnetwork trafficThis paper studied DDoS attack by analyzingthe distribution of network traffic in order to distinguishanomaly traffic from the normal network behavior In 2014Tan et al [8] brought forth a more sophisticated DoS attackdetection approach using MCA Following the emergingmethod their paper proposed a new detection system basedonMCA to protect online services against DoS attacks In thesame year Luo et al [9] developed a mathematical model forestimating the combined impact of DDoS attack pattern andnetwork environment on attack effect by originally capturingthe adjustment behaviors of victim TCPs congestion window

DDoS attack can be detected by statistical analysisdata mining and machine learning However some existingdetection methods and techniques still suffer from low preci-sion and TNR or some of them cannot actively detect DDoSattacks The previous detection methods and techniquesalready cannot meet the requirements of the Big Data erain particular because of their low detection efficiency highresource consumption and high computing complexity Inthis paper we propose a novel detectionmechanism based onMDRA to show how to detect DDoS attack traffic effectivelyand in real time

3 Detection Mechanism

Figure 1 shows the overview of our real-time DDoS detectionframework We first collect network traffic data sample fromInternet and then input them into data acquisition systemwhich is composed of data cleaning data store and dataanonymization module Next the processed traffic data arefed into traffic feature Big Data system The traffic featuresin this system have two functions The first one is applied toOnline AttackDetection and the other one is used forOfflineTraffic Analysis based on Knowledge Base Here the resultsof Offline Traffic Analysis provide the feature recognition forOnline Attack Detection Last but not least current networkis adjusted on the basis of routing policy offered by the resultsof Online Attack Detection

In this section our novel method is separated intothree components that is traffic feature dimensionalityreduction traffic feature correlation analysis and attack

Mathematical Problems in Engineering 3

Networkmanagement

Trafficsampling

DataanonymizationDatastore Traffic featureBig Data system

Data acquisition system

Online attackdetection

Offline trafficanalysis

Routing policy adjustmentFeature

recognition

Trafficfeature

database

Datacleaning

Trafficfeature

Trafficfeature

Router 1

Router 3

Router 2

Router 4

Knowledgebase

Online processingOffline processing

Figure 1 Overview of real-time DDoS attack detection framework

detection framework based on MDRA and threshold Thesecomponents are introduced in following subsections

31 Traffic Feature Dimensionality Reduction A networktraffic record encompasses a wide variety of high dimensionalfeatures However some of these high dimensional featuresare redundant or noisy They may influence the effectivenessand efficiency of attack detection In order to eliminatedata redundancy and data noise we introduce a dimension-ality reduction technique into our detection method ThePCA method is used to extract less dimensional and morerepresentative features The projections on the remainingdimensionalities are called the principal components [10]One advantage of PCA is its data-driven design by keepingthe principal components of feature data and eliminating thecorrelated and measured feature data Currently PCA hasbeen widely applied in the domain of intrusion detection [11](such as [12 13]) and the other fields (such as [14])

In the PCA method some original dependent randomvariables are transformed into new random variables whosecomponents are uncorrelated by orthogonal transformationThe covariance matrix that is composed of original randomvariables is transformed into a diagonal matrix in the form ofalgebra The original coordinate system is transformed intoa new orthogonal coordinate system that points to multipleorthogonal directions in the form of geometry

PCA is able to obtain 119875 principal components Thefirst principal component is the linear combination for themaximum variance If the first principal component is notenough to represent information of the original variables weselect the second linear combination In order to effectivelyreflect the original information the existing information forthe first principal component needs not to appear in thesecond principal component By this analogy all subsequentprincipal components can be constructed We assume that anetwork traffic record sample set 119883 includes 119899 samples and

the dimension of each sample is 119889 That is to say 119883 = 1198831

1198832 119883

119899 and 119883

119894= (119909i1 1199091198942 119909119894119889) isin 119877

119889 119894 = 1 2 119899The representation of sample matrix 119878 is 119878 isin 119877

119899times119889 Thenthe covariance matrix of sample matrix is calculated by thefollowing formula

119862 =119878119879119878

119899 minus 1 119862 isin 119877

119889times119889 (1)

Next the covariance matrix 119862 needs to be diagonalizableHere the matrix 119862 is a symmetric matrix and the purpose ofsymmetric matrices diagonalization is to find an orthogonalmatrix 119875 let

119875119879119862119875 = Λ sdot 119875 Λ isin 119877

119889times119889 (2)

Assuming that we get the corresponding dimensions for thefirst 119901 (119901 lt 119889) biggest eigenvalues a new diagonal matrixΛ1(Λ1isin 119877119901times119901

) is set up according to the 119901 eigenvalues Thecorresponding 119901 eigenvalues constitute a new eigenvectormatrix 119875

1(1198751

isin 119877119889times119901

) Actually these eigenvalues in 1198751

constitute a new coordinate system in low dimension spaceand those are the principal components

Assuming that the sample matrix after PCA dimension-ality reduction is 119878

1 according to the purpose of PCA the

covariance between every two dimensions basically is zero in1198781 In other words the covariance matrix of 119878

1is Λ1 It is to

satisfy the following condition

119878119879

11198781

119899 minus 1= Λ1 (3)

We can get the following formula by (2)

119875119879119862119875 = Λ 997904rArr

119875119879

11198621198751= Λ1

(4)

4 Mathematical Problems in Engineering

Equation (4) is put into (2) and we get

119878119879

11198781

119899 minus 1= Λ1= 119875119879

11198621198751= 119875119879

1(

119878119879119878

119899 minus 1)1198751

=(1198781198751)119879

(1198781198751)

119899 minus 1997904rArr

1198781= 1198781198751

1198781isin 119877119899times119901

(5)

Because the covariance matrix of 1198781is a diagonal matrix

it means that the components are basically independentbetween every two different dimensions The process of PCAhas been done

32 Traffic Feature Correlation Analysis From the view ofthe correlation based on statistical theory DDoS attacktraffic features reflect different statistical properties versuslegitimate network traffic features Here we apply MCA [815 16] This approach is based on a triangle area techniqueandMahalanobis distance (MD)The triangle area techniqueis able to extract geometrical correlative information betweenevery two features in an acquired network traffic record AndMD is capable of similarity measurement between every twotraffic records The analysis is presented as follows

Assume that there is a captured network traffic recorddata set119883 = 119909

1 1199092 119909

119899 Here 119909119879

119894= [119891119894

1 119891119894

2 119891

119894

119898] 1 le

119894 le 119899 where 119909119894represents the 119894th traffic record and 119891

119894

119895

indicates the 119895th feature in the 119894th record For example119891119894119895and

119891119894

119896are a couple of features in 119909

119894 The area of a triangle 119879

119894

119895119896is

shown as

119879119894

119895119896=

(10038161003816100381610038161003816119891119894

119895

10038161003816100381610038161003816times10038161003816100381610038161003816119891119894

119896

10038161003816100381610038161003816)

2 (6)

where 1 le 119894 le 119899 1 le 119895 119896 le 119898 and 119895 = 119896 Figure 2 shows thearea of a triangle

On the basis of (6) we get the area of the triangle for everytwo distinct features in 119909

119894 By that analogy the areas of these

corresponding triangles between every two distinct featuresfor each and every network traffic record of all are acquiredAnd a Triangle Area Matrix (TAM) has been set up When119895 is equal to 119896 the value of 119879119894

119895119896is zero So the values of these

elements on themain diagonal of thematrix are zero Because119879119894

119895119896and119879

119894

119896119895represent the same triangle area the values of the

two are equalAs a consequence we draw the following conclusion

TAM119894 is a symmetric matrix and the elements of its maindiagonal are zero Here the low triangle of TAM is chosento convert into another vector TAM119894low and it is shown asfollows

TAM119894low

= [119879119894

21119879119894

31sdot sdot sdot 119879

119894

1198981119879119894

32119879119894

42sdot sdot sdot 119879

119894

1198982sdot sdot sdot 119879

119894

119898119898minus1]119879

(7)

x

y

Tijk

(fij f

ik)

(fij 0)

(0 fik)

0(0 0)

Figure 2 Sketch for the area of a triangle

DDoS attack is detected by the application to inherentMCA of traffic features in the network environment of BigData The geometrical correlation between every two pairsof traffic features has changed when anomaly behaviors ofDDoS attack appear on the Internet This approach providesan important warning signal

33 Attack Detection Framework In this section we firstestablish benchmark data by covariance matrix and MDSecondly the attack traffic detection based on MD and theselected threshold is implemented Last but not least wepresent the MDRA DDoS attack detection algorithm

331 Benchmark Data Formation by Covariance Matrix andMD The benchmark data is established based on normalnetwork traffic records It is used to compare with the freshincoming traffic records The inferior benchmark data canlead to the erroneous estimate that an incoming traffic recordis regarded as a legitimate record

Assume that there are 119905 normal training traffic featurerecords 119883nor

= 119909nor1

119909nor2

119909nor119905

We need to do twothings

(i) Computing the Covariance Matrices between the Areasof Every Two Triangles The MCA method is applied tobenchmark data formation The acquired lower triangles aredenoted as follows 119883

norTAMlower

= TAMnor1lowerTAM

nor2lower

TAMnor119905lower Then we compute the covariance matrices

between the areas of every two triangles that is

119862119879=

[[[[[[[

[

120590119879nor21119879

nor21

120590119879nor21119879

nor31

sdot sdot sdot 120590119879nor21119879

nor119898119898minus1

120590119879nor31119879

nor21

120590119879nor31119879

nor31

sdot sdot sdot 120590119879nor31119879

nor119898119898minus1

d

120590119879nor119898119898minus1119879

nor21

120590119879nor119898119898minus1119879

nor31

sdot sdot sdot 120590119879nor119898119898minus1119879

nor119898119898minus1

]]]]]]]

]

(8)

In this formula the covariance between every two arbitraryelements in TAMlower is defined as follows

120590(119879

nor119895119896119879

nor119906V )

=1

119905 minus 1

119905

sum

119894=1

(119879nor119894119895119896

minus 120583119879nor119895119896

) (119879nor119894119906V minus 120583

119879nor119906V

) (9)

Mathematical Problems in Engineering 5

where the mean of the (119895 119896)th elements and the mean of the(119906 V)th elements of TAMs for 119905normal training traffic recordsare respectively defined as

120583119879nor119895119896

=1

119905

119905

sum

119894=1

119879nor119894119895119896

(10)

120583119879nor119906V

=1

119905

119905

sum

119894=1

119879nor119894119906V (11)

(ii) Computing the MD between Every Two TAMs of TrafficRecords The covariance distance of data is signified by MDMD is an effective approach to compute the similarity ofthe two unknown sample sets The difference between MDand Euclidean Distance (ED) is that the relations between allkinds of characters are considered and thatMD is not relevantto the scale of the measurement

The MD between the normal training records and theirexpectation and the MD between the fresh captured trafficrecord and the expectation of normal training records areshown by the following formulas

MDnor119894

= radic(TAMnor119894lower minus TAMnor

lower)119879

covminus1 (TAMnor119894lower minus TAMnor

lower)

(12)

MDfresh

= radic(TAMfreshlower minus TAMnor

lower)119879

covminus1 (TAMfreshlower minus TAMnor

lower)

(13)

Moreover the expectation of TAMnorlower for the 119905 normal

training records is shown as follows

TAMnorlower =

1

119905

119905

sum

119894=1

TAMnor119894lower (14)

332 Attack Detection Standard Based on MD andThresholdFor DDoS attack detection we set a threshold value todistinguish DDoS anomaly traffic from the normal trafficfeature Next we give a formula [8] about the threshold value

Threshold = 120583 + 120590 lowast 120572 (15)

where 120583 was shown by (10) or (11) and 120590 is shown as follows

120590 = radic1

119905 minus 1

119905

sum

119894=1

(MDnor119894minusMDnor

)2

(16)

MDnor=

1

119905

119905

sum

119894=1

MDnor119894 (17)

In order to conform to the normal distribution [8] therange of the 120590 value is set from 1 to 3 with the increment of02 in this paperThen the standard of DDoS attack detectionis obtained An attack behavior is considered when the MDbetween a fresh acquired traffic record and the expectation ofnormal training records is greater than the threshold

Testing data

Generate benchmarkdata by triangle area

map and Mahalanobisdistance

Training data

Attack detectionbased on the given

threshold

Input

MDRA attackdetection algorithm

Attack detection modelInput

Output

Detectionresult

Figure 3 Flowchart of attack detection

333 A MDRA DDoS Attack Detection Algorithm Tan etal [8] proposed the algorithm to use for normal profilegeneration based on triangle-area and MCA and to use forattack detection based onMD By evaluation and comparisonwith some state-of-the-art approaches it is easy to find thatthe previous attack detection method and its system havesome advantages in detection performance Detection Rate(DR) and accuracy rate However in Big Data of cyberspacesecurity especially when the network attack behaviors oflarge traffic are growing increasingly detection efficiencyresource consumption and computing complexity need betaken adequately into account for attack detection For thereasons given above we propose the MDRA algorithm todetect efficiently the network anomaly traffic Algorithm 1depicts the procedures of the algorithm for DDoS attackdetection metric based on MDRA in detail

4 Experiments

In this section we discuss how to apply our algorithm indetecting efficiently the DDoS attack traffic The flowchart ofattack detection is shown in Figure 3

Firstly we present the data set used in our experimentsand the data pretreatment approach to serve our experi-ments Then the experimental results are got to evaluate thealgorithm performance Finally we make comparisons withthe previous unoptimized approach in terms of time costresource consumption and computing complexity

The computer environment to run our experiments isshown in Table 1

Next we describe our experiments in detail

41 Data Set and Pretreatment In this paper we use thefamous Knowledge Discovery and Data Mining (KDD) Cup1999 data set [17ndash21] as our novel algorithm verificationWe have to admit that this data set has some shortagesbut it is still uniquely public and relatively credible labeledbenchmark data set so far This data set has been widelyapplied to researching and evaluating network intrusiondetection methods [22 23]

6 Mathematical Problems in Engineering

(1) Input a set of training data of normal network traffic records 119883nor= 119909

nor1

119909nor2

119909nor119905

where 119909nor119894

= [119891119894

1 119891119894

2 119891

119894

119898] 1 le 119894 le 119899

(2) Extract the principal components of119883nor to reach 70 for the accumulative contribution rate based on PCA and obtain theprincipal component data set119883119875nor

(3) Calculate TAM119875nor119894lower and TAM119875norlower of119883119875nor

(4) Calculate the covariance matrices between the areas of every two triangles 119879119875nor in119883119875nor

(5) for 119894 = 1 to 119905 do(6) Input TAM119875nor119894lower and TAM119875norlower(7) Calculate MD119875nor119894 between TAM119875nor119894lower and TAM119875norlower(8) Output MD119875nor119894(9) end for(10) Calculate 120583 by MD119875nor119894(11) Calculate 120590 by MD119875nor119894 and 120583(12) Input a fresh incoming traffic record 119909

fresh(13) Reduce the dimensions of the features for119909fresh based on PCA then get the records which include the principal components119909119875fresh(14) Calculate TAM119875freshlower of 119909119875fresh(15) Calculate MD119875fresh between TAM119875freshlower and TAM119875norlower(16) Input the threshold value 120572(17) If (120583 minus 120590 lowast 120572) le MD119875fresh le (120583 + 120590 lowast 120572) then(18) return Normal(19) else(20) return Attack(21) end if

Algorithm 1 Algorithm for DDoS attack detection based on MDRA

Table 1 Computer environment to run our experiments

CPU Memory Hard disk OS MATLABIntel Xeon CPU E5-2640 v2 200GHz200GHz (2 processors) 32GB 2TB Windows Server 2008 R2

EnterpriseR2013a (810604)64-bit (win64)

Table 2 Data sets used in our experiments

Category Training data set (10) Testing data set (corrected)Normal 97278 60593DoS 391458 229853

KDD CUP 1999 data set comprises about five millionnetwork records and provides a training subset of 10 percentof the network records and a testing subset It covers fourmain categories of attack that is DoS R2L U2R andProbingHere we use these records labeled as ldquonormalrdquo in theabovementioned training subset to construct our benchmarkdata and employ this testing subset ldquocorrectedrdquo to verifythe validity and efficiency of our algorithm In this paperwe choose DoS network attack as our algorithm evaluationand comparison with the previous approaches The data setsused in our experiments are shown in Table 2 The datapretreatment procedure is shown as follows

Firstly for each network traffic record it includes theinformation that has been separated into 41 features plus 1class label [24] in this data set In our experiments we need toget all numeric data for 41 features of every record Howeverthere are 3 nonnumeric features in all features and these areprotocol type service and flag They must be transformedinto numeric typeThe type conversion is achieved accordingto Table 3 where we emphatically analyze the pretreatment

process with reference to the feature ldquoservicerdquo The analysisprocess is as follows

There are 70 kinds of network service types in the ldquoser-vicerdquo feature however some of them rarely appear or neverappear For these features we can ignore them completelyAmong the 494021 records in the training subset of 10percent we find that the top three network service typesrespectively are ecr i private and http by counting andsorting and their ratios respectively are 5696 2245 and1301 The sum of all the other types accounts merely for758 The ratios of the top four types in ldquoservicerdquo feature areshown in Table 4

Secondly among the 41 features of these records labeled asldquonormalrdquo in the training subset of 10 percent there are threeinvalid features (ie wrong fragment num outbound cmdsand is hot login) by PCA This is because all the values ofthe three features are zero Therefore we get rid of the threefeatures in our experiments

Last but not least we extract the principal componentsaccording to the rate of accumulative contribution basedon PCA algorithm As a general rule we set the value ofthe rate of accumulative contribution to be equal to or tobe greater than 50 to extract important features from thechosen data set [6] In order to obtain the more importantprincipal components the value of the rate of accumulativecontribution is set to 70 in our experimentsThese principalcomponents extracted in the 41 features are listed in Table 5

Mathematical Problems in Engineering 7

Table 3 Type conversion for numbers 2 3 and 4 of 41 features

Number Feature name Type setting 1 Type setting 2 Type setting 3 Type setting 42 protocol type TCP = 1 UDP = 2 ICMP = 3 3 service ecr i = 1 private = 2 http = 3 others = 04 flag SF = 1 others = 0

Table 4 Top four types in ldquoservicerdquo feature

Type name The percentageecr i 5696private 2245http 1301others 758

Table 5 The principal components extracted in 41 features

Number Feature name of principal component1 duration2 protocol type3 service4 flag5 src bytes6 dst bytes7 land8 urgent9 hot10 num failed logins11 logged in12 num compromised

42 Experimental Results Our experiments aim at showingexhaustive and comparable results between the DDoS attackdetection method based on MCA and the method based onMDRA These results prove that the latter is superior to theformer

In order to estimate the advantage of our method it isindispensable to establish some evaluating indications Herewe present four formulae to evaluate our algorithm and theyare Precision TNR FPR and DR [11] The formulae aredefined as follows

Precision =TP

TP + FP (18)

TNR =TN

FP + TN (19)

FPR =FP

FP + TN (20)

DR =TP

TP + FN (21)

where(i) TP (True Positive) is the number of attacks correctly

classified as attacks(ii) FP (False Positive) is the number of normal records

incorrectly classified as attacks

MDRAMCA

99

992

994

996

998

100

Prec

ision

()

12 14 16 18 2 22 24 26 28 31The value of 120572

Figure 4 Precision for comparing detection methods based onMDRA and MCA

MDRAMCA

96

968

976

984

992

100

TNR

()

14 2216 18 212 24 26 28 31The value of 120572

Figure 5 TNR for comparing detection methods based on MDRAand MCA

(iii) TN (True Negative) is the number of normal recordscorrectly classified as normal records

(iv) FN (False Negative) is the number of attacks incor-rectly classified as normal records

Table 6 shows all results of TP FP TN and FN for 120572 thatis set from 1 to 3 with the increment of 02 when we use DDoSattack detection methods based on MDRA and MCA

Here the detection results of precision and TNR with thedifferent 120572 values are shown in Figures 4 and 5

8 Mathematical Problems in Engineering

Table 6 Results of TP FP TN and FN based on MDRA and MCA

120572Indicators based on MDRA Indicators based on MCA

TP FP TN FN TP FP TN FN120572 = 1 166299 278 60315 63554 223587 1743 58850 6266120572 = 12 166299 249 60344 63554 221873 1469 59124 7980120572 = 14 166292 227 60366 63561 206504 1313 59280 23349120572 = 16 166289 217 60376 63564 191190 1214 59379 38663120572 = 18 166289 204 60389 63564 190394 1159 59434 39459120572 = 2 166289 194 60399 63564 190342 1115 59478 39511120572 = 22 166289 191 60402 63564 190311 1065 59528 39542120572 = 24 166289 188 60405 63564 190277 1027 59566 39576120572 = 26 166282 180 60413 63571 190254 988 59605 39599120572 = 28 166282 176 60417 63571 190230 953 59640 39623120572 = 3 166282 172 60421 63571 190199 927 59666 39654

723

7232

7234

7236

7238

724

DR

()

03 035 04 045 05025FPR ()

(a)

80

85

90

95

100

DR

()

18 22 26 314FPR ()

(b)

Figure 6 (a) ROC for analyzing detection method based on MDRA (b) ROC for analyzing detection method based on MCA

In Figure 4 it is not hard to find that when the value of120572 gradually increases from 1 to 3 with the increment of 02the precision of attack detection method based on MDRA issuperior to the counterpart based onMCA and the former isabout 04 to 06 percent higher than the latter

In Figure 5 similarly we find that the TNR of ourdetection method is completely superior to another one withthe progressive increment of 120572 and the former is about 12 to24 percent higher than the latter

In addition the relationship between DR and FPR isfrequently used to evaluate the detection performance by theReceiver Operating Characteristic (ROC) curve The ROCcurve is obtained by setting different thresholds and there isa tradeoff between the DR and FPR [25] The ROC curves ofthe comparisons about the two detection methods are shownin Figure 6 In Figures 6(a) and 6(b) the two ROC curvesthat are used to analyze attack detection performance basedon our method and another one show the growing tendencyIn Figure 6(a) theROCcurve of ourmethod climbs graduallyfrom 7234 to 7235 for DR and it reflects that the changeof DR with different 120572 values is fairly small Likewise inFigure 6(b) this change is relatively large and the ROC

curve jumps dramatically from 8318 to 8984 Howeverin Big Data we pay more attention to instantaneity timecost resource consumption and computational complexityof attack detection Therefore a shade of discrepancy of DRcould be ignored At this point our method has the vastmajority of advantages in comparison to other methods Thediscussion about this topic will be opened up in the nextsection

43 Results Comparisons in terms of Time Cost and ResourceConsumption Here we emphatically analyze time cost andmemory resource consumption based on MDRA and MCA

On the one hand our detection mechanism is superiorto another one based on triangle-area and MCA proposedby Tan et al in time cost In our experimental environmentwe employ this server which has two CPUs and whereevery CPU has 16 cores When we ran the abovementionedexperimental data one of two CPUs opened and 16 coresof this CPU would gradually load to its full capacity Atthe moment the comparing results in CPU time of runningthe experimental data based on our detection method andthe other one are shown in Figure 7 However in the same

Mathematical Problems in Engineering 9

MDRAMCA

10minus1

100

101

102

CPU

tim

e (se

c)

12 14 16 18 2622 24 31 2 28The value of 120572

Figure 7 Comparing results in CPU time based on MDRA andMCA

MDRAMCA

0

05

1

15

2

25

3

35

Mem

ory

cons

umpt

ion

(GB)

12 14 16 18 2622 24 31 2 28The value of 120572

Figure 8 Comparing results in memory consumption based onMDRA and MCA

experimental environment the CPU time of the detectionmethod proposed by Tan et al is about 80 times as long asours and our CPU time is less than one second

On the other hand in terms of memory consumptionour detection mechanism is also a cut above the rest of themethod proposed by Tan et al This is because the memoryoccupied by our detection method in the experiments takesup less than 1GB however another one needs memory spaceof more than 3GB In the same experimental environmentthe occupied memory space in the detection method pro-posed by Tan et al is more than 3 times as long as ours Thecomparing results in memory consumption of running theexperimental data are shown in Figure 8

To sumup our detectionmethod can be perfectly appliedin real-timeDDoS attack detection under the environment ofvast amount of network traffic in Big Data

44 Computing Complexity Analysis In this section weanalyze the computing complexity of our detection method

Because the previous method based on MCA has the com-puting complexity of 119874(119898

2) and 119898 is a fixed number the

overall computing complexity is equal to 119874(1) [8] How-ever our detection mechanism based on MDRA uses thesimilar computational principle What is more the fixedfeature dimensionality119898 after reducing dimensionality in ourmethod is one-third of the previous method based on MCAHence the computing complexity of ourmethod is also equalto119874(1) At this point our detectionmechanism is equal to oris better than the other methods in [6 8 16]

5 Conclusion

In this paper we present a real-time DDoS attack detectionmechanism based on the MDRA algorithm in Big DataCompared with previous methods the experimental resultsdemonstrate that our solution has the better effectiveness andefficiency to distinguish attack traffic from vast amount ofnormal network traffic on the aspects of precision rate TNRtime cost memory resource consumption and computingcomplexity

Competing Interests

The authors declare that there is no conflict of interestsregarding the publication of this paper

Acknowledgments

This work was supported by the International Science andTechnology Cooperation Project of China (2013DFE13130)

References

[1] P J CriscuoloDistributed Denial of Service Trin00 Tribe FloodNetwork Tribe Flood Network 2000 and Stacheldraht CIAC-2319 Lawrence Livermore National Laboratory 2000

[2] M Kim H Na and K Chae ldquoA combined data miningapproach for DDoS attack detectionrdquo in Information Net-working Networking Technologies for Broadband and MobileNetworks vol 3090 of Lecture Notes in Computer Science pp943ndash950 Springer Berlin Germany 2004

[3] A Scherrer N Larrieu P Owezarski P Borgnat and P AbryldquoNon-Gaussian and long memory statistical characterizationsfor Internet traffic with anomaliesrdquo IEEE Transactions onDependable and Secure Computing vol 4 no 1 pp 56ndash70 2007

[4] K Lee J Kim K H Kwon Y Han and S Kim ldquoDDoS attackdetection method using cluster analysisrdquo Expert Systems withApplications vol 34 no 3 pp 1659ndash1665 2008

[5] H V Nguyen and Y Choi ldquoProactive detection of DDoS attacksutilizing k-NN classifier in an anti-DDoS frameworkrdquo WorldAcademy of Science Engineering and Technology InternationalScience Index vol 4 no 3 pp 247ndash252 2010

[6] C-F Tsai andC-Y Lin ldquoA triangle area based nearest neighborsapproach to intrusion detectionrdquo Pattern Recognition vol 43no 1 pp 222ndash229 2010

[7] A Bhange A Syad and S SinghThakur ldquoDDoS attacks impacton network traffic and its detection approachrdquo InternationalJournal of Computer Applications vol 40 no 11 pp 36ndash40 2012

10 Mathematical Problems in Engineering

[8] Z Y Tan A Jamdagni X J He P Nanda and R P LiuldquoA system for denial-of-service attack detection based onmultivariate correlation analysisrdquo IEEE Transactions on Paralleland Distributed Systems vol 25 no 2 pp 447ndash456 2014

[9] J Luo X Yang JWang J Xu J Sun and K Long ldquoOn amathe-matical model for low-rate shrew DDoSrdquo IEEE Transactions onInformation Forensics and Security vol 9 no 7 pp 1069ndash10832014

[10] H Hotelling ldquoAnalysis of a complex of statistical variables intoprincipal componentsrdquo Journal of Educational Psychology vol24 no 6 pp 417ndash441 1933

[11] A Patcha and J-M Park ldquoAn overview of anomaly detectiontechniques existing solutions and latest technological trendsrdquoComputer Networks vol 51 no 12 pp 3448ndash3470 2007

[12] G Liu Z Yi and S Yang ldquoA hierarchical intrusion detectionmodel based on the PCA neural networksrdquo Neurocomputingvol 70 no 7ndash9 pp 1561ndash1568 2007

[13] Y Kanda K Fukuda and T Sugawara ldquoEvaluation of anomalydetection based on sketch and PCArdquo in Proceedings of the 53rdIEEE Global Telecommunications Conference (GLOBECOM rsquo10)pp 1ndash5 IEEE Miami Fla USA December 2010

[14] Y Zhang and L Wu ldquoAn MR brain images classifier via prin-cipal component analysis and kernel support vector machinerdquoProgress in Electromagnetics Research vol 130 pp 369ndash3882012

[15] Z Tan A Jamdagni X He P Nanda and R P Liu ldquoDenial-of-service attack detection based on multivariate correlation anal-ysisrdquo in Neural Information Processing pp 756ndash765 SpringerBerlin Germany 2011

[16] Z Tan A Jamdagni X J He P Nanda and R P Liu ldquoTriangle-area-basedmultivariate correlation analysis for effective denial-of-service attack detectionrdquo in Proceedings of the 11th IEEEInternational Conference on Trust Security and Privacy inComputing and Communications (TrustCom rsquo12) pp 33ndash40IEEE Liverpool UK June 2012

[17] S Stolfo W Fan W Lee A Prodromidis and P K ChanldquoCost-basedmodeling for fraud and intrusion detection resultsfrom the JAMprojectrdquo inProceedings of theDARPA informationsurvivability conference and exposition (DISCEX rsquo00) pp 130ndash144 Hilton Head SC USA 2000

[18] R Lippmann JWHaines D J Fried J Korba andKDas ldquoThe1999 DARPA off-line intrusion detection evaluationrdquoComputerNetworks vol 34 no 4 pp 579ndash595 2000

[19] J McHugh ldquoTesting Intrusion detection systems a critique ofthe 1998 and 1999 DARPA intrusion detection system evalua-tions as performed by Lincoln Laboratoryrdquo ACM Transactionson Information and System Security vol 3 no 4 pp 262ndash2942000

[20] S StofoTheThird International Knowledge Discovery and DataMining Tools Competition The University of California 2002httpkddicsuciedudatabaseskddcup99kddcup99html

[21] S Mukkamala A H Sung and A Abraham ldquoIntrusiondetection using an ensemble of intelligent paradigmsrdquo Journalof Network and Computer Applications vol 28 no 2 pp 167ndash182 2005

[22] K-C Khor C-Y Ting and S Phon-Amnuaisuk ldquoA cascadedclassifier approach for improving detection rates on rare attackcategories in network intrusion detectionrdquo Applied Intelligencevol 36 no 2 pp 320ndash329 2012

[23] P Prasenna A V T Raghav Ramana R Krishna Kumar andA Devanbu ldquoNetwork programming and mining classifier for

intrusion detection using probability classificationrdquo in Proceed-ings of the International Conference on Pattern RecognitionInformatics and Medical Engineering (PRIME rsquo12) pp 204ndash209IEEE Salem Tamilnadu March 2012

[24] C Bae W-C Yeh M A M Shukran Y Y chung and T-JHsieh ldquoA novel anomaly-network intrusion detection systemusing ABC algorithmsrdquo International Journal of InnovativeComputing Information and Control vol 8 no 12 pp 8231ndash8248 2012

[25] WWang X Zhang S Gombault and S J Knapskog ldquoAttributenormalization in network intrusion detectionrdquo in Proceedingsof the 10th International Symposium on Pervasive SystemsAlgorithms and Networks (ISPAN rsquo09) pp 448ndash453 December2009

Submit your manuscripts athttpwwwhindawicom

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Mathematical Problems in Engineering

Hindawi Publishing Corporationhttpwwwhindawicom

Differential EquationsInternational Journal of

Volume 2014

Applied MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Probability and StatisticsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Mathematical PhysicsAdvances in

Complex AnalysisJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

OptimizationJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

CombinatoricsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Operations ResearchAdvances in

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Function Spaces

Abstract and Applied AnalysisHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of Mathematics and Mathematical Sciences

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Algebra

Discrete Dynamics in Nature and Society

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Decision SciencesAdvances in

Discrete MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom

Volume 2014 Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Stochastic AnalysisInternational Journal of

Page 2: Research Article A Novel Real-Time DDoS Attack Detection ...downloads.hindawi.com/journals/mpe/2016/1467051.pdf · e Denial of Service (DoS) attack is one of the most popular attacks

2 Mathematical Problems in Engineering

Netflow-DDoS attack and there are many typical instancessuch asDNS amplification attack SNMPamplification attackUDP Flood and ICMP FloodThe second one is connection-DDoS attack SYN Flood and TCP Flood are the mostinfluential attack cases Besides there is a kind ofDDoS attackbased on application such asHTTPGet Flood and SSL FloodIn this paper we focus on how to detect the Netflow-DDoSand connection-DDoS attacks

In spite of all the effort from industry to academia DDoSattack is still an open problem In recent years techniqueand level of DDoS attack are ceaselessly advancing with theimprovement of capability for attack detection With theemergence of Big Data technology it is particularly muchmore difficult than ever before to prevent the network fromvarious DDoS attacks The continuously growing networktraffic makes it impossible to detect network attack behaviorfrom such large scale of network traffic based on previousdetection methods

In this paper we address the abovementioned challengesand propose a novel method for real-time DDoS attackdetection based on Multivariate Dimensionality ReductionAnalysis (MDRA) algorithm which combines PrincipalComponent Analysis (PCA) and Multivariate CorrelationAnalysis (MCA) Compared with the previous solutions ourproposed algorithm has the following advantages

(i) Higher precision rate approximates to 100 in TrueNegative Rate (TNR)

(ii) CPU computing time is one-eightieth of the previousdetection method based on MCA

(iii) Memory resource consumption is one-third of theprevious detection method based on MCA

(iv) Computing complexity is constant

To the best of our knowledge this paper proposes thetheoretical method for the first time and attempts to applyit in the field of DDoS attack detection

The remainder of this paper is organized as follows Sec-tion 2 introduces the related work in DDoS attack detectionand analyzes related shortcomings Section 3 describes thetheoretical approach to our detection mechanism What ismore we design the attack detection framework based onMDRA Section 4 discusses the experimental details andgives the experimental results and analyses In Section 5 wesummarize this paper

2 Related Work

Although there is a development history of almost 20 yearsfor it DDoS attack detection is still a hot field of researchin industry and academia And its corresponding methodand technique have to keep up with the times along withcomplexity and diversity of DDoS attack means Previouswork mainly includes the following

In 2004 Kim et al [2] proposed a combined data miningapproach for the DDoS attack detection of the various typeswhich studied the automatic feature selection module andthe classifier generation module Because the analysis of perdata flow is indispensable toDDoS attack detection they used

the data based on Netflow as the gathering data In 2007Scherrer et al [3] focused on how to extract DDoS attackfeatures and how to detect and filter DDoS attack packets bya number of known characteristics In 2008 Lee et al [4]designed a method for proactive detection of DDoS attackby exploiting its architecture and selecting different variablesbased on attack features then they performed cluster analysisfor proactive detection of attack In 2010 Nguyen and Choi[5] introduced a method for preliminary detection of DDoSattacks by classifying the network conditions They selectedsome variables based on the key features What is more theyapplied the 119896-nearest neighbor (119896-NN) method to classifythe network conditions into each phase of DDoS attack Inaddition Tsai and Lin [6] told us a new method to detect theDDoS attack called ldquoTriangle Area Based Nearest ApproachrdquoBy using this approach the accuracy and the False PositiveRate (FPR) were improved In 2012 Bhange et al [7] pre-sented the idea about the DDoS attack and its impact onnetwork trafficThis paper studied DDoS attack by analyzingthe distribution of network traffic in order to distinguishanomaly traffic from the normal network behavior In 2014Tan et al [8] brought forth a more sophisticated DoS attackdetection approach using MCA Following the emergingmethod their paper proposed a new detection system basedonMCA to protect online services against DoS attacks In thesame year Luo et al [9] developed a mathematical model forestimating the combined impact of DDoS attack pattern andnetwork environment on attack effect by originally capturingthe adjustment behaviors of victim TCPs congestion window

DDoS attack can be detected by statistical analysisdata mining and machine learning However some existingdetection methods and techniques still suffer from low preci-sion and TNR or some of them cannot actively detect DDoSattacks The previous detection methods and techniquesalready cannot meet the requirements of the Big Data erain particular because of their low detection efficiency highresource consumption and high computing complexity Inthis paper we propose a novel detectionmechanism based onMDRA to show how to detect DDoS attack traffic effectivelyand in real time

3 Detection Mechanism

Figure 1 shows the overview of our real-time DDoS detectionframework We first collect network traffic data sample fromInternet and then input them into data acquisition systemwhich is composed of data cleaning data store and dataanonymization module Next the processed traffic data arefed into traffic feature Big Data system The traffic featuresin this system have two functions The first one is applied toOnline AttackDetection and the other one is used forOfflineTraffic Analysis based on Knowledge Base Here the resultsof Offline Traffic Analysis provide the feature recognition forOnline Attack Detection Last but not least current networkis adjusted on the basis of routing policy offered by the resultsof Online Attack Detection

In this section our novel method is separated intothree components that is traffic feature dimensionalityreduction traffic feature correlation analysis and attack

Mathematical Problems in Engineering 3

Networkmanagement

Trafficsampling

DataanonymizationDatastore Traffic featureBig Data system

Data acquisition system

Online attackdetection

Offline trafficanalysis

Routing policy adjustmentFeature

recognition

Trafficfeature

database

Datacleaning

Trafficfeature

Trafficfeature

Router 1

Router 3

Router 2

Router 4

Knowledgebase

Online processingOffline processing

Figure 1 Overview of real-time DDoS attack detection framework

detection framework based on MDRA and threshold Thesecomponents are introduced in following subsections

31 Traffic Feature Dimensionality Reduction A networktraffic record encompasses a wide variety of high dimensionalfeatures However some of these high dimensional featuresare redundant or noisy They may influence the effectivenessand efficiency of attack detection In order to eliminatedata redundancy and data noise we introduce a dimension-ality reduction technique into our detection method ThePCA method is used to extract less dimensional and morerepresentative features The projections on the remainingdimensionalities are called the principal components [10]One advantage of PCA is its data-driven design by keepingthe principal components of feature data and eliminating thecorrelated and measured feature data Currently PCA hasbeen widely applied in the domain of intrusion detection [11](such as [12 13]) and the other fields (such as [14])

In the PCA method some original dependent randomvariables are transformed into new random variables whosecomponents are uncorrelated by orthogonal transformationThe covariance matrix that is composed of original randomvariables is transformed into a diagonal matrix in the form ofalgebra The original coordinate system is transformed intoa new orthogonal coordinate system that points to multipleorthogonal directions in the form of geometry

PCA is able to obtain 119875 principal components Thefirst principal component is the linear combination for themaximum variance If the first principal component is notenough to represent information of the original variables weselect the second linear combination In order to effectivelyreflect the original information the existing information forthe first principal component needs not to appear in thesecond principal component By this analogy all subsequentprincipal components can be constructed We assume that anetwork traffic record sample set 119883 includes 119899 samples and

the dimension of each sample is 119889 That is to say 119883 = 1198831

1198832 119883

119899 and 119883

119894= (119909i1 1199091198942 119909119894119889) isin 119877

119889 119894 = 1 2 119899The representation of sample matrix 119878 is 119878 isin 119877

119899times119889 Thenthe covariance matrix of sample matrix is calculated by thefollowing formula

119862 =119878119879119878

119899 minus 1 119862 isin 119877

119889times119889 (1)

Next the covariance matrix 119862 needs to be diagonalizableHere the matrix 119862 is a symmetric matrix and the purpose ofsymmetric matrices diagonalization is to find an orthogonalmatrix 119875 let

119875119879119862119875 = Λ sdot 119875 Λ isin 119877

119889times119889 (2)

Assuming that we get the corresponding dimensions for thefirst 119901 (119901 lt 119889) biggest eigenvalues a new diagonal matrixΛ1(Λ1isin 119877119901times119901

) is set up according to the 119901 eigenvalues Thecorresponding 119901 eigenvalues constitute a new eigenvectormatrix 119875

1(1198751

isin 119877119889times119901

) Actually these eigenvalues in 1198751

constitute a new coordinate system in low dimension spaceand those are the principal components

Assuming that the sample matrix after PCA dimension-ality reduction is 119878

1 according to the purpose of PCA the

covariance between every two dimensions basically is zero in1198781 In other words the covariance matrix of 119878

1is Λ1 It is to

satisfy the following condition

119878119879

11198781

119899 minus 1= Λ1 (3)

We can get the following formula by (2)

119875119879119862119875 = Λ 997904rArr

119875119879

11198621198751= Λ1

(4)

4 Mathematical Problems in Engineering

Equation (4) is put into (2) and we get

119878119879

11198781

119899 minus 1= Λ1= 119875119879

11198621198751= 119875119879

1(

119878119879119878

119899 minus 1)1198751

=(1198781198751)119879

(1198781198751)

119899 minus 1997904rArr

1198781= 1198781198751

1198781isin 119877119899times119901

(5)

Because the covariance matrix of 1198781is a diagonal matrix

it means that the components are basically independentbetween every two different dimensions The process of PCAhas been done

32 Traffic Feature Correlation Analysis From the view ofthe correlation based on statistical theory DDoS attacktraffic features reflect different statistical properties versuslegitimate network traffic features Here we apply MCA [815 16] This approach is based on a triangle area techniqueandMahalanobis distance (MD)The triangle area techniqueis able to extract geometrical correlative information betweenevery two features in an acquired network traffic record AndMD is capable of similarity measurement between every twotraffic records The analysis is presented as follows

Assume that there is a captured network traffic recorddata set119883 = 119909

1 1199092 119909

119899 Here 119909119879

119894= [119891119894

1 119891119894

2 119891

119894

119898] 1 le

119894 le 119899 where 119909119894represents the 119894th traffic record and 119891

119894

119895

indicates the 119895th feature in the 119894th record For example119891119894119895and

119891119894

119896are a couple of features in 119909

119894 The area of a triangle 119879

119894

119895119896is

shown as

119879119894

119895119896=

(10038161003816100381610038161003816119891119894

119895

10038161003816100381610038161003816times10038161003816100381610038161003816119891119894

119896

10038161003816100381610038161003816)

2 (6)

where 1 le 119894 le 119899 1 le 119895 119896 le 119898 and 119895 = 119896 Figure 2 shows thearea of a triangle

On the basis of (6) we get the area of the triangle for everytwo distinct features in 119909

119894 By that analogy the areas of these

corresponding triangles between every two distinct featuresfor each and every network traffic record of all are acquiredAnd a Triangle Area Matrix (TAM) has been set up When119895 is equal to 119896 the value of 119879119894

119895119896is zero So the values of these

elements on themain diagonal of thematrix are zero Because119879119894

119895119896and119879

119894

119896119895represent the same triangle area the values of the

two are equalAs a consequence we draw the following conclusion

TAM119894 is a symmetric matrix and the elements of its maindiagonal are zero Here the low triangle of TAM is chosento convert into another vector TAM119894low and it is shown asfollows

TAM119894low

= [119879119894

21119879119894

31sdot sdot sdot 119879

119894

1198981119879119894

32119879119894

42sdot sdot sdot 119879

119894

1198982sdot sdot sdot 119879

119894

119898119898minus1]119879

(7)

x

y

Tijk

(fij f

ik)

(fij 0)

(0 fik)

0(0 0)

Figure 2 Sketch for the area of a triangle

DDoS attack is detected by the application to inherentMCA of traffic features in the network environment of BigData The geometrical correlation between every two pairsof traffic features has changed when anomaly behaviors ofDDoS attack appear on the Internet This approach providesan important warning signal

33 Attack Detection Framework In this section we firstestablish benchmark data by covariance matrix and MDSecondly the attack traffic detection based on MD and theselected threshold is implemented Last but not least wepresent the MDRA DDoS attack detection algorithm

331 Benchmark Data Formation by Covariance Matrix andMD The benchmark data is established based on normalnetwork traffic records It is used to compare with the freshincoming traffic records The inferior benchmark data canlead to the erroneous estimate that an incoming traffic recordis regarded as a legitimate record

Assume that there are 119905 normal training traffic featurerecords 119883nor

= 119909nor1

119909nor2

119909nor119905

We need to do twothings

(i) Computing the Covariance Matrices between the Areasof Every Two Triangles The MCA method is applied tobenchmark data formation The acquired lower triangles aredenoted as follows 119883

norTAMlower

= TAMnor1lowerTAM

nor2lower

TAMnor119905lower Then we compute the covariance matrices

between the areas of every two triangles that is

119862119879=

[[[[[[[

[

120590119879nor21119879

nor21

120590119879nor21119879

nor31

sdot sdot sdot 120590119879nor21119879

nor119898119898minus1

120590119879nor31119879

nor21

120590119879nor31119879

nor31

sdot sdot sdot 120590119879nor31119879

nor119898119898minus1

d

120590119879nor119898119898minus1119879

nor21

120590119879nor119898119898minus1119879

nor31

sdot sdot sdot 120590119879nor119898119898minus1119879

nor119898119898minus1

]]]]]]]

]

(8)

In this formula the covariance between every two arbitraryelements in TAMlower is defined as follows

120590(119879

nor119895119896119879

nor119906V )

=1

119905 minus 1

119905

sum

119894=1

(119879nor119894119895119896

minus 120583119879nor119895119896

) (119879nor119894119906V minus 120583

119879nor119906V

) (9)

Mathematical Problems in Engineering 5

where the mean of the (119895 119896)th elements and the mean of the(119906 V)th elements of TAMs for 119905normal training traffic recordsare respectively defined as

120583119879nor119895119896

=1

119905

119905

sum

119894=1

119879nor119894119895119896

(10)

120583119879nor119906V

=1

119905

119905

sum

119894=1

119879nor119894119906V (11)

(ii) Computing the MD between Every Two TAMs of TrafficRecords The covariance distance of data is signified by MDMD is an effective approach to compute the similarity ofthe two unknown sample sets The difference between MDand Euclidean Distance (ED) is that the relations between allkinds of characters are considered and thatMD is not relevantto the scale of the measurement

The MD between the normal training records and theirexpectation and the MD between the fresh captured trafficrecord and the expectation of normal training records areshown by the following formulas

MDnor119894

= radic(TAMnor119894lower minus TAMnor

lower)119879

covminus1 (TAMnor119894lower minus TAMnor

lower)

(12)

MDfresh

= radic(TAMfreshlower minus TAMnor

lower)119879

covminus1 (TAMfreshlower minus TAMnor

lower)

(13)

Moreover the expectation of TAMnorlower for the 119905 normal

training records is shown as follows

TAMnorlower =

1

119905

119905

sum

119894=1

TAMnor119894lower (14)

332 Attack Detection Standard Based on MD andThresholdFor DDoS attack detection we set a threshold value todistinguish DDoS anomaly traffic from the normal trafficfeature Next we give a formula [8] about the threshold value

Threshold = 120583 + 120590 lowast 120572 (15)

where 120583 was shown by (10) or (11) and 120590 is shown as follows

120590 = radic1

119905 minus 1

119905

sum

119894=1

(MDnor119894minusMDnor

)2

(16)

MDnor=

1

119905

119905

sum

119894=1

MDnor119894 (17)

In order to conform to the normal distribution [8] therange of the 120590 value is set from 1 to 3 with the increment of02 in this paperThen the standard of DDoS attack detectionis obtained An attack behavior is considered when the MDbetween a fresh acquired traffic record and the expectation ofnormal training records is greater than the threshold

Testing data

Generate benchmarkdata by triangle area

map and Mahalanobisdistance

Training data

Attack detectionbased on the given

threshold

Input

MDRA attackdetection algorithm

Attack detection modelInput

Output

Detectionresult

Figure 3 Flowchart of attack detection

333 A MDRA DDoS Attack Detection Algorithm Tan etal [8] proposed the algorithm to use for normal profilegeneration based on triangle-area and MCA and to use forattack detection based onMD By evaluation and comparisonwith some state-of-the-art approaches it is easy to find thatthe previous attack detection method and its system havesome advantages in detection performance Detection Rate(DR) and accuracy rate However in Big Data of cyberspacesecurity especially when the network attack behaviors oflarge traffic are growing increasingly detection efficiencyresource consumption and computing complexity need betaken adequately into account for attack detection For thereasons given above we propose the MDRA algorithm todetect efficiently the network anomaly traffic Algorithm 1depicts the procedures of the algorithm for DDoS attackdetection metric based on MDRA in detail

4 Experiments

In this section we discuss how to apply our algorithm indetecting efficiently the DDoS attack traffic The flowchart ofattack detection is shown in Figure 3

Firstly we present the data set used in our experimentsand the data pretreatment approach to serve our experi-ments Then the experimental results are got to evaluate thealgorithm performance Finally we make comparisons withthe previous unoptimized approach in terms of time costresource consumption and computing complexity

The computer environment to run our experiments isshown in Table 1

Next we describe our experiments in detail

41 Data Set and Pretreatment In this paper we use thefamous Knowledge Discovery and Data Mining (KDD) Cup1999 data set [17ndash21] as our novel algorithm verificationWe have to admit that this data set has some shortagesbut it is still uniquely public and relatively credible labeledbenchmark data set so far This data set has been widelyapplied to researching and evaluating network intrusiondetection methods [22 23]

6 Mathematical Problems in Engineering

(1) Input a set of training data of normal network traffic records 119883nor= 119909

nor1

119909nor2

119909nor119905

where 119909nor119894

= [119891119894

1 119891119894

2 119891

119894

119898] 1 le 119894 le 119899

(2) Extract the principal components of119883nor to reach 70 for the accumulative contribution rate based on PCA and obtain theprincipal component data set119883119875nor

(3) Calculate TAM119875nor119894lower and TAM119875norlower of119883119875nor

(4) Calculate the covariance matrices between the areas of every two triangles 119879119875nor in119883119875nor

(5) for 119894 = 1 to 119905 do(6) Input TAM119875nor119894lower and TAM119875norlower(7) Calculate MD119875nor119894 between TAM119875nor119894lower and TAM119875norlower(8) Output MD119875nor119894(9) end for(10) Calculate 120583 by MD119875nor119894(11) Calculate 120590 by MD119875nor119894 and 120583(12) Input a fresh incoming traffic record 119909

fresh(13) Reduce the dimensions of the features for119909fresh based on PCA then get the records which include the principal components119909119875fresh(14) Calculate TAM119875freshlower of 119909119875fresh(15) Calculate MD119875fresh between TAM119875freshlower and TAM119875norlower(16) Input the threshold value 120572(17) If (120583 minus 120590 lowast 120572) le MD119875fresh le (120583 + 120590 lowast 120572) then(18) return Normal(19) else(20) return Attack(21) end if

Algorithm 1 Algorithm for DDoS attack detection based on MDRA

Table 1 Computer environment to run our experiments

CPU Memory Hard disk OS MATLABIntel Xeon CPU E5-2640 v2 200GHz200GHz (2 processors) 32GB 2TB Windows Server 2008 R2

EnterpriseR2013a (810604)64-bit (win64)

Table 2 Data sets used in our experiments

Category Training data set (10) Testing data set (corrected)Normal 97278 60593DoS 391458 229853

KDD CUP 1999 data set comprises about five millionnetwork records and provides a training subset of 10 percentof the network records and a testing subset It covers fourmain categories of attack that is DoS R2L U2R andProbingHere we use these records labeled as ldquonormalrdquo in theabovementioned training subset to construct our benchmarkdata and employ this testing subset ldquocorrectedrdquo to verifythe validity and efficiency of our algorithm In this paperwe choose DoS network attack as our algorithm evaluationand comparison with the previous approaches The data setsused in our experiments are shown in Table 2 The datapretreatment procedure is shown as follows

Firstly for each network traffic record it includes theinformation that has been separated into 41 features plus 1class label [24] in this data set In our experiments we need toget all numeric data for 41 features of every record Howeverthere are 3 nonnumeric features in all features and these areprotocol type service and flag They must be transformedinto numeric typeThe type conversion is achieved accordingto Table 3 where we emphatically analyze the pretreatment

process with reference to the feature ldquoservicerdquo The analysisprocess is as follows

There are 70 kinds of network service types in the ldquoser-vicerdquo feature however some of them rarely appear or neverappear For these features we can ignore them completelyAmong the 494021 records in the training subset of 10percent we find that the top three network service typesrespectively are ecr i private and http by counting andsorting and their ratios respectively are 5696 2245 and1301 The sum of all the other types accounts merely for758 The ratios of the top four types in ldquoservicerdquo feature areshown in Table 4

Secondly among the 41 features of these records labeled asldquonormalrdquo in the training subset of 10 percent there are threeinvalid features (ie wrong fragment num outbound cmdsand is hot login) by PCA This is because all the values ofthe three features are zero Therefore we get rid of the threefeatures in our experiments

Last but not least we extract the principal componentsaccording to the rate of accumulative contribution basedon PCA algorithm As a general rule we set the value ofthe rate of accumulative contribution to be equal to or tobe greater than 50 to extract important features from thechosen data set [6] In order to obtain the more importantprincipal components the value of the rate of accumulativecontribution is set to 70 in our experimentsThese principalcomponents extracted in the 41 features are listed in Table 5

Mathematical Problems in Engineering 7

Table 3 Type conversion for numbers 2 3 and 4 of 41 features

Number Feature name Type setting 1 Type setting 2 Type setting 3 Type setting 42 protocol type TCP = 1 UDP = 2 ICMP = 3 3 service ecr i = 1 private = 2 http = 3 others = 04 flag SF = 1 others = 0

Table 4 Top four types in ldquoservicerdquo feature

Type name The percentageecr i 5696private 2245http 1301others 758

Table 5 The principal components extracted in 41 features

Number Feature name of principal component1 duration2 protocol type3 service4 flag5 src bytes6 dst bytes7 land8 urgent9 hot10 num failed logins11 logged in12 num compromised

42 Experimental Results Our experiments aim at showingexhaustive and comparable results between the DDoS attackdetection method based on MCA and the method based onMDRA These results prove that the latter is superior to theformer

In order to estimate the advantage of our method it isindispensable to establish some evaluating indications Herewe present four formulae to evaluate our algorithm and theyare Precision TNR FPR and DR [11] The formulae aredefined as follows

Precision =TP

TP + FP (18)

TNR =TN

FP + TN (19)

FPR =FP

FP + TN (20)

DR =TP

TP + FN (21)

where(i) TP (True Positive) is the number of attacks correctly

classified as attacks(ii) FP (False Positive) is the number of normal records

incorrectly classified as attacks

MDRAMCA

99

992

994

996

998

100

Prec

ision

()

12 14 16 18 2 22 24 26 28 31The value of 120572

Figure 4 Precision for comparing detection methods based onMDRA and MCA

MDRAMCA

96

968

976

984

992

100

TNR

()

14 2216 18 212 24 26 28 31The value of 120572

Figure 5 TNR for comparing detection methods based on MDRAand MCA

(iii) TN (True Negative) is the number of normal recordscorrectly classified as normal records

(iv) FN (False Negative) is the number of attacks incor-rectly classified as normal records

Table 6 shows all results of TP FP TN and FN for 120572 thatis set from 1 to 3 with the increment of 02 when we use DDoSattack detection methods based on MDRA and MCA

Here the detection results of precision and TNR with thedifferent 120572 values are shown in Figures 4 and 5

8 Mathematical Problems in Engineering

Table 6 Results of TP FP TN and FN based on MDRA and MCA

120572Indicators based on MDRA Indicators based on MCA

TP FP TN FN TP FP TN FN120572 = 1 166299 278 60315 63554 223587 1743 58850 6266120572 = 12 166299 249 60344 63554 221873 1469 59124 7980120572 = 14 166292 227 60366 63561 206504 1313 59280 23349120572 = 16 166289 217 60376 63564 191190 1214 59379 38663120572 = 18 166289 204 60389 63564 190394 1159 59434 39459120572 = 2 166289 194 60399 63564 190342 1115 59478 39511120572 = 22 166289 191 60402 63564 190311 1065 59528 39542120572 = 24 166289 188 60405 63564 190277 1027 59566 39576120572 = 26 166282 180 60413 63571 190254 988 59605 39599120572 = 28 166282 176 60417 63571 190230 953 59640 39623120572 = 3 166282 172 60421 63571 190199 927 59666 39654

723

7232

7234

7236

7238

724

DR

()

03 035 04 045 05025FPR ()

(a)

80

85

90

95

100

DR

()

18 22 26 314FPR ()

(b)

Figure 6 (a) ROC for analyzing detection method based on MDRA (b) ROC for analyzing detection method based on MCA

In Figure 4 it is not hard to find that when the value of120572 gradually increases from 1 to 3 with the increment of 02the precision of attack detection method based on MDRA issuperior to the counterpart based onMCA and the former isabout 04 to 06 percent higher than the latter

In Figure 5 similarly we find that the TNR of ourdetection method is completely superior to another one withthe progressive increment of 120572 and the former is about 12 to24 percent higher than the latter

In addition the relationship between DR and FPR isfrequently used to evaluate the detection performance by theReceiver Operating Characteristic (ROC) curve The ROCcurve is obtained by setting different thresholds and there isa tradeoff between the DR and FPR [25] The ROC curves ofthe comparisons about the two detection methods are shownin Figure 6 In Figures 6(a) and 6(b) the two ROC curvesthat are used to analyze attack detection performance basedon our method and another one show the growing tendencyIn Figure 6(a) theROCcurve of ourmethod climbs graduallyfrom 7234 to 7235 for DR and it reflects that the changeof DR with different 120572 values is fairly small Likewise inFigure 6(b) this change is relatively large and the ROC

curve jumps dramatically from 8318 to 8984 Howeverin Big Data we pay more attention to instantaneity timecost resource consumption and computational complexityof attack detection Therefore a shade of discrepancy of DRcould be ignored At this point our method has the vastmajority of advantages in comparison to other methods Thediscussion about this topic will be opened up in the nextsection

43 Results Comparisons in terms of Time Cost and ResourceConsumption Here we emphatically analyze time cost andmemory resource consumption based on MDRA and MCA

On the one hand our detection mechanism is superiorto another one based on triangle-area and MCA proposedby Tan et al in time cost In our experimental environmentwe employ this server which has two CPUs and whereevery CPU has 16 cores When we ran the abovementionedexperimental data one of two CPUs opened and 16 coresof this CPU would gradually load to its full capacity Atthe moment the comparing results in CPU time of runningthe experimental data based on our detection method andthe other one are shown in Figure 7 However in the same

Mathematical Problems in Engineering 9

MDRAMCA

10minus1

100

101

102

CPU

tim

e (se

c)

12 14 16 18 2622 24 31 2 28The value of 120572

Figure 7 Comparing results in CPU time based on MDRA andMCA

MDRAMCA

0

05

1

15

2

25

3

35

Mem

ory

cons

umpt

ion

(GB)

12 14 16 18 2622 24 31 2 28The value of 120572

Figure 8 Comparing results in memory consumption based onMDRA and MCA

experimental environment the CPU time of the detectionmethod proposed by Tan et al is about 80 times as long asours and our CPU time is less than one second

On the other hand in terms of memory consumptionour detection mechanism is also a cut above the rest of themethod proposed by Tan et al This is because the memoryoccupied by our detection method in the experiments takesup less than 1GB however another one needs memory spaceof more than 3GB In the same experimental environmentthe occupied memory space in the detection method pro-posed by Tan et al is more than 3 times as long as ours Thecomparing results in memory consumption of running theexperimental data are shown in Figure 8

To sumup our detectionmethod can be perfectly appliedin real-timeDDoS attack detection under the environment ofvast amount of network traffic in Big Data

44 Computing Complexity Analysis In this section weanalyze the computing complexity of our detection method

Because the previous method based on MCA has the com-puting complexity of 119874(119898

2) and 119898 is a fixed number the

overall computing complexity is equal to 119874(1) [8] How-ever our detection mechanism based on MDRA uses thesimilar computational principle What is more the fixedfeature dimensionality119898 after reducing dimensionality in ourmethod is one-third of the previous method based on MCAHence the computing complexity of ourmethod is also equalto119874(1) At this point our detectionmechanism is equal to oris better than the other methods in [6 8 16]

5 Conclusion

In this paper we present a real-time DDoS attack detectionmechanism based on the MDRA algorithm in Big DataCompared with previous methods the experimental resultsdemonstrate that our solution has the better effectiveness andefficiency to distinguish attack traffic from vast amount ofnormal network traffic on the aspects of precision rate TNRtime cost memory resource consumption and computingcomplexity

Competing Interests

The authors declare that there is no conflict of interestsregarding the publication of this paper

Acknowledgments

This work was supported by the International Science andTechnology Cooperation Project of China (2013DFE13130)

References

[1] P J CriscuoloDistributed Denial of Service Trin00 Tribe FloodNetwork Tribe Flood Network 2000 and Stacheldraht CIAC-2319 Lawrence Livermore National Laboratory 2000

[2] M Kim H Na and K Chae ldquoA combined data miningapproach for DDoS attack detectionrdquo in Information Net-working Networking Technologies for Broadband and MobileNetworks vol 3090 of Lecture Notes in Computer Science pp943ndash950 Springer Berlin Germany 2004

[3] A Scherrer N Larrieu P Owezarski P Borgnat and P AbryldquoNon-Gaussian and long memory statistical characterizationsfor Internet traffic with anomaliesrdquo IEEE Transactions onDependable and Secure Computing vol 4 no 1 pp 56ndash70 2007

[4] K Lee J Kim K H Kwon Y Han and S Kim ldquoDDoS attackdetection method using cluster analysisrdquo Expert Systems withApplications vol 34 no 3 pp 1659ndash1665 2008

[5] H V Nguyen and Y Choi ldquoProactive detection of DDoS attacksutilizing k-NN classifier in an anti-DDoS frameworkrdquo WorldAcademy of Science Engineering and Technology InternationalScience Index vol 4 no 3 pp 247ndash252 2010

[6] C-F Tsai andC-Y Lin ldquoA triangle area based nearest neighborsapproach to intrusion detectionrdquo Pattern Recognition vol 43no 1 pp 222ndash229 2010

[7] A Bhange A Syad and S SinghThakur ldquoDDoS attacks impacton network traffic and its detection approachrdquo InternationalJournal of Computer Applications vol 40 no 11 pp 36ndash40 2012

10 Mathematical Problems in Engineering

[8] Z Y Tan A Jamdagni X J He P Nanda and R P LiuldquoA system for denial-of-service attack detection based onmultivariate correlation analysisrdquo IEEE Transactions on Paralleland Distributed Systems vol 25 no 2 pp 447ndash456 2014

[9] J Luo X Yang JWang J Xu J Sun and K Long ldquoOn amathe-matical model for low-rate shrew DDoSrdquo IEEE Transactions onInformation Forensics and Security vol 9 no 7 pp 1069ndash10832014

[10] H Hotelling ldquoAnalysis of a complex of statistical variables intoprincipal componentsrdquo Journal of Educational Psychology vol24 no 6 pp 417ndash441 1933

[11] A Patcha and J-M Park ldquoAn overview of anomaly detectiontechniques existing solutions and latest technological trendsrdquoComputer Networks vol 51 no 12 pp 3448ndash3470 2007

[12] G Liu Z Yi and S Yang ldquoA hierarchical intrusion detectionmodel based on the PCA neural networksrdquo Neurocomputingvol 70 no 7ndash9 pp 1561ndash1568 2007

[13] Y Kanda K Fukuda and T Sugawara ldquoEvaluation of anomalydetection based on sketch and PCArdquo in Proceedings of the 53rdIEEE Global Telecommunications Conference (GLOBECOM rsquo10)pp 1ndash5 IEEE Miami Fla USA December 2010

[14] Y Zhang and L Wu ldquoAn MR brain images classifier via prin-cipal component analysis and kernel support vector machinerdquoProgress in Electromagnetics Research vol 130 pp 369ndash3882012

[15] Z Tan A Jamdagni X He P Nanda and R P Liu ldquoDenial-of-service attack detection based on multivariate correlation anal-ysisrdquo in Neural Information Processing pp 756ndash765 SpringerBerlin Germany 2011

[16] Z Tan A Jamdagni X J He P Nanda and R P Liu ldquoTriangle-area-basedmultivariate correlation analysis for effective denial-of-service attack detectionrdquo in Proceedings of the 11th IEEEInternational Conference on Trust Security and Privacy inComputing and Communications (TrustCom rsquo12) pp 33ndash40IEEE Liverpool UK June 2012

[17] S Stolfo W Fan W Lee A Prodromidis and P K ChanldquoCost-basedmodeling for fraud and intrusion detection resultsfrom the JAMprojectrdquo inProceedings of theDARPA informationsurvivability conference and exposition (DISCEX rsquo00) pp 130ndash144 Hilton Head SC USA 2000

[18] R Lippmann JWHaines D J Fried J Korba andKDas ldquoThe1999 DARPA off-line intrusion detection evaluationrdquoComputerNetworks vol 34 no 4 pp 579ndash595 2000

[19] J McHugh ldquoTesting Intrusion detection systems a critique ofthe 1998 and 1999 DARPA intrusion detection system evalua-tions as performed by Lincoln Laboratoryrdquo ACM Transactionson Information and System Security vol 3 no 4 pp 262ndash2942000

[20] S StofoTheThird International Knowledge Discovery and DataMining Tools Competition The University of California 2002httpkddicsuciedudatabaseskddcup99kddcup99html

[21] S Mukkamala A H Sung and A Abraham ldquoIntrusiondetection using an ensemble of intelligent paradigmsrdquo Journalof Network and Computer Applications vol 28 no 2 pp 167ndash182 2005

[22] K-C Khor C-Y Ting and S Phon-Amnuaisuk ldquoA cascadedclassifier approach for improving detection rates on rare attackcategories in network intrusion detectionrdquo Applied Intelligencevol 36 no 2 pp 320ndash329 2012

[23] P Prasenna A V T Raghav Ramana R Krishna Kumar andA Devanbu ldquoNetwork programming and mining classifier for

intrusion detection using probability classificationrdquo in Proceed-ings of the International Conference on Pattern RecognitionInformatics and Medical Engineering (PRIME rsquo12) pp 204ndash209IEEE Salem Tamilnadu March 2012

[24] C Bae W-C Yeh M A M Shukran Y Y chung and T-JHsieh ldquoA novel anomaly-network intrusion detection systemusing ABC algorithmsrdquo International Journal of InnovativeComputing Information and Control vol 8 no 12 pp 8231ndash8248 2012

[25] WWang X Zhang S Gombault and S J Knapskog ldquoAttributenormalization in network intrusion detectionrdquo in Proceedingsof the 10th International Symposium on Pervasive SystemsAlgorithms and Networks (ISPAN rsquo09) pp 448ndash453 December2009

Submit your manuscripts athttpwwwhindawicom

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Mathematical Problems in Engineering

Hindawi Publishing Corporationhttpwwwhindawicom

Differential EquationsInternational Journal of

Volume 2014

Applied MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Probability and StatisticsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Mathematical PhysicsAdvances in

Complex AnalysisJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

OptimizationJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

CombinatoricsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Operations ResearchAdvances in

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Function Spaces

Abstract and Applied AnalysisHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of Mathematics and Mathematical Sciences

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Algebra

Discrete Dynamics in Nature and Society

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Decision SciencesAdvances in

Discrete MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom

Volume 2014 Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Stochastic AnalysisInternational Journal of

Page 3: Research Article A Novel Real-Time DDoS Attack Detection ...downloads.hindawi.com/journals/mpe/2016/1467051.pdf · e Denial of Service (DoS) attack is one of the most popular attacks

Mathematical Problems in Engineering 3

Networkmanagement

Trafficsampling

DataanonymizationDatastore Traffic featureBig Data system

Data acquisition system

Online attackdetection

Offline trafficanalysis

Routing policy adjustmentFeature

recognition

Trafficfeature

database

Datacleaning

Trafficfeature

Trafficfeature

Router 1

Router 3

Router 2

Router 4

Knowledgebase

Online processingOffline processing

Figure 1 Overview of real-time DDoS attack detection framework

detection framework based on MDRA and threshold Thesecomponents are introduced in following subsections

31 Traffic Feature Dimensionality Reduction A networktraffic record encompasses a wide variety of high dimensionalfeatures However some of these high dimensional featuresare redundant or noisy They may influence the effectivenessand efficiency of attack detection In order to eliminatedata redundancy and data noise we introduce a dimension-ality reduction technique into our detection method ThePCA method is used to extract less dimensional and morerepresentative features The projections on the remainingdimensionalities are called the principal components [10]One advantage of PCA is its data-driven design by keepingthe principal components of feature data and eliminating thecorrelated and measured feature data Currently PCA hasbeen widely applied in the domain of intrusion detection [11](such as [12 13]) and the other fields (such as [14])

In the PCA method some original dependent randomvariables are transformed into new random variables whosecomponents are uncorrelated by orthogonal transformationThe covariance matrix that is composed of original randomvariables is transformed into a diagonal matrix in the form ofalgebra The original coordinate system is transformed intoa new orthogonal coordinate system that points to multipleorthogonal directions in the form of geometry

PCA is able to obtain 119875 principal components Thefirst principal component is the linear combination for themaximum variance If the first principal component is notenough to represent information of the original variables weselect the second linear combination In order to effectivelyreflect the original information the existing information forthe first principal component needs not to appear in thesecond principal component By this analogy all subsequentprincipal components can be constructed We assume that anetwork traffic record sample set 119883 includes 119899 samples and

the dimension of each sample is 119889 That is to say 119883 = 1198831

1198832 119883

119899 and 119883

119894= (119909i1 1199091198942 119909119894119889) isin 119877

119889 119894 = 1 2 119899The representation of sample matrix 119878 is 119878 isin 119877

119899times119889 Thenthe covariance matrix of sample matrix is calculated by thefollowing formula

119862 =119878119879119878

119899 minus 1 119862 isin 119877

119889times119889 (1)

Next the covariance matrix 119862 needs to be diagonalizableHere the matrix 119862 is a symmetric matrix and the purpose ofsymmetric matrices diagonalization is to find an orthogonalmatrix 119875 let

119875119879119862119875 = Λ sdot 119875 Λ isin 119877

119889times119889 (2)

Assuming that we get the corresponding dimensions for thefirst 119901 (119901 lt 119889) biggest eigenvalues a new diagonal matrixΛ1(Λ1isin 119877119901times119901

) is set up according to the 119901 eigenvalues Thecorresponding 119901 eigenvalues constitute a new eigenvectormatrix 119875

1(1198751

isin 119877119889times119901

) Actually these eigenvalues in 1198751

constitute a new coordinate system in low dimension spaceand those are the principal components

Assuming that the sample matrix after PCA dimension-ality reduction is 119878

1 according to the purpose of PCA the

covariance between every two dimensions basically is zero in1198781 In other words the covariance matrix of 119878

1is Λ1 It is to

satisfy the following condition

119878119879

11198781

119899 minus 1= Λ1 (3)

We can get the following formula by (2)

119875119879119862119875 = Λ 997904rArr

119875119879

11198621198751= Λ1

(4)

4 Mathematical Problems in Engineering

Equation (4) is put into (2) and we get

119878119879

11198781

119899 minus 1= Λ1= 119875119879

11198621198751= 119875119879

1(

119878119879119878

119899 minus 1)1198751

=(1198781198751)119879

(1198781198751)

119899 minus 1997904rArr

1198781= 1198781198751

1198781isin 119877119899times119901

(5)

Because the covariance matrix of 1198781is a diagonal matrix

it means that the components are basically independentbetween every two different dimensions The process of PCAhas been done

32 Traffic Feature Correlation Analysis From the view ofthe correlation based on statistical theory DDoS attacktraffic features reflect different statistical properties versuslegitimate network traffic features Here we apply MCA [815 16] This approach is based on a triangle area techniqueandMahalanobis distance (MD)The triangle area techniqueis able to extract geometrical correlative information betweenevery two features in an acquired network traffic record AndMD is capable of similarity measurement between every twotraffic records The analysis is presented as follows

Assume that there is a captured network traffic recorddata set119883 = 119909

1 1199092 119909

119899 Here 119909119879

119894= [119891119894

1 119891119894

2 119891

119894

119898] 1 le

119894 le 119899 where 119909119894represents the 119894th traffic record and 119891

119894

119895

indicates the 119895th feature in the 119894th record For example119891119894119895and

119891119894

119896are a couple of features in 119909

119894 The area of a triangle 119879

119894

119895119896is

shown as

119879119894

119895119896=

(10038161003816100381610038161003816119891119894

119895

10038161003816100381610038161003816times10038161003816100381610038161003816119891119894

119896

10038161003816100381610038161003816)

2 (6)

where 1 le 119894 le 119899 1 le 119895 119896 le 119898 and 119895 = 119896 Figure 2 shows thearea of a triangle

On the basis of (6) we get the area of the triangle for everytwo distinct features in 119909

119894 By that analogy the areas of these

corresponding triangles between every two distinct featuresfor each and every network traffic record of all are acquiredAnd a Triangle Area Matrix (TAM) has been set up When119895 is equal to 119896 the value of 119879119894

119895119896is zero So the values of these

elements on themain diagonal of thematrix are zero Because119879119894

119895119896and119879

119894

119896119895represent the same triangle area the values of the

two are equalAs a consequence we draw the following conclusion

TAM119894 is a symmetric matrix and the elements of its maindiagonal are zero Here the low triangle of TAM is chosento convert into another vector TAM119894low and it is shown asfollows

TAM119894low

= [119879119894

21119879119894

31sdot sdot sdot 119879

119894

1198981119879119894

32119879119894

42sdot sdot sdot 119879

119894

1198982sdot sdot sdot 119879

119894

119898119898minus1]119879

(7)

x

y

Tijk

(fij f

ik)

(fij 0)

(0 fik)

0(0 0)

Figure 2 Sketch for the area of a triangle

DDoS attack is detected by the application to inherentMCA of traffic features in the network environment of BigData The geometrical correlation between every two pairsof traffic features has changed when anomaly behaviors ofDDoS attack appear on the Internet This approach providesan important warning signal

33 Attack Detection Framework In this section we firstestablish benchmark data by covariance matrix and MDSecondly the attack traffic detection based on MD and theselected threshold is implemented Last but not least wepresent the MDRA DDoS attack detection algorithm

331 Benchmark Data Formation by Covariance Matrix andMD The benchmark data is established based on normalnetwork traffic records It is used to compare with the freshincoming traffic records The inferior benchmark data canlead to the erroneous estimate that an incoming traffic recordis regarded as a legitimate record

Assume that there are 119905 normal training traffic featurerecords 119883nor

= 119909nor1

119909nor2

119909nor119905

We need to do twothings

(i) Computing the Covariance Matrices between the Areasof Every Two Triangles The MCA method is applied tobenchmark data formation The acquired lower triangles aredenoted as follows 119883

norTAMlower

= TAMnor1lowerTAM

nor2lower

TAMnor119905lower Then we compute the covariance matrices

between the areas of every two triangles that is

119862119879=

[[[[[[[

[

120590119879nor21119879

nor21

120590119879nor21119879

nor31

sdot sdot sdot 120590119879nor21119879

nor119898119898minus1

120590119879nor31119879

nor21

120590119879nor31119879

nor31

sdot sdot sdot 120590119879nor31119879

nor119898119898minus1

d

120590119879nor119898119898minus1119879

nor21

120590119879nor119898119898minus1119879

nor31

sdot sdot sdot 120590119879nor119898119898minus1119879

nor119898119898minus1

]]]]]]]

]

(8)

In this formula the covariance between every two arbitraryelements in TAMlower is defined as follows

120590(119879

nor119895119896119879

nor119906V )

=1

119905 minus 1

119905

sum

119894=1

(119879nor119894119895119896

minus 120583119879nor119895119896

) (119879nor119894119906V minus 120583

119879nor119906V

) (9)

Mathematical Problems in Engineering 5

where the mean of the (119895 119896)th elements and the mean of the(119906 V)th elements of TAMs for 119905normal training traffic recordsare respectively defined as

120583119879nor119895119896

=1

119905

119905

sum

119894=1

119879nor119894119895119896

(10)

120583119879nor119906V

=1

119905

119905

sum

119894=1

119879nor119894119906V (11)

(ii) Computing the MD between Every Two TAMs of TrafficRecords The covariance distance of data is signified by MDMD is an effective approach to compute the similarity ofthe two unknown sample sets The difference between MDand Euclidean Distance (ED) is that the relations between allkinds of characters are considered and thatMD is not relevantto the scale of the measurement

The MD between the normal training records and theirexpectation and the MD between the fresh captured trafficrecord and the expectation of normal training records areshown by the following formulas

MDnor119894

= radic(TAMnor119894lower minus TAMnor

lower)119879

covminus1 (TAMnor119894lower minus TAMnor

lower)

(12)

MDfresh

= radic(TAMfreshlower minus TAMnor

lower)119879

covminus1 (TAMfreshlower minus TAMnor

lower)

(13)

Moreover the expectation of TAMnorlower for the 119905 normal

training records is shown as follows

TAMnorlower =

1

119905

119905

sum

119894=1

TAMnor119894lower (14)

332 Attack Detection Standard Based on MD andThresholdFor DDoS attack detection we set a threshold value todistinguish DDoS anomaly traffic from the normal trafficfeature Next we give a formula [8] about the threshold value

Threshold = 120583 + 120590 lowast 120572 (15)

where 120583 was shown by (10) or (11) and 120590 is shown as follows

120590 = radic1

119905 minus 1

119905

sum

119894=1

(MDnor119894minusMDnor

)2

(16)

MDnor=

1

119905

119905

sum

119894=1

MDnor119894 (17)

In order to conform to the normal distribution [8] therange of the 120590 value is set from 1 to 3 with the increment of02 in this paperThen the standard of DDoS attack detectionis obtained An attack behavior is considered when the MDbetween a fresh acquired traffic record and the expectation ofnormal training records is greater than the threshold

Testing data

Generate benchmarkdata by triangle area

map and Mahalanobisdistance

Training data

Attack detectionbased on the given

threshold

Input

MDRA attackdetection algorithm

Attack detection modelInput

Output

Detectionresult

Figure 3 Flowchart of attack detection

333 A MDRA DDoS Attack Detection Algorithm Tan etal [8] proposed the algorithm to use for normal profilegeneration based on triangle-area and MCA and to use forattack detection based onMD By evaluation and comparisonwith some state-of-the-art approaches it is easy to find thatthe previous attack detection method and its system havesome advantages in detection performance Detection Rate(DR) and accuracy rate However in Big Data of cyberspacesecurity especially when the network attack behaviors oflarge traffic are growing increasingly detection efficiencyresource consumption and computing complexity need betaken adequately into account for attack detection For thereasons given above we propose the MDRA algorithm todetect efficiently the network anomaly traffic Algorithm 1depicts the procedures of the algorithm for DDoS attackdetection metric based on MDRA in detail

4 Experiments

In this section we discuss how to apply our algorithm indetecting efficiently the DDoS attack traffic The flowchart ofattack detection is shown in Figure 3

Firstly we present the data set used in our experimentsand the data pretreatment approach to serve our experi-ments Then the experimental results are got to evaluate thealgorithm performance Finally we make comparisons withthe previous unoptimized approach in terms of time costresource consumption and computing complexity

The computer environment to run our experiments isshown in Table 1

Next we describe our experiments in detail

41 Data Set and Pretreatment In this paper we use thefamous Knowledge Discovery and Data Mining (KDD) Cup1999 data set [17ndash21] as our novel algorithm verificationWe have to admit that this data set has some shortagesbut it is still uniquely public and relatively credible labeledbenchmark data set so far This data set has been widelyapplied to researching and evaluating network intrusiondetection methods [22 23]

6 Mathematical Problems in Engineering

(1) Input a set of training data of normal network traffic records 119883nor= 119909

nor1

119909nor2

119909nor119905

where 119909nor119894

= [119891119894

1 119891119894

2 119891

119894

119898] 1 le 119894 le 119899

(2) Extract the principal components of119883nor to reach 70 for the accumulative contribution rate based on PCA and obtain theprincipal component data set119883119875nor

(3) Calculate TAM119875nor119894lower and TAM119875norlower of119883119875nor

(4) Calculate the covariance matrices between the areas of every two triangles 119879119875nor in119883119875nor

(5) for 119894 = 1 to 119905 do(6) Input TAM119875nor119894lower and TAM119875norlower(7) Calculate MD119875nor119894 between TAM119875nor119894lower and TAM119875norlower(8) Output MD119875nor119894(9) end for(10) Calculate 120583 by MD119875nor119894(11) Calculate 120590 by MD119875nor119894 and 120583(12) Input a fresh incoming traffic record 119909

fresh(13) Reduce the dimensions of the features for119909fresh based on PCA then get the records which include the principal components119909119875fresh(14) Calculate TAM119875freshlower of 119909119875fresh(15) Calculate MD119875fresh between TAM119875freshlower and TAM119875norlower(16) Input the threshold value 120572(17) If (120583 minus 120590 lowast 120572) le MD119875fresh le (120583 + 120590 lowast 120572) then(18) return Normal(19) else(20) return Attack(21) end if

Algorithm 1 Algorithm for DDoS attack detection based on MDRA

Table 1 Computer environment to run our experiments

CPU Memory Hard disk OS MATLABIntel Xeon CPU E5-2640 v2 200GHz200GHz (2 processors) 32GB 2TB Windows Server 2008 R2

EnterpriseR2013a (810604)64-bit (win64)

Table 2 Data sets used in our experiments

Category Training data set (10) Testing data set (corrected)Normal 97278 60593DoS 391458 229853

KDD CUP 1999 data set comprises about five millionnetwork records and provides a training subset of 10 percentof the network records and a testing subset It covers fourmain categories of attack that is DoS R2L U2R andProbingHere we use these records labeled as ldquonormalrdquo in theabovementioned training subset to construct our benchmarkdata and employ this testing subset ldquocorrectedrdquo to verifythe validity and efficiency of our algorithm In this paperwe choose DoS network attack as our algorithm evaluationand comparison with the previous approaches The data setsused in our experiments are shown in Table 2 The datapretreatment procedure is shown as follows

Firstly for each network traffic record it includes theinformation that has been separated into 41 features plus 1class label [24] in this data set In our experiments we need toget all numeric data for 41 features of every record Howeverthere are 3 nonnumeric features in all features and these areprotocol type service and flag They must be transformedinto numeric typeThe type conversion is achieved accordingto Table 3 where we emphatically analyze the pretreatment

process with reference to the feature ldquoservicerdquo The analysisprocess is as follows

There are 70 kinds of network service types in the ldquoser-vicerdquo feature however some of them rarely appear or neverappear For these features we can ignore them completelyAmong the 494021 records in the training subset of 10percent we find that the top three network service typesrespectively are ecr i private and http by counting andsorting and their ratios respectively are 5696 2245 and1301 The sum of all the other types accounts merely for758 The ratios of the top four types in ldquoservicerdquo feature areshown in Table 4

Secondly among the 41 features of these records labeled asldquonormalrdquo in the training subset of 10 percent there are threeinvalid features (ie wrong fragment num outbound cmdsand is hot login) by PCA This is because all the values ofthe three features are zero Therefore we get rid of the threefeatures in our experiments

Last but not least we extract the principal componentsaccording to the rate of accumulative contribution basedon PCA algorithm As a general rule we set the value ofthe rate of accumulative contribution to be equal to or tobe greater than 50 to extract important features from thechosen data set [6] In order to obtain the more importantprincipal components the value of the rate of accumulativecontribution is set to 70 in our experimentsThese principalcomponents extracted in the 41 features are listed in Table 5

Mathematical Problems in Engineering 7

Table 3 Type conversion for numbers 2 3 and 4 of 41 features

Number Feature name Type setting 1 Type setting 2 Type setting 3 Type setting 42 protocol type TCP = 1 UDP = 2 ICMP = 3 3 service ecr i = 1 private = 2 http = 3 others = 04 flag SF = 1 others = 0

Table 4 Top four types in ldquoservicerdquo feature

Type name The percentageecr i 5696private 2245http 1301others 758

Table 5 The principal components extracted in 41 features

Number Feature name of principal component1 duration2 protocol type3 service4 flag5 src bytes6 dst bytes7 land8 urgent9 hot10 num failed logins11 logged in12 num compromised

42 Experimental Results Our experiments aim at showingexhaustive and comparable results between the DDoS attackdetection method based on MCA and the method based onMDRA These results prove that the latter is superior to theformer

In order to estimate the advantage of our method it isindispensable to establish some evaluating indications Herewe present four formulae to evaluate our algorithm and theyare Precision TNR FPR and DR [11] The formulae aredefined as follows

Precision =TP

TP + FP (18)

TNR =TN

FP + TN (19)

FPR =FP

FP + TN (20)

DR =TP

TP + FN (21)

where(i) TP (True Positive) is the number of attacks correctly

classified as attacks(ii) FP (False Positive) is the number of normal records

incorrectly classified as attacks

MDRAMCA

99

992

994

996

998

100

Prec

ision

()

12 14 16 18 2 22 24 26 28 31The value of 120572

Figure 4 Precision for comparing detection methods based onMDRA and MCA

MDRAMCA

96

968

976

984

992

100

TNR

()

14 2216 18 212 24 26 28 31The value of 120572

Figure 5 TNR for comparing detection methods based on MDRAand MCA

(iii) TN (True Negative) is the number of normal recordscorrectly classified as normal records

(iv) FN (False Negative) is the number of attacks incor-rectly classified as normal records

Table 6 shows all results of TP FP TN and FN for 120572 thatis set from 1 to 3 with the increment of 02 when we use DDoSattack detection methods based on MDRA and MCA

Here the detection results of precision and TNR with thedifferent 120572 values are shown in Figures 4 and 5

8 Mathematical Problems in Engineering

Table 6 Results of TP FP TN and FN based on MDRA and MCA

120572Indicators based on MDRA Indicators based on MCA

TP FP TN FN TP FP TN FN120572 = 1 166299 278 60315 63554 223587 1743 58850 6266120572 = 12 166299 249 60344 63554 221873 1469 59124 7980120572 = 14 166292 227 60366 63561 206504 1313 59280 23349120572 = 16 166289 217 60376 63564 191190 1214 59379 38663120572 = 18 166289 204 60389 63564 190394 1159 59434 39459120572 = 2 166289 194 60399 63564 190342 1115 59478 39511120572 = 22 166289 191 60402 63564 190311 1065 59528 39542120572 = 24 166289 188 60405 63564 190277 1027 59566 39576120572 = 26 166282 180 60413 63571 190254 988 59605 39599120572 = 28 166282 176 60417 63571 190230 953 59640 39623120572 = 3 166282 172 60421 63571 190199 927 59666 39654

723

7232

7234

7236

7238

724

DR

()

03 035 04 045 05025FPR ()

(a)

80

85

90

95

100

DR

()

18 22 26 314FPR ()

(b)

Figure 6 (a) ROC for analyzing detection method based on MDRA (b) ROC for analyzing detection method based on MCA

In Figure 4 it is not hard to find that when the value of120572 gradually increases from 1 to 3 with the increment of 02the precision of attack detection method based on MDRA issuperior to the counterpart based onMCA and the former isabout 04 to 06 percent higher than the latter

In Figure 5 similarly we find that the TNR of ourdetection method is completely superior to another one withthe progressive increment of 120572 and the former is about 12 to24 percent higher than the latter

In addition the relationship between DR and FPR isfrequently used to evaluate the detection performance by theReceiver Operating Characteristic (ROC) curve The ROCcurve is obtained by setting different thresholds and there isa tradeoff between the DR and FPR [25] The ROC curves ofthe comparisons about the two detection methods are shownin Figure 6 In Figures 6(a) and 6(b) the two ROC curvesthat are used to analyze attack detection performance basedon our method and another one show the growing tendencyIn Figure 6(a) theROCcurve of ourmethod climbs graduallyfrom 7234 to 7235 for DR and it reflects that the changeof DR with different 120572 values is fairly small Likewise inFigure 6(b) this change is relatively large and the ROC

curve jumps dramatically from 8318 to 8984 Howeverin Big Data we pay more attention to instantaneity timecost resource consumption and computational complexityof attack detection Therefore a shade of discrepancy of DRcould be ignored At this point our method has the vastmajority of advantages in comparison to other methods Thediscussion about this topic will be opened up in the nextsection

43 Results Comparisons in terms of Time Cost and ResourceConsumption Here we emphatically analyze time cost andmemory resource consumption based on MDRA and MCA

On the one hand our detection mechanism is superiorto another one based on triangle-area and MCA proposedby Tan et al in time cost In our experimental environmentwe employ this server which has two CPUs and whereevery CPU has 16 cores When we ran the abovementionedexperimental data one of two CPUs opened and 16 coresof this CPU would gradually load to its full capacity Atthe moment the comparing results in CPU time of runningthe experimental data based on our detection method andthe other one are shown in Figure 7 However in the same

Mathematical Problems in Engineering 9

MDRAMCA

10minus1

100

101

102

CPU

tim

e (se

c)

12 14 16 18 2622 24 31 2 28The value of 120572

Figure 7 Comparing results in CPU time based on MDRA andMCA

MDRAMCA

0

05

1

15

2

25

3

35

Mem

ory

cons

umpt

ion

(GB)

12 14 16 18 2622 24 31 2 28The value of 120572

Figure 8 Comparing results in memory consumption based onMDRA and MCA

experimental environment the CPU time of the detectionmethod proposed by Tan et al is about 80 times as long asours and our CPU time is less than one second

On the other hand in terms of memory consumptionour detection mechanism is also a cut above the rest of themethod proposed by Tan et al This is because the memoryoccupied by our detection method in the experiments takesup less than 1GB however another one needs memory spaceof more than 3GB In the same experimental environmentthe occupied memory space in the detection method pro-posed by Tan et al is more than 3 times as long as ours Thecomparing results in memory consumption of running theexperimental data are shown in Figure 8

To sumup our detectionmethod can be perfectly appliedin real-timeDDoS attack detection under the environment ofvast amount of network traffic in Big Data

44 Computing Complexity Analysis In this section weanalyze the computing complexity of our detection method

Because the previous method based on MCA has the com-puting complexity of 119874(119898

2) and 119898 is a fixed number the

overall computing complexity is equal to 119874(1) [8] How-ever our detection mechanism based on MDRA uses thesimilar computational principle What is more the fixedfeature dimensionality119898 after reducing dimensionality in ourmethod is one-third of the previous method based on MCAHence the computing complexity of ourmethod is also equalto119874(1) At this point our detectionmechanism is equal to oris better than the other methods in [6 8 16]

5 Conclusion

In this paper we present a real-time DDoS attack detectionmechanism based on the MDRA algorithm in Big DataCompared with previous methods the experimental resultsdemonstrate that our solution has the better effectiveness andefficiency to distinguish attack traffic from vast amount ofnormal network traffic on the aspects of precision rate TNRtime cost memory resource consumption and computingcomplexity

Competing Interests

The authors declare that there is no conflict of interestsregarding the publication of this paper

Acknowledgments

This work was supported by the International Science andTechnology Cooperation Project of China (2013DFE13130)

References

[1] P J CriscuoloDistributed Denial of Service Trin00 Tribe FloodNetwork Tribe Flood Network 2000 and Stacheldraht CIAC-2319 Lawrence Livermore National Laboratory 2000

[2] M Kim H Na and K Chae ldquoA combined data miningapproach for DDoS attack detectionrdquo in Information Net-working Networking Technologies for Broadband and MobileNetworks vol 3090 of Lecture Notes in Computer Science pp943ndash950 Springer Berlin Germany 2004

[3] A Scherrer N Larrieu P Owezarski P Borgnat and P AbryldquoNon-Gaussian and long memory statistical characterizationsfor Internet traffic with anomaliesrdquo IEEE Transactions onDependable and Secure Computing vol 4 no 1 pp 56ndash70 2007

[4] K Lee J Kim K H Kwon Y Han and S Kim ldquoDDoS attackdetection method using cluster analysisrdquo Expert Systems withApplications vol 34 no 3 pp 1659ndash1665 2008

[5] H V Nguyen and Y Choi ldquoProactive detection of DDoS attacksutilizing k-NN classifier in an anti-DDoS frameworkrdquo WorldAcademy of Science Engineering and Technology InternationalScience Index vol 4 no 3 pp 247ndash252 2010

[6] C-F Tsai andC-Y Lin ldquoA triangle area based nearest neighborsapproach to intrusion detectionrdquo Pattern Recognition vol 43no 1 pp 222ndash229 2010

[7] A Bhange A Syad and S SinghThakur ldquoDDoS attacks impacton network traffic and its detection approachrdquo InternationalJournal of Computer Applications vol 40 no 11 pp 36ndash40 2012

10 Mathematical Problems in Engineering

[8] Z Y Tan A Jamdagni X J He P Nanda and R P LiuldquoA system for denial-of-service attack detection based onmultivariate correlation analysisrdquo IEEE Transactions on Paralleland Distributed Systems vol 25 no 2 pp 447ndash456 2014

[9] J Luo X Yang JWang J Xu J Sun and K Long ldquoOn amathe-matical model for low-rate shrew DDoSrdquo IEEE Transactions onInformation Forensics and Security vol 9 no 7 pp 1069ndash10832014

[10] H Hotelling ldquoAnalysis of a complex of statistical variables intoprincipal componentsrdquo Journal of Educational Psychology vol24 no 6 pp 417ndash441 1933

[11] A Patcha and J-M Park ldquoAn overview of anomaly detectiontechniques existing solutions and latest technological trendsrdquoComputer Networks vol 51 no 12 pp 3448ndash3470 2007

[12] G Liu Z Yi and S Yang ldquoA hierarchical intrusion detectionmodel based on the PCA neural networksrdquo Neurocomputingvol 70 no 7ndash9 pp 1561ndash1568 2007

[13] Y Kanda K Fukuda and T Sugawara ldquoEvaluation of anomalydetection based on sketch and PCArdquo in Proceedings of the 53rdIEEE Global Telecommunications Conference (GLOBECOM rsquo10)pp 1ndash5 IEEE Miami Fla USA December 2010

[14] Y Zhang and L Wu ldquoAn MR brain images classifier via prin-cipal component analysis and kernel support vector machinerdquoProgress in Electromagnetics Research vol 130 pp 369ndash3882012

[15] Z Tan A Jamdagni X He P Nanda and R P Liu ldquoDenial-of-service attack detection based on multivariate correlation anal-ysisrdquo in Neural Information Processing pp 756ndash765 SpringerBerlin Germany 2011

[16] Z Tan A Jamdagni X J He P Nanda and R P Liu ldquoTriangle-area-basedmultivariate correlation analysis for effective denial-of-service attack detectionrdquo in Proceedings of the 11th IEEEInternational Conference on Trust Security and Privacy inComputing and Communications (TrustCom rsquo12) pp 33ndash40IEEE Liverpool UK June 2012

[17] S Stolfo W Fan W Lee A Prodromidis and P K ChanldquoCost-basedmodeling for fraud and intrusion detection resultsfrom the JAMprojectrdquo inProceedings of theDARPA informationsurvivability conference and exposition (DISCEX rsquo00) pp 130ndash144 Hilton Head SC USA 2000

[18] R Lippmann JWHaines D J Fried J Korba andKDas ldquoThe1999 DARPA off-line intrusion detection evaluationrdquoComputerNetworks vol 34 no 4 pp 579ndash595 2000

[19] J McHugh ldquoTesting Intrusion detection systems a critique ofthe 1998 and 1999 DARPA intrusion detection system evalua-tions as performed by Lincoln Laboratoryrdquo ACM Transactionson Information and System Security vol 3 no 4 pp 262ndash2942000

[20] S StofoTheThird International Knowledge Discovery and DataMining Tools Competition The University of California 2002httpkddicsuciedudatabaseskddcup99kddcup99html

[21] S Mukkamala A H Sung and A Abraham ldquoIntrusiondetection using an ensemble of intelligent paradigmsrdquo Journalof Network and Computer Applications vol 28 no 2 pp 167ndash182 2005

[22] K-C Khor C-Y Ting and S Phon-Amnuaisuk ldquoA cascadedclassifier approach for improving detection rates on rare attackcategories in network intrusion detectionrdquo Applied Intelligencevol 36 no 2 pp 320ndash329 2012

[23] P Prasenna A V T Raghav Ramana R Krishna Kumar andA Devanbu ldquoNetwork programming and mining classifier for

intrusion detection using probability classificationrdquo in Proceed-ings of the International Conference on Pattern RecognitionInformatics and Medical Engineering (PRIME rsquo12) pp 204ndash209IEEE Salem Tamilnadu March 2012

[24] C Bae W-C Yeh M A M Shukran Y Y chung and T-JHsieh ldquoA novel anomaly-network intrusion detection systemusing ABC algorithmsrdquo International Journal of InnovativeComputing Information and Control vol 8 no 12 pp 8231ndash8248 2012

[25] WWang X Zhang S Gombault and S J Knapskog ldquoAttributenormalization in network intrusion detectionrdquo in Proceedingsof the 10th International Symposium on Pervasive SystemsAlgorithms and Networks (ISPAN rsquo09) pp 448ndash453 December2009

Submit your manuscripts athttpwwwhindawicom

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Mathematical Problems in Engineering

Hindawi Publishing Corporationhttpwwwhindawicom

Differential EquationsInternational Journal of

Volume 2014

Applied MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Probability and StatisticsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Mathematical PhysicsAdvances in

Complex AnalysisJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

OptimizationJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

CombinatoricsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Operations ResearchAdvances in

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Function Spaces

Abstract and Applied AnalysisHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of Mathematics and Mathematical Sciences

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Algebra

Discrete Dynamics in Nature and Society

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Decision SciencesAdvances in

Discrete MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom

Volume 2014 Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Stochastic AnalysisInternational Journal of

Page 4: Research Article A Novel Real-Time DDoS Attack Detection ...downloads.hindawi.com/journals/mpe/2016/1467051.pdf · e Denial of Service (DoS) attack is one of the most popular attacks

4 Mathematical Problems in Engineering

Equation (4) is put into (2) and we get

119878119879

11198781

119899 minus 1= Λ1= 119875119879

11198621198751= 119875119879

1(

119878119879119878

119899 minus 1)1198751

=(1198781198751)119879

(1198781198751)

119899 minus 1997904rArr

1198781= 1198781198751

1198781isin 119877119899times119901

(5)

Because the covariance matrix of 1198781is a diagonal matrix

it means that the components are basically independentbetween every two different dimensions The process of PCAhas been done

32 Traffic Feature Correlation Analysis From the view ofthe correlation based on statistical theory DDoS attacktraffic features reflect different statistical properties versuslegitimate network traffic features Here we apply MCA [815 16] This approach is based on a triangle area techniqueandMahalanobis distance (MD)The triangle area techniqueis able to extract geometrical correlative information betweenevery two features in an acquired network traffic record AndMD is capable of similarity measurement between every twotraffic records The analysis is presented as follows

Assume that there is a captured network traffic recorddata set119883 = 119909

1 1199092 119909

119899 Here 119909119879

119894= [119891119894

1 119891119894

2 119891

119894

119898] 1 le

119894 le 119899 where 119909119894represents the 119894th traffic record and 119891

119894

119895

indicates the 119895th feature in the 119894th record For example119891119894119895and

119891119894

119896are a couple of features in 119909

119894 The area of a triangle 119879

119894

119895119896is

shown as

119879119894

119895119896=

(10038161003816100381610038161003816119891119894

119895

10038161003816100381610038161003816times10038161003816100381610038161003816119891119894

119896

10038161003816100381610038161003816)

2 (6)

where 1 le 119894 le 119899 1 le 119895 119896 le 119898 and 119895 = 119896 Figure 2 shows thearea of a triangle

On the basis of (6) we get the area of the triangle for everytwo distinct features in 119909

119894 By that analogy the areas of these

corresponding triangles between every two distinct featuresfor each and every network traffic record of all are acquiredAnd a Triangle Area Matrix (TAM) has been set up When119895 is equal to 119896 the value of 119879119894

119895119896is zero So the values of these

elements on themain diagonal of thematrix are zero Because119879119894

119895119896and119879

119894

119896119895represent the same triangle area the values of the

two are equalAs a consequence we draw the following conclusion

TAM119894 is a symmetric matrix and the elements of its maindiagonal are zero Here the low triangle of TAM is chosento convert into another vector TAM119894low and it is shown asfollows

TAM119894low

= [119879119894

21119879119894

31sdot sdot sdot 119879

119894

1198981119879119894

32119879119894

42sdot sdot sdot 119879

119894

1198982sdot sdot sdot 119879

119894

119898119898minus1]119879

(7)

x

y

Tijk

(fij f

ik)

(fij 0)

(0 fik)

0(0 0)

Figure 2 Sketch for the area of a triangle

DDoS attack is detected by the application to inherentMCA of traffic features in the network environment of BigData The geometrical correlation between every two pairsof traffic features has changed when anomaly behaviors ofDDoS attack appear on the Internet This approach providesan important warning signal

33 Attack Detection Framework In this section we firstestablish benchmark data by covariance matrix and MDSecondly the attack traffic detection based on MD and theselected threshold is implemented Last but not least wepresent the MDRA DDoS attack detection algorithm

331 Benchmark Data Formation by Covariance Matrix andMD The benchmark data is established based on normalnetwork traffic records It is used to compare with the freshincoming traffic records The inferior benchmark data canlead to the erroneous estimate that an incoming traffic recordis regarded as a legitimate record

Assume that there are 119905 normal training traffic featurerecords 119883nor

= 119909nor1

119909nor2

119909nor119905

We need to do twothings

(i) Computing the Covariance Matrices between the Areasof Every Two Triangles The MCA method is applied tobenchmark data formation The acquired lower triangles aredenoted as follows 119883

norTAMlower

= TAMnor1lowerTAM

nor2lower

TAMnor119905lower Then we compute the covariance matrices

between the areas of every two triangles that is

119862119879=

[[[[[[[

[

120590119879nor21119879

nor21

120590119879nor21119879

nor31

sdot sdot sdot 120590119879nor21119879

nor119898119898minus1

120590119879nor31119879

nor21

120590119879nor31119879

nor31

sdot sdot sdot 120590119879nor31119879

nor119898119898minus1

d

120590119879nor119898119898minus1119879

nor21

120590119879nor119898119898minus1119879

nor31

sdot sdot sdot 120590119879nor119898119898minus1119879

nor119898119898minus1

]]]]]]]

]

(8)

In this formula the covariance between every two arbitraryelements in TAMlower is defined as follows

120590(119879

nor119895119896119879

nor119906V )

=1

119905 minus 1

119905

sum

119894=1

(119879nor119894119895119896

minus 120583119879nor119895119896

) (119879nor119894119906V minus 120583

119879nor119906V

) (9)

Mathematical Problems in Engineering 5

where the mean of the (119895 119896)th elements and the mean of the(119906 V)th elements of TAMs for 119905normal training traffic recordsare respectively defined as

120583119879nor119895119896

=1

119905

119905

sum

119894=1

119879nor119894119895119896

(10)

120583119879nor119906V

=1

119905

119905

sum

119894=1

119879nor119894119906V (11)

(ii) Computing the MD between Every Two TAMs of TrafficRecords The covariance distance of data is signified by MDMD is an effective approach to compute the similarity ofthe two unknown sample sets The difference between MDand Euclidean Distance (ED) is that the relations between allkinds of characters are considered and thatMD is not relevantto the scale of the measurement

The MD between the normal training records and theirexpectation and the MD between the fresh captured trafficrecord and the expectation of normal training records areshown by the following formulas

MDnor119894

= radic(TAMnor119894lower minus TAMnor

lower)119879

covminus1 (TAMnor119894lower minus TAMnor

lower)

(12)

MDfresh

= radic(TAMfreshlower minus TAMnor

lower)119879

covminus1 (TAMfreshlower minus TAMnor

lower)

(13)

Moreover the expectation of TAMnorlower for the 119905 normal

training records is shown as follows

TAMnorlower =

1

119905

119905

sum

119894=1

TAMnor119894lower (14)

332 Attack Detection Standard Based on MD andThresholdFor DDoS attack detection we set a threshold value todistinguish DDoS anomaly traffic from the normal trafficfeature Next we give a formula [8] about the threshold value

Threshold = 120583 + 120590 lowast 120572 (15)

where 120583 was shown by (10) or (11) and 120590 is shown as follows

120590 = radic1

119905 minus 1

119905

sum

119894=1

(MDnor119894minusMDnor

)2

(16)

MDnor=

1

119905

119905

sum

119894=1

MDnor119894 (17)

In order to conform to the normal distribution [8] therange of the 120590 value is set from 1 to 3 with the increment of02 in this paperThen the standard of DDoS attack detectionis obtained An attack behavior is considered when the MDbetween a fresh acquired traffic record and the expectation ofnormal training records is greater than the threshold

Testing data

Generate benchmarkdata by triangle area

map and Mahalanobisdistance

Training data

Attack detectionbased on the given

threshold

Input

MDRA attackdetection algorithm

Attack detection modelInput

Output

Detectionresult

Figure 3 Flowchart of attack detection

333 A MDRA DDoS Attack Detection Algorithm Tan etal [8] proposed the algorithm to use for normal profilegeneration based on triangle-area and MCA and to use forattack detection based onMD By evaluation and comparisonwith some state-of-the-art approaches it is easy to find thatthe previous attack detection method and its system havesome advantages in detection performance Detection Rate(DR) and accuracy rate However in Big Data of cyberspacesecurity especially when the network attack behaviors oflarge traffic are growing increasingly detection efficiencyresource consumption and computing complexity need betaken adequately into account for attack detection For thereasons given above we propose the MDRA algorithm todetect efficiently the network anomaly traffic Algorithm 1depicts the procedures of the algorithm for DDoS attackdetection metric based on MDRA in detail

4 Experiments

In this section we discuss how to apply our algorithm indetecting efficiently the DDoS attack traffic The flowchart ofattack detection is shown in Figure 3

Firstly we present the data set used in our experimentsand the data pretreatment approach to serve our experi-ments Then the experimental results are got to evaluate thealgorithm performance Finally we make comparisons withthe previous unoptimized approach in terms of time costresource consumption and computing complexity

The computer environment to run our experiments isshown in Table 1

Next we describe our experiments in detail

41 Data Set and Pretreatment In this paper we use thefamous Knowledge Discovery and Data Mining (KDD) Cup1999 data set [17ndash21] as our novel algorithm verificationWe have to admit that this data set has some shortagesbut it is still uniquely public and relatively credible labeledbenchmark data set so far This data set has been widelyapplied to researching and evaluating network intrusiondetection methods [22 23]

6 Mathematical Problems in Engineering

(1) Input a set of training data of normal network traffic records 119883nor= 119909

nor1

119909nor2

119909nor119905

where 119909nor119894

= [119891119894

1 119891119894

2 119891

119894

119898] 1 le 119894 le 119899

(2) Extract the principal components of119883nor to reach 70 for the accumulative contribution rate based on PCA and obtain theprincipal component data set119883119875nor

(3) Calculate TAM119875nor119894lower and TAM119875norlower of119883119875nor

(4) Calculate the covariance matrices between the areas of every two triangles 119879119875nor in119883119875nor

(5) for 119894 = 1 to 119905 do(6) Input TAM119875nor119894lower and TAM119875norlower(7) Calculate MD119875nor119894 between TAM119875nor119894lower and TAM119875norlower(8) Output MD119875nor119894(9) end for(10) Calculate 120583 by MD119875nor119894(11) Calculate 120590 by MD119875nor119894 and 120583(12) Input a fresh incoming traffic record 119909

fresh(13) Reduce the dimensions of the features for119909fresh based on PCA then get the records which include the principal components119909119875fresh(14) Calculate TAM119875freshlower of 119909119875fresh(15) Calculate MD119875fresh between TAM119875freshlower and TAM119875norlower(16) Input the threshold value 120572(17) If (120583 minus 120590 lowast 120572) le MD119875fresh le (120583 + 120590 lowast 120572) then(18) return Normal(19) else(20) return Attack(21) end if

Algorithm 1 Algorithm for DDoS attack detection based on MDRA

Table 1 Computer environment to run our experiments

CPU Memory Hard disk OS MATLABIntel Xeon CPU E5-2640 v2 200GHz200GHz (2 processors) 32GB 2TB Windows Server 2008 R2

EnterpriseR2013a (810604)64-bit (win64)

Table 2 Data sets used in our experiments

Category Training data set (10) Testing data set (corrected)Normal 97278 60593DoS 391458 229853

KDD CUP 1999 data set comprises about five millionnetwork records and provides a training subset of 10 percentof the network records and a testing subset It covers fourmain categories of attack that is DoS R2L U2R andProbingHere we use these records labeled as ldquonormalrdquo in theabovementioned training subset to construct our benchmarkdata and employ this testing subset ldquocorrectedrdquo to verifythe validity and efficiency of our algorithm In this paperwe choose DoS network attack as our algorithm evaluationand comparison with the previous approaches The data setsused in our experiments are shown in Table 2 The datapretreatment procedure is shown as follows

Firstly for each network traffic record it includes theinformation that has been separated into 41 features plus 1class label [24] in this data set In our experiments we need toget all numeric data for 41 features of every record Howeverthere are 3 nonnumeric features in all features and these areprotocol type service and flag They must be transformedinto numeric typeThe type conversion is achieved accordingto Table 3 where we emphatically analyze the pretreatment

process with reference to the feature ldquoservicerdquo The analysisprocess is as follows

There are 70 kinds of network service types in the ldquoser-vicerdquo feature however some of them rarely appear or neverappear For these features we can ignore them completelyAmong the 494021 records in the training subset of 10percent we find that the top three network service typesrespectively are ecr i private and http by counting andsorting and their ratios respectively are 5696 2245 and1301 The sum of all the other types accounts merely for758 The ratios of the top four types in ldquoservicerdquo feature areshown in Table 4

Secondly among the 41 features of these records labeled asldquonormalrdquo in the training subset of 10 percent there are threeinvalid features (ie wrong fragment num outbound cmdsand is hot login) by PCA This is because all the values ofthe three features are zero Therefore we get rid of the threefeatures in our experiments

Last but not least we extract the principal componentsaccording to the rate of accumulative contribution basedon PCA algorithm As a general rule we set the value ofthe rate of accumulative contribution to be equal to or tobe greater than 50 to extract important features from thechosen data set [6] In order to obtain the more importantprincipal components the value of the rate of accumulativecontribution is set to 70 in our experimentsThese principalcomponents extracted in the 41 features are listed in Table 5

Mathematical Problems in Engineering 7

Table 3 Type conversion for numbers 2 3 and 4 of 41 features

Number Feature name Type setting 1 Type setting 2 Type setting 3 Type setting 42 protocol type TCP = 1 UDP = 2 ICMP = 3 3 service ecr i = 1 private = 2 http = 3 others = 04 flag SF = 1 others = 0

Table 4 Top four types in ldquoservicerdquo feature

Type name The percentageecr i 5696private 2245http 1301others 758

Table 5 The principal components extracted in 41 features

Number Feature name of principal component1 duration2 protocol type3 service4 flag5 src bytes6 dst bytes7 land8 urgent9 hot10 num failed logins11 logged in12 num compromised

42 Experimental Results Our experiments aim at showingexhaustive and comparable results between the DDoS attackdetection method based on MCA and the method based onMDRA These results prove that the latter is superior to theformer

In order to estimate the advantage of our method it isindispensable to establish some evaluating indications Herewe present four formulae to evaluate our algorithm and theyare Precision TNR FPR and DR [11] The formulae aredefined as follows

Precision =TP

TP + FP (18)

TNR =TN

FP + TN (19)

FPR =FP

FP + TN (20)

DR =TP

TP + FN (21)

where(i) TP (True Positive) is the number of attacks correctly

classified as attacks(ii) FP (False Positive) is the number of normal records

incorrectly classified as attacks

MDRAMCA

99

992

994

996

998

100

Prec

ision

()

12 14 16 18 2 22 24 26 28 31The value of 120572

Figure 4 Precision for comparing detection methods based onMDRA and MCA

MDRAMCA

96

968

976

984

992

100

TNR

()

14 2216 18 212 24 26 28 31The value of 120572

Figure 5 TNR for comparing detection methods based on MDRAand MCA

(iii) TN (True Negative) is the number of normal recordscorrectly classified as normal records

(iv) FN (False Negative) is the number of attacks incor-rectly classified as normal records

Table 6 shows all results of TP FP TN and FN for 120572 thatis set from 1 to 3 with the increment of 02 when we use DDoSattack detection methods based on MDRA and MCA

Here the detection results of precision and TNR with thedifferent 120572 values are shown in Figures 4 and 5

8 Mathematical Problems in Engineering

Table 6 Results of TP FP TN and FN based on MDRA and MCA

120572Indicators based on MDRA Indicators based on MCA

TP FP TN FN TP FP TN FN120572 = 1 166299 278 60315 63554 223587 1743 58850 6266120572 = 12 166299 249 60344 63554 221873 1469 59124 7980120572 = 14 166292 227 60366 63561 206504 1313 59280 23349120572 = 16 166289 217 60376 63564 191190 1214 59379 38663120572 = 18 166289 204 60389 63564 190394 1159 59434 39459120572 = 2 166289 194 60399 63564 190342 1115 59478 39511120572 = 22 166289 191 60402 63564 190311 1065 59528 39542120572 = 24 166289 188 60405 63564 190277 1027 59566 39576120572 = 26 166282 180 60413 63571 190254 988 59605 39599120572 = 28 166282 176 60417 63571 190230 953 59640 39623120572 = 3 166282 172 60421 63571 190199 927 59666 39654

723

7232

7234

7236

7238

724

DR

()

03 035 04 045 05025FPR ()

(a)

80

85

90

95

100

DR

()

18 22 26 314FPR ()

(b)

Figure 6 (a) ROC for analyzing detection method based on MDRA (b) ROC for analyzing detection method based on MCA

In Figure 4 it is not hard to find that when the value of120572 gradually increases from 1 to 3 with the increment of 02the precision of attack detection method based on MDRA issuperior to the counterpart based onMCA and the former isabout 04 to 06 percent higher than the latter

In Figure 5 similarly we find that the TNR of ourdetection method is completely superior to another one withthe progressive increment of 120572 and the former is about 12 to24 percent higher than the latter

In addition the relationship between DR and FPR isfrequently used to evaluate the detection performance by theReceiver Operating Characteristic (ROC) curve The ROCcurve is obtained by setting different thresholds and there isa tradeoff between the DR and FPR [25] The ROC curves ofthe comparisons about the two detection methods are shownin Figure 6 In Figures 6(a) and 6(b) the two ROC curvesthat are used to analyze attack detection performance basedon our method and another one show the growing tendencyIn Figure 6(a) theROCcurve of ourmethod climbs graduallyfrom 7234 to 7235 for DR and it reflects that the changeof DR with different 120572 values is fairly small Likewise inFigure 6(b) this change is relatively large and the ROC

curve jumps dramatically from 8318 to 8984 Howeverin Big Data we pay more attention to instantaneity timecost resource consumption and computational complexityof attack detection Therefore a shade of discrepancy of DRcould be ignored At this point our method has the vastmajority of advantages in comparison to other methods Thediscussion about this topic will be opened up in the nextsection

43 Results Comparisons in terms of Time Cost and ResourceConsumption Here we emphatically analyze time cost andmemory resource consumption based on MDRA and MCA

On the one hand our detection mechanism is superiorto another one based on triangle-area and MCA proposedby Tan et al in time cost In our experimental environmentwe employ this server which has two CPUs and whereevery CPU has 16 cores When we ran the abovementionedexperimental data one of two CPUs opened and 16 coresof this CPU would gradually load to its full capacity Atthe moment the comparing results in CPU time of runningthe experimental data based on our detection method andthe other one are shown in Figure 7 However in the same

Mathematical Problems in Engineering 9

MDRAMCA

10minus1

100

101

102

CPU

tim

e (se

c)

12 14 16 18 2622 24 31 2 28The value of 120572

Figure 7 Comparing results in CPU time based on MDRA andMCA

MDRAMCA

0

05

1

15

2

25

3

35

Mem

ory

cons

umpt

ion

(GB)

12 14 16 18 2622 24 31 2 28The value of 120572

Figure 8 Comparing results in memory consumption based onMDRA and MCA

experimental environment the CPU time of the detectionmethod proposed by Tan et al is about 80 times as long asours and our CPU time is less than one second

On the other hand in terms of memory consumptionour detection mechanism is also a cut above the rest of themethod proposed by Tan et al This is because the memoryoccupied by our detection method in the experiments takesup less than 1GB however another one needs memory spaceof more than 3GB In the same experimental environmentthe occupied memory space in the detection method pro-posed by Tan et al is more than 3 times as long as ours Thecomparing results in memory consumption of running theexperimental data are shown in Figure 8

To sumup our detectionmethod can be perfectly appliedin real-timeDDoS attack detection under the environment ofvast amount of network traffic in Big Data

44 Computing Complexity Analysis In this section weanalyze the computing complexity of our detection method

Because the previous method based on MCA has the com-puting complexity of 119874(119898

2) and 119898 is a fixed number the

overall computing complexity is equal to 119874(1) [8] How-ever our detection mechanism based on MDRA uses thesimilar computational principle What is more the fixedfeature dimensionality119898 after reducing dimensionality in ourmethod is one-third of the previous method based on MCAHence the computing complexity of ourmethod is also equalto119874(1) At this point our detectionmechanism is equal to oris better than the other methods in [6 8 16]

5 Conclusion

In this paper we present a real-time DDoS attack detectionmechanism based on the MDRA algorithm in Big DataCompared with previous methods the experimental resultsdemonstrate that our solution has the better effectiveness andefficiency to distinguish attack traffic from vast amount ofnormal network traffic on the aspects of precision rate TNRtime cost memory resource consumption and computingcomplexity

Competing Interests

The authors declare that there is no conflict of interestsregarding the publication of this paper

Acknowledgments

This work was supported by the International Science andTechnology Cooperation Project of China (2013DFE13130)

References

[1] P J CriscuoloDistributed Denial of Service Trin00 Tribe FloodNetwork Tribe Flood Network 2000 and Stacheldraht CIAC-2319 Lawrence Livermore National Laboratory 2000

[2] M Kim H Na and K Chae ldquoA combined data miningapproach for DDoS attack detectionrdquo in Information Net-working Networking Technologies for Broadband and MobileNetworks vol 3090 of Lecture Notes in Computer Science pp943ndash950 Springer Berlin Germany 2004

[3] A Scherrer N Larrieu P Owezarski P Borgnat and P AbryldquoNon-Gaussian and long memory statistical characterizationsfor Internet traffic with anomaliesrdquo IEEE Transactions onDependable and Secure Computing vol 4 no 1 pp 56ndash70 2007

[4] K Lee J Kim K H Kwon Y Han and S Kim ldquoDDoS attackdetection method using cluster analysisrdquo Expert Systems withApplications vol 34 no 3 pp 1659ndash1665 2008

[5] H V Nguyen and Y Choi ldquoProactive detection of DDoS attacksutilizing k-NN classifier in an anti-DDoS frameworkrdquo WorldAcademy of Science Engineering and Technology InternationalScience Index vol 4 no 3 pp 247ndash252 2010

[6] C-F Tsai andC-Y Lin ldquoA triangle area based nearest neighborsapproach to intrusion detectionrdquo Pattern Recognition vol 43no 1 pp 222ndash229 2010

[7] A Bhange A Syad and S SinghThakur ldquoDDoS attacks impacton network traffic and its detection approachrdquo InternationalJournal of Computer Applications vol 40 no 11 pp 36ndash40 2012

10 Mathematical Problems in Engineering

[8] Z Y Tan A Jamdagni X J He P Nanda and R P LiuldquoA system for denial-of-service attack detection based onmultivariate correlation analysisrdquo IEEE Transactions on Paralleland Distributed Systems vol 25 no 2 pp 447ndash456 2014

[9] J Luo X Yang JWang J Xu J Sun and K Long ldquoOn amathe-matical model for low-rate shrew DDoSrdquo IEEE Transactions onInformation Forensics and Security vol 9 no 7 pp 1069ndash10832014

[10] H Hotelling ldquoAnalysis of a complex of statistical variables intoprincipal componentsrdquo Journal of Educational Psychology vol24 no 6 pp 417ndash441 1933

[11] A Patcha and J-M Park ldquoAn overview of anomaly detectiontechniques existing solutions and latest technological trendsrdquoComputer Networks vol 51 no 12 pp 3448ndash3470 2007

[12] G Liu Z Yi and S Yang ldquoA hierarchical intrusion detectionmodel based on the PCA neural networksrdquo Neurocomputingvol 70 no 7ndash9 pp 1561ndash1568 2007

[13] Y Kanda K Fukuda and T Sugawara ldquoEvaluation of anomalydetection based on sketch and PCArdquo in Proceedings of the 53rdIEEE Global Telecommunications Conference (GLOBECOM rsquo10)pp 1ndash5 IEEE Miami Fla USA December 2010

[14] Y Zhang and L Wu ldquoAn MR brain images classifier via prin-cipal component analysis and kernel support vector machinerdquoProgress in Electromagnetics Research vol 130 pp 369ndash3882012

[15] Z Tan A Jamdagni X He P Nanda and R P Liu ldquoDenial-of-service attack detection based on multivariate correlation anal-ysisrdquo in Neural Information Processing pp 756ndash765 SpringerBerlin Germany 2011

[16] Z Tan A Jamdagni X J He P Nanda and R P Liu ldquoTriangle-area-basedmultivariate correlation analysis for effective denial-of-service attack detectionrdquo in Proceedings of the 11th IEEEInternational Conference on Trust Security and Privacy inComputing and Communications (TrustCom rsquo12) pp 33ndash40IEEE Liverpool UK June 2012

[17] S Stolfo W Fan W Lee A Prodromidis and P K ChanldquoCost-basedmodeling for fraud and intrusion detection resultsfrom the JAMprojectrdquo inProceedings of theDARPA informationsurvivability conference and exposition (DISCEX rsquo00) pp 130ndash144 Hilton Head SC USA 2000

[18] R Lippmann JWHaines D J Fried J Korba andKDas ldquoThe1999 DARPA off-line intrusion detection evaluationrdquoComputerNetworks vol 34 no 4 pp 579ndash595 2000

[19] J McHugh ldquoTesting Intrusion detection systems a critique ofthe 1998 and 1999 DARPA intrusion detection system evalua-tions as performed by Lincoln Laboratoryrdquo ACM Transactionson Information and System Security vol 3 no 4 pp 262ndash2942000

[20] S StofoTheThird International Knowledge Discovery and DataMining Tools Competition The University of California 2002httpkddicsuciedudatabaseskddcup99kddcup99html

[21] S Mukkamala A H Sung and A Abraham ldquoIntrusiondetection using an ensemble of intelligent paradigmsrdquo Journalof Network and Computer Applications vol 28 no 2 pp 167ndash182 2005

[22] K-C Khor C-Y Ting and S Phon-Amnuaisuk ldquoA cascadedclassifier approach for improving detection rates on rare attackcategories in network intrusion detectionrdquo Applied Intelligencevol 36 no 2 pp 320ndash329 2012

[23] P Prasenna A V T Raghav Ramana R Krishna Kumar andA Devanbu ldquoNetwork programming and mining classifier for

intrusion detection using probability classificationrdquo in Proceed-ings of the International Conference on Pattern RecognitionInformatics and Medical Engineering (PRIME rsquo12) pp 204ndash209IEEE Salem Tamilnadu March 2012

[24] C Bae W-C Yeh M A M Shukran Y Y chung and T-JHsieh ldquoA novel anomaly-network intrusion detection systemusing ABC algorithmsrdquo International Journal of InnovativeComputing Information and Control vol 8 no 12 pp 8231ndash8248 2012

[25] WWang X Zhang S Gombault and S J Knapskog ldquoAttributenormalization in network intrusion detectionrdquo in Proceedingsof the 10th International Symposium on Pervasive SystemsAlgorithms and Networks (ISPAN rsquo09) pp 448ndash453 December2009

Submit your manuscripts athttpwwwhindawicom

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Mathematical Problems in Engineering

Hindawi Publishing Corporationhttpwwwhindawicom

Differential EquationsInternational Journal of

Volume 2014

Applied MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Probability and StatisticsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Mathematical PhysicsAdvances in

Complex AnalysisJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

OptimizationJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

CombinatoricsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Operations ResearchAdvances in

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Function Spaces

Abstract and Applied AnalysisHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of Mathematics and Mathematical Sciences

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Algebra

Discrete Dynamics in Nature and Society

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Decision SciencesAdvances in

Discrete MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom

Volume 2014 Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Stochastic AnalysisInternational Journal of

Page 5: Research Article A Novel Real-Time DDoS Attack Detection ...downloads.hindawi.com/journals/mpe/2016/1467051.pdf · e Denial of Service (DoS) attack is one of the most popular attacks

Mathematical Problems in Engineering 5

where the mean of the (119895 119896)th elements and the mean of the(119906 V)th elements of TAMs for 119905normal training traffic recordsare respectively defined as

120583119879nor119895119896

=1

119905

119905

sum

119894=1

119879nor119894119895119896

(10)

120583119879nor119906V

=1

119905

119905

sum

119894=1

119879nor119894119906V (11)

(ii) Computing the MD between Every Two TAMs of TrafficRecords The covariance distance of data is signified by MDMD is an effective approach to compute the similarity ofthe two unknown sample sets The difference between MDand Euclidean Distance (ED) is that the relations between allkinds of characters are considered and thatMD is not relevantto the scale of the measurement

The MD between the normal training records and theirexpectation and the MD between the fresh captured trafficrecord and the expectation of normal training records areshown by the following formulas

MDnor119894

= radic(TAMnor119894lower minus TAMnor

lower)119879

covminus1 (TAMnor119894lower minus TAMnor

lower)

(12)

MDfresh

= radic(TAMfreshlower minus TAMnor

lower)119879

covminus1 (TAMfreshlower minus TAMnor

lower)

(13)

Moreover the expectation of TAMnorlower for the 119905 normal

training records is shown as follows

TAMnorlower =

1

119905

119905

sum

119894=1

TAMnor119894lower (14)

332 Attack Detection Standard Based on MD andThresholdFor DDoS attack detection we set a threshold value todistinguish DDoS anomaly traffic from the normal trafficfeature Next we give a formula [8] about the threshold value

Threshold = 120583 + 120590 lowast 120572 (15)

where 120583 was shown by (10) or (11) and 120590 is shown as follows

120590 = radic1

119905 minus 1

119905

sum

119894=1

(MDnor119894minusMDnor

)2

(16)

MDnor=

1

119905

119905

sum

119894=1

MDnor119894 (17)

In order to conform to the normal distribution [8] therange of the 120590 value is set from 1 to 3 with the increment of02 in this paperThen the standard of DDoS attack detectionis obtained An attack behavior is considered when the MDbetween a fresh acquired traffic record and the expectation ofnormal training records is greater than the threshold

Testing data

Generate benchmarkdata by triangle area

map and Mahalanobisdistance

Training data

Attack detectionbased on the given

threshold

Input

MDRA attackdetection algorithm

Attack detection modelInput

Output

Detectionresult

Figure 3 Flowchart of attack detection

333 A MDRA DDoS Attack Detection Algorithm Tan etal [8] proposed the algorithm to use for normal profilegeneration based on triangle-area and MCA and to use forattack detection based onMD By evaluation and comparisonwith some state-of-the-art approaches it is easy to find thatthe previous attack detection method and its system havesome advantages in detection performance Detection Rate(DR) and accuracy rate However in Big Data of cyberspacesecurity especially when the network attack behaviors oflarge traffic are growing increasingly detection efficiencyresource consumption and computing complexity need betaken adequately into account for attack detection For thereasons given above we propose the MDRA algorithm todetect efficiently the network anomaly traffic Algorithm 1depicts the procedures of the algorithm for DDoS attackdetection metric based on MDRA in detail

4 Experiments

In this section we discuss how to apply our algorithm indetecting efficiently the DDoS attack traffic The flowchart ofattack detection is shown in Figure 3

Firstly we present the data set used in our experimentsand the data pretreatment approach to serve our experi-ments Then the experimental results are got to evaluate thealgorithm performance Finally we make comparisons withthe previous unoptimized approach in terms of time costresource consumption and computing complexity

The computer environment to run our experiments isshown in Table 1

Next we describe our experiments in detail

41 Data Set and Pretreatment In this paper we use thefamous Knowledge Discovery and Data Mining (KDD) Cup1999 data set [17ndash21] as our novel algorithm verificationWe have to admit that this data set has some shortagesbut it is still uniquely public and relatively credible labeledbenchmark data set so far This data set has been widelyapplied to researching and evaluating network intrusiondetection methods [22 23]

6 Mathematical Problems in Engineering

(1) Input a set of training data of normal network traffic records 119883nor= 119909

nor1

119909nor2

119909nor119905

where 119909nor119894

= [119891119894

1 119891119894

2 119891

119894

119898] 1 le 119894 le 119899

(2) Extract the principal components of119883nor to reach 70 for the accumulative contribution rate based on PCA and obtain theprincipal component data set119883119875nor

(3) Calculate TAM119875nor119894lower and TAM119875norlower of119883119875nor

(4) Calculate the covariance matrices between the areas of every two triangles 119879119875nor in119883119875nor

(5) for 119894 = 1 to 119905 do(6) Input TAM119875nor119894lower and TAM119875norlower(7) Calculate MD119875nor119894 between TAM119875nor119894lower and TAM119875norlower(8) Output MD119875nor119894(9) end for(10) Calculate 120583 by MD119875nor119894(11) Calculate 120590 by MD119875nor119894 and 120583(12) Input a fresh incoming traffic record 119909

fresh(13) Reduce the dimensions of the features for119909fresh based on PCA then get the records which include the principal components119909119875fresh(14) Calculate TAM119875freshlower of 119909119875fresh(15) Calculate MD119875fresh between TAM119875freshlower and TAM119875norlower(16) Input the threshold value 120572(17) If (120583 minus 120590 lowast 120572) le MD119875fresh le (120583 + 120590 lowast 120572) then(18) return Normal(19) else(20) return Attack(21) end if

Algorithm 1 Algorithm for DDoS attack detection based on MDRA

Table 1 Computer environment to run our experiments

CPU Memory Hard disk OS MATLABIntel Xeon CPU E5-2640 v2 200GHz200GHz (2 processors) 32GB 2TB Windows Server 2008 R2

EnterpriseR2013a (810604)64-bit (win64)

Table 2 Data sets used in our experiments

Category Training data set (10) Testing data set (corrected)Normal 97278 60593DoS 391458 229853

KDD CUP 1999 data set comprises about five millionnetwork records and provides a training subset of 10 percentof the network records and a testing subset It covers fourmain categories of attack that is DoS R2L U2R andProbingHere we use these records labeled as ldquonormalrdquo in theabovementioned training subset to construct our benchmarkdata and employ this testing subset ldquocorrectedrdquo to verifythe validity and efficiency of our algorithm In this paperwe choose DoS network attack as our algorithm evaluationand comparison with the previous approaches The data setsused in our experiments are shown in Table 2 The datapretreatment procedure is shown as follows

Firstly for each network traffic record it includes theinformation that has been separated into 41 features plus 1class label [24] in this data set In our experiments we need toget all numeric data for 41 features of every record Howeverthere are 3 nonnumeric features in all features and these areprotocol type service and flag They must be transformedinto numeric typeThe type conversion is achieved accordingto Table 3 where we emphatically analyze the pretreatment

process with reference to the feature ldquoservicerdquo The analysisprocess is as follows

There are 70 kinds of network service types in the ldquoser-vicerdquo feature however some of them rarely appear or neverappear For these features we can ignore them completelyAmong the 494021 records in the training subset of 10percent we find that the top three network service typesrespectively are ecr i private and http by counting andsorting and their ratios respectively are 5696 2245 and1301 The sum of all the other types accounts merely for758 The ratios of the top four types in ldquoservicerdquo feature areshown in Table 4

Secondly among the 41 features of these records labeled asldquonormalrdquo in the training subset of 10 percent there are threeinvalid features (ie wrong fragment num outbound cmdsand is hot login) by PCA This is because all the values ofthe three features are zero Therefore we get rid of the threefeatures in our experiments

Last but not least we extract the principal componentsaccording to the rate of accumulative contribution basedon PCA algorithm As a general rule we set the value ofthe rate of accumulative contribution to be equal to or tobe greater than 50 to extract important features from thechosen data set [6] In order to obtain the more importantprincipal components the value of the rate of accumulativecontribution is set to 70 in our experimentsThese principalcomponents extracted in the 41 features are listed in Table 5

Mathematical Problems in Engineering 7

Table 3 Type conversion for numbers 2 3 and 4 of 41 features

Number Feature name Type setting 1 Type setting 2 Type setting 3 Type setting 42 protocol type TCP = 1 UDP = 2 ICMP = 3 3 service ecr i = 1 private = 2 http = 3 others = 04 flag SF = 1 others = 0

Table 4 Top four types in ldquoservicerdquo feature

Type name The percentageecr i 5696private 2245http 1301others 758

Table 5 The principal components extracted in 41 features

Number Feature name of principal component1 duration2 protocol type3 service4 flag5 src bytes6 dst bytes7 land8 urgent9 hot10 num failed logins11 logged in12 num compromised

42 Experimental Results Our experiments aim at showingexhaustive and comparable results between the DDoS attackdetection method based on MCA and the method based onMDRA These results prove that the latter is superior to theformer

In order to estimate the advantage of our method it isindispensable to establish some evaluating indications Herewe present four formulae to evaluate our algorithm and theyare Precision TNR FPR and DR [11] The formulae aredefined as follows

Precision =TP

TP + FP (18)

TNR =TN

FP + TN (19)

FPR =FP

FP + TN (20)

DR =TP

TP + FN (21)

where(i) TP (True Positive) is the number of attacks correctly

classified as attacks(ii) FP (False Positive) is the number of normal records

incorrectly classified as attacks

MDRAMCA

99

992

994

996

998

100

Prec

ision

()

12 14 16 18 2 22 24 26 28 31The value of 120572

Figure 4 Precision for comparing detection methods based onMDRA and MCA

MDRAMCA

96

968

976

984

992

100

TNR

()

14 2216 18 212 24 26 28 31The value of 120572

Figure 5 TNR for comparing detection methods based on MDRAand MCA

(iii) TN (True Negative) is the number of normal recordscorrectly classified as normal records

(iv) FN (False Negative) is the number of attacks incor-rectly classified as normal records

Table 6 shows all results of TP FP TN and FN for 120572 thatis set from 1 to 3 with the increment of 02 when we use DDoSattack detection methods based on MDRA and MCA

Here the detection results of precision and TNR with thedifferent 120572 values are shown in Figures 4 and 5

8 Mathematical Problems in Engineering

Table 6 Results of TP FP TN and FN based on MDRA and MCA

120572Indicators based on MDRA Indicators based on MCA

TP FP TN FN TP FP TN FN120572 = 1 166299 278 60315 63554 223587 1743 58850 6266120572 = 12 166299 249 60344 63554 221873 1469 59124 7980120572 = 14 166292 227 60366 63561 206504 1313 59280 23349120572 = 16 166289 217 60376 63564 191190 1214 59379 38663120572 = 18 166289 204 60389 63564 190394 1159 59434 39459120572 = 2 166289 194 60399 63564 190342 1115 59478 39511120572 = 22 166289 191 60402 63564 190311 1065 59528 39542120572 = 24 166289 188 60405 63564 190277 1027 59566 39576120572 = 26 166282 180 60413 63571 190254 988 59605 39599120572 = 28 166282 176 60417 63571 190230 953 59640 39623120572 = 3 166282 172 60421 63571 190199 927 59666 39654

723

7232

7234

7236

7238

724

DR

()

03 035 04 045 05025FPR ()

(a)

80

85

90

95

100

DR

()

18 22 26 314FPR ()

(b)

Figure 6 (a) ROC for analyzing detection method based on MDRA (b) ROC for analyzing detection method based on MCA

In Figure 4 it is not hard to find that when the value of120572 gradually increases from 1 to 3 with the increment of 02the precision of attack detection method based on MDRA issuperior to the counterpart based onMCA and the former isabout 04 to 06 percent higher than the latter

In Figure 5 similarly we find that the TNR of ourdetection method is completely superior to another one withthe progressive increment of 120572 and the former is about 12 to24 percent higher than the latter

In addition the relationship between DR and FPR isfrequently used to evaluate the detection performance by theReceiver Operating Characteristic (ROC) curve The ROCcurve is obtained by setting different thresholds and there isa tradeoff between the DR and FPR [25] The ROC curves ofthe comparisons about the two detection methods are shownin Figure 6 In Figures 6(a) and 6(b) the two ROC curvesthat are used to analyze attack detection performance basedon our method and another one show the growing tendencyIn Figure 6(a) theROCcurve of ourmethod climbs graduallyfrom 7234 to 7235 for DR and it reflects that the changeof DR with different 120572 values is fairly small Likewise inFigure 6(b) this change is relatively large and the ROC

curve jumps dramatically from 8318 to 8984 Howeverin Big Data we pay more attention to instantaneity timecost resource consumption and computational complexityof attack detection Therefore a shade of discrepancy of DRcould be ignored At this point our method has the vastmajority of advantages in comparison to other methods Thediscussion about this topic will be opened up in the nextsection

43 Results Comparisons in terms of Time Cost and ResourceConsumption Here we emphatically analyze time cost andmemory resource consumption based on MDRA and MCA

On the one hand our detection mechanism is superiorto another one based on triangle-area and MCA proposedby Tan et al in time cost In our experimental environmentwe employ this server which has two CPUs and whereevery CPU has 16 cores When we ran the abovementionedexperimental data one of two CPUs opened and 16 coresof this CPU would gradually load to its full capacity Atthe moment the comparing results in CPU time of runningthe experimental data based on our detection method andthe other one are shown in Figure 7 However in the same

Mathematical Problems in Engineering 9

MDRAMCA

10minus1

100

101

102

CPU

tim

e (se

c)

12 14 16 18 2622 24 31 2 28The value of 120572

Figure 7 Comparing results in CPU time based on MDRA andMCA

MDRAMCA

0

05

1

15

2

25

3

35

Mem

ory

cons

umpt

ion

(GB)

12 14 16 18 2622 24 31 2 28The value of 120572

Figure 8 Comparing results in memory consumption based onMDRA and MCA

experimental environment the CPU time of the detectionmethod proposed by Tan et al is about 80 times as long asours and our CPU time is less than one second

On the other hand in terms of memory consumptionour detection mechanism is also a cut above the rest of themethod proposed by Tan et al This is because the memoryoccupied by our detection method in the experiments takesup less than 1GB however another one needs memory spaceof more than 3GB In the same experimental environmentthe occupied memory space in the detection method pro-posed by Tan et al is more than 3 times as long as ours Thecomparing results in memory consumption of running theexperimental data are shown in Figure 8

To sumup our detectionmethod can be perfectly appliedin real-timeDDoS attack detection under the environment ofvast amount of network traffic in Big Data

44 Computing Complexity Analysis In this section weanalyze the computing complexity of our detection method

Because the previous method based on MCA has the com-puting complexity of 119874(119898

2) and 119898 is a fixed number the

overall computing complexity is equal to 119874(1) [8] How-ever our detection mechanism based on MDRA uses thesimilar computational principle What is more the fixedfeature dimensionality119898 after reducing dimensionality in ourmethod is one-third of the previous method based on MCAHence the computing complexity of ourmethod is also equalto119874(1) At this point our detectionmechanism is equal to oris better than the other methods in [6 8 16]

5 Conclusion

In this paper we present a real-time DDoS attack detectionmechanism based on the MDRA algorithm in Big DataCompared with previous methods the experimental resultsdemonstrate that our solution has the better effectiveness andefficiency to distinguish attack traffic from vast amount ofnormal network traffic on the aspects of precision rate TNRtime cost memory resource consumption and computingcomplexity

Competing Interests

The authors declare that there is no conflict of interestsregarding the publication of this paper

Acknowledgments

This work was supported by the International Science andTechnology Cooperation Project of China (2013DFE13130)

References

[1] P J CriscuoloDistributed Denial of Service Trin00 Tribe FloodNetwork Tribe Flood Network 2000 and Stacheldraht CIAC-2319 Lawrence Livermore National Laboratory 2000

[2] M Kim H Na and K Chae ldquoA combined data miningapproach for DDoS attack detectionrdquo in Information Net-working Networking Technologies for Broadband and MobileNetworks vol 3090 of Lecture Notes in Computer Science pp943ndash950 Springer Berlin Germany 2004

[3] A Scherrer N Larrieu P Owezarski P Borgnat and P AbryldquoNon-Gaussian and long memory statistical characterizationsfor Internet traffic with anomaliesrdquo IEEE Transactions onDependable and Secure Computing vol 4 no 1 pp 56ndash70 2007

[4] K Lee J Kim K H Kwon Y Han and S Kim ldquoDDoS attackdetection method using cluster analysisrdquo Expert Systems withApplications vol 34 no 3 pp 1659ndash1665 2008

[5] H V Nguyen and Y Choi ldquoProactive detection of DDoS attacksutilizing k-NN classifier in an anti-DDoS frameworkrdquo WorldAcademy of Science Engineering and Technology InternationalScience Index vol 4 no 3 pp 247ndash252 2010

[6] C-F Tsai andC-Y Lin ldquoA triangle area based nearest neighborsapproach to intrusion detectionrdquo Pattern Recognition vol 43no 1 pp 222ndash229 2010

[7] A Bhange A Syad and S SinghThakur ldquoDDoS attacks impacton network traffic and its detection approachrdquo InternationalJournal of Computer Applications vol 40 no 11 pp 36ndash40 2012

10 Mathematical Problems in Engineering

[8] Z Y Tan A Jamdagni X J He P Nanda and R P LiuldquoA system for denial-of-service attack detection based onmultivariate correlation analysisrdquo IEEE Transactions on Paralleland Distributed Systems vol 25 no 2 pp 447ndash456 2014

[9] J Luo X Yang JWang J Xu J Sun and K Long ldquoOn amathe-matical model for low-rate shrew DDoSrdquo IEEE Transactions onInformation Forensics and Security vol 9 no 7 pp 1069ndash10832014

[10] H Hotelling ldquoAnalysis of a complex of statistical variables intoprincipal componentsrdquo Journal of Educational Psychology vol24 no 6 pp 417ndash441 1933

[11] A Patcha and J-M Park ldquoAn overview of anomaly detectiontechniques existing solutions and latest technological trendsrdquoComputer Networks vol 51 no 12 pp 3448ndash3470 2007

[12] G Liu Z Yi and S Yang ldquoA hierarchical intrusion detectionmodel based on the PCA neural networksrdquo Neurocomputingvol 70 no 7ndash9 pp 1561ndash1568 2007

[13] Y Kanda K Fukuda and T Sugawara ldquoEvaluation of anomalydetection based on sketch and PCArdquo in Proceedings of the 53rdIEEE Global Telecommunications Conference (GLOBECOM rsquo10)pp 1ndash5 IEEE Miami Fla USA December 2010

[14] Y Zhang and L Wu ldquoAn MR brain images classifier via prin-cipal component analysis and kernel support vector machinerdquoProgress in Electromagnetics Research vol 130 pp 369ndash3882012

[15] Z Tan A Jamdagni X He P Nanda and R P Liu ldquoDenial-of-service attack detection based on multivariate correlation anal-ysisrdquo in Neural Information Processing pp 756ndash765 SpringerBerlin Germany 2011

[16] Z Tan A Jamdagni X J He P Nanda and R P Liu ldquoTriangle-area-basedmultivariate correlation analysis for effective denial-of-service attack detectionrdquo in Proceedings of the 11th IEEEInternational Conference on Trust Security and Privacy inComputing and Communications (TrustCom rsquo12) pp 33ndash40IEEE Liverpool UK June 2012

[17] S Stolfo W Fan W Lee A Prodromidis and P K ChanldquoCost-basedmodeling for fraud and intrusion detection resultsfrom the JAMprojectrdquo inProceedings of theDARPA informationsurvivability conference and exposition (DISCEX rsquo00) pp 130ndash144 Hilton Head SC USA 2000

[18] R Lippmann JWHaines D J Fried J Korba andKDas ldquoThe1999 DARPA off-line intrusion detection evaluationrdquoComputerNetworks vol 34 no 4 pp 579ndash595 2000

[19] J McHugh ldquoTesting Intrusion detection systems a critique ofthe 1998 and 1999 DARPA intrusion detection system evalua-tions as performed by Lincoln Laboratoryrdquo ACM Transactionson Information and System Security vol 3 no 4 pp 262ndash2942000

[20] S StofoTheThird International Knowledge Discovery and DataMining Tools Competition The University of California 2002httpkddicsuciedudatabaseskddcup99kddcup99html

[21] S Mukkamala A H Sung and A Abraham ldquoIntrusiondetection using an ensemble of intelligent paradigmsrdquo Journalof Network and Computer Applications vol 28 no 2 pp 167ndash182 2005

[22] K-C Khor C-Y Ting and S Phon-Amnuaisuk ldquoA cascadedclassifier approach for improving detection rates on rare attackcategories in network intrusion detectionrdquo Applied Intelligencevol 36 no 2 pp 320ndash329 2012

[23] P Prasenna A V T Raghav Ramana R Krishna Kumar andA Devanbu ldquoNetwork programming and mining classifier for

intrusion detection using probability classificationrdquo in Proceed-ings of the International Conference on Pattern RecognitionInformatics and Medical Engineering (PRIME rsquo12) pp 204ndash209IEEE Salem Tamilnadu March 2012

[24] C Bae W-C Yeh M A M Shukran Y Y chung and T-JHsieh ldquoA novel anomaly-network intrusion detection systemusing ABC algorithmsrdquo International Journal of InnovativeComputing Information and Control vol 8 no 12 pp 8231ndash8248 2012

[25] WWang X Zhang S Gombault and S J Knapskog ldquoAttributenormalization in network intrusion detectionrdquo in Proceedingsof the 10th International Symposium on Pervasive SystemsAlgorithms and Networks (ISPAN rsquo09) pp 448ndash453 December2009

Submit your manuscripts athttpwwwhindawicom

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Mathematical Problems in Engineering

Hindawi Publishing Corporationhttpwwwhindawicom

Differential EquationsInternational Journal of

Volume 2014

Applied MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Probability and StatisticsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Mathematical PhysicsAdvances in

Complex AnalysisJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

OptimizationJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

CombinatoricsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Operations ResearchAdvances in

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Function Spaces

Abstract and Applied AnalysisHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of Mathematics and Mathematical Sciences

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Algebra

Discrete Dynamics in Nature and Society

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Decision SciencesAdvances in

Discrete MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom

Volume 2014 Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Stochastic AnalysisInternational Journal of

Page 6: Research Article A Novel Real-Time DDoS Attack Detection ...downloads.hindawi.com/journals/mpe/2016/1467051.pdf · e Denial of Service (DoS) attack is one of the most popular attacks

6 Mathematical Problems in Engineering

(1) Input a set of training data of normal network traffic records 119883nor= 119909

nor1

119909nor2

119909nor119905

where 119909nor119894

= [119891119894

1 119891119894

2 119891

119894

119898] 1 le 119894 le 119899

(2) Extract the principal components of119883nor to reach 70 for the accumulative contribution rate based on PCA and obtain theprincipal component data set119883119875nor

(3) Calculate TAM119875nor119894lower and TAM119875norlower of119883119875nor

(4) Calculate the covariance matrices between the areas of every two triangles 119879119875nor in119883119875nor

(5) for 119894 = 1 to 119905 do(6) Input TAM119875nor119894lower and TAM119875norlower(7) Calculate MD119875nor119894 between TAM119875nor119894lower and TAM119875norlower(8) Output MD119875nor119894(9) end for(10) Calculate 120583 by MD119875nor119894(11) Calculate 120590 by MD119875nor119894 and 120583(12) Input a fresh incoming traffic record 119909

fresh(13) Reduce the dimensions of the features for119909fresh based on PCA then get the records which include the principal components119909119875fresh(14) Calculate TAM119875freshlower of 119909119875fresh(15) Calculate MD119875fresh between TAM119875freshlower and TAM119875norlower(16) Input the threshold value 120572(17) If (120583 minus 120590 lowast 120572) le MD119875fresh le (120583 + 120590 lowast 120572) then(18) return Normal(19) else(20) return Attack(21) end if

Algorithm 1 Algorithm for DDoS attack detection based on MDRA

Table 1 Computer environment to run our experiments

CPU Memory Hard disk OS MATLABIntel Xeon CPU E5-2640 v2 200GHz200GHz (2 processors) 32GB 2TB Windows Server 2008 R2

EnterpriseR2013a (810604)64-bit (win64)

Table 2 Data sets used in our experiments

Category Training data set (10) Testing data set (corrected)Normal 97278 60593DoS 391458 229853

KDD CUP 1999 data set comprises about five millionnetwork records and provides a training subset of 10 percentof the network records and a testing subset It covers fourmain categories of attack that is DoS R2L U2R andProbingHere we use these records labeled as ldquonormalrdquo in theabovementioned training subset to construct our benchmarkdata and employ this testing subset ldquocorrectedrdquo to verifythe validity and efficiency of our algorithm In this paperwe choose DoS network attack as our algorithm evaluationand comparison with the previous approaches The data setsused in our experiments are shown in Table 2 The datapretreatment procedure is shown as follows

Firstly for each network traffic record it includes theinformation that has been separated into 41 features plus 1class label [24] in this data set In our experiments we need toget all numeric data for 41 features of every record Howeverthere are 3 nonnumeric features in all features and these areprotocol type service and flag They must be transformedinto numeric typeThe type conversion is achieved accordingto Table 3 where we emphatically analyze the pretreatment

process with reference to the feature ldquoservicerdquo The analysisprocess is as follows

There are 70 kinds of network service types in the ldquoser-vicerdquo feature however some of them rarely appear or neverappear For these features we can ignore them completelyAmong the 494021 records in the training subset of 10percent we find that the top three network service typesrespectively are ecr i private and http by counting andsorting and their ratios respectively are 5696 2245 and1301 The sum of all the other types accounts merely for758 The ratios of the top four types in ldquoservicerdquo feature areshown in Table 4

Secondly among the 41 features of these records labeled asldquonormalrdquo in the training subset of 10 percent there are threeinvalid features (ie wrong fragment num outbound cmdsand is hot login) by PCA This is because all the values ofthe three features are zero Therefore we get rid of the threefeatures in our experiments

Last but not least we extract the principal componentsaccording to the rate of accumulative contribution basedon PCA algorithm As a general rule we set the value ofthe rate of accumulative contribution to be equal to or tobe greater than 50 to extract important features from thechosen data set [6] In order to obtain the more importantprincipal components the value of the rate of accumulativecontribution is set to 70 in our experimentsThese principalcomponents extracted in the 41 features are listed in Table 5

Mathematical Problems in Engineering 7

Table 3 Type conversion for numbers 2 3 and 4 of 41 features

Number Feature name Type setting 1 Type setting 2 Type setting 3 Type setting 42 protocol type TCP = 1 UDP = 2 ICMP = 3 3 service ecr i = 1 private = 2 http = 3 others = 04 flag SF = 1 others = 0

Table 4 Top four types in ldquoservicerdquo feature

Type name The percentageecr i 5696private 2245http 1301others 758

Table 5 The principal components extracted in 41 features

Number Feature name of principal component1 duration2 protocol type3 service4 flag5 src bytes6 dst bytes7 land8 urgent9 hot10 num failed logins11 logged in12 num compromised

42 Experimental Results Our experiments aim at showingexhaustive and comparable results between the DDoS attackdetection method based on MCA and the method based onMDRA These results prove that the latter is superior to theformer

In order to estimate the advantage of our method it isindispensable to establish some evaluating indications Herewe present four formulae to evaluate our algorithm and theyare Precision TNR FPR and DR [11] The formulae aredefined as follows

Precision =TP

TP + FP (18)

TNR =TN

FP + TN (19)

FPR =FP

FP + TN (20)

DR =TP

TP + FN (21)

where(i) TP (True Positive) is the number of attacks correctly

classified as attacks(ii) FP (False Positive) is the number of normal records

incorrectly classified as attacks

MDRAMCA

99

992

994

996

998

100

Prec

ision

()

12 14 16 18 2 22 24 26 28 31The value of 120572

Figure 4 Precision for comparing detection methods based onMDRA and MCA

MDRAMCA

96

968

976

984

992

100

TNR

()

14 2216 18 212 24 26 28 31The value of 120572

Figure 5 TNR for comparing detection methods based on MDRAand MCA

(iii) TN (True Negative) is the number of normal recordscorrectly classified as normal records

(iv) FN (False Negative) is the number of attacks incor-rectly classified as normal records

Table 6 shows all results of TP FP TN and FN for 120572 thatis set from 1 to 3 with the increment of 02 when we use DDoSattack detection methods based on MDRA and MCA

Here the detection results of precision and TNR with thedifferent 120572 values are shown in Figures 4 and 5

8 Mathematical Problems in Engineering

Table 6 Results of TP FP TN and FN based on MDRA and MCA

120572Indicators based on MDRA Indicators based on MCA

TP FP TN FN TP FP TN FN120572 = 1 166299 278 60315 63554 223587 1743 58850 6266120572 = 12 166299 249 60344 63554 221873 1469 59124 7980120572 = 14 166292 227 60366 63561 206504 1313 59280 23349120572 = 16 166289 217 60376 63564 191190 1214 59379 38663120572 = 18 166289 204 60389 63564 190394 1159 59434 39459120572 = 2 166289 194 60399 63564 190342 1115 59478 39511120572 = 22 166289 191 60402 63564 190311 1065 59528 39542120572 = 24 166289 188 60405 63564 190277 1027 59566 39576120572 = 26 166282 180 60413 63571 190254 988 59605 39599120572 = 28 166282 176 60417 63571 190230 953 59640 39623120572 = 3 166282 172 60421 63571 190199 927 59666 39654

723

7232

7234

7236

7238

724

DR

()

03 035 04 045 05025FPR ()

(a)

80

85

90

95

100

DR

()

18 22 26 314FPR ()

(b)

Figure 6 (a) ROC for analyzing detection method based on MDRA (b) ROC for analyzing detection method based on MCA

In Figure 4 it is not hard to find that when the value of120572 gradually increases from 1 to 3 with the increment of 02the precision of attack detection method based on MDRA issuperior to the counterpart based onMCA and the former isabout 04 to 06 percent higher than the latter

In Figure 5 similarly we find that the TNR of ourdetection method is completely superior to another one withthe progressive increment of 120572 and the former is about 12 to24 percent higher than the latter

In addition the relationship between DR and FPR isfrequently used to evaluate the detection performance by theReceiver Operating Characteristic (ROC) curve The ROCcurve is obtained by setting different thresholds and there isa tradeoff between the DR and FPR [25] The ROC curves ofthe comparisons about the two detection methods are shownin Figure 6 In Figures 6(a) and 6(b) the two ROC curvesthat are used to analyze attack detection performance basedon our method and another one show the growing tendencyIn Figure 6(a) theROCcurve of ourmethod climbs graduallyfrom 7234 to 7235 for DR and it reflects that the changeof DR with different 120572 values is fairly small Likewise inFigure 6(b) this change is relatively large and the ROC

curve jumps dramatically from 8318 to 8984 Howeverin Big Data we pay more attention to instantaneity timecost resource consumption and computational complexityof attack detection Therefore a shade of discrepancy of DRcould be ignored At this point our method has the vastmajority of advantages in comparison to other methods Thediscussion about this topic will be opened up in the nextsection

43 Results Comparisons in terms of Time Cost and ResourceConsumption Here we emphatically analyze time cost andmemory resource consumption based on MDRA and MCA

On the one hand our detection mechanism is superiorto another one based on triangle-area and MCA proposedby Tan et al in time cost In our experimental environmentwe employ this server which has two CPUs and whereevery CPU has 16 cores When we ran the abovementionedexperimental data one of two CPUs opened and 16 coresof this CPU would gradually load to its full capacity Atthe moment the comparing results in CPU time of runningthe experimental data based on our detection method andthe other one are shown in Figure 7 However in the same

Mathematical Problems in Engineering 9

MDRAMCA

10minus1

100

101

102

CPU

tim

e (se

c)

12 14 16 18 2622 24 31 2 28The value of 120572

Figure 7 Comparing results in CPU time based on MDRA andMCA

MDRAMCA

0

05

1

15

2

25

3

35

Mem

ory

cons

umpt

ion

(GB)

12 14 16 18 2622 24 31 2 28The value of 120572

Figure 8 Comparing results in memory consumption based onMDRA and MCA

experimental environment the CPU time of the detectionmethod proposed by Tan et al is about 80 times as long asours and our CPU time is less than one second

On the other hand in terms of memory consumptionour detection mechanism is also a cut above the rest of themethod proposed by Tan et al This is because the memoryoccupied by our detection method in the experiments takesup less than 1GB however another one needs memory spaceof more than 3GB In the same experimental environmentthe occupied memory space in the detection method pro-posed by Tan et al is more than 3 times as long as ours Thecomparing results in memory consumption of running theexperimental data are shown in Figure 8

To sumup our detectionmethod can be perfectly appliedin real-timeDDoS attack detection under the environment ofvast amount of network traffic in Big Data

44 Computing Complexity Analysis In this section weanalyze the computing complexity of our detection method

Because the previous method based on MCA has the com-puting complexity of 119874(119898

2) and 119898 is a fixed number the

overall computing complexity is equal to 119874(1) [8] How-ever our detection mechanism based on MDRA uses thesimilar computational principle What is more the fixedfeature dimensionality119898 after reducing dimensionality in ourmethod is one-third of the previous method based on MCAHence the computing complexity of ourmethod is also equalto119874(1) At this point our detectionmechanism is equal to oris better than the other methods in [6 8 16]

5 Conclusion

In this paper we present a real-time DDoS attack detectionmechanism based on the MDRA algorithm in Big DataCompared with previous methods the experimental resultsdemonstrate that our solution has the better effectiveness andefficiency to distinguish attack traffic from vast amount ofnormal network traffic on the aspects of precision rate TNRtime cost memory resource consumption and computingcomplexity

Competing Interests

The authors declare that there is no conflict of interestsregarding the publication of this paper

Acknowledgments

This work was supported by the International Science andTechnology Cooperation Project of China (2013DFE13130)

References

[1] P J CriscuoloDistributed Denial of Service Trin00 Tribe FloodNetwork Tribe Flood Network 2000 and Stacheldraht CIAC-2319 Lawrence Livermore National Laboratory 2000

[2] M Kim H Na and K Chae ldquoA combined data miningapproach for DDoS attack detectionrdquo in Information Net-working Networking Technologies for Broadband and MobileNetworks vol 3090 of Lecture Notes in Computer Science pp943ndash950 Springer Berlin Germany 2004

[3] A Scherrer N Larrieu P Owezarski P Borgnat and P AbryldquoNon-Gaussian and long memory statistical characterizationsfor Internet traffic with anomaliesrdquo IEEE Transactions onDependable and Secure Computing vol 4 no 1 pp 56ndash70 2007

[4] K Lee J Kim K H Kwon Y Han and S Kim ldquoDDoS attackdetection method using cluster analysisrdquo Expert Systems withApplications vol 34 no 3 pp 1659ndash1665 2008

[5] H V Nguyen and Y Choi ldquoProactive detection of DDoS attacksutilizing k-NN classifier in an anti-DDoS frameworkrdquo WorldAcademy of Science Engineering and Technology InternationalScience Index vol 4 no 3 pp 247ndash252 2010

[6] C-F Tsai andC-Y Lin ldquoA triangle area based nearest neighborsapproach to intrusion detectionrdquo Pattern Recognition vol 43no 1 pp 222ndash229 2010

[7] A Bhange A Syad and S SinghThakur ldquoDDoS attacks impacton network traffic and its detection approachrdquo InternationalJournal of Computer Applications vol 40 no 11 pp 36ndash40 2012

10 Mathematical Problems in Engineering

[8] Z Y Tan A Jamdagni X J He P Nanda and R P LiuldquoA system for denial-of-service attack detection based onmultivariate correlation analysisrdquo IEEE Transactions on Paralleland Distributed Systems vol 25 no 2 pp 447ndash456 2014

[9] J Luo X Yang JWang J Xu J Sun and K Long ldquoOn amathe-matical model for low-rate shrew DDoSrdquo IEEE Transactions onInformation Forensics and Security vol 9 no 7 pp 1069ndash10832014

[10] H Hotelling ldquoAnalysis of a complex of statistical variables intoprincipal componentsrdquo Journal of Educational Psychology vol24 no 6 pp 417ndash441 1933

[11] A Patcha and J-M Park ldquoAn overview of anomaly detectiontechniques existing solutions and latest technological trendsrdquoComputer Networks vol 51 no 12 pp 3448ndash3470 2007

[12] G Liu Z Yi and S Yang ldquoA hierarchical intrusion detectionmodel based on the PCA neural networksrdquo Neurocomputingvol 70 no 7ndash9 pp 1561ndash1568 2007

[13] Y Kanda K Fukuda and T Sugawara ldquoEvaluation of anomalydetection based on sketch and PCArdquo in Proceedings of the 53rdIEEE Global Telecommunications Conference (GLOBECOM rsquo10)pp 1ndash5 IEEE Miami Fla USA December 2010

[14] Y Zhang and L Wu ldquoAn MR brain images classifier via prin-cipal component analysis and kernel support vector machinerdquoProgress in Electromagnetics Research vol 130 pp 369ndash3882012

[15] Z Tan A Jamdagni X He P Nanda and R P Liu ldquoDenial-of-service attack detection based on multivariate correlation anal-ysisrdquo in Neural Information Processing pp 756ndash765 SpringerBerlin Germany 2011

[16] Z Tan A Jamdagni X J He P Nanda and R P Liu ldquoTriangle-area-basedmultivariate correlation analysis for effective denial-of-service attack detectionrdquo in Proceedings of the 11th IEEEInternational Conference on Trust Security and Privacy inComputing and Communications (TrustCom rsquo12) pp 33ndash40IEEE Liverpool UK June 2012

[17] S Stolfo W Fan W Lee A Prodromidis and P K ChanldquoCost-basedmodeling for fraud and intrusion detection resultsfrom the JAMprojectrdquo inProceedings of theDARPA informationsurvivability conference and exposition (DISCEX rsquo00) pp 130ndash144 Hilton Head SC USA 2000

[18] R Lippmann JWHaines D J Fried J Korba andKDas ldquoThe1999 DARPA off-line intrusion detection evaluationrdquoComputerNetworks vol 34 no 4 pp 579ndash595 2000

[19] J McHugh ldquoTesting Intrusion detection systems a critique ofthe 1998 and 1999 DARPA intrusion detection system evalua-tions as performed by Lincoln Laboratoryrdquo ACM Transactionson Information and System Security vol 3 no 4 pp 262ndash2942000

[20] S StofoTheThird International Knowledge Discovery and DataMining Tools Competition The University of California 2002httpkddicsuciedudatabaseskddcup99kddcup99html

[21] S Mukkamala A H Sung and A Abraham ldquoIntrusiondetection using an ensemble of intelligent paradigmsrdquo Journalof Network and Computer Applications vol 28 no 2 pp 167ndash182 2005

[22] K-C Khor C-Y Ting and S Phon-Amnuaisuk ldquoA cascadedclassifier approach for improving detection rates on rare attackcategories in network intrusion detectionrdquo Applied Intelligencevol 36 no 2 pp 320ndash329 2012

[23] P Prasenna A V T Raghav Ramana R Krishna Kumar andA Devanbu ldquoNetwork programming and mining classifier for

intrusion detection using probability classificationrdquo in Proceed-ings of the International Conference on Pattern RecognitionInformatics and Medical Engineering (PRIME rsquo12) pp 204ndash209IEEE Salem Tamilnadu March 2012

[24] C Bae W-C Yeh M A M Shukran Y Y chung and T-JHsieh ldquoA novel anomaly-network intrusion detection systemusing ABC algorithmsrdquo International Journal of InnovativeComputing Information and Control vol 8 no 12 pp 8231ndash8248 2012

[25] WWang X Zhang S Gombault and S J Knapskog ldquoAttributenormalization in network intrusion detectionrdquo in Proceedingsof the 10th International Symposium on Pervasive SystemsAlgorithms and Networks (ISPAN rsquo09) pp 448ndash453 December2009

Submit your manuscripts athttpwwwhindawicom

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Mathematical Problems in Engineering

Hindawi Publishing Corporationhttpwwwhindawicom

Differential EquationsInternational Journal of

Volume 2014

Applied MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Probability and StatisticsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Mathematical PhysicsAdvances in

Complex AnalysisJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

OptimizationJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

CombinatoricsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Operations ResearchAdvances in

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Function Spaces

Abstract and Applied AnalysisHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of Mathematics and Mathematical Sciences

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Algebra

Discrete Dynamics in Nature and Society

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Decision SciencesAdvances in

Discrete MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom

Volume 2014 Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Stochastic AnalysisInternational Journal of

Page 7: Research Article A Novel Real-Time DDoS Attack Detection ...downloads.hindawi.com/journals/mpe/2016/1467051.pdf · e Denial of Service (DoS) attack is one of the most popular attacks

Mathematical Problems in Engineering 7

Table 3 Type conversion for numbers 2 3 and 4 of 41 features

Number Feature name Type setting 1 Type setting 2 Type setting 3 Type setting 42 protocol type TCP = 1 UDP = 2 ICMP = 3 3 service ecr i = 1 private = 2 http = 3 others = 04 flag SF = 1 others = 0

Table 4 Top four types in ldquoservicerdquo feature

Type name The percentageecr i 5696private 2245http 1301others 758

Table 5 The principal components extracted in 41 features

Number Feature name of principal component1 duration2 protocol type3 service4 flag5 src bytes6 dst bytes7 land8 urgent9 hot10 num failed logins11 logged in12 num compromised

42 Experimental Results Our experiments aim at showingexhaustive and comparable results between the DDoS attackdetection method based on MCA and the method based onMDRA These results prove that the latter is superior to theformer

In order to estimate the advantage of our method it isindispensable to establish some evaluating indications Herewe present four formulae to evaluate our algorithm and theyare Precision TNR FPR and DR [11] The formulae aredefined as follows

Precision =TP

TP + FP (18)

TNR =TN

FP + TN (19)

FPR =FP

FP + TN (20)

DR =TP

TP + FN (21)

where(i) TP (True Positive) is the number of attacks correctly

classified as attacks(ii) FP (False Positive) is the number of normal records

incorrectly classified as attacks

MDRAMCA

99

992

994

996

998

100

Prec

ision

()

12 14 16 18 2 22 24 26 28 31The value of 120572

Figure 4 Precision for comparing detection methods based onMDRA and MCA

MDRAMCA

96

968

976

984

992

100

TNR

()

14 2216 18 212 24 26 28 31The value of 120572

Figure 5 TNR for comparing detection methods based on MDRAand MCA

(iii) TN (True Negative) is the number of normal recordscorrectly classified as normal records

(iv) FN (False Negative) is the number of attacks incor-rectly classified as normal records

Table 6 shows all results of TP FP TN and FN for 120572 thatis set from 1 to 3 with the increment of 02 when we use DDoSattack detection methods based on MDRA and MCA

Here the detection results of precision and TNR with thedifferent 120572 values are shown in Figures 4 and 5

8 Mathematical Problems in Engineering

Table 6 Results of TP FP TN and FN based on MDRA and MCA

120572Indicators based on MDRA Indicators based on MCA

TP FP TN FN TP FP TN FN120572 = 1 166299 278 60315 63554 223587 1743 58850 6266120572 = 12 166299 249 60344 63554 221873 1469 59124 7980120572 = 14 166292 227 60366 63561 206504 1313 59280 23349120572 = 16 166289 217 60376 63564 191190 1214 59379 38663120572 = 18 166289 204 60389 63564 190394 1159 59434 39459120572 = 2 166289 194 60399 63564 190342 1115 59478 39511120572 = 22 166289 191 60402 63564 190311 1065 59528 39542120572 = 24 166289 188 60405 63564 190277 1027 59566 39576120572 = 26 166282 180 60413 63571 190254 988 59605 39599120572 = 28 166282 176 60417 63571 190230 953 59640 39623120572 = 3 166282 172 60421 63571 190199 927 59666 39654

723

7232

7234

7236

7238

724

DR

()

03 035 04 045 05025FPR ()

(a)

80

85

90

95

100

DR

()

18 22 26 314FPR ()

(b)

Figure 6 (a) ROC for analyzing detection method based on MDRA (b) ROC for analyzing detection method based on MCA

In Figure 4 it is not hard to find that when the value of120572 gradually increases from 1 to 3 with the increment of 02the precision of attack detection method based on MDRA issuperior to the counterpart based onMCA and the former isabout 04 to 06 percent higher than the latter

In Figure 5 similarly we find that the TNR of ourdetection method is completely superior to another one withthe progressive increment of 120572 and the former is about 12 to24 percent higher than the latter

In addition the relationship between DR and FPR isfrequently used to evaluate the detection performance by theReceiver Operating Characteristic (ROC) curve The ROCcurve is obtained by setting different thresholds and there isa tradeoff between the DR and FPR [25] The ROC curves ofthe comparisons about the two detection methods are shownin Figure 6 In Figures 6(a) and 6(b) the two ROC curvesthat are used to analyze attack detection performance basedon our method and another one show the growing tendencyIn Figure 6(a) theROCcurve of ourmethod climbs graduallyfrom 7234 to 7235 for DR and it reflects that the changeof DR with different 120572 values is fairly small Likewise inFigure 6(b) this change is relatively large and the ROC

curve jumps dramatically from 8318 to 8984 Howeverin Big Data we pay more attention to instantaneity timecost resource consumption and computational complexityof attack detection Therefore a shade of discrepancy of DRcould be ignored At this point our method has the vastmajority of advantages in comparison to other methods Thediscussion about this topic will be opened up in the nextsection

43 Results Comparisons in terms of Time Cost and ResourceConsumption Here we emphatically analyze time cost andmemory resource consumption based on MDRA and MCA

On the one hand our detection mechanism is superiorto another one based on triangle-area and MCA proposedby Tan et al in time cost In our experimental environmentwe employ this server which has two CPUs and whereevery CPU has 16 cores When we ran the abovementionedexperimental data one of two CPUs opened and 16 coresof this CPU would gradually load to its full capacity Atthe moment the comparing results in CPU time of runningthe experimental data based on our detection method andthe other one are shown in Figure 7 However in the same

Mathematical Problems in Engineering 9

MDRAMCA

10minus1

100

101

102

CPU

tim

e (se

c)

12 14 16 18 2622 24 31 2 28The value of 120572

Figure 7 Comparing results in CPU time based on MDRA andMCA

MDRAMCA

0

05

1

15

2

25

3

35

Mem

ory

cons

umpt

ion

(GB)

12 14 16 18 2622 24 31 2 28The value of 120572

Figure 8 Comparing results in memory consumption based onMDRA and MCA

experimental environment the CPU time of the detectionmethod proposed by Tan et al is about 80 times as long asours and our CPU time is less than one second

On the other hand in terms of memory consumptionour detection mechanism is also a cut above the rest of themethod proposed by Tan et al This is because the memoryoccupied by our detection method in the experiments takesup less than 1GB however another one needs memory spaceof more than 3GB In the same experimental environmentthe occupied memory space in the detection method pro-posed by Tan et al is more than 3 times as long as ours Thecomparing results in memory consumption of running theexperimental data are shown in Figure 8

To sumup our detectionmethod can be perfectly appliedin real-timeDDoS attack detection under the environment ofvast amount of network traffic in Big Data

44 Computing Complexity Analysis In this section weanalyze the computing complexity of our detection method

Because the previous method based on MCA has the com-puting complexity of 119874(119898

2) and 119898 is a fixed number the

overall computing complexity is equal to 119874(1) [8] How-ever our detection mechanism based on MDRA uses thesimilar computational principle What is more the fixedfeature dimensionality119898 after reducing dimensionality in ourmethod is one-third of the previous method based on MCAHence the computing complexity of ourmethod is also equalto119874(1) At this point our detectionmechanism is equal to oris better than the other methods in [6 8 16]

5 Conclusion

In this paper we present a real-time DDoS attack detectionmechanism based on the MDRA algorithm in Big DataCompared with previous methods the experimental resultsdemonstrate that our solution has the better effectiveness andefficiency to distinguish attack traffic from vast amount ofnormal network traffic on the aspects of precision rate TNRtime cost memory resource consumption and computingcomplexity

Competing Interests

The authors declare that there is no conflict of interestsregarding the publication of this paper

Acknowledgments

This work was supported by the International Science andTechnology Cooperation Project of China (2013DFE13130)

References

[1] P J CriscuoloDistributed Denial of Service Trin00 Tribe FloodNetwork Tribe Flood Network 2000 and Stacheldraht CIAC-2319 Lawrence Livermore National Laboratory 2000

[2] M Kim H Na and K Chae ldquoA combined data miningapproach for DDoS attack detectionrdquo in Information Net-working Networking Technologies for Broadband and MobileNetworks vol 3090 of Lecture Notes in Computer Science pp943ndash950 Springer Berlin Germany 2004

[3] A Scherrer N Larrieu P Owezarski P Borgnat and P AbryldquoNon-Gaussian and long memory statistical characterizationsfor Internet traffic with anomaliesrdquo IEEE Transactions onDependable and Secure Computing vol 4 no 1 pp 56ndash70 2007

[4] K Lee J Kim K H Kwon Y Han and S Kim ldquoDDoS attackdetection method using cluster analysisrdquo Expert Systems withApplications vol 34 no 3 pp 1659ndash1665 2008

[5] H V Nguyen and Y Choi ldquoProactive detection of DDoS attacksutilizing k-NN classifier in an anti-DDoS frameworkrdquo WorldAcademy of Science Engineering and Technology InternationalScience Index vol 4 no 3 pp 247ndash252 2010

[6] C-F Tsai andC-Y Lin ldquoA triangle area based nearest neighborsapproach to intrusion detectionrdquo Pattern Recognition vol 43no 1 pp 222ndash229 2010

[7] A Bhange A Syad and S SinghThakur ldquoDDoS attacks impacton network traffic and its detection approachrdquo InternationalJournal of Computer Applications vol 40 no 11 pp 36ndash40 2012

10 Mathematical Problems in Engineering

[8] Z Y Tan A Jamdagni X J He P Nanda and R P LiuldquoA system for denial-of-service attack detection based onmultivariate correlation analysisrdquo IEEE Transactions on Paralleland Distributed Systems vol 25 no 2 pp 447ndash456 2014

[9] J Luo X Yang JWang J Xu J Sun and K Long ldquoOn amathe-matical model for low-rate shrew DDoSrdquo IEEE Transactions onInformation Forensics and Security vol 9 no 7 pp 1069ndash10832014

[10] H Hotelling ldquoAnalysis of a complex of statistical variables intoprincipal componentsrdquo Journal of Educational Psychology vol24 no 6 pp 417ndash441 1933

[11] A Patcha and J-M Park ldquoAn overview of anomaly detectiontechniques existing solutions and latest technological trendsrdquoComputer Networks vol 51 no 12 pp 3448ndash3470 2007

[12] G Liu Z Yi and S Yang ldquoA hierarchical intrusion detectionmodel based on the PCA neural networksrdquo Neurocomputingvol 70 no 7ndash9 pp 1561ndash1568 2007

[13] Y Kanda K Fukuda and T Sugawara ldquoEvaluation of anomalydetection based on sketch and PCArdquo in Proceedings of the 53rdIEEE Global Telecommunications Conference (GLOBECOM rsquo10)pp 1ndash5 IEEE Miami Fla USA December 2010

[14] Y Zhang and L Wu ldquoAn MR brain images classifier via prin-cipal component analysis and kernel support vector machinerdquoProgress in Electromagnetics Research vol 130 pp 369ndash3882012

[15] Z Tan A Jamdagni X He P Nanda and R P Liu ldquoDenial-of-service attack detection based on multivariate correlation anal-ysisrdquo in Neural Information Processing pp 756ndash765 SpringerBerlin Germany 2011

[16] Z Tan A Jamdagni X J He P Nanda and R P Liu ldquoTriangle-area-basedmultivariate correlation analysis for effective denial-of-service attack detectionrdquo in Proceedings of the 11th IEEEInternational Conference on Trust Security and Privacy inComputing and Communications (TrustCom rsquo12) pp 33ndash40IEEE Liverpool UK June 2012

[17] S Stolfo W Fan W Lee A Prodromidis and P K ChanldquoCost-basedmodeling for fraud and intrusion detection resultsfrom the JAMprojectrdquo inProceedings of theDARPA informationsurvivability conference and exposition (DISCEX rsquo00) pp 130ndash144 Hilton Head SC USA 2000

[18] R Lippmann JWHaines D J Fried J Korba andKDas ldquoThe1999 DARPA off-line intrusion detection evaluationrdquoComputerNetworks vol 34 no 4 pp 579ndash595 2000

[19] J McHugh ldquoTesting Intrusion detection systems a critique ofthe 1998 and 1999 DARPA intrusion detection system evalua-tions as performed by Lincoln Laboratoryrdquo ACM Transactionson Information and System Security vol 3 no 4 pp 262ndash2942000

[20] S StofoTheThird International Knowledge Discovery and DataMining Tools Competition The University of California 2002httpkddicsuciedudatabaseskddcup99kddcup99html

[21] S Mukkamala A H Sung and A Abraham ldquoIntrusiondetection using an ensemble of intelligent paradigmsrdquo Journalof Network and Computer Applications vol 28 no 2 pp 167ndash182 2005

[22] K-C Khor C-Y Ting and S Phon-Amnuaisuk ldquoA cascadedclassifier approach for improving detection rates on rare attackcategories in network intrusion detectionrdquo Applied Intelligencevol 36 no 2 pp 320ndash329 2012

[23] P Prasenna A V T Raghav Ramana R Krishna Kumar andA Devanbu ldquoNetwork programming and mining classifier for

intrusion detection using probability classificationrdquo in Proceed-ings of the International Conference on Pattern RecognitionInformatics and Medical Engineering (PRIME rsquo12) pp 204ndash209IEEE Salem Tamilnadu March 2012

[24] C Bae W-C Yeh M A M Shukran Y Y chung and T-JHsieh ldquoA novel anomaly-network intrusion detection systemusing ABC algorithmsrdquo International Journal of InnovativeComputing Information and Control vol 8 no 12 pp 8231ndash8248 2012

[25] WWang X Zhang S Gombault and S J Knapskog ldquoAttributenormalization in network intrusion detectionrdquo in Proceedingsof the 10th International Symposium on Pervasive SystemsAlgorithms and Networks (ISPAN rsquo09) pp 448ndash453 December2009

Submit your manuscripts athttpwwwhindawicom

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Mathematical Problems in Engineering

Hindawi Publishing Corporationhttpwwwhindawicom

Differential EquationsInternational Journal of

Volume 2014

Applied MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Probability and StatisticsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Mathematical PhysicsAdvances in

Complex AnalysisJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

OptimizationJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

CombinatoricsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Operations ResearchAdvances in

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Function Spaces

Abstract and Applied AnalysisHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of Mathematics and Mathematical Sciences

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Algebra

Discrete Dynamics in Nature and Society

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Decision SciencesAdvances in

Discrete MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom

Volume 2014 Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Stochastic AnalysisInternational Journal of

Page 8: Research Article A Novel Real-Time DDoS Attack Detection ...downloads.hindawi.com/journals/mpe/2016/1467051.pdf · e Denial of Service (DoS) attack is one of the most popular attacks

8 Mathematical Problems in Engineering

Table 6 Results of TP FP TN and FN based on MDRA and MCA

120572Indicators based on MDRA Indicators based on MCA

TP FP TN FN TP FP TN FN120572 = 1 166299 278 60315 63554 223587 1743 58850 6266120572 = 12 166299 249 60344 63554 221873 1469 59124 7980120572 = 14 166292 227 60366 63561 206504 1313 59280 23349120572 = 16 166289 217 60376 63564 191190 1214 59379 38663120572 = 18 166289 204 60389 63564 190394 1159 59434 39459120572 = 2 166289 194 60399 63564 190342 1115 59478 39511120572 = 22 166289 191 60402 63564 190311 1065 59528 39542120572 = 24 166289 188 60405 63564 190277 1027 59566 39576120572 = 26 166282 180 60413 63571 190254 988 59605 39599120572 = 28 166282 176 60417 63571 190230 953 59640 39623120572 = 3 166282 172 60421 63571 190199 927 59666 39654

723

7232

7234

7236

7238

724

DR

()

03 035 04 045 05025FPR ()

(a)

80

85

90

95

100

DR

()

18 22 26 314FPR ()

(b)

Figure 6 (a) ROC for analyzing detection method based on MDRA (b) ROC for analyzing detection method based on MCA

In Figure 4 it is not hard to find that when the value of120572 gradually increases from 1 to 3 with the increment of 02the precision of attack detection method based on MDRA issuperior to the counterpart based onMCA and the former isabout 04 to 06 percent higher than the latter

In Figure 5 similarly we find that the TNR of ourdetection method is completely superior to another one withthe progressive increment of 120572 and the former is about 12 to24 percent higher than the latter

In addition the relationship between DR and FPR isfrequently used to evaluate the detection performance by theReceiver Operating Characteristic (ROC) curve The ROCcurve is obtained by setting different thresholds and there isa tradeoff between the DR and FPR [25] The ROC curves ofthe comparisons about the two detection methods are shownin Figure 6 In Figures 6(a) and 6(b) the two ROC curvesthat are used to analyze attack detection performance basedon our method and another one show the growing tendencyIn Figure 6(a) theROCcurve of ourmethod climbs graduallyfrom 7234 to 7235 for DR and it reflects that the changeof DR with different 120572 values is fairly small Likewise inFigure 6(b) this change is relatively large and the ROC

curve jumps dramatically from 8318 to 8984 Howeverin Big Data we pay more attention to instantaneity timecost resource consumption and computational complexityof attack detection Therefore a shade of discrepancy of DRcould be ignored At this point our method has the vastmajority of advantages in comparison to other methods Thediscussion about this topic will be opened up in the nextsection

43 Results Comparisons in terms of Time Cost and ResourceConsumption Here we emphatically analyze time cost andmemory resource consumption based on MDRA and MCA

On the one hand our detection mechanism is superiorto another one based on triangle-area and MCA proposedby Tan et al in time cost In our experimental environmentwe employ this server which has two CPUs and whereevery CPU has 16 cores When we ran the abovementionedexperimental data one of two CPUs opened and 16 coresof this CPU would gradually load to its full capacity Atthe moment the comparing results in CPU time of runningthe experimental data based on our detection method andthe other one are shown in Figure 7 However in the same

Mathematical Problems in Engineering 9

MDRAMCA

10minus1

100

101

102

CPU

tim

e (se

c)

12 14 16 18 2622 24 31 2 28The value of 120572

Figure 7 Comparing results in CPU time based on MDRA andMCA

MDRAMCA

0

05

1

15

2

25

3

35

Mem

ory

cons

umpt

ion

(GB)

12 14 16 18 2622 24 31 2 28The value of 120572

Figure 8 Comparing results in memory consumption based onMDRA and MCA

experimental environment the CPU time of the detectionmethod proposed by Tan et al is about 80 times as long asours and our CPU time is less than one second

On the other hand in terms of memory consumptionour detection mechanism is also a cut above the rest of themethod proposed by Tan et al This is because the memoryoccupied by our detection method in the experiments takesup less than 1GB however another one needs memory spaceof more than 3GB In the same experimental environmentthe occupied memory space in the detection method pro-posed by Tan et al is more than 3 times as long as ours Thecomparing results in memory consumption of running theexperimental data are shown in Figure 8

To sumup our detectionmethod can be perfectly appliedin real-timeDDoS attack detection under the environment ofvast amount of network traffic in Big Data

44 Computing Complexity Analysis In this section weanalyze the computing complexity of our detection method

Because the previous method based on MCA has the com-puting complexity of 119874(119898

2) and 119898 is a fixed number the

overall computing complexity is equal to 119874(1) [8] How-ever our detection mechanism based on MDRA uses thesimilar computational principle What is more the fixedfeature dimensionality119898 after reducing dimensionality in ourmethod is one-third of the previous method based on MCAHence the computing complexity of ourmethod is also equalto119874(1) At this point our detectionmechanism is equal to oris better than the other methods in [6 8 16]

5 Conclusion

In this paper we present a real-time DDoS attack detectionmechanism based on the MDRA algorithm in Big DataCompared with previous methods the experimental resultsdemonstrate that our solution has the better effectiveness andefficiency to distinguish attack traffic from vast amount ofnormal network traffic on the aspects of precision rate TNRtime cost memory resource consumption and computingcomplexity

Competing Interests

The authors declare that there is no conflict of interestsregarding the publication of this paper

Acknowledgments

This work was supported by the International Science andTechnology Cooperation Project of China (2013DFE13130)

References

[1] P J CriscuoloDistributed Denial of Service Trin00 Tribe FloodNetwork Tribe Flood Network 2000 and Stacheldraht CIAC-2319 Lawrence Livermore National Laboratory 2000

[2] M Kim H Na and K Chae ldquoA combined data miningapproach for DDoS attack detectionrdquo in Information Net-working Networking Technologies for Broadband and MobileNetworks vol 3090 of Lecture Notes in Computer Science pp943ndash950 Springer Berlin Germany 2004

[3] A Scherrer N Larrieu P Owezarski P Borgnat and P AbryldquoNon-Gaussian and long memory statistical characterizationsfor Internet traffic with anomaliesrdquo IEEE Transactions onDependable and Secure Computing vol 4 no 1 pp 56ndash70 2007

[4] K Lee J Kim K H Kwon Y Han and S Kim ldquoDDoS attackdetection method using cluster analysisrdquo Expert Systems withApplications vol 34 no 3 pp 1659ndash1665 2008

[5] H V Nguyen and Y Choi ldquoProactive detection of DDoS attacksutilizing k-NN classifier in an anti-DDoS frameworkrdquo WorldAcademy of Science Engineering and Technology InternationalScience Index vol 4 no 3 pp 247ndash252 2010

[6] C-F Tsai andC-Y Lin ldquoA triangle area based nearest neighborsapproach to intrusion detectionrdquo Pattern Recognition vol 43no 1 pp 222ndash229 2010

[7] A Bhange A Syad and S SinghThakur ldquoDDoS attacks impacton network traffic and its detection approachrdquo InternationalJournal of Computer Applications vol 40 no 11 pp 36ndash40 2012

10 Mathematical Problems in Engineering

[8] Z Y Tan A Jamdagni X J He P Nanda and R P LiuldquoA system for denial-of-service attack detection based onmultivariate correlation analysisrdquo IEEE Transactions on Paralleland Distributed Systems vol 25 no 2 pp 447ndash456 2014

[9] J Luo X Yang JWang J Xu J Sun and K Long ldquoOn amathe-matical model for low-rate shrew DDoSrdquo IEEE Transactions onInformation Forensics and Security vol 9 no 7 pp 1069ndash10832014

[10] H Hotelling ldquoAnalysis of a complex of statistical variables intoprincipal componentsrdquo Journal of Educational Psychology vol24 no 6 pp 417ndash441 1933

[11] A Patcha and J-M Park ldquoAn overview of anomaly detectiontechniques existing solutions and latest technological trendsrdquoComputer Networks vol 51 no 12 pp 3448ndash3470 2007

[12] G Liu Z Yi and S Yang ldquoA hierarchical intrusion detectionmodel based on the PCA neural networksrdquo Neurocomputingvol 70 no 7ndash9 pp 1561ndash1568 2007

[13] Y Kanda K Fukuda and T Sugawara ldquoEvaluation of anomalydetection based on sketch and PCArdquo in Proceedings of the 53rdIEEE Global Telecommunications Conference (GLOBECOM rsquo10)pp 1ndash5 IEEE Miami Fla USA December 2010

[14] Y Zhang and L Wu ldquoAn MR brain images classifier via prin-cipal component analysis and kernel support vector machinerdquoProgress in Electromagnetics Research vol 130 pp 369ndash3882012

[15] Z Tan A Jamdagni X He P Nanda and R P Liu ldquoDenial-of-service attack detection based on multivariate correlation anal-ysisrdquo in Neural Information Processing pp 756ndash765 SpringerBerlin Germany 2011

[16] Z Tan A Jamdagni X J He P Nanda and R P Liu ldquoTriangle-area-basedmultivariate correlation analysis for effective denial-of-service attack detectionrdquo in Proceedings of the 11th IEEEInternational Conference on Trust Security and Privacy inComputing and Communications (TrustCom rsquo12) pp 33ndash40IEEE Liverpool UK June 2012

[17] S Stolfo W Fan W Lee A Prodromidis and P K ChanldquoCost-basedmodeling for fraud and intrusion detection resultsfrom the JAMprojectrdquo inProceedings of theDARPA informationsurvivability conference and exposition (DISCEX rsquo00) pp 130ndash144 Hilton Head SC USA 2000

[18] R Lippmann JWHaines D J Fried J Korba andKDas ldquoThe1999 DARPA off-line intrusion detection evaluationrdquoComputerNetworks vol 34 no 4 pp 579ndash595 2000

[19] J McHugh ldquoTesting Intrusion detection systems a critique ofthe 1998 and 1999 DARPA intrusion detection system evalua-tions as performed by Lincoln Laboratoryrdquo ACM Transactionson Information and System Security vol 3 no 4 pp 262ndash2942000

[20] S StofoTheThird International Knowledge Discovery and DataMining Tools Competition The University of California 2002httpkddicsuciedudatabaseskddcup99kddcup99html

[21] S Mukkamala A H Sung and A Abraham ldquoIntrusiondetection using an ensemble of intelligent paradigmsrdquo Journalof Network and Computer Applications vol 28 no 2 pp 167ndash182 2005

[22] K-C Khor C-Y Ting and S Phon-Amnuaisuk ldquoA cascadedclassifier approach for improving detection rates on rare attackcategories in network intrusion detectionrdquo Applied Intelligencevol 36 no 2 pp 320ndash329 2012

[23] P Prasenna A V T Raghav Ramana R Krishna Kumar andA Devanbu ldquoNetwork programming and mining classifier for

intrusion detection using probability classificationrdquo in Proceed-ings of the International Conference on Pattern RecognitionInformatics and Medical Engineering (PRIME rsquo12) pp 204ndash209IEEE Salem Tamilnadu March 2012

[24] C Bae W-C Yeh M A M Shukran Y Y chung and T-JHsieh ldquoA novel anomaly-network intrusion detection systemusing ABC algorithmsrdquo International Journal of InnovativeComputing Information and Control vol 8 no 12 pp 8231ndash8248 2012

[25] WWang X Zhang S Gombault and S J Knapskog ldquoAttributenormalization in network intrusion detectionrdquo in Proceedingsof the 10th International Symposium on Pervasive SystemsAlgorithms and Networks (ISPAN rsquo09) pp 448ndash453 December2009

Submit your manuscripts athttpwwwhindawicom

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Mathematical Problems in Engineering

Hindawi Publishing Corporationhttpwwwhindawicom

Differential EquationsInternational Journal of

Volume 2014

Applied MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Probability and StatisticsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Mathematical PhysicsAdvances in

Complex AnalysisJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

OptimizationJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

CombinatoricsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Operations ResearchAdvances in

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Function Spaces

Abstract and Applied AnalysisHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of Mathematics and Mathematical Sciences

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Algebra

Discrete Dynamics in Nature and Society

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Decision SciencesAdvances in

Discrete MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom

Volume 2014 Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Stochastic AnalysisInternational Journal of

Page 9: Research Article A Novel Real-Time DDoS Attack Detection ...downloads.hindawi.com/journals/mpe/2016/1467051.pdf · e Denial of Service (DoS) attack is one of the most popular attacks

Mathematical Problems in Engineering 9

MDRAMCA

10minus1

100

101

102

CPU

tim

e (se

c)

12 14 16 18 2622 24 31 2 28The value of 120572

Figure 7 Comparing results in CPU time based on MDRA andMCA

MDRAMCA

0

05

1

15

2

25

3

35

Mem

ory

cons

umpt

ion

(GB)

12 14 16 18 2622 24 31 2 28The value of 120572

Figure 8 Comparing results in memory consumption based onMDRA and MCA

experimental environment the CPU time of the detectionmethod proposed by Tan et al is about 80 times as long asours and our CPU time is less than one second

On the other hand in terms of memory consumptionour detection mechanism is also a cut above the rest of themethod proposed by Tan et al This is because the memoryoccupied by our detection method in the experiments takesup less than 1GB however another one needs memory spaceof more than 3GB In the same experimental environmentthe occupied memory space in the detection method pro-posed by Tan et al is more than 3 times as long as ours Thecomparing results in memory consumption of running theexperimental data are shown in Figure 8

To sumup our detectionmethod can be perfectly appliedin real-timeDDoS attack detection under the environment ofvast amount of network traffic in Big Data

44 Computing Complexity Analysis In this section weanalyze the computing complexity of our detection method

Because the previous method based on MCA has the com-puting complexity of 119874(119898

2) and 119898 is a fixed number the

overall computing complexity is equal to 119874(1) [8] How-ever our detection mechanism based on MDRA uses thesimilar computational principle What is more the fixedfeature dimensionality119898 after reducing dimensionality in ourmethod is one-third of the previous method based on MCAHence the computing complexity of ourmethod is also equalto119874(1) At this point our detectionmechanism is equal to oris better than the other methods in [6 8 16]

5 Conclusion

In this paper we present a real-time DDoS attack detectionmechanism based on the MDRA algorithm in Big DataCompared with previous methods the experimental resultsdemonstrate that our solution has the better effectiveness andefficiency to distinguish attack traffic from vast amount ofnormal network traffic on the aspects of precision rate TNRtime cost memory resource consumption and computingcomplexity

Competing Interests

The authors declare that there is no conflict of interestsregarding the publication of this paper

Acknowledgments

This work was supported by the International Science andTechnology Cooperation Project of China (2013DFE13130)

References

[1] P J CriscuoloDistributed Denial of Service Trin00 Tribe FloodNetwork Tribe Flood Network 2000 and Stacheldraht CIAC-2319 Lawrence Livermore National Laboratory 2000

[2] M Kim H Na and K Chae ldquoA combined data miningapproach for DDoS attack detectionrdquo in Information Net-working Networking Technologies for Broadband and MobileNetworks vol 3090 of Lecture Notes in Computer Science pp943ndash950 Springer Berlin Germany 2004

[3] A Scherrer N Larrieu P Owezarski P Borgnat and P AbryldquoNon-Gaussian and long memory statistical characterizationsfor Internet traffic with anomaliesrdquo IEEE Transactions onDependable and Secure Computing vol 4 no 1 pp 56ndash70 2007

[4] K Lee J Kim K H Kwon Y Han and S Kim ldquoDDoS attackdetection method using cluster analysisrdquo Expert Systems withApplications vol 34 no 3 pp 1659ndash1665 2008

[5] H V Nguyen and Y Choi ldquoProactive detection of DDoS attacksutilizing k-NN classifier in an anti-DDoS frameworkrdquo WorldAcademy of Science Engineering and Technology InternationalScience Index vol 4 no 3 pp 247ndash252 2010

[6] C-F Tsai andC-Y Lin ldquoA triangle area based nearest neighborsapproach to intrusion detectionrdquo Pattern Recognition vol 43no 1 pp 222ndash229 2010

[7] A Bhange A Syad and S SinghThakur ldquoDDoS attacks impacton network traffic and its detection approachrdquo InternationalJournal of Computer Applications vol 40 no 11 pp 36ndash40 2012

10 Mathematical Problems in Engineering

[8] Z Y Tan A Jamdagni X J He P Nanda and R P LiuldquoA system for denial-of-service attack detection based onmultivariate correlation analysisrdquo IEEE Transactions on Paralleland Distributed Systems vol 25 no 2 pp 447ndash456 2014

[9] J Luo X Yang JWang J Xu J Sun and K Long ldquoOn amathe-matical model for low-rate shrew DDoSrdquo IEEE Transactions onInformation Forensics and Security vol 9 no 7 pp 1069ndash10832014

[10] H Hotelling ldquoAnalysis of a complex of statistical variables intoprincipal componentsrdquo Journal of Educational Psychology vol24 no 6 pp 417ndash441 1933

[11] A Patcha and J-M Park ldquoAn overview of anomaly detectiontechniques existing solutions and latest technological trendsrdquoComputer Networks vol 51 no 12 pp 3448ndash3470 2007

[12] G Liu Z Yi and S Yang ldquoA hierarchical intrusion detectionmodel based on the PCA neural networksrdquo Neurocomputingvol 70 no 7ndash9 pp 1561ndash1568 2007

[13] Y Kanda K Fukuda and T Sugawara ldquoEvaluation of anomalydetection based on sketch and PCArdquo in Proceedings of the 53rdIEEE Global Telecommunications Conference (GLOBECOM rsquo10)pp 1ndash5 IEEE Miami Fla USA December 2010

[14] Y Zhang and L Wu ldquoAn MR brain images classifier via prin-cipal component analysis and kernel support vector machinerdquoProgress in Electromagnetics Research vol 130 pp 369ndash3882012

[15] Z Tan A Jamdagni X He P Nanda and R P Liu ldquoDenial-of-service attack detection based on multivariate correlation anal-ysisrdquo in Neural Information Processing pp 756ndash765 SpringerBerlin Germany 2011

[16] Z Tan A Jamdagni X J He P Nanda and R P Liu ldquoTriangle-area-basedmultivariate correlation analysis for effective denial-of-service attack detectionrdquo in Proceedings of the 11th IEEEInternational Conference on Trust Security and Privacy inComputing and Communications (TrustCom rsquo12) pp 33ndash40IEEE Liverpool UK June 2012

[17] S Stolfo W Fan W Lee A Prodromidis and P K ChanldquoCost-basedmodeling for fraud and intrusion detection resultsfrom the JAMprojectrdquo inProceedings of theDARPA informationsurvivability conference and exposition (DISCEX rsquo00) pp 130ndash144 Hilton Head SC USA 2000

[18] R Lippmann JWHaines D J Fried J Korba andKDas ldquoThe1999 DARPA off-line intrusion detection evaluationrdquoComputerNetworks vol 34 no 4 pp 579ndash595 2000

[19] J McHugh ldquoTesting Intrusion detection systems a critique ofthe 1998 and 1999 DARPA intrusion detection system evalua-tions as performed by Lincoln Laboratoryrdquo ACM Transactionson Information and System Security vol 3 no 4 pp 262ndash2942000

[20] S StofoTheThird International Knowledge Discovery and DataMining Tools Competition The University of California 2002httpkddicsuciedudatabaseskddcup99kddcup99html

[21] S Mukkamala A H Sung and A Abraham ldquoIntrusiondetection using an ensemble of intelligent paradigmsrdquo Journalof Network and Computer Applications vol 28 no 2 pp 167ndash182 2005

[22] K-C Khor C-Y Ting and S Phon-Amnuaisuk ldquoA cascadedclassifier approach for improving detection rates on rare attackcategories in network intrusion detectionrdquo Applied Intelligencevol 36 no 2 pp 320ndash329 2012

[23] P Prasenna A V T Raghav Ramana R Krishna Kumar andA Devanbu ldquoNetwork programming and mining classifier for

intrusion detection using probability classificationrdquo in Proceed-ings of the International Conference on Pattern RecognitionInformatics and Medical Engineering (PRIME rsquo12) pp 204ndash209IEEE Salem Tamilnadu March 2012

[24] C Bae W-C Yeh M A M Shukran Y Y chung and T-JHsieh ldquoA novel anomaly-network intrusion detection systemusing ABC algorithmsrdquo International Journal of InnovativeComputing Information and Control vol 8 no 12 pp 8231ndash8248 2012

[25] WWang X Zhang S Gombault and S J Knapskog ldquoAttributenormalization in network intrusion detectionrdquo in Proceedingsof the 10th International Symposium on Pervasive SystemsAlgorithms and Networks (ISPAN rsquo09) pp 448ndash453 December2009

Submit your manuscripts athttpwwwhindawicom

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Mathematical Problems in Engineering

Hindawi Publishing Corporationhttpwwwhindawicom

Differential EquationsInternational Journal of

Volume 2014

Applied MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Probability and StatisticsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Mathematical PhysicsAdvances in

Complex AnalysisJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

OptimizationJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

CombinatoricsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Operations ResearchAdvances in

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Function Spaces

Abstract and Applied AnalysisHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of Mathematics and Mathematical Sciences

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Algebra

Discrete Dynamics in Nature and Society

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Decision SciencesAdvances in

Discrete MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom

Volume 2014 Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Stochastic AnalysisInternational Journal of

Page 10: Research Article A Novel Real-Time DDoS Attack Detection ...downloads.hindawi.com/journals/mpe/2016/1467051.pdf · e Denial of Service (DoS) attack is one of the most popular attacks

10 Mathematical Problems in Engineering

[8] Z Y Tan A Jamdagni X J He P Nanda and R P LiuldquoA system for denial-of-service attack detection based onmultivariate correlation analysisrdquo IEEE Transactions on Paralleland Distributed Systems vol 25 no 2 pp 447ndash456 2014

[9] J Luo X Yang JWang J Xu J Sun and K Long ldquoOn amathe-matical model for low-rate shrew DDoSrdquo IEEE Transactions onInformation Forensics and Security vol 9 no 7 pp 1069ndash10832014

[10] H Hotelling ldquoAnalysis of a complex of statistical variables intoprincipal componentsrdquo Journal of Educational Psychology vol24 no 6 pp 417ndash441 1933

[11] A Patcha and J-M Park ldquoAn overview of anomaly detectiontechniques existing solutions and latest technological trendsrdquoComputer Networks vol 51 no 12 pp 3448ndash3470 2007

[12] G Liu Z Yi and S Yang ldquoA hierarchical intrusion detectionmodel based on the PCA neural networksrdquo Neurocomputingvol 70 no 7ndash9 pp 1561ndash1568 2007

[13] Y Kanda K Fukuda and T Sugawara ldquoEvaluation of anomalydetection based on sketch and PCArdquo in Proceedings of the 53rdIEEE Global Telecommunications Conference (GLOBECOM rsquo10)pp 1ndash5 IEEE Miami Fla USA December 2010

[14] Y Zhang and L Wu ldquoAn MR brain images classifier via prin-cipal component analysis and kernel support vector machinerdquoProgress in Electromagnetics Research vol 130 pp 369ndash3882012

[15] Z Tan A Jamdagni X He P Nanda and R P Liu ldquoDenial-of-service attack detection based on multivariate correlation anal-ysisrdquo in Neural Information Processing pp 756ndash765 SpringerBerlin Germany 2011

[16] Z Tan A Jamdagni X J He P Nanda and R P Liu ldquoTriangle-area-basedmultivariate correlation analysis for effective denial-of-service attack detectionrdquo in Proceedings of the 11th IEEEInternational Conference on Trust Security and Privacy inComputing and Communications (TrustCom rsquo12) pp 33ndash40IEEE Liverpool UK June 2012

[17] S Stolfo W Fan W Lee A Prodromidis and P K ChanldquoCost-basedmodeling for fraud and intrusion detection resultsfrom the JAMprojectrdquo inProceedings of theDARPA informationsurvivability conference and exposition (DISCEX rsquo00) pp 130ndash144 Hilton Head SC USA 2000

[18] R Lippmann JWHaines D J Fried J Korba andKDas ldquoThe1999 DARPA off-line intrusion detection evaluationrdquoComputerNetworks vol 34 no 4 pp 579ndash595 2000

[19] J McHugh ldquoTesting Intrusion detection systems a critique ofthe 1998 and 1999 DARPA intrusion detection system evalua-tions as performed by Lincoln Laboratoryrdquo ACM Transactionson Information and System Security vol 3 no 4 pp 262ndash2942000

[20] S StofoTheThird International Knowledge Discovery and DataMining Tools Competition The University of California 2002httpkddicsuciedudatabaseskddcup99kddcup99html

[21] S Mukkamala A H Sung and A Abraham ldquoIntrusiondetection using an ensemble of intelligent paradigmsrdquo Journalof Network and Computer Applications vol 28 no 2 pp 167ndash182 2005

[22] K-C Khor C-Y Ting and S Phon-Amnuaisuk ldquoA cascadedclassifier approach for improving detection rates on rare attackcategories in network intrusion detectionrdquo Applied Intelligencevol 36 no 2 pp 320ndash329 2012

[23] P Prasenna A V T Raghav Ramana R Krishna Kumar andA Devanbu ldquoNetwork programming and mining classifier for

intrusion detection using probability classificationrdquo in Proceed-ings of the International Conference on Pattern RecognitionInformatics and Medical Engineering (PRIME rsquo12) pp 204ndash209IEEE Salem Tamilnadu March 2012

[24] C Bae W-C Yeh M A M Shukran Y Y chung and T-JHsieh ldquoA novel anomaly-network intrusion detection systemusing ABC algorithmsrdquo International Journal of InnovativeComputing Information and Control vol 8 no 12 pp 8231ndash8248 2012

[25] WWang X Zhang S Gombault and S J Knapskog ldquoAttributenormalization in network intrusion detectionrdquo in Proceedingsof the 10th International Symposium on Pervasive SystemsAlgorithms and Networks (ISPAN rsquo09) pp 448ndash453 December2009

Submit your manuscripts athttpwwwhindawicom

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Mathematical Problems in Engineering

Hindawi Publishing Corporationhttpwwwhindawicom

Differential EquationsInternational Journal of

Volume 2014

Applied MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Probability and StatisticsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Mathematical PhysicsAdvances in

Complex AnalysisJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

OptimizationJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

CombinatoricsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Operations ResearchAdvances in

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Function Spaces

Abstract and Applied AnalysisHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of Mathematics and Mathematical Sciences

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Algebra

Discrete Dynamics in Nature and Society

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Decision SciencesAdvances in

Discrete MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom

Volume 2014 Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Stochastic AnalysisInternational Journal of

Page 11: Research Article A Novel Real-Time DDoS Attack Detection ...downloads.hindawi.com/journals/mpe/2016/1467051.pdf · e Denial of Service (DoS) attack is one of the most popular attacks

Submit your manuscripts athttpwwwhindawicom

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Mathematical Problems in Engineering

Hindawi Publishing Corporationhttpwwwhindawicom

Differential EquationsInternational Journal of

Volume 2014

Applied MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Probability and StatisticsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Mathematical PhysicsAdvances in

Complex AnalysisJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

OptimizationJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

CombinatoricsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Operations ResearchAdvances in

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Function Spaces

Abstract and Applied AnalysisHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of Mathematics and Mathematical Sciences

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Algebra

Discrete Dynamics in Nature and Society

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Decision SciencesAdvances in

Discrete MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom

Volume 2014 Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Stochastic AnalysisInternational Journal of