engineering applications of artificial intelligence...system for network intrusion detection...

11
A population-based incremental learning approach with articial immune system for network intrusion detection Meng-Hui Chen a,b , Pei-Chann Chang a,b,n , Jheng-Long Wu b a School of Software, Nanchang University, Nanchang 330031, China b Innovation Center for Big Data & Digital Convergence and Department of Information Management, Yuan Ze University, Taoyuan 32026, Taiwan article info Keywords: Articial immune system Population-based incremental learning Evolutionary computation Classication problems Electronic commerce Collaborative ltering abstract The focus of this research is to develop a classier using an articial immune system (AIS) combined with population-based incremental learning (PBIL) and collaborative ltering (CF) for network intrusion detection. AIS is a powerful tool in terms of extirpating antigens inspired by the principles and processes of the natural immune system. PBIL uses past experiences to evolve into new species through learning and adopting the idea of CF for classication. The novelty of this research is in its combining of the three above mentioned approaches to develop a new classier which can be applied to detect network intrusion, with incremental learning capability, by adapting the weight of key features. In addition, four mechanisms: creating a new antibody using PBIL, dynamic adjustment of feature weighting using clonal expansion, antibody hierarchy adjustment using mean afnity, as well as usage rates, are proposed to intensify AIS performance. As shown by the comparison carried out with other articial intelligence and evolutionary computation approaches in network anomaly detection problems, our PBIL-AIS CF classier can achieve high accuracy for the benchmark problem. & 2016 Elsevier Ltd. All rights reserved. 1. Introduction With the advent and explosive growth of the global mobile computing and electronic commerce environments, network intrusion and anomaly detection in wide area data networks and electronic commerce infrastructures are gaining great attention from academic researchers and industrial practitioners. Recently, the number and impact of attacks has been increasing, as evi- denced by recent network attacks against several prominent Web portals as well as many well-known companies. In addition, there are many types of network attacks which can damage an organi- zation. Intrusions, such as advanced persistent threats (APTs), are designed to penetrate networks and surreptitiously steal intellec- tual property; distributed denial-of-service (DDoS) and ooding attacks can blind servers and shut down web sites. In addition, in the network economy, the problems caused by sustained network attacks are very difcult to handle. People worry about whether a major network disruption could cause confusion at local, national and even global levels. To detect network attacks effectively and automatically, software should be developed such that statistical signatures of network anomalies can be recognized algor- ithmically. This research focuses on building a classier for net- work intrusion and anomaly detection in electronic commerce environments. Classication models based on statistical pattern recognition approaches have attracted a wide range of interest in response to the growing demand for reliable and intelligent data mining sys- tems that can be used to detect network intrusion attacks. Today, sophisticated classication is required in various domains. The goal of classication is to accurately predict the target class for each case; the determination of correct classes would lead to efcient service provision. Traditional classication systems use statistical approaches based on frequency and attack probability determination. Detection approaches such as fuzzy control, arti- cial neural network, decision tree, SVM, and so forth, also have good performance in numerous fault detection problems based on classication (Razavi-Far et al., 2009; Baccarini et al., 2011; Przystalka and Moczulski, 2015). In recent years, another computationally intelligent approach, Evolutionary Computing (EC) has gained wide interest in the elds of computer science and articial intelligence. EC approaches, such as articial immune system (AIS), genetic algorithm (GA), evolu- tionary strategy (ES), genetic programming (GP), ant swarm Contents lists available at ScienceDirect journal homepage: www.elsevier.com/locate/engappai Engineering Applications of Articial Intelligence http://dx.doi.org/10.1016/j.engappai.2016.01.020 0952-1976/& 2016 Elsevier Ltd. All rights reserved. n Correspondence to: Department of Information Management, Yuan Ze Uni- versity,135 Yuan Tung Road, Chungli 32003, Taiwan. Tel.: þ886 3 4638800x2305; fax: þ886 3 4352077. E-mail address: [email protected] (P.-C. Chang). Please cite this article as: Chen, M.-H., et al., A population-based incremental learning approach with articial immune system for network intrusion detection. Eng. Appl. Artif. Intel. (2016), http://dx.doi.org/10.1016/j.engappai.2016.01.020i Engineering Applications of Articial Intelligence (∎∎∎∎) ∎∎∎∎∎∎

Upload: others

Post on 31-May-2020

1 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Engineering Applications of Artificial Intelligence...system for network intrusion detection Meng-Hui Chena,b, Pei-Chann Changa,b,n, Jheng-Long Wub a School of Software, Nanchang

Engineering Applications of Artificial Intelligence ∎ (∎∎∎∎) ∎∎∎–∎∎∎

Contents lists available at ScienceDirect

Engineering Applications of Artificial Intelligence

http://d0952-19

n Corrversity,fax: þ8

E-m

Pleasnetw

journal homepage: www.elsevier.com/locate/engappai

A population-based incremental learning approach with artificial immunesystem for network intrusion detection

Meng-Hui Chen a,b, Pei-Chann Chang a,b,n, Jheng-Long Wu b

a School of Software, Nanchang University, Nanchang 330031, Chinab Innovation Center for Big Data & Digital Convergence and Department of Information Management, Yuan Ze University, Taoyuan 32026, Taiwan

a r t i c l e i n f o

Keywords:Artificial immune systemPopulation-based incremental learningEvolutionary computationClassification problemsElectronic commerceCollaborative filtering

x.doi.org/10.1016/j.engappai.2016.01.02076/& 2016 Elsevier Ltd. All rights reserved.

espondence to: Department of Information135 Yuan Tung Road, Chungli 32003, Taiwan.86 3 4352077.ail address: [email protected] (P.-C

e cite this article as: Chen, M.-H., eork intrusion detection. Eng. Appl. A

a b s t r a c t

The focus of this research is to develop a classifier using an artificial immune system (AIS) combined withpopulation-based incremental learning (PBIL) and collaborative filtering (CF) for network intrusiondetection. AIS is a powerful tool in terms of extirpating antigens inspired by the principles and processesof the natural immune system. PBIL uses past experiences to evolve into new species through learningand adopting the idea of CF for classification. The novelty of this research is in its combining of the threeabove mentioned approaches to develop a new classifier which can be applied to detect networkintrusion, with incremental learning capability, by adapting the weight of key features. In addition, fourmechanisms: creating a new antibody using PBIL, dynamic adjustment of feature weighting using clonalexpansion, antibody hierarchy adjustment using mean affinity, as well as usage rates, are proposed tointensify AIS performance. As shown by the comparison carried out with other artificial intelligence andevolutionary computation approaches in network anomaly detection problems, our PBIL-AISCF classifiercan achieve high accuracy for the benchmark problem.

& 2016 Elsevier Ltd. All rights reserved.

1. Introduction

With the advent and explosive growth of the global mobilecomputing and electronic commerce environments, networkintrusion and anomaly detection in wide area data networks andelectronic commerce infrastructures are gaining great attentionfrom academic researchers and industrial practitioners. Recently,the number and impact of attacks has been increasing, as evi-denced by recent network attacks against several prominent Webportals as well as many well-known companies. In addition, thereare many types of network attacks which can damage an organi-zation. Intrusions, such as advanced persistent threats (APTs), aredesigned to penetrate networks and surreptitiously steal intellec-tual property; distributed denial-of-service (DDoS) and floodingattacks can blind servers and shut down web sites. In addition, inthe network economy, the problems caused by sustained networkattacks are very difficult to handle. People worry about whether amajor network disruption could cause confusion at local, nationaland even global levels. To detect network attacks effectively and

Management, Yuan Ze Uni-Tel.: þ886 3 4638800x2305;

. Chang).

t al., A population-based inrtif. Intel. (2016), http://dx.

automatically, software should be developed such that statisticalsignatures of network anomalies can be recognized algor-ithmically. This research focuses on building a classifier for net-work intrusion and anomaly detection in electronic commerceenvironments.

Classification models based on statistical pattern recognitionapproaches have attracted a wide range of interest in response tothe growing demand for reliable and intelligent data mining sys-tems that can be used to detect network intrusion attacks. Today,sophisticated classification is required in various domains. Thegoal of classification is to accurately predict the target class foreach case; the determination of correct classes would lead toefficient service provision. Traditional classification systems usestatistical approaches based on frequency and attack probabilitydetermination. Detection approaches such as fuzzy control, artifi-cial neural network, decision tree, SVM, and so forth, also havegood performance in numerous fault detection problems based onclassification (Razavi-Far et al., 2009; Baccarini et al., 2011;Przystałka and Moczulski, 2015).

In recent years, another computationally intelligent approach,Evolutionary Computing (EC) has gained wide interest in the fieldsof computer science and artificial intelligence. EC approaches, suchas artificial immune system (AIS), genetic algorithm (GA), evolu-tionary strategy (ES), genetic programming (GP), ant swarm

cremental learning approach with artificial immune system fordoi.org/10.1016/j.engappai.2016.01.020i

www.Matlabi.irwww.Matlabi.ir

Page 2: Engineering Applications of Artificial Intelligence...system for network intrusion detection Meng-Hui Chena,b, Pei-Chann Changa,b,n, Jheng-Long Wub a School of Software, Nanchang

M.-H. Chen et al. / Engineering Applications of Artificial Intelligence ∎ (∎∎∎∎) ∎∎∎–∎∎∎2

optimization (ASO) and particle swarm optimization (PSO) mimicthe natural evolution mechanism to solve complex problems(Zhao, 2007; Castellani and Rowlands, 2008; Nemati et al., 2009;Agarwal et al., 2012). Hunt et al. (1998) proposed that AIS as aspecial approach, among these EC approaches, has a powerfulevolution mechanism in accordance with the natural immuneresponse. AIS exploits immune systems’ characteristics such asfeature extraction, pattern recognition and learning and memorycapabilities to intelligently modify itself with experience andcontinuously evolve to achieve higher accuracy. Many researchershave used AIS for parameter combination optimization (Gao, 2010;Aydin et al., 2011). Dasgupta et al. (2011) investigated the AISapproach, which effectively identifies antibodies, creates newantibodies and then incorporates Clonal Selection and NegativeSelection Mechanisms. Based on their usage and mean affinity, aqualitative classification of antibodies is also maintained duringthe training. AIS has also been employed to solve classificationproblems, such as the study by Watkins et al. (2004); theydeveloped immune inspired supervised learning algorithms andartificial immune recognition systems (AIRS). De Castro and VonZuben (2002) proposed the computational implementation of theclonal selection principle that explicitly takes into account theaffinity maturation of the immune response. Carter (2000) pro-posed a supervised learning system (Immunos-81) using softwareabstraction of T cells, B cells, antibodies and their interactions inwhich artificial T cells control the creation of B-cell populations(clones). However, the performances of the above AIS-basedapproaches are not competitive with other evolutionary or SVM-based approaches, as shown in Baccarini et al. (2011). As men-tioned by Dasgupta et al. (2011), the AIS-based approach can befurther improved by embedding more sophisticated mechanismsin controlling the affinity between the antibody and the antigen toimprove their classification accuracy.

In this research, we incorporate population-based incrementallearning (PBIL) into AIS classification to enhance the performance.Since classification is one of the most important mining tasks, wefocus on developing various AIS algorithms, such as a clonalselection algorithm (CLONALG) and an antibody hierarchyadjusting mechanism. Since the existing antibodies may not besufficient to efficiently extirpate all the antigens, the creation ofnew antibodies in response to the dynamic requirements becomesnecessary. This can be achieved through incremental learningprocedures; therefore we use PBIL to evolve new antibodies withhigher affinities than the older ones, which by themselves werenot capable of correctly identifying the class. The primary objec-tive of this research is to develop a classifier by combining PBILwith AIS. In addition, the antibodies related to the target instanceare clustered together using collaborative filtering for classifyingthe class of the target intrusions.

The rest of the paper is organized as follows. Section 2 presentsa review of the literature and defines the classification problem ingeneral. Based on the two specific problems we aim to solve in thispaper, we outline a few traditional classification approaches andintroduce evolutionary computation methods. Section 3 describesthe methodology detailing the core algorithms used in ourmethod. Section 4 presents the experimental results and comparesour method with other classification approaches. We then closewith conclusions and directions for future research.

2. Literature review

2.1. Classification problems

Classification problems require assigning items in a collectionto target categories or classes. The goal is to accurately predict the

Please cite this article as: Chen, M.-H., et al., A population-based innetwork intrusion detection. Eng. Appl. Artif. Intel. (2016), http://dx.

target class of each case in the data set. The assignment is based onquantitative information derived from certain characteristics, suchas antibody or antigen features inherent in the items as antibodiesand antigens. During the classification process, a group of markedclasses is provided and a training set is used to learn the defini-tions of these classes. Classification rules are determined, and thenthese rules are treated as benchmarks to identify the most likelylabel and the class of a new pattern (Woolley and Milanovic, 2011).Classification problems stem from various real-world situationssuch as those found in biological fields (Alves et al., 2010), financialservices (Twala, 2010), remote sensing (Zhong et al., 2007) andinventory classification (Thomas, 2000).

Whether used to detect legitimate uses against attacks intoday’s ubiquitous use of e-commerce, or the need to accuratelydistinguish cancer cells with the rising demands for sophisticatedhealth, or for industrial activities such as expensive petroleumdrilling operations, identification of impeccable classifiers is nee-ded everywhere. An evaluation of classifiers thus becomes veryimportant. Criteria such as classification precision or accuracy, thescalability of learning and classification on large data sets, therobustness to noise and the ability for incremental learning areoften used to evaluate classification techniques. Scalability andincremental learning ability have become increasingly importantwith the collection of large amounts of data resulting frommoderncomputing and information technologies. Since patterns embed-ded in large data sets generated through industrial processes maybe dynamic, a classification technique should have incrementallearning ability to update existing patterns with the collection ofnew data, and also be scalable to process data in large volumes.

An intrusion detection system (IDS) is a monitoring or a pro-tection mechanism against various malicious activities or policyviolations. With the proliferation of computers, networks and theInternet, security has become a primary concern. In general, thereare two main types of IDS: network intrusion detection system(NIDS) and host-based intrusion detection system (HIDS). Theintrusion detection approach usually uses statistical analysis andpattern recognition, and is capable of detecting anomaly intrusionswithout any prior knowledge; therefore, the model is able togeneralize and extract intrusion rules during training. IDSs canreliably identify intrusion attacks in correspondence to knownsignatures of discovered vulnerabilities. Abadeh et al. (2011)referred to the use of genetic fuzzy systems (GFSs) in hybridmodels to solve intrusion attacks problems. They presented threekinds of genetic fuzzy systems and an iterative rule learningapproach to deal with intrusion detection. Altwaijry and Algarny(2012) presented a Bayesian intrusion detection system based onthe Bayesian probability theory. Horng et al. (2011) presented ahierarchical clustering and support vector machines hybrid modelto build an IDS. Afzali and Azmi (2014) presented a multi-agentAIS-based distributed intrusion detection system; the character-istics of MAIS-IDS are cloning, mutation, migration, collaborationand randomness. All of these researches solved the KDD Cup 1999dataset, which is very popular in numerous studies. In this paper,we also use the KDD Cup 1999 dataset to evaluate our classifica-tion model and hence, to compare results.

Other popular classification problems also include credit cardapproval. With the recent growth of the credit industry, a need foran actively managed credit scoring model has emerged. A creditscoring technique is a set of decision models that assist lenders ingranting consumer credit (Thomas, 2000). The models help inmaking decisions on whether to grant credit to new applicantsbased on customer characteristics such as age, income and maritalstatus. In recent years, it has been extensively used for creditadmission. The basic principle of credit scoring is to assess thosewho apply for fresh credit by predicting through the analyses ofrepayment performance on the part of previous consumers.

cremental learning approach with artificial immune system fordoi.org/10.1016/j.engappai.2016.01.020i

www.Matlabi.irwww.Matlabi.ir

Page 3: Engineering Applications of Artificial Intelligence...system for network intrusion detection Meng-Hui Chena,b, Pei-Chann Changa,b,n, Jheng-Long Wub a School of Software, Nanchang

M.-H. Chen et al. / Engineering Applications of Artificial Intelligence ∎ (∎∎∎∎) ∎∎∎–∎∎∎ 3

Initially, linear discriminate analysis was used for the evaluationeven though the highly categorical data cannot be accuratelyassessed with linear methods. The restrictions of linearity and highuse of probability distributions shattered the predictability of thecredit scoring models. Since then, researchers have developed avariety of statistical models for credit scoring, such as logisticregression models (Desai et al., 1996). In recent years, many non-parametric statistical methods have been designed to solve theproblem with the use of neural networks (Zhang et al., 1999) orgenetic algorithms (Kim and Street, 2004).

2.2. Evolutionary computation

Evolutionary computation, a creative process inspired by nat-ure's evolution, is capable of addressing complex problems withgreater accuracy and speed. These problems normally involve highrandomness, nonlinear dynamics and noise, which are difficult tofathom with traditional algorithms. It employs iterative progress,such as population development. The population is then guidedthrough some parallel processing, for it to adhere to certain rulesin order to achieve the desired results. The most popular ECapproach is GAs, which is a family of nature inspired optimizationmodels. The basic idea was derived from natural selection’sretention of the fittest members of the population, and then car-rying them forward for future selections. The fitter populationleads to better output at each step and thus iteratively convergesto an optimal solution. The basic concept of GA was outlined byGoldberg (1989). GAs have been applied for classification purposesat many instances in their very basic form, as well as in advancedforms. Mining classification rules from large databases (Dehuriet al., 2008) and hybrid models that combine genetic algorithmsand neural networks for classifying garment defects (Yuen et al.,2009) are both examples of their applications.

In this section, we review the role of EC in classification. Incomputer science, artificial immune systems are based on com-putationally intelligent systems inspired by the principles andprocesses that simulate the vertebrate immune system. AIS is oneof the EC methodologies that have been extensively used fordeveloping classification approaches. It is an adaptive system,influenced by theoretical immunology and observed immunefunctions, principles and models, which are then applied for pro-blem solving (Aydin et al., 2011). In addition, AIS approaches havebetter performance compared to artificial neural networks, GAs,fuzzy systems and so on, and have been successfully applied tomany fields such as clustering, classification, pattern recognition,computer defense and optimization (Alatas and Akin, 2005; Dar-moul et al., 2006; Hart and Timmis, 2008; Tavakkoli-Moghaddamet al., 2008; Aydin et al., 2010; Chang et al., 2011; Liu et al., 2012,Zhang, et al., 2014).

Studies on AIS have led to enhancement in many processes,including the clonal selection (CS) mechanism and negativeselection (NS) mechanism. From time to time, the NS mechanismmay introduce weaker elements in the population to accom-modate certain diversities. The CS mechanism captures basic fea-tures of an immune response to an antigenic stimulus, and ensuresthat only those antibodies that recognize antigens effectivelyremain in the memory cell. The CLONALG algorithm, introduced byWhite and Garrett (2003), primarily focuses on pattern recognitionrather than explicitly on classification.

2.3. Population-based incremental learning

In this paper, we incorporate PBIL into AIS classification toenhance the performance. The concept of incremental learning

Please cite this article as: Chen, M.-H., et al., A population-based innetwork intrusion detection. Eng. Appl. Artif. Intel. (2016), http://dx.

involves increasing tolerance during the learning process, whichtends to improve with ongoing iteration by creating new membersof the population if certain threshold requirements are not met (Yeand Li, 2002). The PBIL algorithm, first proposed by Baluja andCaruana (1995), is an evolutionary optimization algorithm andestimation of distribution algorithm. It is a type of genetic algo-rithm (GA) whereby statistics contained in any population areexplicitly maintained; the genotype of an entire population isevolved, rather than just those of the individual members (Karrayand Silva, 2004). The PBIL algorithm is as follows: (1) a populationis generated from a probability vector, (2) each member is eval-uated and ranked based on its fitness, (3) the population prob-ability vector is updated based on the fittest individuals,(4) mutation and (5) repeating steps 1 through 4.

PBIL has been very successful when compared against standardGAs on many benchmark and real time problems, such as thedevelopment of power system stabilizers through incrementallearning (Folly, 2007); with training using incremental learning,the variation in patterns can then be used to develop models forthe design of responsive stabilizers.

2.4. Collaborative filtering

A typical collaborative filtering scenario is one in which, given aset of users and a set of items, the users may rate a subset of items,while the system must predict a missing user rating for an item.Based on their information processing approaches, collaborativefiltering techniques can be classified into two types: model-basedand memory-based. Memory-based approaches predict the mis-sed rating from a group of users or items with similar profiles; inthis type of approach, similarity calculation is a critical step forfinding a group of similar users or items.

Well-known similarity metrics include the Pearson correlationcoefficient (Resnick et al., 1994), constrained Pearson (Shardanandand Maes, 1995), weighted Pearson correlation (Herlocker et al.,1999) and cosine similarity (Breese et al., 1998). Although thesesimilarity measures have been used in many collaborative filteringalgorithms, some researchers are dissatisfied with their perfor-mance. As a result, novel similarity models are continuously beingput forward (Ahn, 2008; Liu et al., 2014). For example, Liu et al.(2014) proposed a similarity model taking into account the localcontext information of user ratings and the global preference ofuser behavior. They claim that this new model is more effective,especially under cold user conditions. In addition, memory-basedcollaborative filtering techniques achieve recommendations basedon a group of similar users or items; they are also called neighbor-based methods. Neighbor-based approaches can be further clas-sified unto two types,: user-based (Resnick et al., 1994; Sharda-nand, 1994) and item-based (Sarwar et al., 2001), according to thesimilarity of their calculation methods. User-based approachesfilter information based on a group of similar users, while item-based approaches compute the similarity of items instead of usersimilarity.

3. Proposed approach

This study develops a classifier, named a PBIL-AISCF Classifier, tosolve classification problems. The focus of this study is the gen-eration of multiple type pools of antibodies for classification pro-blems. Our proposed model contains three phases, as shown inFig. 1. The first phase is the data collection and pre-processingstage; the second phase is the core application of PBIL-AISCF forevolution, which involves affinity calculation, the creation of new

cremental learning approach with artificial immune system fordoi.org/10.1016/j.engappai.2016.01.020i

www.Matlabi.irwww.Matlabi.ir

Page 4: Engineering Applications of Artificial Intelligence...system for network intrusion detection Meng-Hui Chena,b, Pei-Chann Changa,b,n, Jheng-Long Wub a School of Software, Nanchang

PBIL-AIS Classifier

Detection phase

Data collection and preprocessing phase

Evolution phase

User

InstancesAnswer

label

Detectionrules

Newinstance Notify Event

class

Colleting data

Affinitycalculation

LearningClassification Model

by PBIL-AIS

Fig. 1. Framework of PBIL-AISCF Classifier.

M.-H. Chen et al. / Engineering Applications of Artificial Intelligence ∎ (∎∎∎∎) ∎∎∎–∎∎∎4

antibodies, a clonal selection mechanism and antibody hierarchy;the third phase is the detection stage where detection rules of theproposed classifier are employed.

Details of the proposed model are described below.

3.1. Data collection and pre-processing phase

In this step, we used benchmark data from the UCI database1 toevaluate the PBIL-AISCF model. We collect records from users toform the training data for learning. According to the user’srequirement, each record may represent a credit card applicant incredit card validation data, such as those contained in the Australiacredit card data set, or a user’s connection to the service system toprovide services, such as those in the KDD99 cup data set. Therecords used for developing the training data must be a super setthat contains all classes and covers all variations pertaining tofeature values, which is why we chose an extensive record. Forclassification problems, each record will tag the answer label forsupervised learning. The training dataset plays a critical role in thecreation of the desired antibody pool, and thus has a significant

1 UCI Machine Learning Repository (Online), available: http://archive.ics.uci.edu/ml/.

Please cite this article as: Chen, M.-H., et al., A population-based innetwork intrusion detection. Eng. Appl. Artif. Intel. (2016), http://dx.

impact on accuracy, sensitivity and specificity values. In essence,each element of the training data is used during the learningprocess and is important in shaping the final pool. The trainingand testing data we are interested in for the datasets containfeature characteristic values. The feature value can be acquiredfrom one of the two data types: nominal or continuous.

The nominal values use Hamming distance (HD) to computethe nominal difference h. HD between two strings of follows:

hj ¼X

f ¼ 1AH

wf Ij:f ; Ij:f ¼0; if xj;f ¼ xAgf

1; if xj;f axAgf;

8<: ð1Þ

where wf represents the weight of the fth feature for the jthantibody, I represents the difference value as nominal data, H is aset of nominal data and xAgf is the fth feature value of the antigen.However, we need to transform the data scale for continuous data.The normalization is done as follows:

x0j;g ¼xj;g� minðxgÞ

maxðxgÞ� minðxgÞ; ð2Þ

where x’j,g is the normalized value for the jth antibody of thegth feature and xj,g is the original feature value for the jth antibodyof the gth feature. Min(xg) represents the minimal value of the gthfeature andMax(xg) is the maximum value of the gth feature. Then,for continuous values, we use Euclidean Distance (ED) to calculatethe continuous difference e. The distance of continuous dataiscalculated as follows:

ej ¼ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiXg ¼ 1AE

wgðx0j;g�xAgg Þ2s

; ð3Þ

where wj.g denotes the weighting of the gth feature for the jthantibody; xAgg denotes gth feature value of the antigen.

3.2. Evolution phase: using the PBIL-AISCF model

The evolution of PBIL-AISCF model is shown in Fig. 2.Step 1: Initial Antibody Pool. In this step, we will generate the

same n numbers of initial antibodies in each category pool. Arandom selection from the training dataset will be made. Theseantibodies will initiate the training process and a final exhaustedantibody pool will be generated through the evolution process. Weaim for sufficient information representing each class in the initialantibody pool for the AIS evolution.

Step 2: Calculating Affinity between Antibodies and Antigen. Inthis paper, affinity calculation is the key to what antibody can beextirpated from the antigens; thus, we consider each trainingdatum as an antigen in our system. The value for affinity is cal-culated for each antibody with the incoming antigen. Each anti-body and antigen contains certain features. The number of featuresand specific characteristics defined by the features are identical forall antigens and antibodies in any given dataset. An affinity canmeasure the relationship of the same feature of antibody andantigen. However, the feature value has two types of data: nominalor numeric. Nominal data use HD (Eq. (1)) to obtain the differentdegree, whereas numeric data use ED (Eq. (3)). The affinity valuefor each antibody–antigen pairs is calculated as follows:

Af f inityj ¼1

hjþejj¼ 1;2;3; :::; v; ð4Þ

where Af f inityj denotes the affinity value for the jth antibody withthe current antigen; hj denotes the net distance as nominal data; ejdenotes the net distance as numeric data. The affinity calculationis a very important step since we determine the use of an antibodybased on affinity: the lower the net distance, the higher the affi-nity, thus the better the antibody is in countering that particularantigen. In classification problems, we need to identify the

cremental learning approach with artificial immune system fordoi.org/10.1016/j.engappai.2016.01.020i

www.Matlabi.irwww.Matlabi.ir

Page 5: Engineering Applications of Artificial Intelligence...system for network intrusion detection Meng-Hui Chena,b, Pei-Chann Changa,b,n, Jheng-Long Wub a School of Software, Nanchang

Input training data

Initial class pools of antibody

Calculate the affinity betweenantibodies with antigen

Is detected correctclass?

Have all antigensbeen processed?

Testing

Yes

Create the newantibody No

Antibody hierarchy mechanism

Clonal expansion mechanismNo

Yes

Fig. 2. Flowchart for PBIL-AISCF.

Find the useful features in Abright ; i.e. thosefeatures whose values are to be retained in Abnew

If the affinity of Abnew isthe best or repeat N runs

Yes

Yes

No

Find the Abright and Abwrong andif the affinity of Abwrong> Abright

Calculate other feature values of Abnew throughincremental learning

Calculate the affinity of Abnewwith the particular antigen

Let the class of Ag And Abnew be same

Add the Abnew to antibody Pool

Fig. 3. The flowchart for the new antibody evolution.

M.-H. Chen et al. / Engineering Applications of Artificial Intelligence ∎ (∎∎∎∎) ∎∎∎–∎∎∎ 5

antibodies that can efficiently extirpate antigens of the right class.In our model, for an antibody to correctly extirpate an antigen, thevalue of affinity for the correct class of antibody has to be the best.We define the best antibody Abbest with the highest current affinityto form multiple antibody pools k. Abbest is determined as follows:

Abbest ¼ arg maxClassk

af f initykj ; ð5Þ

where Class denotes the label on classes, such as anomaly ornormal.

The antibody with the best affinity for a particular antigen maynot be of the extirpated antigen. This antibody hence cannot beused for detection. Nevertheless, none of the other antibodies isgood enough to be used for the antigen. In order to solve thisproblem, we propose PBIL to create new antibody with betteraffinity and for correct detection.

Step 3: Create New Antibody through evolution with PBIL. Asdescribed in the previous section, the call for evolution isimperative. In this step, we seek to create a new antibody; thisbecomes necessary in the insufficient antibody pool during thetraining. The PBIL approach is employed. The antibody of thehighest affinity with the particular antigen is selected from indi-vidual antibody classes. Abright stands for the antibody with thehighest affinity in the correct class (class of Ab and class of Ag arethe same) and Abwrong stands for the antibody with the highestaffinity, but in the incorrect class. The evolution only becomesnecessary when Af f inityAb

right oAf f inityAbwrong

. The new antibodyshould be identically structured as the parent antibody, and it

Please cite this article as: Chen, M.-H., et al., A population-based innetwork intrusion detection. Eng. Appl. Artif. Intel. (2016), http://dx.

should be generated only through a parent of the correct class.Feature values for the new antibody Abnew should be calculatedseparately for the nominal and the continuous sets. If the featurebelongs to the nominal dataset, it would be presented as shown inEq. (6) The flowchart for the new antibody evolution is shownin Fig. 3:

xAbnew

f ¼ xAgf ð6Þ

xAbnew

f denotes the new value of feature f, which is the nominaldata for each fth nominal feature. If the feature belongs to thecontinuous dataset, then the concept of learning rate is used. WithLR being the learning rate and N being the maximum number oftimes for evolution, according to the gradient descent algorithm,the LR value is designated to be high initially, but tends to decreasewith subsequent training. The new value of the gth feature in thenew antibody xnewg can be derived as follows:

xAbnew

g ¼ xAbright

g þLR� ðxAgg �xAbright

g Þ ; 0oLRo0:5; ð7Þ

Details of the steps for the creation of a new antibody aredescribed as follows:

1. Antibody Abright and Abwrong are identified using affinity rules.2. Antibody features with high information for learning are iden-

tified. This is done by scanning the value of feature differencebetween the antibodies and the antigen. A value equal or closeto 0 indicates close proximity, and hence is important, whilehigher values indicate less important features.

3. Other antibody features follow PBIL to learn.4. Affinity of the new antibody to the antigen is re-calculated.5. If the new antibody is of the highest affinity, then we proceed to

the next step; otherwise, we go back to step 3).6. The class of the new antibody is defined according to the

antigen class.

cremental learning approach with artificial immune system fordoi.org/10.1016/j.engappai.2016.01.020i

www.Matlabi.irwww.Matlabi.ir

Page 6: Engineering Applications of Artificial Intelligence...system for network intrusion detection Meng-Hui Chena,b, Pei-Chann Changa,b,n, Jheng-Long Wub a School of Software, Nanchang

Fig. 4. Pseudo code for the new antibody evolution.

M.-H. Chen et al. / Engineering Applications of Artificial Intelligence ∎ (∎∎∎∎) ∎∎∎–∎∎∎6

7. The new antibody is injected into the antibody pool accordingto the antibody pool class.

The pseudo code of new antibody evolution is shown in Fig. 4.Step 4: Clonal Expansion Mechanism. In this section, we further

modify the clonal selection by introducing dynamic weightadjustment. Traditionally, only features that are important forclassification are allotted higher weights, yet features distinct fromthe corresponding antigen features are not altered. Here, wesuggest decreasing the weights of these features simply based ontheir distances to the corresponding features, as we believe thiswill further improve the affinity gap issue between the con-tributing antibodies and the non-contributing ones. Antibodyfeature weights can be adjusted from the clonal expansion rate(Eq. (8)). From Eqs. (1) and (2), wf and wg are derived from Eq. (7).Let weight wf and wg be the weight set with a feature set WS¼{w1f1, w2f2,…, wnfn }, where the feature set includes nominal andnumerical data:

wnewt ¼wold

t þΔτt ; ð8Þ

Δτt ¼ 1� xAgt �xAbtNg

!; ð9Þ

where wnewt denotes the new weight of the tth feature in the fea-

ture set;Δτt denotes the expand rate in tth feature; Ng denotes the

Please cite this article as: Chen, M.-H., et al., A population-based innetwork intrusion detection. Eng. Appl. Artif. Intel. (2016), http://dx.

number of antigens. Based on formulae (8) and (9), we can expandweights of feature for learning high useful weighting using clonalexpand mechanism. However, for the affinity relation, if the dis-tance is AbrightoAbwrong, then we increase the feature weight;otherwise, if the distance is Abright4Abwrong, then we decrease thefeature weight. That said, the aim is to change or increase therelation to Af f inityAb

right4Af f inityAb

wrongafter the clonal expansion.

The flowchart and pseudo code for clonal expansion are shown inFigs. 5 and 6.

Step 5: Antibody Hierarchy MechanismThe antibody hierarchy adjustment mechanism joins the pro-

cess as the weight sets evolve, and therefore, contributes to theaffinity values and usage of any antibody. The better antibody–antigen affinities and usages increases the probability of an anti-body being in the memory cell, which when called upon, candispose antigens with better accuracies during subsequent train-ing and test runs.

The antibody hierarchy mechanism seeks to categorize anti-bodies in the antibody pool into “useful” and “unproductive”groups and subsequently retains the best antibodies for futuretraining through the elimination of the unproductive ones. Thisstep derives its importance from the fact that a less efficientantibody, if retained for testing, would lead to lower accuracy. Alsoa set of non-performing antibodies in the population wouldseverely impact the accuracy parameters. At this moment, it is

cremental learning approach with artificial immune system fordoi.org/10.1016/j.engappai.2016.01.020i

www.Matlabi.irwww.Matlabi.ir

Page 7: Engineering Applications of Artificial Intelligence...system for network intrusion detection Meng-Hui Chena,b, Pei-Chann Changa,b,n, Jheng-Long Wub a School of Software, Nanchang

1

2

3

Best Ab

Is the best Ab of the correct class?

Calculate Affinities for all the antibodieswith the current antigen

Create new Ab usingPBIL

Yes

No

Best Ab in Anomaly Class2Best Ab in Class1

Distance (incorrect class Ab =correct class Ab)

Distance (incorrect class Ab <correct class Ab)

Distance (incorrect class Ab >correct class Ab)

Increase weight Decrease weight No change

Adjust weight

Fig. 5. Flowchart for the clonal expansion.

M.-H. Chen et al. / Engineering Applications of Artificial Intelligence ∎ (∎∎∎∎) ∎∎∎–∎∎∎ 7

desirable to keep some of the mid-range antibodies rather thandiscarding them completely since they may contribute to antigenextirpation in the future, either through a high usage rate or byhaving high affinity values with a set of antigens. Thus, a cumu-lative assessment of two parameters: Usage Rate and Mean Affinity,is used to perform the categorization.

This mechanism divides the antibody pool into threehierarchies:

1. Memory Cell is at the top. Most cells bear the best of the anti-bodies, and hence are thoroughly used prior to the Mature andImmature Cells during training. Memory cells are also usedexclusively for testing.

2. Mature Cell contains mid-range antibodies. Antibodies in thiscell may be promoted to memory cell or demoted to immaturecell depending on their performances.

3. Immature Cell contains non-performing antibodies and part ofthis cell is eliminated each time the antibody hierarchy mechan-ism is called upon during the training.

This mechanism is called upon repeatedly during training afterprocessing a certain number of antigens. A memory cell of highquality is always maintained by this mechanism, and is necessaryfor an efficient intrusion detection system. The constant update,along with the mid and low range antibody classes, ensures thatminimum information is lost while eliminating the weaker anti-bodies. The antibody and antigen match follows the followingprocess:

) First Memory Cell antibodies are used to match the antigen. Ifthe match is successful, the new record is labeled accordingly.From the affinity formula, we know that the successful affinityrelation between the two classes of antibody pool is deter-mined: Af f inityAb

right4Af f inityAb

wrong.

Please cite this article as: Chen, M.-H., et al., A population-based innetwork intrusion detection. Eng. Appl. Artif. Intel. (2016), http://dx.

) If the Memory Cell antibodies are not successfully matched forantigen, then we go to the Mature Cell to find a match.

) Otherwise, we go to the Immature Cell. If the required condi-tions are still not met, a need for a new antibody rises. A newantibody Abnew is to be created.

The usage rate for categorization that an antibody is capable ofdetecting antigens repeatedly by following all the detection ruleshas more chance of successful detection in subsequent runs andtherefore, should be kept in memory cells, and be put ahead ofthose less used. This also means that it should be used prior tothose in the mature and immature cells. Since the memory cell iscalled upon first for any detection, both during the training as wellas during the testing phase, exposing the best performing anti-bodies through this cell seems logical. The usage rate is explicitlydefined for each class and each of the three cells indicated by therules:

UsageRateClassCell ¼ Ti

NClassCell

ð10Þ

where Ti stands for the total number of times the ith antibody issuccessfully used during the training period. NClass

Cell stands for thetotal instances that the antibody from the cell of the particularclass was used.

Use of mean affinity for categorization. Many antibodies maynot be used frequently, but can be effective when used. Thus,usage rate by itself would be insufficient for rating antibodies.With this in mind, we introduce the concept of mean affinity,which takes into account the quality of the interaction betweenany antibody and antigen. Mean affinity is calculated as a ratio ofthe sum of the affinities to the number of times which the

cremental learning approach with artificial immune system fordoi.org/10.1016/j.engappai.2016.01.020i

www.Matlabi.irwww.Matlabi.ir

Page 8: Engineering Applications of Artificial Intelligence...system for network intrusion detection Meng-Hui Chena,b, Pei-Chann Changa,b,n, Jheng-Long Wub a School of Software, Nanchang

Fig. 6. Flowchart for the clonal expansion.

M.-H. Chen et al. / Engineering Applications of Artificial Intelligence ∎ (∎∎∎∎) ∎∎∎–∎∎∎8

antibody was successfully employed for antigen detection:

MeanAf f ini ty¼ SiTi; ð11Þ

The sum of affinities for ith antibody during that training per-iod is denoted by Si. Ti stands for the total number of times that ithantibody was successfully used during the training period. Fig. 7illustrates the flow of the antibody hierarchy mechanism bydepicting the combined effect of usage rate and mean affinity usedfor storing best antibodies in the Memory cell, while eliminatingthe worst ones from the immature set.

3.3. Classify stage of unknown instance

We have generated the classification rules for antibodies in twoclass pools employing PBIL-AISCF Evolution by using only memorycell antibodies for detection since they are best in distinguishingfor most cases. The use of memory cell antibodies can only bejustified by the fact that the antibody hierarchy adjustingmechanism has been employed to form an effective set of anti-bodies through the training process. The classified strategy isbased on the idea of collaborative filtering. The classificationprocess is used to find a set of the memory cell antibodies withhigher affinity to unknown instance. The affinity between the

Please cite this article as: Chen, M.-H., et al., A population-based innetwork intrusion detection. Eng. Appl. Artif. Intel. (2016), http://dx.

memory cell antibodies with the unknown instance is set as athreshold value, α. This set of memory cell antibodies with theiraffinities higher than the threshold value can be applied foridentifying unknown intrusions related to this instance. With theclassifying support of this set of memory cell antibodies, theclassifier will be more powerful and accurate in network intru-sions detection.

4. Experimental results

Data from the KDD99 Cup and the Australia Credit Approvalwere used to evaluate the performance of our PBIL-AISCF inte-grated classification approach. We set the parameters of PBIL-AISCF

as shown in Table 1.The α is set up as 0.5 for clustering a memory cell antibody to

check if it is related to the target instance or not. The otherparameters are set up by intensive experiments. Several traditionalapproaches proposed for comparisons with our approach are runin WEKA (3.6.6); their parameters are set up as follows: The costparameter of DT (J48) is set by 0.1. The M parameter of logistic isset by 20. The gamma and cost parameters of SVM are set by 2�12

and 150. The affinity, clonal rate and k-nn parameter of AIRS are

cremental learning approach with artificial immune system fordoi.org/10.1016/j.engappai.2016.01.020i

www.Matlabi.irwww.Matlabi.ir

Page 9: Engineering Applications of Artificial Intelligence...system for network intrusion detection Meng-Hui Chena,b, Pei-Chann Changa,b,n, Jheng-Long Wub a School of Software, Nanchang

If Memory Cell has been used at least once

If Normal AntiBody and k <= 0.05 If Anomaly AntiBody and k <= 0.05

AntiBody is demoted to Mature Cell

If Mature Cell has been used at least once

If Normal Ab and k > usage rate of Mature If Anomaly Ab and k > usage rate of Mature Else

Record in NorMat20/AnoMat20 Record in Mat80

Sort NorMat20 and AnoMat20 according to Mean Affinity Sort Mat80 according to Mean Affinity

Abs in lower quarter are demoted toImmature

Abs in higher segment are promoted to Memoryaccording to percentage of Rank

If Immature Cell is used at least once

If k > usage rate of Mature

Record in NorMat20/AnoMat20

Sort NorMat20 and AnoMat20 according to Mean Affinityand Abs in higher segment are promoted to Memory

according to percentage of Rank

Else

Record in OldImm80/ New Imm80

NewImm80 is left as it is OldImm80 is sorted according to meanaffinity and lower half is discarded.

Fig. 7. Flowchart for antibody hierarchy adjustment.

Table 1Parameters of PBIL-AISCF.

Parameters Values

α 0.5Learning rate 0.7Initial antibody pool size for each class 20New antibodies evolved 10

Table 2Distribution of KDD99 cup data.

Class Training set Testing set

Normal connection 9728 6060Intrusion connection 39,524 25,064

Table 3Distribution of KDD99 cup data.

Average accuracy Max accuracy Min accuracy

Training set 97.03 98.21 95.10Testing set 93.24 94.01 92.66

Fig. 8. Result on training set and testing set from PBIL-AISCF model for 10 runs.

M.-H. Chen et al. / Engineering Applications of Artificial Intelligence ∎ (∎∎∎∎) ∎∎∎–∎∎∎ 9

set by 0.9, 10 and 3. The parameters of the other approaches areset by default in WEKA. In addition, from the KDD99 Cup data, werandomly selected 49,252 records from the 10% training data,consisting of 494,021 connection records for training. We alsorandomly selected 31,124 records for testing. Each connectionrecord has 41 attributes characterizing the instance, represents asequence of packet transmission starting and ending at a timeperiod, and can be classified as either normal connection orintrusion connection class. Each connection contains thirty-fournumerical and seven categorical attributes (inputs) and one classattribute (output: normal or intrusion). As shown in the distribu-tion of connection types in Table 2, the normal connection classhas 9728 connections in the training set and 6060 connections in

Please cite this article as: Chen, M.-H., et al., A population-based innetwork intrusion detection. Eng. Appl. Artif. Intel. (2016), http://dx.

the testing set; the intrusion connection class has 39,524 con-nections in the training set and 25,064 connections in the testingset. In addition, the intrusion connection has 4 attacks: denial-of-Service (DOS), remote-to-local (R2L), user-to-root (U2R) andprobing (Probe). DOS is denial of service that is accessed by

cremental learning approach with artificial immune system fordoi.org/10.1016/j.engappai.2016.01.020i

www.Matlabi.irwww.Matlabi.ir

Page 10: Engineering Applications of Artificial Intelligence...system for network intrusion detection Meng-Hui Chena,b, Pei-Chann Changa,b,n, Jheng-Long Wub a School of Software, Nanchang

Table 4Performance comparisons for different methods on the KDD99 cup data.

Data set PBIL-AISCF PBIL-AIS SVM Naïve Bayes k-NN DT (J48) AIRS CLONALG Immunos-81

Training set 97.03 96.83 98.70 96.16 98.91 99.96 99.31 75.52 95.20Testing set 93.24 92.41 92.46 79.80 91.96 92.53 92.11 67.09 80.90

Table 5Performance comparisons for the different methods on the Australian credit card approval instance in 10-fold cross validation.

Data set PBIL-AISCF PBIL-AIS SVM Naïve Bayes k-NN DT (J48) AIRS CLONALG Immunos-81

10-fold cross validation 86.23 84.76 81.2 79.4 78.8 85.78 80.43 64.64 84.35

M.-H. Chen et al. / Engineering Applications of Artificial Intelligence ∎ (∎∎∎∎) ∎∎∎–∎∎∎10

legitimate users, such as SYN flooding; R2Lis unauthorized accessfrom a remote machine or password guessing; U2R is unauthor-ized access to gain local super-user (root) privileges and bufferoverflow attack; and Probe is surveillance and probing for infor-mation gathering.

The Australian credit approval data set, which is availablepublicly, has 690 samples each with 14 attributes. Each case con-tains six numerical and eight categorical attributes (inputs), aswell as one class attribute (output: accept or reject). Fewer ele-ments and features make the dataset suitable for faster trainingand testing. This dataset is interesting as there is a mixture ofattributes: continuous, nominal with small numbers of values andnominal with larger numbers of values. The distribution ofaccepted approval class has 307 approvals, and the rejectedapproval class has 383 approvals for testing. In addition, we used a10-fold cross validation mechanism in the testing stage for testingour proposed PBIL-AISCF classification model. The advantage ofthis method over repeated random sub-sampling is that allobservations are used for both training and validation purpose,and each observation is used for validation exactly one time.

4.1. Evaluation measures

For model evaluation, we calculated the accuracy to evaluateour proposed model. The parameters for evaluation are defined inaccordance with the confusion matrix:

Accuracy¼ TPþTNTPþTNþFNþFP

ð12Þ

The TP denotes true positive, such as the class of classifiedconnection and true connection are both normal in the kDD99 Cupdataset; TN denotes for true negative, such as the class of classifiedconnection and true connection are both anomaly in the kDD99Cup dataset; FN denotes for false negative such as the class ofclassified connection being an anomaly, but true connection isnormal in the kDD99 Cup dataset and FP denotes false positive,such as the class of classified connection being normal, but trueconnection is an anomaly in the kDD99 Cup dataset.

4.2. Results on KDD99 cup

Performance of PBIL-AISCF classification in the sampling datasetof KDD99 Cup is shown in Table 3 and Fig. 8. The PBIL-AISCF modelis average in accuracy. The maximum and minimum accuracy in 10runs of testing set are 92.34, 94.01 and 92.66, respectively. Wecompared our performance against other algorithms from theliterature such as PBIL-AIS without collaborative filtering, SVM,Naïve Bayes, Decision Tree (J48), k-NN, AIRS (Watkins et al., 2004),CLONALG (De Castro and Von Zuben, 2002) and Immunos-81(Carter, 2000). Table 4 summarizes the statistics of the average

Please cite this article as: Chen, M.-H., et al., A population-based innetwork intrusion detection. Eng. Appl. Artif. Intel. (2016), http://dx.

accuracy values of these algorithms on the KDD99 Cup instances.To test the efficiency of our proposed PBIL-AISCF method againstother evolutionary algorithms (EAs), we compared it with AIRS,CLONALG and Immunos-81. Table 3 shows that PBIL-AISCF yields amuch better average accuracy ratio than the three EAs.

4.3. Results on Australian credit card

For the Australian credit card instance, we also compared theperformance of our proposed algorithm with algorithms of otherliterature, such as PBIL-AIS without collaborative filtering, SVM,Naïve Bayes, Decision Tree (J48), k-NN, AIRS, CLONALG andImmunos-81.

Table 5 summarizes the statistics of the average accuracy valuesof these algorithms on the Australian credit card approvalinstances. To test the efficiency of our proposed PBIL-AISCF methodagainst other EAs, we compared it with AIRS, CLONALG andImmunos-81. It can be seen from Table 5 that PBIL-AISCF yields amuch better average accuracy ratio than the 9 classifiers.

5. Conclusion

We proposed a PBIL integrated AIS model for classification pro-blems to solve for network intrusion detection and credit card vali-dation. PBIL is a learning process that can improve the AIS evolutioneffect for the creation of new antibodies. The induction of PBILstrengthens the concept of retaining the fittest while eliminating weakones in the population. In this paper, we proposed antibody hierarchyadjustment using mean affinity, as well as usage rates, to intensify AISperformance. Our proposed PBIL-AISCF classification model for classi-fying the KDD99 cup and Australia credit card datasets was found toyield significantly better performance. We also carried out compar-isons of our method against other methods, and have shown positiveperformance of our method for both datasets. The major feature of ourresult was the efficiency with which our method works for developingIDS, as well as for designing credit scoring model. Future work can bebased on further classification on various intrusion categories of therecords for the KDD99 cup dataset. Our paper classified records intonormal and anomaly categories only, which can be further sub-classified into various kinds of anomalies. Also, implementation ofother mechanisms into AIS, such as negative selection mechanism, canalso improve the performance of the method.

References

Abadeh, M.S., Mohamadi, H., Habibi, J., 2011. Design and analysis of genetic fuzzysystems for intrusion detection in computer networks. Expert. Syst. Appl. 38,7067–7075.

cremental learning approach with artificial immune system fordoi.org/10.1016/j.engappai.2016.01.020i

www.Matlabi.irwww.Matlabi.ir

Page 11: Engineering Applications of Artificial Intelligence...system for network intrusion detection Meng-Hui Chena,b, Pei-Chann Changa,b,n, Jheng-Long Wub a School of Software, Nanchang

M.-H. Chen et al. / Engineering Applications of Artificial Intelligence ∎ (∎∎∎∎) ∎∎∎–∎∎∎ 11

Afzali Seresht, N., Azmi, R., 2014. MAIS-IDS: a distributed intrusion detection sys-tem using multi-agent AIS approach. Eng. Appl. Artif. Intell. 35, 286–298.

Agarwal, G., Barari, S., Tiwari, M.K., 2012. A PSO-based optimum consumer incen-tive policy for WEEE incorporating reliability of components. Int. J. Prod. Res.50, 4372–4380.

Ahn, H.J., 2008. A new similarity measure for collaborative filtering to alleviate thenew user cold-starting problem. Inf. Sci. 178 (1), 37–51.

Alatas, B., Akin, E., 2005. Mining fuzzy classification rules using an artificial immunesystem with boosting. Adv. Databases Inf. Syst. 3631, 283–293.

Altwaijry, H., Algarny, S., 2012. Bayesian based intrusion detection system. J. KingSaud. Univ. – Comput. Inf. Sci. 24, 1–6.

Alves, R.T., Delgado, M.R., Freitas, A.A., 2010. Knowledge discovery with artificialimmune systems for hierarchical multi-label classification of protein functions.In: Proceedings of the 2010 IEEE International Conference on Fuzzy Systems(FUZZ), pp. 1–8.

Aydin, I., Karakose, M., Akin, E., 2010. Artificial immune classifier with swarmlearning. Eng. Appl. Artif. Intell. 23, 1291–1302.

Aydin, I., Karakose, M., Akin, E., 2011. A multi-objective artificial immune algorithmfor parameter optimization in support vector machine. Appl. Soft Comput. 11,120–129.

Baccarini, L.M.R., Silva, V.V.R., Menezes, B.R., Caminhas, W.M., 2011. SVM practicalindustrial application for mechanical faults diagnostic. Expert. Syst. Appl. 38,6980–6984.

Baluja, S., Caruana, R., 1995. Removing the genetics from the standard geneticalgorithm. In: Proceedings of the 12th Annual Conference on Machine Learning,pp. 38–46.

Breese, J.S., Heckerman, D., Kadie, C., 1998. Empirical analysis of predictive algo-rithms for collaborative filtering. In: Proceedings of the 14th Annual Conferenceon Uncertainty in Artificial Intelligence, pp. 43–52.

Carter, J.H., 2000. The immune system as a model for classification and patternrecognition. J. Am. Inform. Assoc. 7, 28–41.

Castellani, M., Rowlands, H., 2008. Evolutionary feature selection applied to artifi-cial neural networks for wood-veneer classification. Int. J. Prod. Res. 46,3085–3105.

Chang, P.C., Huang, W.H., Ting, C.J., 2011. A hybrid genetic-immune algorithm withimproved lifespan and elite antigen for flow-shop scheduling problems. Int. J.Prod. Res. 49, 5207–5230.

Darmoul, S., Pierreval, H., Gabouj, S.H., 2006. Scheduling using artificial immunesystem metaphors: A review. In: Proceedings of the International Conferenceon Service Systems and Service Management, pp. 1150–1155.

Dasgupta, D., Yu, S., Nino, F., 2011. Recent advances in artificial immune systems-models and applications. Appl. Soft Comput. 11, 1574–1587.

De Castro, L.N., Von Zuben, F.J., 2002. Learning and optimization using the clonalselection principle. IEEE Trans. Evolut. Comput. 6, 239–251.

Dehuri, S., Patnaik, S., Ghosh, A., Mall, R., 2008. Application of elitist multi-objectivegenetic algorithm for classification rule generation. Appl. Soft Comput. 8,477–487.

Desai, V.S., Crook, J.N., Overstreet, G.A., 1996. A comparison of neural network andlinear scoring models in credit union environment. Eur. J. Oper. Res. 95, 24–35.

Folly, K.A., 2007. Power system stabilizer design for multimachine power systemusing population-based incremental learning. Power Plants Power Syst. Con-trol. 5, 41–46.

Gao, J., 2010. A novel artificial immune system for solving multiobjective sche-duling problems subject to special process constraint. Comput. Ind. Eng. 58,602–609.

Goldberg, D.E., 1989. Genetic algorithms in search, optimization and machinelearning. Addison-Wesley Longman Publishing Co Inc, Boston, MA, USA.

Hart, E., Timmis, J., 2008. Application areas of AIS: The past, the present and thefuture. Appl. Soft Comput. 8, 191–201.

Herlocker, J.L., Konstan, J.A., Borchers, A., Riedl, J., 1999. An algorithmic frameworkfor performing collaborative filtering. In: Proceedings of the Annual Interna-tional ACM SIGIR Conference on Research and Development in InformationRetrieval, pp. 230–237.

Horng, S.J., Su, M.Y., Chen, Y.H., Kao, T.W., Chen, R.J., Lai, J.L., Perkasa, D., 2011. Anovel intrusion detection system based on hierarchical clustering and supportvector machines. Expert. Syst. Appl. 38, 306–313.

Please cite this article as: Chen, M.-H., et al., A population-based innetwork intrusion detection. Eng. Appl. Artif. Intel. (2016), http://dx.

Hunt, J., Timmis, J., Cooke, C., Neal, M., King, C., 1998. Jisys: The development of anartificial immune system for real world applications. In: Dasgupta, D. (Ed.),Artificial Immune Systems and their Applications. Springer-Verlag, New York,pp. 157–186.

Karray, F.O., Silva, C.W., 2004. Soft computing and intelligent systems design:theory, tools, and applications. Addison Wesley Publishing, Boston, USA.

Kim, Y.S., Street, W.N., 2004. An intelligent system for customer targeting: a datamining approach. Decis. Support. Syst. 37, 215–228.

Liu, C.H., Huang, W.H., Chang, P.C., 2012. A two-stage AIS approach for grid sche-duling problems. Int. J. Prod. Res. 50, 2665–2680.

Liu, H., Hu, Z., Mian, A., Tian, H., Zhu, Z., 2014. A new user similarity model toimprove the accuracy of collaborative filtering. Knowledge-Based Syst. 56,156–166.

Nemati, S., Basiri, M.E., Ghasem-Aghaee, N., Aghdam, A.M., 2009. A novel ACO–GAhybrid algorithm for feature selection in protein function prediction. Expert.Syst. Appl. 36, 12086–12094.

Przystałka, P., Moczulski, W., 2015. Methodology of neural modelling in faultdetection with the use of chaos engineering. Eng. Appl. Artif. Intell. 41, 25–40.

Razavi-Far, R., Davilu, H., Palade, V., Lucas, C., 2009. Model-based fault detectionand isolation of a steam generator using neuro-fuzzy networks. Neuro-computing 72, 2939–2951.

Resnick, P., Iacovou, N., Suchak, M., Bergstrom P., Riedl, J., 1994. Grouplens: an openarchitecture for collaborative filtering of netnews. In: Proceedings of the ACMConference on Computer Supported Cooperative Work (CSCW’94). ACM, NewYork, NY, pp. 175–186.

Sarwar, B., Karypis, G., Konstan, J., Riedl, J., 2001. Item-based collaborative filteringrecommendation algorithms. In: Proceedings of the 10th International Con-ference on World Wide Web, pp. 285–295.

Shardanand, U., Maes, P., 1995. Social information filtering: algorithms for auto-mating “word of mouth. In: Proceedings of the SIGCHI Conference on HumanFactors in Computing Systems (CHI’95), pp. 210–217.

Shardanand, U., 1994. Social information filtering for music recommendation, M.S.thesis. Massachussets Inst. Technol.

Tavakkoli-Moghaddam, R., Rahimi-Vahed, A.R., Mirzaei, A.H., 2008. Solving a multi-objective no-wait flow shop scheduling problem with an immune algorithm.Int. J. Adv. Manuf. Technol. 36, 969–981.

Thomas, L.C., 2000. A survey of credit and behavioral scoring: forecasting financialrisk of lending to consumers. Int. J. Forecast. 16, 149–172.

Twala, B., 2010. Multiple classifier application to credit risk assessment. Expert.Syst. Appl. 37, 3326–3336.

Watkins, A., Timmis, J., Boggess, L., 2004. Artificial immune recognition system(AIRS): an immune-inspired supervised learning algorithm. Genet. Program.Evol. Mach. 5, 291–317.

White, J.A., Garrett, S.M., 2003. Improved pattern recognition with artificial clonalselection?. In: Timmis, J., Bentley, P.J., Hart, E. (Eds.), ICARIS 2003. Springer,Heidelberg, New York, pp. 181–193.

Woolley, N.C., Milanovic, J.V., 2011. An immune system inspired clustering andclassification method to detect critical areas in electrical power networks. Nat.Comput. 10, 305–333.

Ye, N., Li, X., 2002. A scalable, incremental learning algorithm for classificationproblems. Comput. Ind. Eng. 43, 677–692.

Yuen, C.W.M., Wong, W.K., Qian, S.Q., Chan, L.K., Fung, E.H.K., 2009. A hybrid modelusing genetic algorithm and neural network for classifying garment defects.Expert. Syst. Appl. 36, 2037–2047.

Zhang, G., Hu, M.Y., Patuwo, B.E., Indro, D.C., 1999. Artificial neural networks inbankruptcy prediction: general framework and cross-validation analysis. Eur. J.Oper. Res. 116, 16–32.

Zhang, R., Shan, M.Y., Liu, X.H., Zhang, L.H., 2014. A novel fuzzy hybrid quantumartificial immune clustering algorithm based on cloud model. Eng. Appl. Artif.Intell. 35, 1–13.

Zhao, H., 2007. A multi-objective genetic programming approach to developingPareto optimal decision trees. Decis. Support. Syst. 43, 809–826.

Zhong, Y., Zhang, L., Gong, J., Li, P., 2007. A supervised artificial immune classifier forremote-sensing imagery. IEEE Trans. Geosci. Remote. Sens. 45, 3957–3966.

cremental learning approach with artificial immune system fordoi.org/10.1016/j.engappai.2016.01.020i

www.Matlabi.irwww.Matlabi.ir