research article a novel distributed online anomaly...
TRANSCRIPT
Research ArticleA Novel Distributed Online Anomaly Detection Method inResource-Constrained Wireless Sensor Networks
Zhiguo Ding12 Haikuan Wang1 Minrui Fei1 and Dajun Du1
1Shanghai Key Laboratory of Power Station Automation Technology School of Mechatronics Engineering and AutomationShanghai University Shanghai 200072 China2College of Mathematics Physics and Information Engineering Zhejiang Normal University Jinhua Zhejiang 321004 China
Correspondence should be addressed to Haikuan Wang eeewhk163com
Received 17 March 2015 Accepted 14 May 2015
Academic Editor Fuwen Yang
Copyright copy 2015 Zhiguo Ding et al This is an open access article distributed under the Creative Commons Attribution Licensewhich permits unrestricted use distribution and reproduction in any medium provided the original work is properly cited
In this paper a novel distributed online anomaly detection method in resource-constrained WSNs was proposed Firstly thespatiotemporal correlation existing in the sensed data was exploited and a series of single anomaly detectors were built in eachdistributed deployment sensor node based on ensemble learning theory Secondly these trained detectors were broadcasted tothe member sensor nodes in the cluster combining with its trained detector and the initial ensemble detector was built Thirdlyconsidering resources-constrained WSNs ensemble pruning based on biogeographical based optimization (BBO) was employedin the cluster head node to obtain an optimized subset of ensemble members Further the pruned ensemble detector coded bythe state matrix was broadcasted to each member sensor nodes for the distributed online global anomaly detection Finally theexperiments operated on a real WSN dataset demonstrated the effectiveness of the proposed method
1 Introduction
Wireless sensor networks (WSNs) are integrated with sens-ing data processing andwireless communication capabilities[1] which have received considerable attention for multipletypes of applications However WSNs are highly susceptibleto suffer from various kinds of interferences and faults hard-ware fault electromagnetic interference environmental fac-tor and network intrusion Consequently anomalous obser-vations arise inevitably inWSNsThese unusual observations(ie anomalies or outliers) can be generally classified into twodifferent types one is error and the other is event [2 3] Theformer refers to the observations that deviate from the truemeasurement significantly such as the dirty data Detectingand cleaning them timely can save the limited memory andcomputation as well as expensive communication resourcesThe later usually refers to an event that occurred suchas temperature change caused by the forest fire Detectingsuch event timely can help to take corresponding measureWith the wide application of WSNs detecting these anoma-lous observations accurately and timely is an importanttask
Though there aremany anomaly detectionmethods avail-able up to now based on data mining and machine learningmethods most of them do not take the resource limitationinto account and are not designed specifically for WSNsConsidering the limited resource (ie computationmemorycommunication and so on) of WSNs how to develop asuitable anomaly detection method becomes an importantand urgent work Up to now researchers have done someworks and proposed some anomaly detection methods forWSNs [1 4ndash7] which took resource limitation into accountto some extent
As the first of four research directions in machinelearning community ensemble learning has attracted manyresearchers attention and been used widely in different appli-cations [8] However seldom work was done for anomalydetection of WSNs A large body of theoretical and empiricalresearches has shown that the combination of the detectingresults of multiple individual detectors can improve the gen-eralization performance observably but original ensemblelearning method usually needs to build and store multipleindividual detectors which incur a large amount of com-putation and storage resource requirement and may not be
Hindawi Publishing CorporationInternational Journal of Distributed Sensor NetworksVolume 2015 Article ID 146189 12 pageshttpdxdoiorg1011552015146189
2 International Journal of Distributed Sensor Networks
appropriate for WSNs The possible strategy is to select partof individual detectors to perform the anomaly detectionConsequently the ensemble pruning is a necessary strategy[9] which can obtain the better (at least same) performancecompared to the initial ensemble while the number ofindividual detectors decreased greatly
Analyzing the spatiotemporal correlation of sensed datain WSNs and motived by the online ensemble learningmethod the paper proposes a distributed anomaly detectionmethod for WSNs from the perspective of both modelbuilding and resource saving Further to mitigate the highcommunication requirements caused by broadcast ensembledetectors BBO based ensemble pruning is used to select theoptimized individual detectors to build the final ensembledetector that has at least same performance compared tothe initial ensemble detector The main contributions of thispaper include the following
(1) A distributed anomaly detection method for WSNs isproposed based on online ensemble learning
(2) BBO based ensemble pruning is used to get theoptimal subset for saving the limited store and com-munication resources in WSNs
(3) State matrix encoding method is designed for ensem-ble detector which can decrease the communicationand memory overhead significantly
The rest of this paper is organized as follows The relatedwork is described in Section 2 Based on ensemble learningtheory and BBO our proposed anomaly detection methodis presented in Section 3 Experiment analysis is provided inSection 4 Finally conclusions and future work are presentedin Section 5
2 Related Work
To clearly analyze the motivation of the paper the stateof the art of three key aspects related to our paper issummarized that is anomaly detection in WSNs onlineensemble learning and ensemble pruning
21 Anomaly Detection Method and Classification in WSNsWith the rapid development and wide application of WSNssome anomaly detection techniques for WSNs have beendeveloped and summarized based on different perspectiveFor example [10] discussed the prioritization of various char-acteristics ofWSNs including of spatiotemporal and attributecorrelations of sensed data anomaly types anomaly identifi-cation anomaly score and so forth A brief overview of theclassifications strategies for anomaly detection methods inWSNs deployed in harsh environment was provided whichgrouped anomaly detection methods into four types that isstatistical-based techniques the nearest neighbor based tech-niques the clustering based techniques and classification-based techniques Based on the nature of sensor data specificrequirements and limitations of the WSNs [1] provideda comprehensive overview of existing anomaly detectiontechniques specifically developed for WSNs It presented atechnique-based taxonomy and gave a comparative table
which could be used as a guideline to select the suitablemethod for the specific application For example based on thecharacteristic such as data types anomaly types and anomalydegree statistical-based methods are further classified intoparametric-based methods and nonparametric-based meth-ods Based on how the probability distribution model isbuilt classification-basedmethods are categorized as supportvector machined-based methods and radial basis functionneural networks-based methods [11] and so on The inter-ested reader is referred to more anomaly detection methodsand taxonomies in [5 6 12 13] These taxonomies of afore-mentioned methods may be some overlaps and machinelearning and computational intelligence-based techniquesare an increasing important research direction beyond alldoubt with respect to the complicated application Moreoverthough these methods have acceptable performance to someextent the resource constraint usually was not or seldomtaken into account With the wide applications of WSNsit also attracted some researchersrsquo attentions [14] Anothernoticeable characteristic of aforementionedmethodswas thatonly a single detector or model was trained It is well-knownthat the singlemodelmay be not well learned the complicateddecision boundary with respect to the complicated dataset For the sensed streaming data with the dynamic datadistribution single model is hard to or need expensive costto learn and obtain the whole profile such as training artificialneural network which leads to overlearning and degrade thegeneralization performance Besides concept drift [15] wasa common phenomenon that occurred in dataset collectedfromWSNs and the single mode was difficult in dealing withsuch dynamic changing of data distribution and providing acomprehensive detector to detect anomaly Moreover detec-tor updating based on all available dataset is also a hard workfor online learning
22 Ensemble Learning Method Ensemble learning is a com-putational intelligence method and theory and experimenthave proved that the combination of the predictions ofmany individual detectors can enhance the generalizationperformance There are many different ensemble learningmethods used widely and successfully such as Bagging [1617] Boosting [18 19] Random Forest [20] and their onlineversion [21 22] Generally an ensemble anomaly detector isconstructed in two steps Firstly a number of base detectorsare trained using the training dataset Secondly a combina-tion strategy of result is designed to obtain the aggregatedresult based on the results of each single detector For time-series dataset such as sensed dataset in WSNs learning asingle model to profile the whole dataset usually is difficultor impossible Generally there are two categorized ensemblepatterns to handle the streaming data that is horizontalensemble and vertical ensemble The former follows suchstrategy that the nearest 119899 consecutive data chunks are firstlyused to train 119899 base detector and the combination methodis employed to build the ensemble detector used to predictdata in the yet-to-arrive chunk The advantage of horizontalensemble is that it can handle noise data in the streamingdataset because the prediction of newly arriving data chunkdepends on the combination of different chunks Even if
International Journal of Distributed Sensor Networks 3
the noise data may deteriorate some chunks the ensemblecan still generate relatively accurate prediction result Thedisadvantage of horizontal ensemble is that the streamingdata is continuously changing and the information containedin the previous chunks may be invalid so that using theseold concept models will not improve the overall result ofprediction The latter ensemble pattern is vertical ensemblewhich uses the newest chunk to build ensemble model Theadvantage of vertical ensemble is that it uses different algo-rithms (heterogeneous ensemble) on same dataset or samealgorithm (homogeneous ensemble) on different samplingsubdataset from the chunk to build the model which candecrease the bias error between models The disadvantageis that vertical ensemble assumes that the data chunk iserrorless in real situation this precondition usually is hardtomeet Currently because online ensemble learningmethodcan address the concept drift and noisy data problem instreaming data ensemble learning has been used in anomalydetection forWSNs [23ndash25] In this paper after exploiting thespatiotemporal correlation existing in the sensed dataset inWSNs a distributed method is proposed based on horizontalensemble and like-vertical ensemble Section 3 will give thedetailed description
23 Ensemble Pruning Based on Optimization Search MethodAlthough there are many advantages for ensemble learningthe nontrivial disadvantage is that it needs more mem-ory especially more communication resource to store andcommunicate multiple detectors in WSNs which can drainenergy quickly and is intolerable in WSNs Motivated bythe ldquomany could be better than allrdquo in the ensemble learn-ing community [9] it implied that the combination of alldetectors maybe not a good choice in ensemble learningcommunity Ensemble pruning as necessary strategy to solveresource-limitation question [26] is employed which selectsa subset of initial ensemble and obtains better or at leastequal detecting performance than the original ensembleThemost advantage of ensemble pruning is that it reduces thecommunication requirement greatly In WSNs broadcastingthe relative few detectors can save the battery energy consid-erably However it is well known that pruning an ensembleof size119873 requires to search in the space composed of 2119873 minus 1nonempty subensembles which is a 119873119875 complete problemHence some heuristic searching approaches are used tofind the expected appropriate subset Biogeographical basedoptimization (BBO) [27 28] as a novel population-basedglobal optimization method had some features in commonwith existing optimizationmethods such as genetic algorithm(GA) and harmony search (HS) [29] In this paper BBO isused to obtain an optimalsuboptimal ensemble for reducingthe communication cost To the best of our knowledge as anew optimization method there is no paper employing thismethod to apply in the fields of WSNs and our study willextend its application
3 Proposed Method
Motivated by the increasing online ensemble learningmethodology [25] and considering the resource limitation
Base station (BS)
Cluster head (CH)CH to BSBoundary of cluster heads
Non-CH node
Figure 1 The considered WSN
of sensor node in WSNs we propose a distributed onlineanomaly detection method based on the ensemble learningFurther BBO is used for ensemble pruning to decrease thecommunication and memory requirements
31 Problem Statement of WSNs In this paper we assumethat the WSNs is applied in untouched area and to assure thesensed data quality the sensor nodes usually are deployeddensely Besides we assume that sensor nodes are timesynchronized which is mainly for clear presentation purposerather than a limitation of our proposed method Figure 1showsWSNs which consist of a large amount of sensor nodesand a base station (BS) [30] Generally the WSNs can berepresented as a graph119866 = (119881 119864) where119881 = V
1 V2 V
|119881|
is a finite set of vertices and 119864 = 1198901 1198902 119890
|119864| is a finite
set of edges and vertex (V119894 119894 = 1 |119881|) and edge (119890
119894 119894 =
1 |119864|) refer to sensor nodes and the one-hop or multihopcommunication link reachable between sensors V
119894and V
119895
respectivelyFrom Figure 1 we can clearly have an idea that some
clusters are formed based on node geographical positionsinformation and communication capability reachable Herewe only consider the one-hop communications among sensornodes Similarly this assumption is mainly for clear repre-sentation of our proposed method rather than a limitation ofcommunication capability of sensors In fact our proposedmethod can easily extend to multiple hop relaying communi-cation Besides in order to concisely describe our proposedanomaly detection method a relatively small subnetworkconsisted of some sensor nodes deployed densely is taken intoaccount which forms a cluster 119862
119894consisting of one cluster
head node and a number of sensor nodes represented as CH119894
and 119873119894119895 119895 = 1 |119862
119894| respectively For the whole WSNs
119881 = 1198621cup1198622cup cup 119862
119899and119862
119894cap119862119895= Φ All nodes in a cluster
are reachable to each other by one-hop communication andthe communication between clusters depends on the directlinks of cluster heads In each cluster the selection of clusterhead is randomized among all nodes in that cluster to avoiddraining of the energy
For one cluster 119862119894= CH
119894 1198731198941 119873
119894119898 which contains
a cluster head CH119894and its 119898 spatially neighboring nodes
(119873119894119895 119895 = 1 119898) Each sensor node in the subnetwork
4 International Journal of Distributed Sensor Networks
measures a data vector at every time interval Δ119905 which iscomposed of multiple attribute values For the cluster headCH119894 the observation is 119883
119894= (119909
119894
1 119909119894
2 119909
119894
119889) where 119889
denotes the dimension For the 119895th neighbor node 119873119894119895 the
observation is 119883119894119895= (119909119894
1198951 119909119894
1198952 119909
119894
119895119889) Nodes in the cluster
collect samples synchronously and our proposed method isto identify these new observations of each sensor node asnormal or anomalous online
32 Spatial and Temporal (Spatiotemporal) Correlation ofSensed Dataset For the sensed dataset in a cluster wedescribed the spatiotemporal correlation firstly which will beused later to build our proposed online ensemble detection
The collected sensor dataset from WSNs is a time seriesdataset A time series is a sequence of value 119883 = 119909(119905) 119905 =
1 119899which follows a nonrandom order and the 119899 consec-utive observation values are collected at same time intervalsAnalyzing and learning from these observations [31] canhelp to understand the data trend over time and build theappropriate detector based on temporal correlation as well asto predict the label of new coming observations
To obtain the detector the foremost requirement is toachieve a stationary time series dataset Some data processingmethods are used to eliminate data trend and obtain a sta-tionary time series dataset such as polynomial fitting movingaverages differencing and double exponential smoothing[32ndash34] Considering the requirement of low computationalcomplexity a simple and efficient nonparametric technique(ie first differencing) is used to eliminate the temporal trendand obtain a stationary time series for dataset collected inWSNs which can be formulated as
1198831015840= 1199091015840(119904 119905) = 119909 (119904 119905) minus 119909 (119904 119905 minus 1) 119905 = 2 3 119899 (1)
Besides the sensor nodes are always deployed denselyand the space redundancy existed A dataset 119883 = 119909(119904) 119904 =
1 119898 is collected from 119898 sensor nodes in a clusterat a timestamp This dataset can help to understand thespatial correlation structure of data and predict the datavalue at a location nearby Spatial data may present thelocal dependencywhich represents the similarity relationshipof observations collected at adjacent locations in a localregion Usually for a specified region the observations of onesensor can be estimated by a linear weighted combination ofobservations collected at its adjacent locations [32] which canbe expressed as
119909 (119904119894) = 1205821119909 (1199041) + sdot sdot sdot + 120582119894minus1119909 (119904119894minus1) + 120582119894+1119909 (119904119894+1)
+ sdot sdot sdot + 120582119898119909 (119904119898)
(2)
where 1199041 119904119894minus1 119904119894+1 119904119898 denotes positions of sensornodes and 1205821 120582119894minus1 120582119894+1 120582119898 denotes the weights ofobservations sum119898
119896=1119896 =119894 120582119896 = 1Consequently for sensed data collected in a local region
two reasonable assumptions are described as follows
(1) The sensed data of adjacent nonfault sensor nodes aresimilar at the same timestamp
(2) The sensed data of adjacent nonfault sensor nodeshave the similar trend over time
Motivated by the two assumptions and ensemble learningtheory a novel anomaly detectionmethod is proposed in thispaper We will give the details in the following section
33 Proposed Ensemble Learning Method of Anomaly Detec-tion in WSNs Spatiotemporal correlation exists among sen-sor data in a local region of WSNs and a relatively smallcomponent that is a cluster consisting of a few of sensornodes and a cluster head node is taken into account to clearlydescribe proposed distributed anomaly detection methodbased on ensemble learning Ensemble pruning based onBBO was adopted to optimize the initial trained detectorfor mitigating the resource requirements The optimizedensemble detector was used to identify global anomalousobservations at each individual sensor timely Our proposedmethod is shown as Figure 2
Online anomaly detection method consists of three keyprocedures that is detector training online detecting andonline detector updating From Figure 2 it can be seen thatour proposed method enables each distributed deploymentsensor node to globally judge every new coming observa-tion normal or anomalous in time Distributed detecting isemployed to achieve load (communication computation andstorage) evenly in the network and to prolong the lifetime ofthe whole network
The whole procedure of proposed method is described asfollows
Step 1 Considering the temporal correlation at the certaintime period each sensor node 119904
119894trains a local ensemble
detector using the history dataset collected from a timeinterval In facts using this initial local ensemble detectorthe new coming observation is normal or anomalous can bedetermined locally
Step 2 Each sensor node 119904119894transmits its local ensemble
detector as well as some related parameters such as themaximize value minimum value and mean of trainingdataset to the cluster head node and other member sensornode
Step 3 Cluster head node received the local ensemble detec-tors from its member nodes and combined with its owntrained detector the initial global ensemble detector is built
Step 4 The BBOmethod is introduced in the cluster head toprune the initial global ensemble detector and to obtain anacceptable final ensemble detector
Step 5 The pruned ensemble detector that is final ensembledetector is broadcasted to its each member sensor node foronline global anomaly detection
Step 6 Each sensor node selectively retains the test data foronline update based on the predefined sampling probability119901
International Journal of Distributed Sensor Networks 5
Learning(i) Input training data
(ii) Preprocess training data(iii) Learn from training data(iv) Output the local ensemble detector
Member nodes (MN1)
Learning(i) Input training data
(ii) Preprocess training data(iii) Learn from training data(iv) Output the local ensemble detector
Member nodes (MN2)
Learning(i) Input training data
(ii) Preprocess training data(iii) Learn from training data(iv) Output the local ensemble detector
Member nodes (MNm)
Cluster head node (CH)
Ensemble aggregating
(i) Receive the local ensemble detector(ii) Aggregating
(iii) Output the global ensemble detector
Bit coding mechanism
BBO (ensemble pruning)
Local ensembledetector
Local ensembledetector
Local ensembledetector
Broadcastcommunication
Broadcastcommunication
Broadcastcommunication
Broadcastcommunication
Local ensembledetector
Global initialensemble detector
State matrix Global initial ensemble detector
Figure 2 Distributed ensemble anomaly detection method based on BBO pruning in WSNs
Step 7 Once the updating condition was activated theprocedure of retraining and detector updating was triggered
This method can scale well with increase of number ofnodes in WSNs due to its distributed processing nature Ithas low communication requirements and does not need totransmit any actual observations between cluster head nodeand itsmember sensor node which saves the communicationresource significantly
Next we described some important procedures men-tioned above in detail Further considering the context ofresource constraint of each sensor node in WSNs sometricks are designed to save the communication and memoryrequirements
331 Building the Initial EnsembleDetector An initial ensem-ble detector is constructed by two steps Firstly a numberof base detectors are trained sequentially for each sensornodes in a cluster (including the cluster head node itself)based on the history dataset Because the data distributionmay be changed over time the previous trained detector maybe useless for the future detection Moreover the limitedmemory resource in the sensor node is another constraintto store too many previous detectors In practice accordingto the space of memory resource only the latest multipledetectors are kept to build the initial local ensemble for onesensor node For example to sensor node 119894 the sensed data iscollected anddivided into data chunk based on a time intervalΔ119905 which is determined by the actual monitoring processConsequently each node trains multiple individual detectorsover time In our paper supposing 119899 latest detector is kept fora sensor node if there are119898 nodes in one cluster then totally119899lowast119898 detectors are obtained for the initial ensemble Secondly
each sensor node (including cluster head node) broadcasts its119899 trained detector in the cluster Taking the cluster head as anexample after all (119899lowast(119898minus1)) individual detectors are receivedfrom its member nodes the cluster head combines with its 119899trained detector and the initial ensemble (including 119899 lowast 119898
individual detectors) is built in cluster head nodeMany techniques can be employed for combining the
results of each detector to obtain the final detection resultThe common used method in the literature is the major-ity vote (for classification problem) and weighted average(for regression problem) In our paper the final ensembledetection result can be calculated by (3) where 119908
119894denotes
weight coefficient that is 119908119894= 1 means the simple average
otherwise weighted average In our paper for simplicity thesimple average strategy is employed to combine the finallyresult
119910fin (119909) =1
119899 lowast 119898
119899lowast119898
sum
119894=1119910119894 (119909) lowast 119908119894
(3)
332 Ensemble Pruning Based on BBO Search To miti-gate the expensive communication cost and high memoryrequirement induced by ensemble learning inspired by theprinciple of ldquomany could be better than allrdquo in the ensemblelearning community the ensemble pruning is necessary
Given an initial ensemble anomaly detector 119864 =
AD1AD2 AD
119899lowast119898 AD
119894is a trained anomaly detector
which can test an observation anomalous or not a combi-nation method 119862 and a test dataset 119879 The goal of ensemblepruning is to find an optimalsuboptimal subset 1198641015840 sube 119864which can minimize the generalization error and obtainbetter or at least same detection performance compared to119864 Let 119891
119894119895(119894 = 1 2 119898 119895 = 1 2 119899) be the fitness values
6 International Journal of Distributed Sensor Networks
Input 119864mdashinitial ensemble anomaly detector 119879mdashThe number of maximization iterationOutput 1198641015840mdashfinal ensemble anomaly detectorlowast BBO parameter initialization lowast
Create a random set of habitats (populations) 1198671 1198672 119867
119873
Compute corresponding fitness that is HSI valueslowastOptimization search process lowast
While (119879)Compute immigration rate 120582 and emigration rate 119906 for each habitat based on HSIlowast Migration lowastSelect119867
119894with probability based on 120582
119894
If119867119894is selected
Select119867119895with probability based on 119906
119895
If119867119895is selected
Randomly select a SIV form119867119895
Replace a random SIV in119867119894with one from119867
119895
End ifEnd iflowast Mutation lowastSelect an SIV in119867
119894with probability based on the mutation rate 120578
If119867119894(SIV) is selected
Replace119867119894(SIV) with a randomly generated SIV
End ifRe-compute HSI values119879 = T minus 1
End whilelowast Ensemble pruning lowast
Get the final ensemble of anomaly detector 119864lowast based on the habitats119867119894
lowast with acceptable HSI
Algorithm 1 Ensemble pruning BBO (E T)
of the detecting performance such as true positive rate falsepositive rate accuracy and so on Obviously the fitness value119865 can be defined as (4) based on the results of testing data
119865 =
[
[
[
[
[
[
11989111 11989112 sdot sdot sdot 1198911119899
11989121 11989122 sdot sdot sdot 1198912119899
sdot sdot sdot sdot sdot sdot sdot sdot sdot sdot sdot sdot
1198911198981 1198911198982 sdot sdot sdot 119891
119898119899
]
]
]
]
]
]
(4)
The final fitness function can be defined as
Maximize (
1198731015840
sum
119894=1119898119895=1119899
119891119894119895)
st 1198731015840le 119898 lowast 119899
(5)
Here the problem of ensemble pruning is to find thesubset of 1198641015840 which was composed of part single detectorsFinding the optimized subset requires much heavier andmore delicate computation resources Biogeography-basedoptimization (BBO) is a novel optimization method and isemployed to find out the acceptable set of ensemble Weonly simply present some key information about BBO theinterested reader can be referred to the detailed descriptionin [28]
BBO is a population-based global optimization methodwhich has some common characteristics similar to the
existing evolutionary algorithms (EAs) such as genetic algo-rithm (GA) particle swarm optimization (PSO) and antcolony optimization (ACO) When it was used to search thesolution domain and obtain an optimalsuboptimal solutionsome operators were employed to share information amongsolutions which makes BBO applicable to many problemsthat GA and PSO are used The more distinctive differencebetween BBO and other EAs can be seen in [27 28]
The pseudo-code of ensemble pruning based on BBO canbe described as shown in Algorithm 1 [7] Here 119867 indicateshabit HIS is fitness and SIV (suitability index variable) is asolution feature
333 Some Tricks Designed to Mitigate the CommunicationRequirement In the WSNs the main reason of quick energydepletion is the radio communication among the sensornodes It has been known that the cost of communicationof one bit equals the cost of processing thousands of bitsin sensors [35] This means that the most energy in sensornode is consumed by radio communication rather thancollecting or processing data Consequently reducing thecommunication quantity will decrease the power resourcerequirement and eventually lengthen the lifetime of thewholeWSNs
It is obvious that the aforementioned method has relativehigh communication overhead Each sensor node transmitsits local ensemble detector to the cluster head and the finalpruned global ensemble detector broadcasts back to its each
International Journal of Distributed Sensor Networks 7
Input 1198641015840mdashCurrent pruned ensemble anomaly detector 119901mdashSampling probabilityOutput 119864lowastmdashUpdated pruned ensemble anomaly detector
For each sensor nodeRetain the new observation with probability 119901If buffer is replaced completely by new observations
Train new detector and transmit its summary to cluster head119864lowast = Ensemble Pruning BBO(1198641015840 119879)
Broadcast 119864lowast to its member sensor node for subsequent anomaly detection
Algorithm 2 Online updating (1198641015840 119901)
member sensor nodes In order to relieve communicationburden some skills are used to descend the communicationoverhead
In fact the distributed traininglearning method onlytransmits the summary information of trained local ensembledetector to the cluster head which has significantly decreasedthe communication cost compared to centralized anomalydetectionmanners that sent all trained data to cluster head tobuild detector Besides after the pruned ensemble is obtainedin cluster head node each member sensor node in thiscluster can obtain the pruned ensemble detector from thecluster head node A straightforward method is broadcastingthis pruned ensemble to its member sensor nodes Thisis a common used strategy but it does not make full useof local ensemble detector information and will cost morecommunication resources Here a state matrix 119875 is designedin the cluster head its element 119901
119894119895is defined by formula (6)
to represent each single detector in initial ensemble Theneach local ensemble detector is represented as a bit stringusing one bit for each single detector Detector is included orexcluded from the ensemble detector depending on the valueof the corresponding bit that is 1 denotes this single detectorthat is included in the final ensemble and 0 means it was notincluded
119901119894119895=
1 AD119894119895isin 1198641015840 119894 = 1 119898 119895 = 1 119899
0 otherwise
119875
=
1 2 119894 minus 1 119894 119894 + 1 sdot sdot sdot 119899
1198781
1198782
119878119898
[
[
[
[
[
[
0
1
sdot
0
1
0
sdot
0
sdot sdot sdot
sdot sdot sdot
sdot sdot sdot
sdot sdot sdot
0
1
sdot
1
1
1
sdot
1
1
0
sdot
1
sdot sdot sdot
sdot sdot sdot
sdot sdot sdot
sdot sdot sdot
1
0
sdot
1
]
]
]
]
]
]
(6)
After the pruned procedure is finished the cluster headbroadcasts the statematrix119875 to its member sensor node eachsensor node keeps the single detector whose correspondingvalue of state element equals 1 and it deletes the rest to buildthe pruned ensemble global detector Employing the statematrix can save the energy greatly For example after theensemble pruning is finished 1198731015840 (1198731015840 le 119899 lowast 119898) individualdetectors are broadcast in cluster If matrix 119875 is not used it
will need 4 lowast1198731015840 lowast 119889 bytes communication cost (suppose thatthe individual detector can be represented by 119889 parametersand each parameter needs at least 4 bytes) If matrix 119875 isintroduced each itemofmatrix119875 only needs 1 bit to representan individual detector Consequently only 119898 lowast 1198998 bytes arerequired to broadcast Suppose that one-third of individualdetectors are pruned (ie1198731015840 = 2lowast119899lowast1198983) then (4lowast119899lowast119898lowast
119889 lowast 23)(119898 lowast 1198998) asymp 2133119889 By introducing the ensemblepruning and state matrix the quantity of energy saving incluster head sensor is significant and the lifetime ofWSNs canbe lengthened
334 Online Update and Relearning Distribution changeof sensed dataset occurred possibly and detector updat-ing is necessary Online detector update will accompanya relearning procedure A comprised strategy (ie delayupdating strategy [36]) can cater this situation and savethe computation communication and memory resources tosome extent Simple to say for the new coming observationwhether saving and using it to update the current detectoror not are decided by a sample probability 119901 Some heuristicrules can be employed to guide its value for example if thedynamics is relatively stationary the small 119901 should be usedotherwise the big 119901 should be chosen When the buffer of asensor node is replaced by the new data completely onlineupdate is triggered and new detector is trained The pseudo-codes of algorithm can be described as shown inAlgorithm 2
4 Experimental and Analysis
In this section the dataset data preprocessing methodexperiment results and analysis are described respectivelyExperiments were conducted on a personal PC with IntelCore 2 Duo CPU P7450213GHZ and 4GB memoryThe operating system is Windows 7 professional The dataprocessing is partly on the MATLAB 2010 and the algorithmmentioned in Section 3 was implemented with MicrosoftVisual C++ platform
41 Dataset and Data Preprocessing IBRL datasets [37] wereused in our paper to validate proposed method which wascollected from aWSN deployed in Intel Research Laboratoryat University of Berkeley and commonly used to evaluatethe performance of some existing models for WSNs [3536 38ndash41] This network consists of 54 Mica2Dot sensornodes Figure 3 shows the location of each node of the
8 International Journal of Distributed Sensor Networks
1
2
3
4
5 6
7
8
9
10
11
12
1314
15
16
17
18
19
20
21
22
23
2425
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
4142
43
44
45
46
47
48
49
50
51
52
54
Lab
Server
Quiet
Phon
e
Kitchen
Elec
Copy
Storage
Conference
Office Office
53
Figure 3 Sensor nodes location in the IBRL deployment
deployment (node locations are shown in black hexagonwith their corresponding node IDs) [35] The whole datasetwas collected from 29022004 to 05042004 Four typesof measures data that is light temperature and humidityas well as voltage were collected and those measurementswere recorded in 31 s interval Because these sensors weredeployed inside a lab and the measurement variables hadlittle changes over time (except the light having the suddenchanges due to the irregular nature of this variable andfrequent onoff operation) this dataset was considered a typeof static datasets for many researchers In our experimentsto evaluate our proposed anomaly detection algorithm someartificial anomalies are created by randomly modifying someobservations which is widely used bymany researchers in theliterature [41]
Since our proposed method adopts the cluster structurea cluster (consisting of 4 sensor nodes ie N7 N8 N9and N10) and dataset (collected on 29022004) are chosenThe data distribution can be seen in [7] Here only partobservations (during 000000 amndash075959 am) from eachsensor node are employed to evaluate proposed methodThedata trend is depicted in Figure 4
From Figure 4 an obvious fact is that data distributionin a cluster is almost same which well proved that spatialcorrelation exists Though there are some trivial differencesafter analyzing the dataset carefully the main reason is thatdataset has some missing data points largely due to packetloss which can be further proved from Figure 4 In ourexperiment these missing observations can be interpolatedusing the method described in Section 33 The obvious factis that sudden peakvalley appeared in Figure 4 for eachsensor observation which implies that an interested eventmay occurred
Suppose that 119863 = 119909119894 119910119894 119894 = 1 2 119899 is a dataset
used to train an anomaly detector Here the 119909119894is a vector
with feature values and 119910119894is the label which indicates whether
the given observation is normal or anomalous Because theIBRL dataset regards all its observations as normal someanomaly data points are generated and inserted to evaluatethe performance of our proposed method In the paper anumber of 30 data points of artificial anomalies for eachsensor were injected consecutively in each dataset to calculatethe true positive rate (TPR) false negative rate (FPR) anddetection accuracy (ACC) Without loss of generality theanomalous dataset should follow a distribution very muchdifferent from that of the training dataset but their rangesshould be overlapped as much as possible Besides ananomalous event should be a small probability event fora normal dataset collected by a nonfault sensor node Theanomalies were generated using a normal randomizer withslightly deviate statistical characteristics from the normaldata characteristics [41] The detailed dataset information(including statistical parameters) of selected sensor node ispresented in Table 1
42 Performance EvaluationMetrics and BBO Parameters Inorder to evaluate our proposed method some commonlyused performance evaluation metrics for anomaly detectionare used in our paper such as detection accuracy (ACC) truepositive rate (TPR) and false positivealarm rate (FPR)Theyare described as follows
ACC =
(TP + TN)(TP + TN + FP + FN)
International Journal of Distributed Sensor Networks 9
17
175
18
185
19
195
20Temperature
0 200 400 600 800 1000
N7N8
N9N10
(a)
38
39
40
41
42
43
44
45
46
0 200 400 600 800 1000
Humidity
N7N8
N9N10
(b)
Figure 4 The data (temperature humidity) trend during 00000 amndash75959 am on February 29 2004
Table 1 Detail dataset information of selected sensor node on 29022004
Node Initial sample Mean Variance Injected anomaly Mean Variance119879 119867 119879 119867 119879 119867 119879 119867
N7 823 184154 409176 05238 14494 30 1821 4110 054 146N8 548 179844 417123 05315 14612 30 1775 4195 055 148N9 652 181140 426295 05288 14827 30 1835 4245 055 150N10 620 181144 426215 05244 14191 30 1833 4247 054 143119879 temperature119867 humidity
TPR =
TP(TP + FN)
FPR =
FP(FP + TN)
(7)
where TP means number of samples correctly predicted asanomaly class FP means number of samples incorrectlypredicted as anomaly class TN means number of samplescorrectly predicted as normal class and FN means numberof samples incorrectly predicted as normal class
BBO is employed to prune the initial ensemble and themigration model is same as that present in [27 28] and therelated parameters are set as follows
Habitat (population) size 119878 = 30 the number of SIVs(suitability index variables) in each island 119899 = 20 40 60 80the maximum migration rates 119864 = 1 and 119868 = 1 and themutation rate 120578 = 001 and 120582 120583 are the immigration rateand the emigration rate respectively The elitism parameter120588 = 2
HSI (habitat suitability index) is a fitness function similarto other population-based optimization algorithms HIS isevaluated by 119865-measure (119865-score) which considers both the
precision probability and the recall probability of binaryclassification problem
119865-measure =(1 + 120573
2) precision lowast recall
1205732lowast precision + recall
=
(1 + 1205732) lowast TP
(1 + 1205732) lowast TP + 1205732 lowast FN + FP
(8)
119865-measure can be interpreted as a weighted average of theprecision and recall and its value reaches best at 1 and worstat 0 120573 is a parameter used to adjust the relative importancebetween precision and recall 120573 = 05 1 2 Usually thevalue of 119865-measure is close to the relative small value ofprecision and recall that is the big 119865-measuremeans that theprecision and recall are all big Consequently a good detectoris analogous to a habitat with a high HSI and is included inthe final ensemble detector and a poor detector is analogousto a habitat with a low HIS and is discarded from the finalensemble detector In our paper 120573 = 1 is specified
43 Results Presentation and Discussions In the data miningand machine learning communities SVM-based method has
10 International Journal of Distributed Sensor Networks
Table 2 Detection performance of local ensemble detector
Ensemble size N7 N8 N9 N10ACC TPR FPR ACC TPR FPR ACC TPR FPR ACC TPR FPR
5 08700 05833 01181 07900 03333 01809 08267 05000 01549 08267 05714 0160810 08800 06667 01111 08033 03889 01702 08267 04375 01514 08333 06429 0157315 08900 07500 01042 08167 05000 01631 08433 05000 01373 08600 07143 0132920 08933 08333 01042 08200 05000 01596 08367 05000 01444 08567 07143 01364
Table 3 Detection performance of global ensemble detector [7]
Combined ensemble size N7 N8 N9 N10ACC TPR FPR ACC TPR FPR ACC TPR FPR ACC TPR FPR
20 09467 08333 00486 09300 07778 00603 09467 07500 00423 09500 07857 0042040 09700 07500 00208 09433 08333 00496 09710 08938 00246 09650 08929 0031560 09700 08333 00243 09733 08889 00213 09800 09375 00176 09783 09357 0019680 09817 09583 00174 09800 09444 00177 09767 09375 00211 09780 09714 00217
been widely used in classification problem which separatesthe data belonging to the different classes by fitting a hyper-plane One class SVM based method as a variation of thismethod is especially favored for anomaly detection [42ndash44]In the paper it was used to train the base detectorThe datasetof each sensor node was divided into two parts about 66was used for training the local detector and the remainder asthe test set was to evaluate proposed method
Online Bagging the commonly used ensemble strategywas used to build initial ensemble detector Our experimentsaim to achieve two goals Firstly it is to prove the effectivenessof proposed method based on ensemble learning theorySecondly it is to prove that pruned ensemble detector canobtain better (at least equal) performance compared to initialensemble detector and mitigate the resource requirementAs a result three experiments were done that is localensemble anomaly detector only considering the temporalcorrelation of each sensor node global ensemble anomalydetector considering the spatiotemporal correlation and theglobal pruned ensemble anomaly detector based on BBOThe experimental results can be seen in Tables 2 3 and 4respectively
Table 2 shows the performance of each sensor node underthe different ensemble size which does not take into accountthe spatial correlation of sensed data in a cluster Though theensemble detection performance is becoming ldquogoodrdquo gradualwith the increasing of ensemble size (the higher value ofACCTPR the better performance and the lower value of FPR thebetter performance) the overall performance is relatively lowThemaximumvalue of detection accuracy is only 8933 andmost of true positive rates are unacceptable and most of falsepositive rates (FPR) have a relative high valueAll these resultsindicate that the performance of local ensemble detectoris poor Table 3 shows the global detection performance ofeach sensor node Here after the local ensemble detectorwas trained each member node sent its local ensembleto each other to form the global ensemble detector andeach member node used this global detector to online testthe local observation From the results of Table 3 [7] an
obvious fact is that the detection performances are higherthan presented in Table 2With the help of neighbor detectorthe detection results become better and better correspondingto the increasing of ensemble size
In order to further optimize the proposed algorithmperformance and save the resource ensemble pruning is usedfor global ensemble detector Table 4 [7] shows the result ofdetection performance of pruned global ensemble detectorbased on BBO
Table 4 shows a more practicable result and the sizeof global ensemble decreases sharply while the detectorperformance is as good as or better than the initial globalensemble detector From the results of Table 5 when thesize of initial ensemble reaches 80 the 60 resource costis saved In our experiment only for validating the methodeffectively we set the ensemble sizes 5 10 15 and 20 for eachlocal ensemble detector which may be small for the practicalapplications In fact how many local ensemble detectors arechosen is an open topic and is decided by many factors suchas the computation capability and the communication cost aswell as memory usage of sensor node the expected detectingaccuracy requirement and so on In the practical applicationa trade-off is commonly considered
5 Conclusion and Future Work
After exploiting the spatiotemporal correlation existing inthe sensed data of WSNs and motivated by the advantagesof online ensemble learning a distributed online ensembleanomaly detector method has been proposed Due to thespecific resource constrained in theWSNs ensemble pruningbased on BBO is employed to mitigate the high resourcerequirement and obtain the optimized detector that performsat least as good as the original ones The experimental resultson real dataset demonstrated that our proposed method iseffective
Because the diversity of base learners is a key factorrelated to the performance of ensemble learning as a possibleextension of our work we plan to include some diversity
International Journal of Distributed Sensor Networks 11
Table 4 Detection performance of global ensemble detector based on BBO pruning [7]
Ensemble size(BBO pruned)
N7 N8 N9 N10ACC TPR FPR ACC TPR FPR ACC TPR FPR ACC TPR FPR
14 09480 08000 00458 09327 07667 00567 09500 08125 00423 09533 08571 0042023 09710 07750 00208 09447 08000 00461 09733 09250 00239 09697 09143 0027627 09713 08500 00236 09683 08333 00230 09810 09563 00176 09797 09357 0018232 09820 09750 00177 09750 08333 00160 09820 09500 00162 09830 09786 00168
Table 5 Rate of saving resource cost based on global ensembledetector of BBO pruned
Number Initial ensemblesize
Prunedensemble size
Saving resourcecost
1 20 14 302 40 23 4253 60 27 554 80 32 60
measures in fitness function to improve the detecting per-formance in future Besides the cost of communication isthe main reason of quick energy depletion of sensor nodesespecially for the cluster head the adaptive selection of clusterhead based on energy state will be taken into account tolengthen the lifetime of WSNs in next work
Conflict of Interests
The authors declare that there is no conflict of interestsregarding the publication of this paper
Acknowledgments
This work is supported by the National Key ScientificInstrument and Equipment Development Project(2012YQ15008703) the Zhejiang Provincial Natural ScienceFoundation of China (LY13F020015) the Open Project of TopKeyDiscipline of Computer Software andTheory in ZhejiangProvincial (ZC323014100) National Science Foundation ofChina (61473182) Science and Technology Commission ofShanghai Municipality (11JC1404000 14JC1402200) andShanghai Rising-Star Program (13QA1401600)
References
[1] Y Zhang N Meratnia and P Havinga ldquoOutlier detectiontechniques for wireless sensor networks a surveyrdquo IEEE Com-munications Surveys and Tutorials vol 12 no 2 pp 159ndash1702010
[2] Y Zhang N A S Hamm N Meratnia A Stein M van deVoort and P J M Havinga ldquoStatistics-based outlier detectionfor wireless sensor networksrdquo International Journal of Geo-graphical Information Science vol 26 no 8 pp 1373ndash1392 2012
[3] C Peng and Q-L Han ldquoA novel event-triggered transmissionscheme and L
2control co-design for sampled-data control
systemsrdquo IEEE Transactions on Automatic Control vol 58 no10 pp 2620ndash2626 2013
[4] S Rajasegarar C Leckie M Palaniswami and J C BezdekldquoDistributed anomaly detection in wireless sensor networksrdquo inProceedings of the 10th IEEE Singapore International Conferenceon Communication systems (ICCS rsquo06) pp 1ndash5 IEEE SingaporeOctober 2006
[5] S Rajasegarar C Leckie and M Palaniswami ldquoAnomalydetection in wireless sensor networksrdquo IEEE Wireless Commu-nications vol 15 no 4 pp 34ndash40 2008
[6] M Xie S Han B Tian and S Parvin ldquoAnomaly detectionin wireless sensor networks a surveyrdquo Journal of Network andComputer Applications vol 34 no 4 pp 1302ndash1325 2011
[7] Z Ding M Fei D Du and S Xu ldquoOnline anomaly detectionmethod based on BBO ensemble pruning in wireless sensornetworksrdquo in Life System Modeling and Simulation vol 461 ofCommunications in Computer and Information Science pp 160ndash169 Springer Berlin Germany 2014
[8] T G Dietterich ldquoMachine-learning researchmdashfour currentdirectionsrdquo AI Magazine vol 18 no 4 pp 97ndash136 1997
[9] Z-H Zhou J Wu andW Tang ldquoEnsembling neural networksmany could be better than allrdquoArtificial Intelligence vol 137 no1-2 pp 239ndash263 2002
[10] N Shahid I H Naqvi and S B Qaisar ldquoCharacteristics andclassification of outlier detection techniques for wireless sensornetworks in harsh environments a surveyrdquoArtificial IntelligenceReview vol 137 pp 1ndash36 2012
[11] D Du K Li and M Fei ldquoA fast multi-output RBF neuralnetwork constructionmethodrdquoNeurocomputing vol 73 no 10ndash12 pp 2196ndash2202 2010
[12] P Gil A Santos and A Cardoso ldquoDealing with outliers inwireless sensor networks an oil refinery applicationrdquo IEEETransactions on Control Systems Technology vol 23 no 4 pp1589ndash1596 2014
[13] M A Rassam M A Maarof and A Zainal ldquoAdaptive andonline data anomaly detection for wireless sensor systemsrdquoKnowledge-Based Systems vol 60 pp 44ndash57 2014
[14] S Rajasegarar A Gluhak M Ali Imran et al ldquoEllipsoidalneighbourhood outlier factor for distributed anomaly detectionin resource constrained networksrdquo Pattern Recognition vol 47no 9 pp 2867ndash2879 2014
[15] N Lu G Zhang and J Lu ldquoConcept drift detection viacompetence modelsrdquo Artificial Intelligence vol 209 pp 11ndash282014
[16] L Breiman ldquoBagging predictorsrdquoMachine Learning vol 24 no2 pp 123ndash140 1996
[17] S Seguı L Igual and J Vitria ldquoBagged one-class classifiersin the presence of outliersrdquo International Journal of PatternRecognition and Artificial Intelligence vol 27 no 5 Article ID1350014 2013
12 International Journal of Distributed Sensor Networks
[18] N Duffy and D Helmbold ldquoBoosting methods for regressionrdquoMachine Learning vol 47 no 2-3 pp 153ndash200 2002
[19] W-C Chang and C-W Cho ldquoOnline boosting for vehicledetectionrdquo IEEETransactions on SystemsMan and CyberneticsPart B Cybernetics vol 40 no 3 pp 892ndash902 2010
[20] C Desir S Bernard C Petitjean and L Heutte ldquoOne classrandom forestsrdquo Pattern Recognition vol 46 no 12 pp 3490ndash3506 2013
[21] A Fern and R Givan ldquoOnline ensemble learning an empiricalstudyrdquoMachine Learning vol 53 no 1-2 pp 71ndash109 2003
[22] A Bifet G Holmes B Pfahringer and R Gavalda ldquoImprov-ing adaptive bagging methods for evolving data streamsrdquo inAdvances in Machine Learning vol 5828 of Lecture Notes inComputer Science pp 23ndash37 Springer Berlin Germany 2009
[23] D I Curiac and C Volosencu ldquoEnsemble based sensinganomaly detection in wireless sensor networksrdquo Expert Systemswith Applications vol 39 no 10 pp 9087ndash9096 2012
[24] X Zhou S Li and Z Ye ldquoA novel system anomaly predictionsystem based on belief markov model and ensemble classifica-tionrdquo Mathematical Problems in Engineering vol 2013 ArticleID 179390 10 pages 2013
[25] H He S Chen K Li and X Xu ldquoIncremental learning fromstream datardquo IEEE Transactions on Neural Networks vol 22 no12 pp 1901ndash1914 2011
[26] D Du K Li X Li and M Fei ldquoA novel forward gene selectionalgorithm for microarray datardquo Neurocomputing vol 133 pp446ndash458 2014
[27] H Ma ldquoAn analysis of the equilibrium of migration models forbiogeography-based optimizationrdquo Information Sciences vol180 no 18 pp 3444ndash3464 2010
[28] D Simon ldquoBiogeography-based optimizationrdquo IEEE Transac-tions on Evolutionary Computation vol 12 no 6 pp 702ndash7132008
[29] S Sheen R Anitha and P Sirisha ldquoMalware detection bypruning of parallel ensembles using harmony searchrdquo PatternRecognition Letters vol 34 no 14 pp 1679ndash1686 2013
[30] Y-Y Zhang H-C Chao M Chen L Shu C-H Park and M-S Park ldquoOutlier detection and countermeasure for hierarchicalwireless sensor networksrdquo IET Information Security vol 4 no4 pp 361ndash373 2010
[31] C Peng and M-R Fei ldquoAn improved result on the stability ofuncertain T-S fuzzy systems with interval time-varying delayrdquoFuzzy Sets and Systems vol 212 pp 97ndash109 2013
[32] Y Zhang Observing the Unobservable Distributed Online Out-lier Detection inWireless Sensor Networks University of TwenteEnschede The Netherlands 2010
[33] C Peng D Yue and M Fei ldquoRelaxed stability and stabilizationconditions of networked fuzzy control systems subject toasynchronous grades of membershiprdquo IEEE Transactions onFuzzy Systems vol 22 no 5 pp 1101ndash1112 2014
[34] C Peng M-R Fei E Tian and Y-P Guan ldquoOn hold or dropout-of-order packets in networked control systemsrdquo Informa-tion Sciences vol 268 pp 436ndash446 2014
[35] M A Rassam A Zainal and M A Maarof ldquoAn adaptive andefficient dimension reduction model for multivariate wirelesssensor networks applicationsrdquo Applied Soft Computing Journalvol 13 no 4 pp 1978ndash1996 2013
[36] M Xie J Hu S Han and H-H Chen ldquoScalable hypergridk-NN-based online anomaly detection in wireless sensor net-worksrdquo IEEE Transactions on Parallel and Distributed Systemsvol 24 no 8 pp 1661ndash1670 2013
[37] Intel Berkely Reseach Lab (IBRL) dataset 2004 httpdbcsailmitedulabdatalabdatahtml
[38] J W Branch C Giannella B Szymanski R Wolff and HKargupta ldquoIn-network outlier detection in wireless sensornetworksrdquo Knowledge and Information Systems vol 34 no 1pp 23ndash54 2013
[39] M Moshtaghi T C Havens J C Bezdek et al ldquoClusteringellipses for anomaly detectionrdquo Pattern Recognition vol 44 no1 pp 55ndash69 2011
[40] S Rajasegarar J C Bezdek C Leckie and M PalaniswamildquoElliptical anomalies in wireless sensor networksrdquo ACM Trans-actions on Sensor Networks vol 6 no 1 pp 1ndash28 2009
[41] M A Rassam A Zainal and M A Maarof ldquoOne-classprincipal component classifier for anomaly detection inwirelesssensor networkrdquo in Proceedings of the 4th International Confer-ence on Computational Aspects of Social Networks (CASoN rsquo12)pp 271ndash276 IEEE Sao Carlos Brazil November 2012
[42] H Sagha H Bayati J D R Millan and R Chavarriaga ldquoOn-line anomaly detection and resilience in classifier ensemblesrdquoPattern Recognition Letters vol 34 no 15 pp 1916ndash1927 2013
[43] M Hejazi and Y P Singh ldquoOne-class support vector machinesapproach to anomaly detectionrdquo Applied Artificial Intelligencevol 27 no 5 pp 351ndash366 2013
[44] Y Zhang NMeratnia and P JMHavinga ldquoDistributed onlineoutlier detection in wireless sensor networks using ellipsoidalsupport vector machinerdquo Ad Hoc Networks vol 11 no 3 pp1062ndash1074 2013
International Journal of
AerospaceEngineeringHindawi Publishing Corporationhttpwwwhindawicom Volume 2014
RoboticsJournal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Active and Passive Electronic Components
Control Scienceand Engineering
Journal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
International Journal of
RotatingMachinery
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Hindawi Publishing Corporation httpwwwhindawicom
Journal ofEngineeringVolume 2014
Submit your manuscripts athttpwwwhindawicom
VLSI Design
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Shock and Vibration
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Civil EngineeringAdvances in
Acoustics and VibrationAdvances in
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Electrical and Computer Engineering
Journal of
Advances inOptoElectronics
Hindawi Publishing Corporation httpwwwhindawicom
Volume 2014
The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014
SensorsJournal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Modelling amp Simulation in EngineeringHindawi Publishing Corporation httpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Chemical EngineeringInternational Journal of Antennas and
Propagation
International Journal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Navigation and Observation
International Journal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
DistributedSensor Networks
International Journal of
2 International Journal of Distributed Sensor Networks
appropriate for WSNs The possible strategy is to select partof individual detectors to perform the anomaly detectionConsequently the ensemble pruning is a necessary strategy[9] which can obtain the better (at least same) performancecompared to the initial ensemble while the number ofindividual detectors decreased greatly
Analyzing the spatiotemporal correlation of sensed datain WSNs and motived by the online ensemble learningmethod the paper proposes a distributed anomaly detectionmethod for WSNs from the perspective of both modelbuilding and resource saving Further to mitigate the highcommunication requirements caused by broadcast ensembledetectors BBO based ensemble pruning is used to select theoptimized individual detectors to build the final ensembledetector that has at least same performance compared tothe initial ensemble detector The main contributions of thispaper include the following
(1) A distributed anomaly detection method for WSNs isproposed based on online ensemble learning
(2) BBO based ensemble pruning is used to get theoptimal subset for saving the limited store and com-munication resources in WSNs
(3) State matrix encoding method is designed for ensem-ble detector which can decrease the communicationand memory overhead significantly
The rest of this paper is organized as follows The relatedwork is described in Section 2 Based on ensemble learningtheory and BBO our proposed anomaly detection methodis presented in Section 3 Experiment analysis is provided inSection 4 Finally conclusions and future work are presentedin Section 5
2 Related Work
To clearly analyze the motivation of the paper the stateof the art of three key aspects related to our paper issummarized that is anomaly detection in WSNs onlineensemble learning and ensemble pruning
21 Anomaly Detection Method and Classification in WSNsWith the rapid development and wide application of WSNssome anomaly detection techniques for WSNs have beendeveloped and summarized based on different perspectiveFor example [10] discussed the prioritization of various char-acteristics ofWSNs including of spatiotemporal and attributecorrelations of sensed data anomaly types anomaly identifi-cation anomaly score and so forth A brief overview of theclassifications strategies for anomaly detection methods inWSNs deployed in harsh environment was provided whichgrouped anomaly detection methods into four types that isstatistical-based techniques the nearest neighbor based tech-niques the clustering based techniques and classification-based techniques Based on the nature of sensor data specificrequirements and limitations of the WSNs [1] provideda comprehensive overview of existing anomaly detectiontechniques specifically developed for WSNs It presented atechnique-based taxonomy and gave a comparative table
which could be used as a guideline to select the suitablemethod for the specific application For example based on thecharacteristic such as data types anomaly types and anomalydegree statistical-based methods are further classified intoparametric-based methods and nonparametric-based meth-ods Based on how the probability distribution model isbuilt classification-basedmethods are categorized as supportvector machined-based methods and radial basis functionneural networks-based methods [11] and so on The inter-ested reader is referred to more anomaly detection methodsand taxonomies in [5 6 12 13] These taxonomies of afore-mentioned methods may be some overlaps and machinelearning and computational intelligence-based techniquesare an increasing important research direction beyond alldoubt with respect to the complicated application Moreoverthough these methods have acceptable performance to someextent the resource constraint usually was not or seldomtaken into account With the wide applications of WSNsit also attracted some researchersrsquo attentions [14] Anothernoticeable characteristic of aforementionedmethodswas thatonly a single detector or model was trained It is well-knownthat the singlemodelmay be not well learned the complicateddecision boundary with respect to the complicated dataset For the sensed streaming data with the dynamic datadistribution single model is hard to or need expensive costto learn and obtain the whole profile such as training artificialneural network which leads to overlearning and degrade thegeneralization performance Besides concept drift [15] wasa common phenomenon that occurred in dataset collectedfromWSNs and the single mode was difficult in dealing withsuch dynamic changing of data distribution and providing acomprehensive detector to detect anomaly Moreover detec-tor updating based on all available dataset is also a hard workfor online learning
22 Ensemble Learning Method Ensemble learning is a com-putational intelligence method and theory and experimenthave proved that the combination of the predictions ofmany individual detectors can enhance the generalizationperformance There are many different ensemble learningmethods used widely and successfully such as Bagging [1617] Boosting [18 19] Random Forest [20] and their onlineversion [21 22] Generally an ensemble anomaly detector isconstructed in two steps Firstly a number of base detectorsare trained using the training dataset Secondly a combina-tion strategy of result is designed to obtain the aggregatedresult based on the results of each single detector For time-series dataset such as sensed dataset in WSNs learning asingle model to profile the whole dataset usually is difficultor impossible Generally there are two categorized ensemblepatterns to handle the streaming data that is horizontalensemble and vertical ensemble The former follows suchstrategy that the nearest 119899 consecutive data chunks are firstlyused to train 119899 base detector and the combination methodis employed to build the ensemble detector used to predictdata in the yet-to-arrive chunk The advantage of horizontalensemble is that it can handle noise data in the streamingdataset because the prediction of newly arriving data chunkdepends on the combination of different chunks Even if
International Journal of Distributed Sensor Networks 3
the noise data may deteriorate some chunks the ensemblecan still generate relatively accurate prediction result Thedisadvantage of horizontal ensemble is that the streamingdata is continuously changing and the information containedin the previous chunks may be invalid so that using theseold concept models will not improve the overall result ofprediction The latter ensemble pattern is vertical ensemblewhich uses the newest chunk to build ensemble model Theadvantage of vertical ensemble is that it uses different algo-rithms (heterogeneous ensemble) on same dataset or samealgorithm (homogeneous ensemble) on different samplingsubdataset from the chunk to build the model which candecrease the bias error between models The disadvantageis that vertical ensemble assumes that the data chunk iserrorless in real situation this precondition usually is hardtomeet Currently because online ensemble learningmethodcan address the concept drift and noisy data problem instreaming data ensemble learning has been used in anomalydetection forWSNs [23ndash25] In this paper after exploiting thespatiotemporal correlation existing in the sensed dataset inWSNs a distributed method is proposed based on horizontalensemble and like-vertical ensemble Section 3 will give thedetailed description
23 Ensemble Pruning Based on Optimization Search MethodAlthough there are many advantages for ensemble learningthe nontrivial disadvantage is that it needs more mem-ory especially more communication resource to store andcommunicate multiple detectors in WSNs which can drainenergy quickly and is intolerable in WSNs Motivated bythe ldquomany could be better than allrdquo in the ensemble learn-ing community [9] it implied that the combination of alldetectors maybe not a good choice in ensemble learningcommunity Ensemble pruning as necessary strategy to solveresource-limitation question [26] is employed which selectsa subset of initial ensemble and obtains better or at leastequal detecting performance than the original ensembleThemost advantage of ensemble pruning is that it reduces thecommunication requirement greatly In WSNs broadcastingthe relative few detectors can save the battery energy consid-erably However it is well known that pruning an ensembleof size119873 requires to search in the space composed of 2119873 minus 1nonempty subensembles which is a 119873119875 complete problemHence some heuristic searching approaches are used tofind the expected appropriate subset Biogeographical basedoptimization (BBO) [27 28] as a novel population-basedglobal optimization method had some features in commonwith existing optimizationmethods such as genetic algorithm(GA) and harmony search (HS) [29] In this paper BBO isused to obtain an optimalsuboptimal ensemble for reducingthe communication cost To the best of our knowledge as anew optimization method there is no paper employing thismethod to apply in the fields of WSNs and our study willextend its application
3 Proposed Method
Motivated by the increasing online ensemble learningmethodology [25] and considering the resource limitation
Base station (BS)
Cluster head (CH)CH to BSBoundary of cluster heads
Non-CH node
Figure 1 The considered WSN
of sensor node in WSNs we propose a distributed onlineanomaly detection method based on the ensemble learningFurther BBO is used for ensemble pruning to decrease thecommunication and memory requirements
31 Problem Statement of WSNs In this paper we assumethat the WSNs is applied in untouched area and to assure thesensed data quality the sensor nodes usually are deployeddensely Besides we assume that sensor nodes are timesynchronized which is mainly for clear presentation purposerather than a limitation of our proposed method Figure 1showsWSNs which consist of a large amount of sensor nodesand a base station (BS) [30] Generally the WSNs can berepresented as a graph119866 = (119881 119864) where119881 = V
1 V2 V
|119881|
is a finite set of vertices and 119864 = 1198901 1198902 119890
|119864| is a finite
set of edges and vertex (V119894 119894 = 1 |119881|) and edge (119890
119894 119894 =
1 |119864|) refer to sensor nodes and the one-hop or multihopcommunication link reachable between sensors V
119894and V
119895
respectivelyFrom Figure 1 we can clearly have an idea that some
clusters are formed based on node geographical positionsinformation and communication capability reachable Herewe only consider the one-hop communications among sensornodes Similarly this assumption is mainly for clear repre-sentation of our proposed method rather than a limitation ofcommunication capability of sensors In fact our proposedmethod can easily extend to multiple hop relaying communi-cation Besides in order to concisely describe our proposedanomaly detection method a relatively small subnetworkconsisted of some sensor nodes deployed densely is taken intoaccount which forms a cluster 119862
119894consisting of one cluster
head node and a number of sensor nodes represented as CH119894
and 119873119894119895 119895 = 1 |119862
119894| respectively For the whole WSNs
119881 = 1198621cup1198622cup cup 119862
119899and119862
119894cap119862119895= Φ All nodes in a cluster
are reachable to each other by one-hop communication andthe communication between clusters depends on the directlinks of cluster heads In each cluster the selection of clusterhead is randomized among all nodes in that cluster to avoiddraining of the energy
For one cluster 119862119894= CH
119894 1198731198941 119873
119894119898 which contains
a cluster head CH119894and its 119898 spatially neighboring nodes
(119873119894119895 119895 = 1 119898) Each sensor node in the subnetwork
4 International Journal of Distributed Sensor Networks
measures a data vector at every time interval Δ119905 which iscomposed of multiple attribute values For the cluster headCH119894 the observation is 119883
119894= (119909
119894
1 119909119894
2 119909
119894
119889) where 119889
denotes the dimension For the 119895th neighbor node 119873119894119895 the
observation is 119883119894119895= (119909119894
1198951 119909119894
1198952 119909
119894
119895119889) Nodes in the cluster
collect samples synchronously and our proposed method isto identify these new observations of each sensor node asnormal or anomalous online
32 Spatial and Temporal (Spatiotemporal) Correlation ofSensed Dataset For the sensed dataset in a cluster wedescribed the spatiotemporal correlation firstly which will beused later to build our proposed online ensemble detection
The collected sensor dataset from WSNs is a time seriesdataset A time series is a sequence of value 119883 = 119909(119905) 119905 =
1 119899which follows a nonrandom order and the 119899 consec-utive observation values are collected at same time intervalsAnalyzing and learning from these observations [31] canhelp to understand the data trend over time and build theappropriate detector based on temporal correlation as well asto predict the label of new coming observations
To obtain the detector the foremost requirement is toachieve a stationary time series dataset Some data processingmethods are used to eliminate data trend and obtain a sta-tionary time series dataset such as polynomial fitting movingaverages differencing and double exponential smoothing[32ndash34] Considering the requirement of low computationalcomplexity a simple and efficient nonparametric technique(ie first differencing) is used to eliminate the temporal trendand obtain a stationary time series for dataset collected inWSNs which can be formulated as
1198831015840= 1199091015840(119904 119905) = 119909 (119904 119905) minus 119909 (119904 119905 minus 1) 119905 = 2 3 119899 (1)
Besides the sensor nodes are always deployed denselyand the space redundancy existed A dataset 119883 = 119909(119904) 119904 =
1 119898 is collected from 119898 sensor nodes in a clusterat a timestamp This dataset can help to understand thespatial correlation structure of data and predict the datavalue at a location nearby Spatial data may present thelocal dependencywhich represents the similarity relationshipof observations collected at adjacent locations in a localregion Usually for a specified region the observations of onesensor can be estimated by a linear weighted combination ofobservations collected at its adjacent locations [32] which canbe expressed as
119909 (119904119894) = 1205821119909 (1199041) + sdot sdot sdot + 120582119894minus1119909 (119904119894minus1) + 120582119894+1119909 (119904119894+1)
+ sdot sdot sdot + 120582119898119909 (119904119898)
(2)
where 1199041 119904119894minus1 119904119894+1 119904119898 denotes positions of sensornodes and 1205821 120582119894minus1 120582119894+1 120582119898 denotes the weights ofobservations sum119898
119896=1119896 =119894 120582119896 = 1Consequently for sensed data collected in a local region
two reasonable assumptions are described as follows
(1) The sensed data of adjacent nonfault sensor nodes aresimilar at the same timestamp
(2) The sensed data of adjacent nonfault sensor nodeshave the similar trend over time
Motivated by the two assumptions and ensemble learningtheory a novel anomaly detectionmethod is proposed in thispaper We will give the details in the following section
33 Proposed Ensemble Learning Method of Anomaly Detec-tion in WSNs Spatiotemporal correlation exists among sen-sor data in a local region of WSNs and a relatively smallcomponent that is a cluster consisting of a few of sensornodes and a cluster head node is taken into account to clearlydescribe proposed distributed anomaly detection methodbased on ensemble learning Ensemble pruning based onBBO was adopted to optimize the initial trained detectorfor mitigating the resource requirements The optimizedensemble detector was used to identify global anomalousobservations at each individual sensor timely Our proposedmethod is shown as Figure 2
Online anomaly detection method consists of three keyprocedures that is detector training online detecting andonline detector updating From Figure 2 it can be seen thatour proposed method enables each distributed deploymentsensor node to globally judge every new coming observa-tion normal or anomalous in time Distributed detecting isemployed to achieve load (communication computation andstorage) evenly in the network and to prolong the lifetime ofthe whole network
The whole procedure of proposed method is described asfollows
Step 1 Considering the temporal correlation at the certaintime period each sensor node 119904
119894trains a local ensemble
detector using the history dataset collected from a timeinterval In facts using this initial local ensemble detectorthe new coming observation is normal or anomalous can bedetermined locally
Step 2 Each sensor node 119904119894transmits its local ensemble
detector as well as some related parameters such as themaximize value minimum value and mean of trainingdataset to the cluster head node and other member sensornode
Step 3 Cluster head node received the local ensemble detec-tors from its member nodes and combined with its owntrained detector the initial global ensemble detector is built
Step 4 The BBOmethod is introduced in the cluster head toprune the initial global ensemble detector and to obtain anacceptable final ensemble detector
Step 5 The pruned ensemble detector that is final ensembledetector is broadcasted to its each member sensor node foronline global anomaly detection
Step 6 Each sensor node selectively retains the test data foronline update based on the predefined sampling probability119901
International Journal of Distributed Sensor Networks 5
Learning(i) Input training data
(ii) Preprocess training data(iii) Learn from training data(iv) Output the local ensemble detector
Member nodes (MN1)
Learning(i) Input training data
(ii) Preprocess training data(iii) Learn from training data(iv) Output the local ensemble detector
Member nodes (MN2)
Learning(i) Input training data
(ii) Preprocess training data(iii) Learn from training data(iv) Output the local ensemble detector
Member nodes (MNm)
Cluster head node (CH)
Ensemble aggregating
(i) Receive the local ensemble detector(ii) Aggregating
(iii) Output the global ensemble detector
Bit coding mechanism
BBO (ensemble pruning)
Local ensembledetector
Local ensembledetector
Local ensembledetector
Broadcastcommunication
Broadcastcommunication
Broadcastcommunication
Broadcastcommunication
Local ensembledetector
Global initialensemble detector
State matrix Global initial ensemble detector
Figure 2 Distributed ensemble anomaly detection method based on BBO pruning in WSNs
Step 7 Once the updating condition was activated theprocedure of retraining and detector updating was triggered
This method can scale well with increase of number ofnodes in WSNs due to its distributed processing nature Ithas low communication requirements and does not need totransmit any actual observations between cluster head nodeand itsmember sensor node which saves the communicationresource significantly
Next we described some important procedures men-tioned above in detail Further considering the context ofresource constraint of each sensor node in WSNs sometricks are designed to save the communication and memoryrequirements
331 Building the Initial EnsembleDetector An initial ensem-ble detector is constructed by two steps Firstly a numberof base detectors are trained sequentially for each sensornodes in a cluster (including the cluster head node itself)based on the history dataset Because the data distributionmay be changed over time the previous trained detector maybe useless for the future detection Moreover the limitedmemory resource in the sensor node is another constraintto store too many previous detectors In practice accordingto the space of memory resource only the latest multipledetectors are kept to build the initial local ensemble for onesensor node For example to sensor node 119894 the sensed data iscollected anddivided into data chunk based on a time intervalΔ119905 which is determined by the actual monitoring processConsequently each node trains multiple individual detectorsover time In our paper supposing 119899 latest detector is kept fora sensor node if there are119898 nodes in one cluster then totally119899lowast119898 detectors are obtained for the initial ensemble Secondly
each sensor node (including cluster head node) broadcasts its119899 trained detector in the cluster Taking the cluster head as anexample after all (119899lowast(119898minus1)) individual detectors are receivedfrom its member nodes the cluster head combines with its 119899trained detector and the initial ensemble (including 119899 lowast 119898
individual detectors) is built in cluster head nodeMany techniques can be employed for combining the
results of each detector to obtain the final detection resultThe common used method in the literature is the major-ity vote (for classification problem) and weighted average(for regression problem) In our paper the final ensembledetection result can be calculated by (3) where 119908
119894denotes
weight coefficient that is 119908119894= 1 means the simple average
otherwise weighted average In our paper for simplicity thesimple average strategy is employed to combine the finallyresult
119910fin (119909) =1
119899 lowast 119898
119899lowast119898
sum
119894=1119910119894 (119909) lowast 119908119894
(3)
332 Ensemble Pruning Based on BBO Search To miti-gate the expensive communication cost and high memoryrequirement induced by ensemble learning inspired by theprinciple of ldquomany could be better than allrdquo in the ensemblelearning community the ensemble pruning is necessary
Given an initial ensemble anomaly detector 119864 =
AD1AD2 AD
119899lowast119898 AD
119894is a trained anomaly detector
which can test an observation anomalous or not a combi-nation method 119862 and a test dataset 119879 The goal of ensemblepruning is to find an optimalsuboptimal subset 1198641015840 sube 119864which can minimize the generalization error and obtainbetter or at least same detection performance compared to119864 Let 119891
119894119895(119894 = 1 2 119898 119895 = 1 2 119899) be the fitness values
6 International Journal of Distributed Sensor Networks
Input 119864mdashinitial ensemble anomaly detector 119879mdashThe number of maximization iterationOutput 1198641015840mdashfinal ensemble anomaly detectorlowast BBO parameter initialization lowast
Create a random set of habitats (populations) 1198671 1198672 119867
119873
Compute corresponding fitness that is HSI valueslowastOptimization search process lowast
While (119879)Compute immigration rate 120582 and emigration rate 119906 for each habitat based on HSIlowast Migration lowastSelect119867
119894with probability based on 120582
119894
If119867119894is selected
Select119867119895with probability based on 119906
119895
If119867119895is selected
Randomly select a SIV form119867119895
Replace a random SIV in119867119894with one from119867
119895
End ifEnd iflowast Mutation lowastSelect an SIV in119867
119894with probability based on the mutation rate 120578
If119867119894(SIV) is selected
Replace119867119894(SIV) with a randomly generated SIV
End ifRe-compute HSI values119879 = T minus 1
End whilelowast Ensemble pruning lowast
Get the final ensemble of anomaly detector 119864lowast based on the habitats119867119894
lowast with acceptable HSI
Algorithm 1 Ensemble pruning BBO (E T)
of the detecting performance such as true positive rate falsepositive rate accuracy and so on Obviously the fitness value119865 can be defined as (4) based on the results of testing data
119865 =
[
[
[
[
[
[
11989111 11989112 sdot sdot sdot 1198911119899
11989121 11989122 sdot sdot sdot 1198912119899
sdot sdot sdot sdot sdot sdot sdot sdot sdot sdot sdot sdot
1198911198981 1198911198982 sdot sdot sdot 119891
119898119899
]
]
]
]
]
]
(4)
The final fitness function can be defined as
Maximize (
1198731015840
sum
119894=1119898119895=1119899
119891119894119895)
st 1198731015840le 119898 lowast 119899
(5)
Here the problem of ensemble pruning is to find thesubset of 1198641015840 which was composed of part single detectorsFinding the optimized subset requires much heavier andmore delicate computation resources Biogeography-basedoptimization (BBO) is a novel optimization method and isemployed to find out the acceptable set of ensemble Weonly simply present some key information about BBO theinterested reader can be referred to the detailed descriptionin [28]
BBO is a population-based global optimization methodwhich has some common characteristics similar to the
existing evolutionary algorithms (EAs) such as genetic algo-rithm (GA) particle swarm optimization (PSO) and antcolony optimization (ACO) When it was used to search thesolution domain and obtain an optimalsuboptimal solutionsome operators were employed to share information amongsolutions which makes BBO applicable to many problemsthat GA and PSO are used The more distinctive differencebetween BBO and other EAs can be seen in [27 28]
The pseudo-code of ensemble pruning based on BBO canbe described as shown in Algorithm 1 [7] Here 119867 indicateshabit HIS is fitness and SIV (suitability index variable) is asolution feature
333 Some Tricks Designed to Mitigate the CommunicationRequirement In the WSNs the main reason of quick energydepletion is the radio communication among the sensornodes It has been known that the cost of communicationof one bit equals the cost of processing thousands of bitsin sensors [35] This means that the most energy in sensornode is consumed by radio communication rather thancollecting or processing data Consequently reducing thecommunication quantity will decrease the power resourcerequirement and eventually lengthen the lifetime of thewholeWSNs
It is obvious that the aforementioned method has relativehigh communication overhead Each sensor node transmitsits local ensemble detector to the cluster head and the finalpruned global ensemble detector broadcasts back to its each
International Journal of Distributed Sensor Networks 7
Input 1198641015840mdashCurrent pruned ensemble anomaly detector 119901mdashSampling probabilityOutput 119864lowastmdashUpdated pruned ensemble anomaly detector
For each sensor nodeRetain the new observation with probability 119901If buffer is replaced completely by new observations
Train new detector and transmit its summary to cluster head119864lowast = Ensemble Pruning BBO(1198641015840 119879)
Broadcast 119864lowast to its member sensor node for subsequent anomaly detection
Algorithm 2 Online updating (1198641015840 119901)
member sensor nodes In order to relieve communicationburden some skills are used to descend the communicationoverhead
In fact the distributed traininglearning method onlytransmits the summary information of trained local ensembledetector to the cluster head which has significantly decreasedthe communication cost compared to centralized anomalydetectionmanners that sent all trained data to cluster head tobuild detector Besides after the pruned ensemble is obtainedin cluster head node each member sensor node in thiscluster can obtain the pruned ensemble detector from thecluster head node A straightforward method is broadcastingthis pruned ensemble to its member sensor nodes Thisis a common used strategy but it does not make full useof local ensemble detector information and will cost morecommunication resources Here a state matrix 119875 is designedin the cluster head its element 119901
119894119895is defined by formula (6)
to represent each single detector in initial ensemble Theneach local ensemble detector is represented as a bit stringusing one bit for each single detector Detector is included orexcluded from the ensemble detector depending on the valueof the corresponding bit that is 1 denotes this single detectorthat is included in the final ensemble and 0 means it was notincluded
119901119894119895=
1 AD119894119895isin 1198641015840 119894 = 1 119898 119895 = 1 119899
0 otherwise
119875
=
1 2 119894 minus 1 119894 119894 + 1 sdot sdot sdot 119899
1198781
1198782
119878119898
[
[
[
[
[
[
0
1
sdot
0
1
0
sdot
0
sdot sdot sdot
sdot sdot sdot
sdot sdot sdot
sdot sdot sdot
0
1
sdot
1
1
1
sdot
1
1
0
sdot
1
sdot sdot sdot
sdot sdot sdot
sdot sdot sdot
sdot sdot sdot
1
0
sdot
1
]
]
]
]
]
]
(6)
After the pruned procedure is finished the cluster headbroadcasts the statematrix119875 to its member sensor node eachsensor node keeps the single detector whose correspondingvalue of state element equals 1 and it deletes the rest to buildthe pruned ensemble global detector Employing the statematrix can save the energy greatly For example after theensemble pruning is finished 1198731015840 (1198731015840 le 119899 lowast 119898) individualdetectors are broadcast in cluster If matrix 119875 is not used it
will need 4 lowast1198731015840 lowast 119889 bytes communication cost (suppose thatthe individual detector can be represented by 119889 parametersand each parameter needs at least 4 bytes) If matrix 119875 isintroduced each itemofmatrix119875 only needs 1 bit to representan individual detector Consequently only 119898 lowast 1198998 bytes arerequired to broadcast Suppose that one-third of individualdetectors are pruned (ie1198731015840 = 2lowast119899lowast1198983) then (4lowast119899lowast119898lowast
119889 lowast 23)(119898 lowast 1198998) asymp 2133119889 By introducing the ensemblepruning and state matrix the quantity of energy saving incluster head sensor is significant and the lifetime ofWSNs canbe lengthened
334 Online Update and Relearning Distribution changeof sensed dataset occurred possibly and detector updat-ing is necessary Online detector update will accompanya relearning procedure A comprised strategy (ie delayupdating strategy [36]) can cater this situation and savethe computation communication and memory resources tosome extent Simple to say for the new coming observationwhether saving and using it to update the current detectoror not are decided by a sample probability 119901 Some heuristicrules can be employed to guide its value for example if thedynamics is relatively stationary the small 119901 should be usedotherwise the big 119901 should be chosen When the buffer of asensor node is replaced by the new data completely onlineupdate is triggered and new detector is trained The pseudo-codes of algorithm can be described as shown inAlgorithm 2
4 Experimental and Analysis
In this section the dataset data preprocessing methodexperiment results and analysis are described respectivelyExperiments were conducted on a personal PC with IntelCore 2 Duo CPU P7450213GHZ and 4GB memoryThe operating system is Windows 7 professional The dataprocessing is partly on the MATLAB 2010 and the algorithmmentioned in Section 3 was implemented with MicrosoftVisual C++ platform
41 Dataset and Data Preprocessing IBRL datasets [37] wereused in our paper to validate proposed method which wascollected from aWSN deployed in Intel Research Laboratoryat University of Berkeley and commonly used to evaluatethe performance of some existing models for WSNs [3536 38ndash41] This network consists of 54 Mica2Dot sensornodes Figure 3 shows the location of each node of the
8 International Journal of Distributed Sensor Networks
1
2
3
4
5 6
7
8
9
10
11
12
1314
15
16
17
18
19
20
21
22
23
2425
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
4142
43
44
45
46
47
48
49
50
51
52
54
Lab
Server
Quiet
Phon
e
Kitchen
Elec
Copy
Storage
Conference
Office Office
53
Figure 3 Sensor nodes location in the IBRL deployment
deployment (node locations are shown in black hexagonwith their corresponding node IDs) [35] The whole datasetwas collected from 29022004 to 05042004 Four typesof measures data that is light temperature and humidityas well as voltage were collected and those measurementswere recorded in 31 s interval Because these sensors weredeployed inside a lab and the measurement variables hadlittle changes over time (except the light having the suddenchanges due to the irregular nature of this variable andfrequent onoff operation) this dataset was considered a typeof static datasets for many researchers In our experimentsto evaluate our proposed anomaly detection algorithm someartificial anomalies are created by randomly modifying someobservations which is widely used bymany researchers in theliterature [41]
Since our proposed method adopts the cluster structurea cluster (consisting of 4 sensor nodes ie N7 N8 N9and N10) and dataset (collected on 29022004) are chosenThe data distribution can be seen in [7] Here only partobservations (during 000000 amndash075959 am) from eachsensor node are employed to evaluate proposed methodThedata trend is depicted in Figure 4
From Figure 4 an obvious fact is that data distributionin a cluster is almost same which well proved that spatialcorrelation exists Though there are some trivial differencesafter analyzing the dataset carefully the main reason is thatdataset has some missing data points largely due to packetloss which can be further proved from Figure 4 In ourexperiment these missing observations can be interpolatedusing the method described in Section 33 The obvious factis that sudden peakvalley appeared in Figure 4 for eachsensor observation which implies that an interested eventmay occurred
Suppose that 119863 = 119909119894 119910119894 119894 = 1 2 119899 is a dataset
used to train an anomaly detector Here the 119909119894is a vector
with feature values and 119910119894is the label which indicates whether
the given observation is normal or anomalous Because theIBRL dataset regards all its observations as normal someanomaly data points are generated and inserted to evaluatethe performance of our proposed method In the paper anumber of 30 data points of artificial anomalies for eachsensor were injected consecutively in each dataset to calculatethe true positive rate (TPR) false negative rate (FPR) anddetection accuracy (ACC) Without loss of generality theanomalous dataset should follow a distribution very muchdifferent from that of the training dataset but their rangesshould be overlapped as much as possible Besides ananomalous event should be a small probability event fora normal dataset collected by a nonfault sensor node Theanomalies were generated using a normal randomizer withslightly deviate statistical characteristics from the normaldata characteristics [41] The detailed dataset information(including statistical parameters) of selected sensor node ispresented in Table 1
42 Performance EvaluationMetrics and BBO Parameters Inorder to evaluate our proposed method some commonlyused performance evaluation metrics for anomaly detectionare used in our paper such as detection accuracy (ACC) truepositive rate (TPR) and false positivealarm rate (FPR)Theyare described as follows
ACC =
(TP + TN)(TP + TN + FP + FN)
International Journal of Distributed Sensor Networks 9
17
175
18
185
19
195
20Temperature
0 200 400 600 800 1000
N7N8
N9N10
(a)
38
39
40
41
42
43
44
45
46
0 200 400 600 800 1000
Humidity
N7N8
N9N10
(b)
Figure 4 The data (temperature humidity) trend during 00000 amndash75959 am on February 29 2004
Table 1 Detail dataset information of selected sensor node on 29022004
Node Initial sample Mean Variance Injected anomaly Mean Variance119879 119867 119879 119867 119879 119867 119879 119867
N7 823 184154 409176 05238 14494 30 1821 4110 054 146N8 548 179844 417123 05315 14612 30 1775 4195 055 148N9 652 181140 426295 05288 14827 30 1835 4245 055 150N10 620 181144 426215 05244 14191 30 1833 4247 054 143119879 temperature119867 humidity
TPR =
TP(TP + FN)
FPR =
FP(FP + TN)
(7)
where TP means number of samples correctly predicted asanomaly class FP means number of samples incorrectlypredicted as anomaly class TN means number of samplescorrectly predicted as normal class and FN means numberof samples incorrectly predicted as normal class
BBO is employed to prune the initial ensemble and themigration model is same as that present in [27 28] and therelated parameters are set as follows
Habitat (population) size 119878 = 30 the number of SIVs(suitability index variables) in each island 119899 = 20 40 60 80the maximum migration rates 119864 = 1 and 119868 = 1 and themutation rate 120578 = 001 and 120582 120583 are the immigration rateand the emigration rate respectively The elitism parameter120588 = 2
HSI (habitat suitability index) is a fitness function similarto other population-based optimization algorithms HIS isevaluated by 119865-measure (119865-score) which considers both the
precision probability and the recall probability of binaryclassification problem
119865-measure =(1 + 120573
2) precision lowast recall
1205732lowast precision + recall
=
(1 + 1205732) lowast TP
(1 + 1205732) lowast TP + 1205732 lowast FN + FP
(8)
119865-measure can be interpreted as a weighted average of theprecision and recall and its value reaches best at 1 and worstat 0 120573 is a parameter used to adjust the relative importancebetween precision and recall 120573 = 05 1 2 Usually thevalue of 119865-measure is close to the relative small value ofprecision and recall that is the big 119865-measuremeans that theprecision and recall are all big Consequently a good detectoris analogous to a habitat with a high HSI and is included inthe final ensemble detector and a poor detector is analogousto a habitat with a low HIS and is discarded from the finalensemble detector In our paper 120573 = 1 is specified
43 Results Presentation and Discussions In the data miningand machine learning communities SVM-based method has
10 International Journal of Distributed Sensor Networks
Table 2 Detection performance of local ensemble detector
Ensemble size N7 N8 N9 N10ACC TPR FPR ACC TPR FPR ACC TPR FPR ACC TPR FPR
5 08700 05833 01181 07900 03333 01809 08267 05000 01549 08267 05714 0160810 08800 06667 01111 08033 03889 01702 08267 04375 01514 08333 06429 0157315 08900 07500 01042 08167 05000 01631 08433 05000 01373 08600 07143 0132920 08933 08333 01042 08200 05000 01596 08367 05000 01444 08567 07143 01364
Table 3 Detection performance of global ensemble detector [7]
Combined ensemble size N7 N8 N9 N10ACC TPR FPR ACC TPR FPR ACC TPR FPR ACC TPR FPR
20 09467 08333 00486 09300 07778 00603 09467 07500 00423 09500 07857 0042040 09700 07500 00208 09433 08333 00496 09710 08938 00246 09650 08929 0031560 09700 08333 00243 09733 08889 00213 09800 09375 00176 09783 09357 0019680 09817 09583 00174 09800 09444 00177 09767 09375 00211 09780 09714 00217
been widely used in classification problem which separatesthe data belonging to the different classes by fitting a hyper-plane One class SVM based method as a variation of thismethod is especially favored for anomaly detection [42ndash44]In the paper it was used to train the base detectorThe datasetof each sensor node was divided into two parts about 66was used for training the local detector and the remainder asthe test set was to evaluate proposed method
Online Bagging the commonly used ensemble strategywas used to build initial ensemble detector Our experimentsaim to achieve two goals Firstly it is to prove the effectivenessof proposed method based on ensemble learning theorySecondly it is to prove that pruned ensemble detector canobtain better (at least equal) performance compared to initialensemble detector and mitigate the resource requirementAs a result three experiments were done that is localensemble anomaly detector only considering the temporalcorrelation of each sensor node global ensemble anomalydetector considering the spatiotemporal correlation and theglobal pruned ensemble anomaly detector based on BBOThe experimental results can be seen in Tables 2 3 and 4respectively
Table 2 shows the performance of each sensor node underthe different ensemble size which does not take into accountthe spatial correlation of sensed data in a cluster Though theensemble detection performance is becoming ldquogoodrdquo gradualwith the increasing of ensemble size (the higher value ofACCTPR the better performance and the lower value of FPR thebetter performance) the overall performance is relatively lowThemaximumvalue of detection accuracy is only 8933 andmost of true positive rates are unacceptable and most of falsepositive rates (FPR) have a relative high valueAll these resultsindicate that the performance of local ensemble detectoris poor Table 3 shows the global detection performance ofeach sensor node Here after the local ensemble detectorwas trained each member node sent its local ensembleto each other to form the global ensemble detector andeach member node used this global detector to online testthe local observation From the results of Table 3 [7] an
obvious fact is that the detection performances are higherthan presented in Table 2With the help of neighbor detectorthe detection results become better and better correspondingto the increasing of ensemble size
In order to further optimize the proposed algorithmperformance and save the resource ensemble pruning is usedfor global ensemble detector Table 4 [7] shows the result ofdetection performance of pruned global ensemble detectorbased on BBO
Table 4 shows a more practicable result and the sizeof global ensemble decreases sharply while the detectorperformance is as good as or better than the initial globalensemble detector From the results of Table 5 when thesize of initial ensemble reaches 80 the 60 resource costis saved In our experiment only for validating the methodeffectively we set the ensemble sizes 5 10 15 and 20 for eachlocal ensemble detector which may be small for the practicalapplications In fact how many local ensemble detectors arechosen is an open topic and is decided by many factors suchas the computation capability and the communication cost aswell as memory usage of sensor node the expected detectingaccuracy requirement and so on In the practical applicationa trade-off is commonly considered
5 Conclusion and Future Work
After exploiting the spatiotemporal correlation existing inthe sensed data of WSNs and motivated by the advantagesof online ensemble learning a distributed online ensembleanomaly detector method has been proposed Due to thespecific resource constrained in theWSNs ensemble pruningbased on BBO is employed to mitigate the high resourcerequirement and obtain the optimized detector that performsat least as good as the original ones The experimental resultson real dataset demonstrated that our proposed method iseffective
Because the diversity of base learners is a key factorrelated to the performance of ensemble learning as a possibleextension of our work we plan to include some diversity
International Journal of Distributed Sensor Networks 11
Table 4 Detection performance of global ensemble detector based on BBO pruning [7]
Ensemble size(BBO pruned)
N7 N8 N9 N10ACC TPR FPR ACC TPR FPR ACC TPR FPR ACC TPR FPR
14 09480 08000 00458 09327 07667 00567 09500 08125 00423 09533 08571 0042023 09710 07750 00208 09447 08000 00461 09733 09250 00239 09697 09143 0027627 09713 08500 00236 09683 08333 00230 09810 09563 00176 09797 09357 0018232 09820 09750 00177 09750 08333 00160 09820 09500 00162 09830 09786 00168
Table 5 Rate of saving resource cost based on global ensembledetector of BBO pruned
Number Initial ensemblesize
Prunedensemble size
Saving resourcecost
1 20 14 302 40 23 4253 60 27 554 80 32 60
measures in fitness function to improve the detecting per-formance in future Besides the cost of communication isthe main reason of quick energy depletion of sensor nodesespecially for the cluster head the adaptive selection of clusterhead based on energy state will be taken into account tolengthen the lifetime of WSNs in next work
Conflict of Interests
The authors declare that there is no conflict of interestsregarding the publication of this paper
Acknowledgments
This work is supported by the National Key ScientificInstrument and Equipment Development Project(2012YQ15008703) the Zhejiang Provincial Natural ScienceFoundation of China (LY13F020015) the Open Project of TopKeyDiscipline of Computer Software andTheory in ZhejiangProvincial (ZC323014100) National Science Foundation ofChina (61473182) Science and Technology Commission ofShanghai Municipality (11JC1404000 14JC1402200) andShanghai Rising-Star Program (13QA1401600)
References
[1] Y Zhang N Meratnia and P Havinga ldquoOutlier detectiontechniques for wireless sensor networks a surveyrdquo IEEE Com-munications Surveys and Tutorials vol 12 no 2 pp 159ndash1702010
[2] Y Zhang N A S Hamm N Meratnia A Stein M van deVoort and P J M Havinga ldquoStatistics-based outlier detectionfor wireless sensor networksrdquo International Journal of Geo-graphical Information Science vol 26 no 8 pp 1373ndash1392 2012
[3] C Peng and Q-L Han ldquoA novel event-triggered transmissionscheme and L
2control co-design for sampled-data control
systemsrdquo IEEE Transactions on Automatic Control vol 58 no10 pp 2620ndash2626 2013
[4] S Rajasegarar C Leckie M Palaniswami and J C BezdekldquoDistributed anomaly detection in wireless sensor networksrdquo inProceedings of the 10th IEEE Singapore International Conferenceon Communication systems (ICCS rsquo06) pp 1ndash5 IEEE SingaporeOctober 2006
[5] S Rajasegarar C Leckie and M Palaniswami ldquoAnomalydetection in wireless sensor networksrdquo IEEE Wireless Commu-nications vol 15 no 4 pp 34ndash40 2008
[6] M Xie S Han B Tian and S Parvin ldquoAnomaly detectionin wireless sensor networks a surveyrdquo Journal of Network andComputer Applications vol 34 no 4 pp 1302ndash1325 2011
[7] Z Ding M Fei D Du and S Xu ldquoOnline anomaly detectionmethod based on BBO ensemble pruning in wireless sensornetworksrdquo in Life System Modeling and Simulation vol 461 ofCommunications in Computer and Information Science pp 160ndash169 Springer Berlin Germany 2014
[8] T G Dietterich ldquoMachine-learning researchmdashfour currentdirectionsrdquo AI Magazine vol 18 no 4 pp 97ndash136 1997
[9] Z-H Zhou J Wu andW Tang ldquoEnsembling neural networksmany could be better than allrdquoArtificial Intelligence vol 137 no1-2 pp 239ndash263 2002
[10] N Shahid I H Naqvi and S B Qaisar ldquoCharacteristics andclassification of outlier detection techniques for wireless sensornetworks in harsh environments a surveyrdquoArtificial IntelligenceReview vol 137 pp 1ndash36 2012
[11] D Du K Li and M Fei ldquoA fast multi-output RBF neuralnetwork constructionmethodrdquoNeurocomputing vol 73 no 10ndash12 pp 2196ndash2202 2010
[12] P Gil A Santos and A Cardoso ldquoDealing with outliers inwireless sensor networks an oil refinery applicationrdquo IEEETransactions on Control Systems Technology vol 23 no 4 pp1589ndash1596 2014
[13] M A Rassam M A Maarof and A Zainal ldquoAdaptive andonline data anomaly detection for wireless sensor systemsrdquoKnowledge-Based Systems vol 60 pp 44ndash57 2014
[14] S Rajasegarar A Gluhak M Ali Imran et al ldquoEllipsoidalneighbourhood outlier factor for distributed anomaly detectionin resource constrained networksrdquo Pattern Recognition vol 47no 9 pp 2867ndash2879 2014
[15] N Lu G Zhang and J Lu ldquoConcept drift detection viacompetence modelsrdquo Artificial Intelligence vol 209 pp 11ndash282014
[16] L Breiman ldquoBagging predictorsrdquoMachine Learning vol 24 no2 pp 123ndash140 1996
[17] S Seguı L Igual and J Vitria ldquoBagged one-class classifiersin the presence of outliersrdquo International Journal of PatternRecognition and Artificial Intelligence vol 27 no 5 Article ID1350014 2013
12 International Journal of Distributed Sensor Networks
[18] N Duffy and D Helmbold ldquoBoosting methods for regressionrdquoMachine Learning vol 47 no 2-3 pp 153ndash200 2002
[19] W-C Chang and C-W Cho ldquoOnline boosting for vehicledetectionrdquo IEEETransactions on SystemsMan and CyberneticsPart B Cybernetics vol 40 no 3 pp 892ndash902 2010
[20] C Desir S Bernard C Petitjean and L Heutte ldquoOne classrandom forestsrdquo Pattern Recognition vol 46 no 12 pp 3490ndash3506 2013
[21] A Fern and R Givan ldquoOnline ensemble learning an empiricalstudyrdquoMachine Learning vol 53 no 1-2 pp 71ndash109 2003
[22] A Bifet G Holmes B Pfahringer and R Gavalda ldquoImprov-ing adaptive bagging methods for evolving data streamsrdquo inAdvances in Machine Learning vol 5828 of Lecture Notes inComputer Science pp 23ndash37 Springer Berlin Germany 2009
[23] D I Curiac and C Volosencu ldquoEnsemble based sensinganomaly detection in wireless sensor networksrdquo Expert Systemswith Applications vol 39 no 10 pp 9087ndash9096 2012
[24] X Zhou S Li and Z Ye ldquoA novel system anomaly predictionsystem based on belief markov model and ensemble classifica-tionrdquo Mathematical Problems in Engineering vol 2013 ArticleID 179390 10 pages 2013
[25] H He S Chen K Li and X Xu ldquoIncremental learning fromstream datardquo IEEE Transactions on Neural Networks vol 22 no12 pp 1901ndash1914 2011
[26] D Du K Li X Li and M Fei ldquoA novel forward gene selectionalgorithm for microarray datardquo Neurocomputing vol 133 pp446ndash458 2014
[27] H Ma ldquoAn analysis of the equilibrium of migration models forbiogeography-based optimizationrdquo Information Sciences vol180 no 18 pp 3444ndash3464 2010
[28] D Simon ldquoBiogeography-based optimizationrdquo IEEE Transac-tions on Evolutionary Computation vol 12 no 6 pp 702ndash7132008
[29] S Sheen R Anitha and P Sirisha ldquoMalware detection bypruning of parallel ensembles using harmony searchrdquo PatternRecognition Letters vol 34 no 14 pp 1679ndash1686 2013
[30] Y-Y Zhang H-C Chao M Chen L Shu C-H Park and M-S Park ldquoOutlier detection and countermeasure for hierarchicalwireless sensor networksrdquo IET Information Security vol 4 no4 pp 361ndash373 2010
[31] C Peng and M-R Fei ldquoAn improved result on the stability ofuncertain T-S fuzzy systems with interval time-varying delayrdquoFuzzy Sets and Systems vol 212 pp 97ndash109 2013
[32] Y Zhang Observing the Unobservable Distributed Online Out-lier Detection inWireless Sensor Networks University of TwenteEnschede The Netherlands 2010
[33] C Peng D Yue and M Fei ldquoRelaxed stability and stabilizationconditions of networked fuzzy control systems subject toasynchronous grades of membershiprdquo IEEE Transactions onFuzzy Systems vol 22 no 5 pp 1101ndash1112 2014
[34] C Peng M-R Fei E Tian and Y-P Guan ldquoOn hold or dropout-of-order packets in networked control systemsrdquo Informa-tion Sciences vol 268 pp 436ndash446 2014
[35] M A Rassam A Zainal and M A Maarof ldquoAn adaptive andefficient dimension reduction model for multivariate wirelesssensor networks applicationsrdquo Applied Soft Computing Journalvol 13 no 4 pp 1978ndash1996 2013
[36] M Xie J Hu S Han and H-H Chen ldquoScalable hypergridk-NN-based online anomaly detection in wireless sensor net-worksrdquo IEEE Transactions on Parallel and Distributed Systemsvol 24 no 8 pp 1661ndash1670 2013
[37] Intel Berkely Reseach Lab (IBRL) dataset 2004 httpdbcsailmitedulabdatalabdatahtml
[38] J W Branch C Giannella B Szymanski R Wolff and HKargupta ldquoIn-network outlier detection in wireless sensornetworksrdquo Knowledge and Information Systems vol 34 no 1pp 23ndash54 2013
[39] M Moshtaghi T C Havens J C Bezdek et al ldquoClusteringellipses for anomaly detectionrdquo Pattern Recognition vol 44 no1 pp 55ndash69 2011
[40] S Rajasegarar J C Bezdek C Leckie and M PalaniswamildquoElliptical anomalies in wireless sensor networksrdquo ACM Trans-actions on Sensor Networks vol 6 no 1 pp 1ndash28 2009
[41] M A Rassam A Zainal and M A Maarof ldquoOne-classprincipal component classifier for anomaly detection inwirelesssensor networkrdquo in Proceedings of the 4th International Confer-ence on Computational Aspects of Social Networks (CASoN rsquo12)pp 271ndash276 IEEE Sao Carlos Brazil November 2012
[42] H Sagha H Bayati J D R Millan and R Chavarriaga ldquoOn-line anomaly detection and resilience in classifier ensemblesrdquoPattern Recognition Letters vol 34 no 15 pp 1916ndash1927 2013
[43] M Hejazi and Y P Singh ldquoOne-class support vector machinesapproach to anomaly detectionrdquo Applied Artificial Intelligencevol 27 no 5 pp 351ndash366 2013
[44] Y Zhang NMeratnia and P JMHavinga ldquoDistributed onlineoutlier detection in wireless sensor networks using ellipsoidalsupport vector machinerdquo Ad Hoc Networks vol 11 no 3 pp1062ndash1074 2013
International Journal of
AerospaceEngineeringHindawi Publishing Corporationhttpwwwhindawicom Volume 2014
RoboticsJournal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Active and Passive Electronic Components
Control Scienceand Engineering
Journal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
International Journal of
RotatingMachinery
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Hindawi Publishing Corporation httpwwwhindawicom
Journal ofEngineeringVolume 2014
Submit your manuscripts athttpwwwhindawicom
VLSI Design
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Shock and Vibration
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Civil EngineeringAdvances in
Acoustics and VibrationAdvances in
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Electrical and Computer Engineering
Journal of
Advances inOptoElectronics
Hindawi Publishing Corporation httpwwwhindawicom
Volume 2014
The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014
SensorsJournal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Modelling amp Simulation in EngineeringHindawi Publishing Corporation httpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Chemical EngineeringInternational Journal of Antennas and
Propagation
International Journal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Navigation and Observation
International Journal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
DistributedSensor Networks
International Journal of
International Journal of Distributed Sensor Networks 3
the noise data may deteriorate some chunks the ensemblecan still generate relatively accurate prediction result Thedisadvantage of horizontal ensemble is that the streamingdata is continuously changing and the information containedin the previous chunks may be invalid so that using theseold concept models will not improve the overall result ofprediction The latter ensemble pattern is vertical ensemblewhich uses the newest chunk to build ensemble model Theadvantage of vertical ensemble is that it uses different algo-rithms (heterogeneous ensemble) on same dataset or samealgorithm (homogeneous ensemble) on different samplingsubdataset from the chunk to build the model which candecrease the bias error between models The disadvantageis that vertical ensemble assumes that the data chunk iserrorless in real situation this precondition usually is hardtomeet Currently because online ensemble learningmethodcan address the concept drift and noisy data problem instreaming data ensemble learning has been used in anomalydetection forWSNs [23ndash25] In this paper after exploiting thespatiotemporal correlation existing in the sensed dataset inWSNs a distributed method is proposed based on horizontalensemble and like-vertical ensemble Section 3 will give thedetailed description
23 Ensemble Pruning Based on Optimization Search MethodAlthough there are many advantages for ensemble learningthe nontrivial disadvantage is that it needs more mem-ory especially more communication resource to store andcommunicate multiple detectors in WSNs which can drainenergy quickly and is intolerable in WSNs Motivated bythe ldquomany could be better than allrdquo in the ensemble learn-ing community [9] it implied that the combination of alldetectors maybe not a good choice in ensemble learningcommunity Ensemble pruning as necessary strategy to solveresource-limitation question [26] is employed which selectsa subset of initial ensemble and obtains better or at leastequal detecting performance than the original ensembleThemost advantage of ensemble pruning is that it reduces thecommunication requirement greatly In WSNs broadcastingthe relative few detectors can save the battery energy consid-erably However it is well known that pruning an ensembleof size119873 requires to search in the space composed of 2119873 minus 1nonempty subensembles which is a 119873119875 complete problemHence some heuristic searching approaches are used tofind the expected appropriate subset Biogeographical basedoptimization (BBO) [27 28] as a novel population-basedglobal optimization method had some features in commonwith existing optimizationmethods such as genetic algorithm(GA) and harmony search (HS) [29] In this paper BBO isused to obtain an optimalsuboptimal ensemble for reducingthe communication cost To the best of our knowledge as anew optimization method there is no paper employing thismethod to apply in the fields of WSNs and our study willextend its application
3 Proposed Method
Motivated by the increasing online ensemble learningmethodology [25] and considering the resource limitation
Base station (BS)
Cluster head (CH)CH to BSBoundary of cluster heads
Non-CH node
Figure 1 The considered WSN
of sensor node in WSNs we propose a distributed onlineanomaly detection method based on the ensemble learningFurther BBO is used for ensemble pruning to decrease thecommunication and memory requirements
31 Problem Statement of WSNs In this paper we assumethat the WSNs is applied in untouched area and to assure thesensed data quality the sensor nodes usually are deployeddensely Besides we assume that sensor nodes are timesynchronized which is mainly for clear presentation purposerather than a limitation of our proposed method Figure 1showsWSNs which consist of a large amount of sensor nodesand a base station (BS) [30] Generally the WSNs can berepresented as a graph119866 = (119881 119864) where119881 = V
1 V2 V
|119881|
is a finite set of vertices and 119864 = 1198901 1198902 119890
|119864| is a finite
set of edges and vertex (V119894 119894 = 1 |119881|) and edge (119890
119894 119894 =
1 |119864|) refer to sensor nodes and the one-hop or multihopcommunication link reachable between sensors V
119894and V
119895
respectivelyFrom Figure 1 we can clearly have an idea that some
clusters are formed based on node geographical positionsinformation and communication capability reachable Herewe only consider the one-hop communications among sensornodes Similarly this assumption is mainly for clear repre-sentation of our proposed method rather than a limitation ofcommunication capability of sensors In fact our proposedmethod can easily extend to multiple hop relaying communi-cation Besides in order to concisely describe our proposedanomaly detection method a relatively small subnetworkconsisted of some sensor nodes deployed densely is taken intoaccount which forms a cluster 119862
119894consisting of one cluster
head node and a number of sensor nodes represented as CH119894
and 119873119894119895 119895 = 1 |119862
119894| respectively For the whole WSNs
119881 = 1198621cup1198622cup cup 119862
119899and119862
119894cap119862119895= Φ All nodes in a cluster
are reachable to each other by one-hop communication andthe communication between clusters depends on the directlinks of cluster heads In each cluster the selection of clusterhead is randomized among all nodes in that cluster to avoiddraining of the energy
For one cluster 119862119894= CH
119894 1198731198941 119873
119894119898 which contains
a cluster head CH119894and its 119898 spatially neighboring nodes
(119873119894119895 119895 = 1 119898) Each sensor node in the subnetwork
4 International Journal of Distributed Sensor Networks
measures a data vector at every time interval Δ119905 which iscomposed of multiple attribute values For the cluster headCH119894 the observation is 119883
119894= (119909
119894
1 119909119894
2 119909
119894
119889) where 119889
denotes the dimension For the 119895th neighbor node 119873119894119895 the
observation is 119883119894119895= (119909119894
1198951 119909119894
1198952 119909
119894
119895119889) Nodes in the cluster
collect samples synchronously and our proposed method isto identify these new observations of each sensor node asnormal or anomalous online
32 Spatial and Temporal (Spatiotemporal) Correlation ofSensed Dataset For the sensed dataset in a cluster wedescribed the spatiotemporal correlation firstly which will beused later to build our proposed online ensemble detection
The collected sensor dataset from WSNs is a time seriesdataset A time series is a sequence of value 119883 = 119909(119905) 119905 =
1 119899which follows a nonrandom order and the 119899 consec-utive observation values are collected at same time intervalsAnalyzing and learning from these observations [31] canhelp to understand the data trend over time and build theappropriate detector based on temporal correlation as well asto predict the label of new coming observations
To obtain the detector the foremost requirement is toachieve a stationary time series dataset Some data processingmethods are used to eliminate data trend and obtain a sta-tionary time series dataset such as polynomial fitting movingaverages differencing and double exponential smoothing[32ndash34] Considering the requirement of low computationalcomplexity a simple and efficient nonparametric technique(ie first differencing) is used to eliminate the temporal trendand obtain a stationary time series for dataset collected inWSNs which can be formulated as
1198831015840= 1199091015840(119904 119905) = 119909 (119904 119905) minus 119909 (119904 119905 minus 1) 119905 = 2 3 119899 (1)
Besides the sensor nodes are always deployed denselyand the space redundancy existed A dataset 119883 = 119909(119904) 119904 =
1 119898 is collected from 119898 sensor nodes in a clusterat a timestamp This dataset can help to understand thespatial correlation structure of data and predict the datavalue at a location nearby Spatial data may present thelocal dependencywhich represents the similarity relationshipof observations collected at adjacent locations in a localregion Usually for a specified region the observations of onesensor can be estimated by a linear weighted combination ofobservations collected at its adjacent locations [32] which canbe expressed as
119909 (119904119894) = 1205821119909 (1199041) + sdot sdot sdot + 120582119894minus1119909 (119904119894minus1) + 120582119894+1119909 (119904119894+1)
+ sdot sdot sdot + 120582119898119909 (119904119898)
(2)
where 1199041 119904119894minus1 119904119894+1 119904119898 denotes positions of sensornodes and 1205821 120582119894minus1 120582119894+1 120582119898 denotes the weights ofobservations sum119898
119896=1119896 =119894 120582119896 = 1Consequently for sensed data collected in a local region
two reasonable assumptions are described as follows
(1) The sensed data of adjacent nonfault sensor nodes aresimilar at the same timestamp
(2) The sensed data of adjacent nonfault sensor nodeshave the similar trend over time
Motivated by the two assumptions and ensemble learningtheory a novel anomaly detectionmethod is proposed in thispaper We will give the details in the following section
33 Proposed Ensemble Learning Method of Anomaly Detec-tion in WSNs Spatiotemporal correlation exists among sen-sor data in a local region of WSNs and a relatively smallcomponent that is a cluster consisting of a few of sensornodes and a cluster head node is taken into account to clearlydescribe proposed distributed anomaly detection methodbased on ensemble learning Ensemble pruning based onBBO was adopted to optimize the initial trained detectorfor mitigating the resource requirements The optimizedensemble detector was used to identify global anomalousobservations at each individual sensor timely Our proposedmethod is shown as Figure 2
Online anomaly detection method consists of three keyprocedures that is detector training online detecting andonline detector updating From Figure 2 it can be seen thatour proposed method enables each distributed deploymentsensor node to globally judge every new coming observa-tion normal or anomalous in time Distributed detecting isemployed to achieve load (communication computation andstorage) evenly in the network and to prolong the lifetime ofthe whole network
The whole procedure of proposed method is described asfollows
Step 1 Considering the temporal correlation at the certaintime period each sensor node 119904
119894trains a local ensemble
detector using the history dataset collected from a timeinterval In facts using this initial local ensemble detectorthe new coming observation is normal or anomalous can bedetermined locally
Step 2 Each sensor node 119904119894transmits its local ensemble
detector as well as some related parameters such as themaximize value minimum value and mean of trainingdataset to the cluster head node and other member sensornode
Step 3 Cluster head node received the local ensemble detec-tors from its member nodes and combined with its owntrained detector the initial global ensemble detector is built
Step 4 The BBOmethod is introduced in the cluster head toprune the initial global ensemble detector and to obtain anacceptable final ensemble detector
Step 5 The pruned ensemble detector that is final ensembledetector is broadcasted to its each member sensor node foronline global anomaly detection
Step 6 Each sensor node selectively retains the test data foronline update based on the predefined sampling probability119901
International Journal of Distributed Sensor Networks 5
Learning(i) Input training data
(ii) Preprocess training data(iii) Learn from training data(iv) Output the local ensemble detector
Member nodes (MN1)
Learning(i) Input training data
(ii) Preprocess training data(iii) Learn from training data(iv) Output the local ensemble detector
Member nodes (MN2)
Learning(i) Input training data
(ii) Preprocess training data(iii) Learn from training data(iv) Output the local ensemble detector
Member nodes (MNm)
Cluster head node (CH)
Ensemble aggregating
(i) Receive the local ensemble detector(ii) Aggregating
(iii) Output the global ensemble detector
Bit coding mechanism
BBO (ensemble pruning)
Local ensembledetector
Local ensembledetector
Local ensembledetector
Broadcastcommunication
Broadcastcommunication
Broadcastcommunication
Broadcastcommunication
Local ensembledetector
Global initialensemble detector
State matrix Global initial ensemble detector
Figure 2 Distributed ensemble anomaly detection method based on BBO pruning in WSNs
Step 7 Once the updating condition was activated theprocedure of retraining and detector updating was triggered
This method can scale well with increase of number ofnodes in WSNs due to its distributed processing nature Ithas low communication requirements and does not need totransmit any actual observations between cluster head nodeand itsmember sensor node which saves the communicationresource significantly
Next we described some important procedures men-tioned above in detail Further considering the context ofresource constraint of each sensor node in WSNs sometricks are designed to save the communication and memoryrequirements
331 Building the Initial EnsembleDetector An initial ensem-ble detector is constructed by two steps Firstly a numberof base detectors are trained sequentially for each sensornodes in a cluster (including the cluster head node itself)based on the history dataset Because the data distributionmay be changed over time the previous trained detector maybe useless for the future detection Moreover the limitedmemory resource in the sensor node is another constraintto store too many previous detectors In practice accordingto the space of memory resource only the latest multipledetectors are kept to build the initial local ensemble for onesensor node For example to sensor node 119894 the sensed data iscollected anddivided into data chunk based on a time intervalΔ119905 which is determined by the actual monitoring processConsequently each node trains multiple individual detectorsover time In our paper supposing 119899 latest detector is kept fora sensor node if there are119898 nodes in one cluster then totally119899lowast119898 detectors are obtained for the initial ensemble Secondly
each sensor node (including cluster head node) broadcasts its119899 trained detector in the cluster Taking the cluster head as anexample after all (119899lowast(119898minus1)) individual detectors are receivedfrom its member nodes the cluster head combines with its 119899trained detector and the initial ensemble (including 119899 lowast 119898
individual detectors) is built in cluster head nodeMany techniques can be employed for combining the
results of each detector to obtain the final detection resultThe common used method in the literature is the major-ity vote (for classification problem) and weighted average(for regression problem) In our paper the final ensembledetection result can be calculated by (3) where 119908
119894denotes
weight coefficient that is 119908119894= 1 means the simple average
otherwise weighted average In our paper for simplicity thesimple average strategy is employed to combine the finallyresult
119910fin (119909) =1
119899 lowast 119898
119899lowast119898
sum
119894=1119910119894 (119909) lowast 119908119894
(3)
332 Ensemble Pruning Based on BBO Search To miti-gate the expensive communication cost and high memoryrequirement induced by ensemble learning inspired by theprinciple of ldquomany could be better than allrdquo in the ensemblelearning community the ensemble pruning is necessary
Given an initial ensemble anomaly detector 119864 =
AD1AD2 AD
119899lowast119898 AD
119894is a trained anomaly detector
which can test an observation anomalous or not a combi-nation method 119862 and a test dataset 119879 The goal of ensemblepruning is to find an optimalsuboptimal subset 1198641015840 sube 119864which can minimize the generalization error and obtainbetter or at least same detection performance compared to119864 Let 119891
119894119895(119894 = 1 2 119898 119895 = 1 2 119899) be the fitness values
6 International Journal of Distributed Sensor Networks
Input 119864mdashinitial ensemble anomaly detector 119879mdashThe number of maximization iterationOutput 1198641015840mdashfinal ensemble anomaly detectorlowast BBO parameter initialization lowast
Create a random set of habitats (populations) 1198671 1198672 119867
119873
Compute corresponding fitness that is HSI valueslowastOptimization search process lowast
While (119879)Compute immigration rate 120582 and emigration rate 119906 for each habitat based on HSIlowast Migration lowastSelect119867
119894with probability based on 120582
119894
If119867119894is selected
Select119867119895with probability based on 119906
119895
If119867119895is selected
Randomly select a SIV form119867119895
Replace a random SIV in119867119894with one from119867
119895
End ifEnd iflowast Mutation lowastSelect an SIV in119867
119894with probability based on the mutation rate 120578
If119867119894(SIV) is selected
Replace119867119894(SIV) with a randomly generated SIV
End ifRe-compute HSI values119879 = T minus 1
End whilelowast Ensemble pruning lowast
Get the final ensemble of anomaly detector 119864lowast based on the habitats119867119894
lowast with acceptable HSI
Algorithm 1 Ensemble pruning BBO (E T)
of the detecting performance such as true positive rate falsepositive rate accuracy and so on Obviously the fitness value119865 can be defined as (4) based on the results of testing data
119865 =
[
[
[
[
[
[
11989111 11989112 sdot sdot sdot 1198911119899
11989121 11989122 sdot sdot sdot 1198912119899
sdot sdot sdot sdot sdot sdot sdot sdot sdot sdot sdot sdot
1198911198981 1198911198982 sdot sdot sdot 119891
119898119899
]
]
]
]
]
]
(4)
The final fitness function can be defined as
Maximize (
1198731015840
sum
119894=1119898119895=1119899
119891119894119895)
st 1198731015840le 119898 lowast 119899
(5)
Here the problem of ensemble pruning is to find thesubset of 1198641015840 which was composed of part single detectorsFinding the optimized subset requires much heavier andmore delicate computation resources Biogeography-basedoptimization (BBO) is a novel optimization method and isemployed to find out the acceptable set of ensemble Weonly simply present some key information about BBO theinterested reader can be referred to the detailed descriptionin [28]
BBO is a population-based global optimization methodwhich has some common characteristics similar to the
existing evolutionary algorithms (EAs) such as genetic algo-rithm (GA) particle swarm optimization (PSO) and antcolony optimization (ACO) When it was used to search thesolution domain and obtain an optimalsuboptimal solutionsome operators were employed to share information amongsolutions which makes BBO applicable to many problemsthat GA and PSO are used The more distinctive differencebetween BBO and other EAs can be seen in [27 28]
The pseudo-code of ensemble pruning based on BBO canbe described as shown in Algorithm 1 [7] Here 119867 indicateshabit HIS is fitness and SIV (suitability index variable) is asolution feature
333 Some Tricks Designed to Mitigate the CommunicationRequirement In the WSNs the main reason of quick energydepletion is the radio communication among the sensornodes It has been known that the cost of communicationof one bit equals the cost of processing thousands of bitsin sensors [35] This means that the most energy in sensornode is consumed by radio communication rather thancollecting or processing data Consequently reducing thecommunication quantity will decrease the power resourcerequirement and eventually lengthen the lifetime of thewholeWSNs
It is obvious that the aforementioned method has relativehigh communication overhead Each sensor node transmitsits local ensemble detector to the cluster head and the finalpruned global ensemble detector broadcasts back to its each
International Journal of Distributed Sensor Networks 7
Input 1198641015840mdashCurrent pruned ensemble anomaly detector 119901mdashSampling probabilityOutput 119864lowastmdashUpdated pruned ensemble anomaly detector
For each sensor nodeRetain the new observation with probability 119901If buffer is replaced completely by new observations
Train new detector and transmit its summary to cluster head119864lowast = Ensemble Pruning BBO(1198641015840 119879)
Broadcast 119864lowast to its member sensor node for subsequent anomaly detection
Algorithm 2 Online updating (1198641015840 119901)
member sensor nodes In order to relieve communicationburden some skills are used to descend the communicationoverhead
In fact the distributed traininglearning method onlytransmits the summary information of trained local ensembledetector to the cluster head which has significantly decreasedthe communication cost compared to centralized anomalydetectionmanners that sent all trained data to cluster head tobuild detector Besides after the pruned ensemble is obtainedin cluster head node each member sensor node in thiscluster can obtain the pruned ensemble detector from thecluster head node A straightforward method is broadcastingthis pruned ensemble to its member sensor nodes Thisis a common used strategy but it does not make full useof local ensemble detector information and will cost morecommunication resources Here a state matrix 119875 is designedin the cluster head its element 119901
119894119895is defined by formula (6)
to represent each single detector in initial ensemble Theneach local ensemble detector is represented as a bit stringusing one bit for each single detector Detector is included orexcluded from the ensemble detector depending on the valueof the corresponding bit that is 1 denotes this single detectorthat is included in the final ensemble and 0 means it was notincluded
119901119894119895=
1 AD119894119895isin 1198641015840 119894 = 1 119898 119895 = 1 119899
0 otherwise
119875
=
1 2 119894 minus 1 119894 119894 + 1 sdot sdot sdot 119899
1198781
1198782
119878119898
[
[
[
[
[
[
0
1
sdot
0
1
0
sdot
0
sdot sdot sdot
sdot sdot sdot
sdot sdot sdot
sdot sdot sdot
0
1
sdot
1
1
1
sdot
1
1
0
sdot
1
sdot sdot sdot
sdot sdot sdot
sdot sdot sdot
sdot sdot sdot
1
0
sdot
1
]
]
]
]
]
]
(6)
After the pruned procedure is finished the cluster headbroadcasts the statematrix119875 to its member sensor node eachsensor node keeps the single detector whose correspondingvalue of state element equals 1 and it deletes the rest to buildthe pruned ensemble global detector Employing the statematrix can save the energy greatly For example after theensemble pruning is finished 1198731015840 (1198731015840 le 119899 lowast 119898) individualdetectors are broadcast in cluster If matrix 119875 is not used it
will need 4 lowast1198731015840 lowast 119889 bytes communication cost (suppose thatthe individual detector can be represented by 119889 parametersand each parameter needs at least 4 bytes) If matrix 119875 isintroduced each itemofmatrix119875 only needs 1 bit to representan individual detector Consequently only 119898 lowast 1198998 bytes arerequired to broadcast Suppose that one-third of individualdetectors are pruned (ie1198731015840 = 2lowast119899lowast1198983) then (4lowast119899lowast119898lowast
119889 lowast 23)(119898 lowast 1198998) asymp 2133119889 By introducing the ensemblepruning and state matrix the quantity of energy saving incluster head sensor is significant and the lifetime ofWSNs canbe lengthened
334 Online Update and Relearning Distribution changeof sensed dataset occurred possibly and detector updat-ing is necessary Online detector update will accompanya relearning procedure A comprised strategy (ie delayupdating strategy [36]) can cater this situation and savethe computation communication and memory resources tosome extent Simple to say for the new coming observationwhether saving and using it to update the current detectoror not are decided by a sample probability 119901 Some heuristicrules can be employed to guide its value for example if thedynamics is relatively stationary the small 119901 should be usedotherwise the big 119901 should be chosen When the buffer of asensor node is replaced by the new data completely onlineupdate is triggered and new detector is trained The pseudo-codes of algorithm can be described as shown inAlgorithm 2
4 Experimental and Analysis
In this section the dataset data preprocessing methodexperiment results and analysis are described respectivelyExperiments were conducted on a personal PC with IntelCore 2 Duo CPU P7450213GHZ and 4GB memoryThe operating system is Windows 7 professional The dataprocessing is partly on the MATLAB 2010 and the algorithmmentioned in Section 3 was implemented with MicrosoftVisual C++ platform
41 Dataset and Data Preprocessing IBRL datasets [37] wereused in our paper to validate proposed method which wascollected from aWSN deployed in Intel Research Laboratoryat University of Berkeley and commonly used to evaluatethe performance of some existing models for WSNs [3536 38ndash41] This network consists of 54 Mica2Dot sensornodes Figure 3 shows the location of each node of the
8 International Journal of Distributed Sensor Networks
1
2
3
4
5 6
7
8
9
10
11
12
1314
15
16
17
18
19
20
21
22
23
2425
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
4142
43
44
45
46
47
48
49
50
51
52
54
Lab
Server
Quiet
Phon
e
Kitchen
Elec
Copy
Storage
Conference
Office Office
53
Figure 3 Sensor nodes location in the IBRL deployment
deployment (node locations are shown in black hexagonwith their corresponding node IDs) [35] The whole datasetwas collected from 29022004 to 05042004 Four typesof measures data that is light temperature and humidityas well as voltage were collected and those measurementswere recorded in 31 s interval Because these sensors weredeployed inside a lab and the measurement variables hadlittle changes over time (except the light having the suddenchanges due to the irregular nature of this variable andfrequent onoff operation) this dataset was considered a typeof static datasets for many researchers In our experimentsto evaluate our proposed anomaly detection algorithm someartificial anomalies are created by randomly modifying someobservations which is widely used bymany researchers in theliterature [41]
Since our proposed method adopts the cluster structurea cluster (consisting of 4 sensor nodes ie N7 N8 N9and N10) and dataset (collected on 29022004) are chosenThe data distribution can be seen in [7] Here only partobservations (during 000000 amndash075959 am) from eachsensor node are employed to evaluate proposed methodThedata trend is depicted in Figure 4
From Figure 4 an obvious fact is that data distributionin a cluster is almost same which well proved that spatialcorrelation exists Though there are some trivial differencesafter analyzing the dataset carefully the main reason is thatdataset has some missing data points largely due to packetloss which can be further proved from Figure 4 In ourexperiment these missing observations can be interpolatedusing the method described in Section 33 The obvious factis that sudden peakvalley appeared in Figure 4 for eachsensor observation which implies that an interested eventmay occurred
Suppose that 119863 = 119909119894 119910119894 119894 = 1 2 119899 is a dataset
used to train an anomaly detector Here the 119909119894is a vector
with feature values and 119910119894is the label which indicates whether
the given observation is normal or anomalous Because theIBRL dataset regards all its observations as normal someanomaly data points are generated and inserted to evaluatethe performance of our proposed method In the paper anumber of 30 data points of artificial anomalies for eachsensor were injected consecutively in each dataset to calculatethe true positive rate (TPR) false negative rate (FPR) anddetection accuracy (ACC) Without loss of generality theanomalous dataset should follow a distribution very muchdifferent from that of the training dataset but their rangesshould be overlapped as much as possible Besides ananomalous event should be a small probability event fora normal dataset collected by a nonfault sensor node Theanomalies were generated using a normal randomizer withslightly deviate statistical characteristics from the normaldata characteristics [41] The detailed dataset information(including statistical parameters) of selected sensor node ispresented in Table 1
42 Performance EvaluationMetrics and BBO Parameters Inorder to evaluate our proposed method some commonlyused performance evaluation metrics for anomaly detectionare used in our paper such as detection accuracy (ACC) truepositive rate (TPR) and false positivealarm rate (FPR)Theyare described as follows
ACC =
(TP + TN)(TP + TN + FP + FN)
International Journal of Distributed Sensor Networks 9
17
175
18
185
19
195
20Temperature
0 200 400 600 800 1000
N7N8
N9N10
(a)
38
39
40
41
42
43
44
45
46
0 200 400 600 800 1000
Humidity
N7N8
N9N10
(b)
Figure 4 The data (temperature humidity) trend during 00000 amndash75959 am on February 29 2004
Table 1 Detail dataset information of selected sensor node on 29022004
Node Initial sample Mean Variance Injected anomaly Mean Variance119879 119867 119879 119867 119879 119867 119879 119867
N7 823 184154 409176 05238 14494 30 1821 4110 054 146N8 548 179844 417123 05315 14612 30 1775 4195 055 148N9 652 181140 426295 05288 14827 30 1835 4245 055 150N10 620 181144 426215 05244 14191 30 1833 4247 054 143119879 temperature119867 humidity
TPR =
TP(TP + FN)
FPR =
FP(FP + TN)
(7)
where TP means number of samples correctly predicted asanomaly class FP means number of samples incorrectlypredicted as anomaly class TN means number of samplescorrectly predicted as normal class and FN means numberof samples incorrectly predicted as normal class
BBO is employed to prune the initial ensemble and themigration model is same as that present in [27 28] and therelated parameters are set as follows
Habitat (population) size 119878 = 30 the number of SIVs(suitability index variables) in each island 119899 = 20 40 60 80the maximum migration rates 119864 = 1 and 119868 = 1 and themutation rate 120578 = 001 and 120582 120583 are the immigration rateand the emigration rate respectively The elitism parameter120588 = 2
HSI (habitat suitability index) is a fitness function similarto other population-based optimization algorithms HIS isevaluated by 119865-measure (119865-score) which considers both the
precision probability and the recall probability of binaryclassification problem
119865-measure =(1 + 120573
2) precision lowast recall
1205732lowast precision + recall
=
(1 + 1205732) lowast TP
(1 + 1205732) lowast TP + 1205732 lowast FN + FP
(8)
119865-measure can be interpreted as a weighted average of theprecision and recall and its value reaches best at 1 and worstat 0 120573 is a parameter used to adjust the relative importancebetween precision and recall 120573 = 05 1 2 Usually thevalue of 119865-measure is close to the relative small value ofprecision and recall that is the big 119865-measuremeans that theprecision and recall are all big Consequently a good detectoris analogous to a habitat with a high HSI and is included inthe final ensemble detector and a poor detector is analogousto a habitat with a low HIS and is discarded from the finalensemble detector In our paper 120573 = 1 is specified
43 Results Presentation and Discussions In the data miningand machine learning communities SVM-based method has
10 International Journal of Distributed Sensor Networks
Table 2 Detection performance of local ensemble detector
Ensemble size N7 N8 N9 N10ACC TPR FPR ACC TPR FPR ACC TPR FPR ACC TPR FPR
5 08700 05833 01181 07900 03333 01809 08267 05000 01549 08267 05714 0160810 08800 06667 01111 08033 03889 01702 08267 04375 01514 08333 06429 0157315 08900 07500 01042 08167 05000 01631 08433 05000 01373 08600 07143 0132920 08933 08333 01042 08200 05000 01596 08367 05000 01444 08567 07143 01364
Table 3 Detection performance of global ensemble detector [7]
Combined ensemble size N7 N8 N9 N10ACC TPR FPR ACC TPR FPR ACC TPR FPR ACC TPR FPR
20 09467 08333 00486 09300 07778 00603 09467 07500 00423 09500 07857 0042040 09700 07500 00208 09433 08333 00496 09710 08938 00246 09650 08929 0031560 09700 08333 00243 09733 08889 00213 09800 09375 00176 09783 09357 0019680 09817 09583 00174 09800 09444 00177 09767 09375 00211 09780 09714 00217
been widely used in classification problem which separatesthe data belonging to the different classes by fitting a hyper-plane One class SVM based method as a variation of thismethod is especially favored for anomaly detection [42ndash44]In the paper it was used to train the base detectorThe datasetof each sensor node was divided into two parts about 66was used for training the local detector and the remainder asthe test set was to evaluate proposed method
Online Bagging the commonly used ensemble strategywas used to build initial ensemble detector Our experimentsaim to achieve two goals Firstly it is to prove the effectivenessof proposed method based on ensemble learning theorySecondly it is to prove that pruned ensemble detector canobtain better (at least equal) performance compared to initialensemble detector and mitigate the resource requirementAs a result three experiments were done that is localensemble anomaly detector only considering the temporalcorrelation of each sensor node global ensemble anomalydetector considering the spatiotemporal correlation and theglobal pruned ensemble anomaly detector based on BBOThe experimental results can be seen in Tables 2 3 and 4respectively
Table 2 shows the performance of each sensor node underthe different ensemble size which does not take into accountthe spatial correlation of sensed data in a cluster Though theensemble detection performance is becoming ldquogoodrdquo gradualwith the increasing of ensemble size (the higher value ofACCTPR the better performance and the lower value of FPR thebetter performance) the overall performance is relatively lowThemaximumvalue of detection accuracy is only 8933 andmost of true positive rates are unacceptable and most of falsepositive rates (FPR) have a relative high valueAll these resultsindicate that the performance of local ensemble detectoris poor Table 3 shows the global detection performance ofeach sensor node Here after the local ensemble detectorwas trained each member node sent its local ensembleto each other to form the global ensemble detector andeach member node used this global detector to online testthe local observation From the results of Table 3 [7] an
obvious fact is that the detection performances are higherthan presented in Table 2With the help of neighbor detectorthe detection results become better and better correspondingto the increasing of ensemble size
In order to further optimize the proposed algorithmperformance and save the resource ensemble pruning is usedfor global ensemble detector Table 4 [7] shows the result ofdetection performance of pruned global ensemble detectorbased on BBO
Table 4 shows a more practicable result and the sizeof global ensemble decreases sharply while the detectorperformance is as good as or better than the initial globalensemble detector From the results of Table 5 when thesize of initial ensemble reaches 80 the 60 resource costis saved In our experiment only for validating the methodeffectively we set the ensemble sizes 5 10 15 and 20 for eachlocal ensemble detector which may be small for the practicalapplications In fact how many local ensemble detectors arechosen is an open topic and is decided by many factors suchas the computation capability and the communication cost aswell as memory usage of sensor node the expected detectingaccuracy requirement and so on In the practical applicationa trade-off is commonly considered
5 Conclusion and Future Work
After exploiting the spatiotemporal correlation existing inthe sensed data of WSNs and motivated by the advantagesof online ensemble learning a distributed online ensembleanomaly detector method has been proposed Due to thespecific resource constrained in theWSNs ensemble pruningbased on BBO is employed to mitigate the high resourcerequirement and obtain the optimized detector that performsat least as good as the original ones The experimental resultson real dataset demonstrated that our proposed method iseffective
Because the diversity of base learners is a key factorrelated to the performance of ensemble learning as a possibleextension of our work we plan to include some diversity
International Journal of Distributed Sensor Networks 11
Table 4 Detection performance of global ensemble detector based on BBO pruning [7]
Ensemble size(BBO pruned)
N7 N8 N9 N10ACC TPR FPR ACC TPR FPR ACC TPR FPR ACC TPR FPR
14 09480 08000 00458 09327 07667 00567 09500 08125 00423 09533 08571 0042023 09710 07750 00208 09447 08000 00461 09733 09250 00239 09697 09143 0027627 09713 08500 00236 09683 08333 00230 09810 09563 00176 09797 09357 0018232 09820 09750 00177 09750 08333 00160 09820 09500 00162 09830 09786 00168
Table 5 Rate of saving resource cost based on global ensembledetector of BBO pruned
Number Initial ensemblesize
Prunedensemble size
Saving resourcecost
1 20 14 302 40 23 4253 60 27 554 80 32 60
measures in fitness function to improve the detecting per-formance in future Besides the cost of communication isthe main reason of quick energy depletion of sensor nodesespecially for the cluster head the adaptive selection of clusterhead based on energy state will be taken into account tolengthen the lifetime of WSNs in next work
Conflict of Interests
The authors declare that there is no conflict of interestsregarding the publication of this paper
Acknowledgments
This work is supported by the National Key ScientificInstrument and Equipment Development Project(2012YQ15008703) the Zhejiang Provincial Natural ScienceFoundation of China (LY13F020015) the Open Project of TopKeyDiscipline of Computer Software andTheory in ZhejiangProvincial (ZC323014100) National Science Foundation ofChina (61473182) Science and Technology Commission ofShanghai Municipality (11JC1404000 14JC1402200) andShanghai Rising-Star Program (13QA1401600)
References
[1] Y Zhang N Meratnia and P Havinga ldquoOutlier detectiontechniques for wireless sensor networks a surveyrdquo IEEE Com-munications Surveys and Tutorials vol 12 no 2 pp 159ndash1702010
[2] Y Zhang N A S Hamm N Meratnia A Stein M van deVoort and P J M Havinga ldquoStatistics-based outlier detectionfor wireless sensor networksrdquo International Journal of Geo-graphical Information Science vol 26 no 8 pp 1373ndash1392 2012
[3] C Peng and Q-L Han ldquoA novel event-triggered transmissionscheme and L
2control co-design for sampled-data control
systemsrdquo IEEE Transactions on Automatic Control vol 58 no10 pp 2620ndash2626 2013
[4] S Rajasegarar C Leckie M Palaniswami and J C BezdekldquoDistributed anomaly detection in wireless sensor networksrdquo inProceedings of the 10th IEEE Singapore International Conferenceon Communication systems (ICCS rsquo06) pp 1ndash5 IEEE SingaporeOctober 2006
[5] S Rajasegarar C Leckie and M Palaniswami ldquoAnomalydetection in wireless sensor networksrdquo IEEE Wireless Commu-nications vol 15 no 4 pp 34ndash40 2008
[6] M Xie S Han B Tian and S Parvin ldquoAnomaly detectionin wireless sensor networks a surveyrdquo Journal of Network andComputer Applications vol 34 no 4 pp 1302ndash1325 2011
[7] Z Ding M Fei D Du and S Xu ldquoOnline anomaly detectionmethod based on BBO ensemble pruning in wireless sensornetworksrdquo in Life System Modeling and Simulation vol 461 ofCommunications in Computer and Information Science pp 160ndash169 Springer Berlin Germany 2014
[8] T G Dietterich ldquoMachine-learning researchmdashfour currentdirectionsrdquo AI Magazine vol 18 no 4 pp 97ndash136 1997
[9] Z-H Zhou J Wu andW Tang ldquoEnsembling neural networksmany could be better than allrdquoArtificial Intelligence vol 137 no1-2 pp 239ndash263 2002
[10] N Shahid I H Naqvi and S B Qaisar ldquoCharacteristics andclassification of outlier detection techniques for wireless sensornetworks in harsh environments a surveyrdquoArtificial IntelligenceReview vol 137 pp 1ndash36 2012
[11] D Du K Li and M Fei ldquoA fast multi-output RBF neuralnetwork constructionmethodrdquoNeurocomputing vol 73 no 10ndash12 pp 2196ndash2202 2010
[12] P Gil A Santos and A Cardoso ldquoDealing with outliers inwireless sensor networks an oil refinery applicationrdquo IEEETransactions on Control Systems Technology vol 23 no 4 pp1589ndash1596 2014
[13] M A Rassam M A Maarof and A Zainal ldquoAdaptive andonline data anomaly detection for wireless sensor systemsrdquoKnowledge-Based Systems vol 60 pp 44ndash57 2014
[14] S Rajasegarar A Gluhak M Ali Imran et al ldquoEllipsoidalneighbourhood outlier factor for distributed anomaly detectionin resource constrained networksrdquo Pattern Recognition vol 47no 9 pp 2867ndash2879 2014
[15] N Lu G Zhang and J Lu ldquoConcept drift detection viacompetence modelsrdquo Artificial Intelligence vol 209 pp 11ndash282014
[16] L Breiman ldquoBagging predictorsrdquoMachine Learning vol 24 no2 pp 123ndash140 1996
[17] S Seguı L Igual and J Vitria ldquoBagged one-class classifiersin the presence of outliersrdquo International Journal of PatternRecognition and Artificial Intelligence vol 27 no 5 Article ID1350014 2013
12 International Journal of Distributed Sensor Networks
[18] N Duffy and D Helmbold ldquoBoosting methods for regressionrdquoMachine Learning vol 47 no 2-3 pp 153ndash200 2002
[19] W-C Chang and C-W Cho ldquoOnline boosting for vehicledetectionrdquo IEEETransactions on SystemsMan and CyberneticsPart B Cybernetics vol 40 no 3 pp 892ndash902 2010
[20] C Desir S Bernard C Petitjean and L Heutte ldquoOne classrandom forestsrdquo Pattern Recognition vol 46 no 12 pp 3490ndash3506 2013
[21] A Fern and R Givan ldquoOnline ensemble learning an empiricalstudyrdquoMachine Learning vol 53 no 1-2 pp 71ndash109 2003
[22] A Bifet G Holmes B Pfahringer and R Gavalda ldquoImprov-ing adaptive bagging methods for evolving data streamsrdquo inAdvances in Machine Learning vol 5828 of Lecture Notes inComputer Science pp 23ndash37 Springer Berlin Germany 2009
[23] D I Curiac and C Volosencu ldquoEnsemble based sensinganomaly detection in wireless sensor networksrdquo Expert Systemswith Applications vol 39 no 10 pp 9087ndash9096 2012
[24] X Zhou S Li and Z Ye ldquoA novel system anomaly predictionsystem based on belief markov model and ensemble classifica-tionrdquo Mathematical Problems in Engineering vol 2013 ArticleID 179390 10 pages 2013
[25] H He S Chen K Li and X Xu ldquoIncremental learning fromstream datardquo IEEE Transactions on Neural Networks vol 22 no12 pp 1901ndash1914 2011
[26] D Du K Li X Li and M Fei ldquoA novel forward gene selectionalgorithm for microarray datardquo Neurocomputing vol 133 pp446ndash458 2014
[27] H Ma ldquoAn analysis of the equilibrium of migration models forbiogeography-based optimizationrdquo Information Sciences vol180 no 18 pp 3444ndash3464 2010
[28] D Simon ldquoBiogeography-based optimizationrdquo IEEE Transac-tions on Evolutionary Computation vol 12 no 6 pp 702ndash7132008
[29] S Sheen R Anitha and P Sirisha ldquoMalware detection bypruning of parallel ensembles using harmony searchrdquo PatternRecognition Letters vol 34 no 14 pp 1679ndash1686 2013
[30] Y-Y Zhang H-C Chao M Chen L Shu C-H Park and M-S Park ldquoOutlier detection and countermeasure for hierarchicalwireless sensor networksrdquo IET Information Security vol 4 no4 pp 361ndash373 2010
[31] C Peng and M-R Fei ldquoAn improved result on the stability ofuncertain T-S fuzzy systems with interval time-varying delayrdquoFuzzy Sets and Systems vol 212 pp 97ndash109 2013
[32] Y Zhang Observing the Unobservable Distributed Online Out-lier Detection inWireless Sensor Networks University of TwenteEnschede The Netherlands 2010
[33] C Peng D Yue and M Fei ldquoRelaxed stability and stabilizationconditions of networked fuzzy control systems subject toasynchronous grades of membershiprdquo IEEE Transactions onFuzzy Systems vol 22 no 5 pp 1101ndash1112 2014
[34] C Peng M-R Fei E Tian and Y-P Guan ldquoOn hold or dropout-of-order packets in networked control systemsrdquo Informa-tion Sciences vol 268 pp 436ndash446 2014
[35] M A Rassam A Zainal and M A Maarof ldquoAn adaptive andefficient dimension reduction model for multivariate wirelesssensor networks applicationsrdquo Applied Soft Computing Journalvol 13 no 4 pp 1978ndash1996 2013
[36] M Xie J Hu S Han and H-H Chen ldquoScalable hypergridk-NN-based online anomaly detection in wireless sensor net-worksrdquo IEEE Transactions on Parallel and Distributed Systemsvol 24 no 8 pp 1661ndash1670 2013
[37] Intel Berkely Reseach Lab (IBRL) dataset 2004 httpdbcsailmitedulabdatalabdatahtml
[38] J W Branch C Giannella B Szymanski R Wolff and HKargupta ldquoIn-network outlier detection in wireless sensornetworksrdquo Knowledge and Information Systems vol 34 no 1pp 23ndash54 2013
[39] M Moshtaghi T C Havens J C Bezdek et al ldquoClusteringellipses for anomaly detectionrdquo Pattern Recognition vol 44 no1 pp 55ndash69 2011
[40] S Rajasegarar J C Bezdek C Leckie and M PalaniswamildquoElliptical anomalies in wireless sensor networksrdquo ACM Trans-actions on Sensor Networks vol 6 no 1 pp 1ndash28 2009
[41] M A Rassam A Zainal and M A Maarof ldquoOne-classprincipal component classifier for anomaly detection inwirelesssensor networkrdquo in Proceedings of the 4th International Confer-ence on Computational Aspects of Social Networks (CASoN rsquo12)pp 271ndash276 IEEE Sao Carlos Brazil November 2012
[42] H Sagha H Bayati J D R Millan and R Chavarriaga ldquoOn-line anomaly detection and resilience in classifier ensemblesrdquoPattern Recognition Letters vol 34 no 15 pp 1916ndash1927 2013
[43] M Hejazi and Y P Singh ldquoOne-class support vector machinesapproach to anomaly detectionrdquo Applied Artificial Intelligencevol 27 no 5 pp 351ndash366 2013
[44] Y Zhang NMeratnia and P JMHavinga ldquoDistributed onlineoutlier detection in wireless sensor networks using ellipsoidalsupport vector machinerdquo Ad Hoc Networks vol 11 no 3 pp1062ndash1074 2013
International Journal of
AerospaceEngineeringHindawi Publishing Corporationhttpwwwhindawicom Volume 2014
RoboticsJournal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Active and Passive Electronic Components
Control Scienceand Engineering
Journal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
International Journal of
RotatingMachinery
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Hindawi Publishing Corporation httpwwwhindawicom
Journal ofEngineeringVolume 2014
Submit your manuscripts athttpwwwhindawicom
VLSI Design
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Shock and Vibration
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Civil EngineeringAdvances in
Acoustics and VibrationAdvances in
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Electrical and Computer Engineering
Journal of
Advances inOptoElectronics
Hindawi Publishing Corporation httpwwwhindawicom
Volume 2014
The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014
SensorsJournal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Modelling amp Simulation in EngineeringHindawi Publishing Corporation httpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Chemical EngineeringInternational Journal of Antennas and
Propagation
International Journal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Navigation and Observation
International Journal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
DistributedSensor Networks
International Journal of
4 International Journal of Distributed Sensor Networks
measures a data vector at every time interval Δ119905 which iscomposed of multiple attribute values For the cluster headCH119894 the observation is 119883
119894= (119909
119894
1 119909119894
2 119909
119894
119889) where 119889
denotes the dimension For the 119895th neighbor node 119873119894119895 the
observation is 119883119894119895= (119909119894
1198951 119909119894
1198952 119909
119894
119895119889) Nodes in the cluster
collect samples synchronously and our proposed method isto identify these new observations of each sensor node asnormal or anomalous online
32 Spatial and Temporal (Spatiotemporal) Correlation ofSensed Dataset For the sensed dataset in a cluster wedescribed the spatiotemporal correlation firstly which will beused later to build our proposed online ensemble detection
The collected sensor dataset from WSNs is a time seriesdataset A time series is a sequence of value 119883 = 119909(119905) 119905 =
1 119899which follows a nonrandom order and the 119899 consec-utive observation values are collected at same time intervalsAnalyzing and learning from these observations [31] canhelp to understand the data trend over time and build theappropriate detector based on temporal correlation as well asto predict the label of new coming observations
To obtain the detector the foremost requirement is toachieve a stationary time series dataset Some data processingmethods are used to eliminate data trend and obtain a sta-tionary time series dataset such as polynomial fitting movingaverages differencing and double exponential smoothing[32ndash34] Considering the requirement of low computationalcomplexity a simple and efficient nonparametric technique(ie first differencing) is used to eliminate the temporal trendand obtain a stationary time series for dataset collected inWSNs which can be formulated as
1198831015840= 1199091015840(119904 119905) = 119909 (119904 119905) minus 119909 (119904 119905 minus 1) 119905 = 2 3 119899 (1)
Besides the sensor nodes are always deployed denselyand the space redundancy existed A dataset 119883 = 119909(119904) 119904 =
1 119898 is collected from 119898 sensor nodes in a clusterat a timestamp This dataset can help to understand thespatial correlation structure of data and predict the datavalue at a location nearby Spatial data may present thelocal dependencywhich represents the similarity relationshipof observations collected at adjacent locations in a localregion Usually for a specified region the observations of onesensor can be estimated by a linear weighted combination ofobservations collected at its adjacent locations [32] which canbe expressed as
119909 (119904119894) = 1205821119909 (1199041) + sdot sdot sdot + 120582119894minus1119909 (119904119894minus1) + 120582119894+1119909 (119904119894+1)
+ sdot sdot sdot + 120582119898119909 (119904119898)
(2)
where 1199041 119904119894minus1 119904119894+1 119904119898 denotes positions of sensornodes and 1205821 120582119894minus1 120582119894+1 120582119898 denotes the weights ofobservations sum119898
119896=1119896 =119894 120582119896 = 1Consequently for sensed data collected in a local region
two reasonable assumptions are described as follows
(1) The sensed data of adjacent nonfault sensor nodes aresimilar at the same timestamp
(2) The sensed data of adjacent nonfault sensor nodeshave the similar trend over time
Motivated by the two assumptions and ensemble learningtheory a novel anomaly detectionmethod is proposed in thispaper We will give the details in the following section
33 Proposed Ensemble Learning Method of Anomaly Detec-tion in WSNs Spatiotemporal correlation exists among sen-sor data in a local region of WSNs and a relatively smallcomponent that is a cluster consisting of a few of sensornodes and a cluster head node is taken into account to clearlydescribe proposed distributed anomaly detection methodbased on ensemble learning Ensemble pruning based onBBO was adopted to optimize the initial trained detectorfor mitigating the resource requirements The optimizedensemble detector was used to identify global anomalousobservations at each individual sensor timely Our proposedmethod is shown as Figure 2
Online anomaly detection method consists of three keyprocedures that is detector training online detecting andonline detector updating From Figure 2 it can be seen thatour proposed method enables each distributed deploymentsensor node to globally judge every new coming observa-tion normal or anomalous in time Distributed detecting isemployed to achieve load (communication computation andstorage) evenly in the network and to prolong the lifetime ofthe whole network
The whole procedure of proposed method is described asfollows
Step 1 Considering the temporal correlation at the certaintime period each sensor node 119904
119894trains a local ensemble
detector using the history dataset collected from a timeinterval In facts using this initial local ensemble detectorthe new coming observation is normal or anomalous can bedetermined locally
Step 2 Each sensor node 119904119894transmits its local ensemble
detector as well as some related parameters such as themaximize value minimum value and mean of trainingdataset to the cluster head node and other member sensornode
Step 3 Cluster head node received the local ensemble detec-tors from its member nodes and combined with its owntrained detector the initial global ensemble detector is built
Step 4 The BBOmethod is introduced in the cluster head toprune the initial global ensemble detector and to obtain anacceptable final ensemble detector
Step 5 The pruned ensemble detector that is final ensembledetector is broadcasted to its each member sensor node foronline global anomaly detection
Step 6 Each sensor node selectively retains the test data foronline update based on the predefined sampling probability119901
International Journal of Distributed Sensor Networks 5
Learning(i) Input training data
(ii) Preprocess training data(iii) Learn from training data(iv) Output the local ensemble detector
Member nodes (MN1)
Learning(i) Input training data
(ii) Preprocess training data(iii) Learn from training data(iv) Output the local ensemble detector
Member nodes (MN2)
Learning(i) Input training data
(ii) Preprocess training data(iii) Learn from training data(iv) Output the local ensemble detector
Member nodes (MNm)
Cluster head node (CH)
Ensemble aggregating
(i) Receive the local ensemble detector(ii) Aggregating
(iii) Output the global ensemble detector
Bit coding mechanism
BBO (ensemble pruning)
Local ensembledetector
Local ensembledetector
Local ensembledetector
Broadcastcommunication
Broadcastcommunication
Broadcastcommunication
Broadcastcommunication
Local ensembledetector
Global initialensemble detector
State matrix Global initial ensemble detector
Figure 2 Distributed ensemble anomaly detection method based on BBO pruning in WSNs
Step 7 Once the updating condition was activated theprocedure of retraining and detector updating was triggered
This method can scale well with increase of number ofnodes in WSNs due to its distributed processing nature Ithas low communication requirements and does not need totransmit any actual observations between cluster head nodeand itsmember sensor node which saves the communicationresource significantly
Next we described some important procedures men-tioned above in detail Further considering the context ofresource constraint of each sensor node in WSNs sometricks are designed to save the communication and memoryrequirements
331 Building the Initial EnsembleDetector An initial ensem-ble detector is constructed by two steps Firstly a numberof base detectors are trained sequentially for each sensornodes in a cluster (including the cluster head node itself)based on the history dataset Because the data distributionmay be changed over time the previous trained detector maybe useless for the future detection Moreover the limitedmemory resource in the sensor node is another constraintto store too many previous detectors In practice accordingto the space of memory resource only the latest multipledetectors are kept to build the initial local ensemble for onesensor node For example to sensor node 119894 the sensed data iscollected anddivided into data chunk based on a time intervalΔ119905 which is determined by the actual monitoring processConsequently each node trains multiple individual detectorsover time In our paper supposing 119899 latest detector is kept fora sensor node if there are119898 nodes in one cluster then totally119899lowast119898 detectors are obtained for the initial ensemble Secondly
each sensor node (including cluster head node) broadcasts its119899 trained detector in the cluster Taking the cluster head as anexample after all (119899lowast(119898minus1)) individual detectors are receivedfrom its member nodes the cluster head combines with its 119899trained detector and the initial ensemble (including 119899 lowast 119898
individual detectors) is built in cluster head nodeMany techniques can be employed for combining the
results of each detector to obtain the final detection resultThe common used method in the literature is the major-ity vote (for classification problem) and weighted average(for regression problem) In our paper the final ensembledetection result can be calculated by (3) where 119908
119894denotes
weight coefficient that is 119908119894= 1 means the simple average
otherwise weighted average In our paper for simplicity thesimple average strategy is employed to combine the finallyresult
119910fin (119909) =1
119899 lowast 119898
119899lowast119898
sum
119894=1119910119894 (119909) lowast 119908119894
(3)
332 Ensemble Pruning Based on BBO Search To miti-gate the expensive communication cost and high memoryrequirement induced by ensemble learning inspired by theprinciple of ldquomany could be better than allrdquo in the ensemblelearning community the ensemble pruning is necessary
Given an initial ensemble anomaly detector 119864 =
AD1AD2 AD
119899lowast119898 AD
119894is a trained anomaly detector
which can test an observation anomalous or not a combi-nation method 119862 and a test dataset 119879 The goal of ensemblepruning is to find an optimalsuboptimal subset 1198641015840 sube 119864which can minimize the generalization error and obtainbetter or at least same detection performance compared to119864 Let 119891
119894119895(119894 = 1 2 119898 119895 = 1 2 119899) be the fitness values
6 International Journal of Distributed Sensor Networks
Input 119864mdashinitial ensemble anomaly detector 119879mdashThe number of maximization iterationOutput 1198641015840mdashfinal ensemble anomaly detectorlowast BBO parameter initialization lowast
Create a random set of habitats (populations) 1198671 1198672 119867
119873
Compute corresponding fitness that is HSI valueslowastOptimization search process lowast
While (119879)Compute immigration rate 120582 and emigration rate 119906 for each habitat based on HSIlowast Migration lowastSelect119867
119894with probability based on 120582
119894
If119867119894is selected
Select119867119895with probability based on 119906
119895
If119867119895is selected
Randomly select a SIV form119867119895
Replace a random SIV in119867119894with one from119867
119895
End ifEnd iflowast Mutation lowastSelect an SIV in119867
119894with probability based on the mutation rate 120578
If119867119894(SIV) is selected
Replace119867119894(SIV) with a randomly generated SIV
End ifRe-compute HSI values119879 = T minus 1
End whilelowast Ensemble pruning lowast
Get the final ensemble of anomaly detector 119864lowast based on the habitats119867119894
lowast with acceptable HSI
Algorithm 1 Ensemble pruning BBO (E T)
of the detecting performance such as true positive rate falsepositive rate accuracy and so on Obviously the fitness value119865 can be defined as (4) based on the results of testing data
119865 =
[
[
[
[
[
[
11989111 11989112 sdot sdot sdot 1198911119899
11989121 11989122 sdot sdot sdot 1198912119899
sdot sdot sdot sdot sdot sdot sdot sdot sdot sdot sdot sdot
1198911198981 1198911198982 sdot sdot sdot 119891
119898119899
]
]
]
]
]
]
(4)
The final fitness function can be defined as
Maximize (
1198731015840
sum
119894=1119898119895=1119899
119891119894119895)
st 1198731015840le 119898 lowast 119899
(5)
Here the problem of ensemble pruning is to find thesubset of 1198641015840 which was composed of part single detectorsFinding the optimized subset requires much heavier andmore delicate computation resources Biogeography-basedoptimization (BBO) is a novel optimization method and isemployed to find out the acceptable set of ensemble Weonly simply present some key information about BBO theinterested reader can be referred to the detailed descriptionin [28]
BBO is a population-based global optimization methodwhich has some common characteristics similar to the
existing evolutionary algorithms (EAs) such as genetic algo-rithm (GA) particle swarm optimization (PSO) and antcolony optimization (ACO) When it was used to search thesolution domain and obtain an optimalsuboptimal solutionsome operators were employed to share information amongsolutions which makes BBO applicable to many problemsthat GA and PSO are used The more distinctive differencebetween BBO and other EAs can be seen in [27 28]
The pseudo-code of ensemble pruning based on BBO canbe described as shown in Algorithm 1 [7] Here 119867 indicateshabit HIS is fitness and SIV (suitability index variable) is asolution feature
333 Some Tricks Designed to Mitigate the CommunicationRequirement In the WSNs the main reason of quick energydepletion is the radio communication among the sensornodes It has been known that the cost of communicationof one bit equals the cost of processing thousands of bitsin sensors [35] This means that the most energy in sensornode is consumed by radio communication rather thancollecting or processing data Consequently reducing thecommunication quantity will decrease the power resourcerequirement and eventually lengthen the lifetime of thewholeWSNs
It is obvious that the aforementioned method has relativehigh communication overhead Each sensor node transmitsits local ensemble detector to the cluster head and the finalpruned global ensemble detector broadcasts back to its each
International Journal of Distributed Sensor Networks 7
Input 1198641015840mdashCurrent pruned ensemble anomaly detector 119901mdashSampling probabilityOutput 119864lowastmdashUpdated pruned ensemble anomaly detector
For each sensor nodeRetain the new observation with probability 119901If buffer is replaced completely by new observations
Train new detector and transmit its summary to cluster head119864lowast = Ensemble Pruning BBO(1198641015840 119879)
Broadcast 119864lowast to its member sensor node for subsequent anomaly detection
Algorithm 2 Online updating (1198641015840 119901)
member sensor nodes In order to relieve communicationburden some skills are used to descend the communicationoverhead
In fact the distributed traininglearning method onlytransmits the summary information of trained local ensembledetector to the cluster head which has significantly decreasedthe communication cost compared to centralized anomalydetectionmanners that sent all trained data to cluster head tobuild detector Besides after the pruned ensemble is obtainedin cluster head node each member sensor node in thiscluster can obtain the pruned ensemble detector from thecluster head node A straightforward method is broadcastingthis pruned ensemble to its member sensor nodes Thisis a common used strategy but it does not make full useof local ensemble detector information and will cost morecommunication resources Here a state matrix 119875 is designedin the cluster head its element 119901
119894119895is defined by formula (6)
to represent each single detector in initial ensemble Theneach local ensemble detector is represented as a bit stringusing one bit for each single detector Detector is included orexcluded from the ensemble detector depending on the valueof the corresponding bit that is 1 denotes this single detectorthat is included in the final ensemble and 0 means it was notincluded
119901119894119895=
1 AD119894119895isin 1198641015840 119894 = 1 119898 119895 = 1 119899
0 otherwise
119875
=
1 2 119894 minus 1 119894 119894 + 1 sdot sdot sdot 119899
1198781
1198782
119878119898
[
[
[
[
[
[
0
1
sdot
0
1
0
sdot
0
sdot sdot sdot
sdot sdot sdot
sdot sdot sdot
sdot sdot sdot
0
1
sdot
1
1
1
sdot
1
1
0
sdot
1
sdot sdot sdot
sdot sdot sdot
sdot sdot sdot
sdot sdot sdot
1
0
sdot
1
]
]
]
]
]
]
(6)
After the pruned procedure is finished the cluster headbroadcasts the statematrix119875 to its member sensor node eachsensor node keeps the single detector whose correspondingvalue of state element equals 1 and it deletes the rest to buildthe pruned ensemble global detector Employing the statematrix can save the energy greatly For example after theensemble pruning is finished 1198731015840 (1198731015840 le 119899 lowast 119898) individualdetectors are broadcast in cluster If matrix 119875 is not used it
will need 4 lowast1198731015840 lowast 119889 bytes communication cost (suppose thatthe individual detector can be represented by 119889 parametersand each parameter needs at least 4 bytes) If matrix 119875 isintroduced each itemofmatrix119875 only needs 1 bit to representan individual detector Consequently only 119898 lowast 1198998 bytes arerequired to broadcast Suppose that one-third of individualdetectors are pruned (ie1198731015840 = 2lowast119899lowast1198983) then (4lowast119899lowast119898lowast
119889 lowast 23)(119898 lowast 1198998) asymp 2133119889 By introducing the ensemblepruning and state matrix the quantity of energy saving incluster head sensor is significant and the lifetime ofWSNs canbe lengthened
334 Online Update and Relearning Distribution changeof sensed dataset occurred possibly and detector updat-ing is necessary Online detector update will accompanya relearning procedure A comprised strategy (ie delayupdating strategy [36]) can cater this situation and savethe computation communication and memory resources tosome extent Simple to say for the new coming observationwhether saving and using it to update the current detectoror not are decided by a sample probability 119901 Some heuristicrules can be employed to guide its value for example if thedynamics is relatively stationary the small 119901 should be usedotherwise the big 119901 should be chosen When the buffer of asensor node is replaced by the new data completely onlineupdate is triggered and new detector is trained The pseudo-codes of algorithm can be described as shown inAlgorithm 2
4 Experimental and Analysis
In this section the dataset data preprocessing methodexperiment results and analysis are described respectivelyExperiments were conducted on a personal PC with IntelCore 2 Duo CPU P7450213GHZ and 4GB memoryThe operating system is Windows 7 professional The dataprocessing is partly on the MATLAB 2010 and the algorithmmentioned in Section 3 was implemented with MicrosoftVisual C++ platform
41 Dataset and Data Preprocessing IBRL datasets [37] wereused in our paper to validate proposed method which wascollected from aWSN deployed in Intel Research Laboratoryat University of Berkeley and commonly used to evaluatethe performance of some existing models for WSNs [3536 38ndash41] This network consists of 54 Mica2Dot sensornodes Figure 3 shows the location of each node of the
8 International Journal of Distributed Sensor Networks
1
2
3
4
5 6
7
8
9
10
11
12
1314
15
16
17
18
19
20
21
22
23
2425
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
4142
43
44
45
46
47
48
49
50
51
52
54
Lab
Server
Quiet
Phon
e
Kitchen
Elec
Copy
Storage
Conference
Office Office
53
Figure 3 Sensor nodes location in the IBRL deployment
deployment (node locations are shown in black hexagonwith their corresponding node IDs) [35] The whole datasetwas collected from 29022004 to 05042004 Four typesof measures data that is light temperature and humidityas well as voltage were collected and those measurementswere recorded in 31 s interval Because these sensors weredeployed inside a lab and the measurement variables hadlittle changes over time (except the light having the suddenchanges due to the irregular nature of this variable andfrequent onoff operation) this dataset was considered a typeof static datasets for many researchers In our experimentsto evaluate our proposed anomaly detection algorithm someartificial anomalies are created by randomly modifying someobservations which is widely used bymany researchers in theliterature [41]
Since our proposed method adopts the cluster structurea cluster (consisting of 4 sensor nodes ie N7 N8 N9and N10) and dataset (collected on 29022004) are chosenThe data distribution can be seen in [7] Here only partobservations (during 000000 amndash075959 am) from eachsensor node are employed to evaluate proposed methodThedata trend is depicted in Figure 4
From Figure 4 an obvious fact is that data distributionin a cluster is almost same which well proved that spatialcorrelation exists Though there are some trivial differencesafter analyzing the dataset carefully the main reason is thatdataset has some missing data points largely due to packetloss which can be further proved from Figure 4 In ourexperiment these missing observations can be interpolatedusing the method described in Section 33 The obvious factis that sudden peakvalley appeared in Figure 4 for eachsensor observation which implies that an interested eventmay occurred
Suppose that 119863 = 119909119894 119910119894 119894 = 1 2 119899 is a dataset
used to train an anomaly detector Here the 119909119894is a vector
with feature values and 119910119894is the label which indicates whether
the given observation is normal or anomalous Because theIBRL dataset regards all its observations as normal someanomaly data points are generated and inserted to evaluatethe performance of our proposed method In the paper anumber of 30 data points of artificial anomalies for eachsensor were injected consecutively in each dataset to calculatethe true positive rate (TPR) false negative rate (FPR) anddetection accuracy (ACC) Without loss of generality theanomalous dataset should follow a distribution very muchdifferent from that of the training dataset but their rangesshould be overlapped as much as possible Besides ananomalous event should be a small probability event fora normal dataset collected by a nonfault sensor node Theanomalies were generated using a normal randomizer withslightly deviate statistical characteristics from the normaldata characteristics [41] The detailed dataset information(including statistical parameters) of selected sensor node ispresented in Table 1
42 Performance EvaluationMetrics and BBO Parameters Inorder to evaluate our proposed method some commonlyused performance evaluation metrics for anomaly detectionare used in our paper such as detection accuracy (ACC) truepositive rate (TPR) and false positivealarm rate (FPR)Theyare described as follows
ACC =
(TP + TN)(TP + TN + FP + FN)
International Journal of Distributed Sensor Networks 9
17
175
18
185
19
195
20Temperature
0 200 400 600 800 1000
N7N8
N9N10
(a)
38
39
40
41
42
43
44
45
46
0 200 400 600 800 1000
Humidity
N7N8
N9N10
(b)
Figure 4 The data (temperature humidity) trend during 00000 amndash75959 am on February 29 2004
Table 1 Detail dataset information of selected sensor node on 29022004
Node Initial sample Mean Variance Injected anomaly Mean Variance119879 119867 119879 119867 119879 119867 119879 119867
N7 823 184154 409176 05238 14494 30 1821 4110 054 146N8 548 179844 417123 05315 14612 30 1775 4195 055 148N9 652 181140 426295 05288 14827 30 1835 4245 055 150N10 620 181144 426215 05244 14191 30 1833 4247 054 143119879 temperature119867 humidity
TPR =
TP(TP + FN)
FPR =
FP(FP + TN)
(7)
where TP means number of samples correctly predicted asanomaly class FP means number of samples incorrectlypredicted as anomaly class TN means number of samplescorrectly predicted as normal class and FN means numberof samples incorrectly predicted as normal class
BBO is employed to prune the initial ensemble and themigration model is same as that present in [27 28] and therelated parameters are set as follows
Habitat (population) size 119878 = 30 the number of SIVs(suitability index variables) in each island 119899 = 20 40 60 80the maximum migration rates 119864 = 1 and 119868 = 1 and themutation rate 120578 = 001 and 120582 120583 are the immigration rateand the emigration rate respectively The elitism parameter120588 = 2
HSI (habitat suitability index) is a fitness function similarto other population-based optimization algorithms HIS isevaluated by 119865-measure (119865-score) which considers both the
precision probability and the recall probability of binaryclassification problem
119865-measure =(1 + 120573
2) precision lowast recall
1205732lowast precision + recall
=
(1 + 1205732) lowast TP
(1 + 1205732) lowast TP + 1205732 lowast FN + FP
(8)
119865-measure can be interpreted as a weighted average of theprecision and recall and its value reaches best at 1 and worstat 0 120573 is a parameter used to adjust the relative importancebetween precision and recall 120573 = 05 1 2 Usually thevalue of 119865-measure is close to the relative small value ofprecision and recall that is the big 119865-measuremeans that theprecision and recall are all big Consequently a good detectoris analogous to a habitat with a high HSI and is included inthe final ensemble detector and a poor detector is analogousto a habitat with a low HIS and is discarded from the finalensemble detector In our paper 120573 = 1 is specified
43 Results Presentation and Discussions In the data miningand machine learning communities SVM-based method has
10 International Journal of Distributed Sensor Networks
Table 2 Detection performance of local ensemble detector
Ensemble size N7 N8 N9 N10ACC TPR FPR ACC TPR FPR ACC TPR FPR ACC TPR FPR
5 08700 05833 01181 07900 03333 01809 08267 05000 01549 08267 05714 0160810 08800 06667 01111 08033 03889 01702 08267 04375 01514 08333 06429 0157315 08900 07500 01042 08167 05000 01631 08433 05000 01373 08600 07143 0132920 08933 08333 01042 08200 05000 01596 08367 05000 01444 08567 07143 01364
Table 3 Detection performance of global ensemble detector [7]
Combined ensemble size N7 N8 N9 N10ACC TPR FPR ACC TPR FPR ACC TPR FPR ACC TPR FPR
20 09467 08333 00486 09300 07778 00603 09467 07500 00423 09500 07857 0042040 09700 07500 00208 09433 08333 00496 09710 08938 00246 09650 08929 0031560 09700 08333 00243 09733 08889 00213 09800 09375 00176 09783 09357 0019680 09817 09583 00174 09800 09444 00177 09767 09375 00211 09780 09714 00217
been widely used in classification problem which separatesthe data belonging to the different classes by fitting a hyper-plane One class SVM based method as a variation of thismethod is especially favored for anomaly detection [42ndash44]In the paper it was used to train the base detectorThe datasetof each sensor node was divided into two parts about 66was used for training the local detector and the remainder asthe test set was to evaluate proposed method
Online Bagging the commonly used ensemble strategywas used to build initial ensemble detector Our experimentsaim to achieve two goals Firstly it is to prove the effectivenessof proposed method based on ensemble learning theorySecondly it is to prove that pruned ensemble detector canobtain better (at least equal) performance compared to initialensemble detector and mitigate the resource requirementAs a result three experiments were done that is localensemble anomaly detector only considering the temporalcorrelation of each sensor node global ensemble anomalydetector considering the spatiotemporal correlation and theglobal pruned ensemble anomaly detector based on BBOThe experimental results can be seen in Tables 2 3 and 4respectively
Table 2 shows the performance of each sensor node underthe different ensemble size which does not take into accountthe spatial correlation of sensed data in a cluster Though theensemble detection performance is becoming ldquogoodrdquo gradualwith the increasing of ensemble size (the higher value ofACCTPR the better performance and the lower value of FPR thebetter performance) the overall performance is relatively lowThemaximumvalue of detection accuracy is only 8933 andmost of true positive rates are unacceptable and most of falsepositive rates (FPR) have a relative high valueAll these resultsindicate that the performance of local ensemble detectoris poor Table 3 shows the global detection performance ofeach sensor node Here after the local ensemble detectorwas trained each member node sent its local ensembleto each other to form the global ensemble detector andeach member node used this global detector to online testthe local observation From the results of Table 3 [7] an
obvious fact is that the detection performances are higherthan presented in Table 2With the help of neighbor detectorthe detection results become better and better correspondingto the increasing of ensemble size
In order to further optimize the proposed algorithmperformance and save the resource ensemble pruning is usedfor global ensemble detector Table 4 [7] shows the result ofdetection performance of pruned global ensemble detectorbased on BBO
Table 4 shows a more practicable result and the sizeof global ensemble decreases sharply while the detectorperformance is as good as or better than the initial globalensemble detector From the results of Table 5 when thesize of initial ensemble reaches 80 the 60 resource costis saved In our experiment only for validating the methodeffectively we set the ensemble sizes 5 10 15 and 20 for eachlocal ensemble detector which may be small for the practicalapplications In fact how many local ensemble detectors arechosen is an open topic and is decided by many factors suchas the computation capability and the communication cost aswell as memory usage of sensor node the expected detectingaccuracy requirement and so on In the practical applicationa trade-off is commonly considered
5 Conclusion and Future Work
After exploiting the spatiotemporal correlation existing inthe sensed data of WSNs and motivated by the advantagesof online ensemble learning a distributed online ensembleanomaly detector method has been proposed Due to thespecific resource constrained in theWSNs ensemble pruningbased on BBO is employed to mitigate the high resourcerequirement and obtain the optimized detector that performsat least as good as the original ones The experimental resultson real dataset demonstrated that our proposed method iseffective
Because the diversity of base learners is a key factorrelated to the performance of ensemble learning as a possibleextension of our work we plan to include some diversity
International Journal of Distributed Sensor Networks 11
Table 4 Detection performance of global ensemble detector based on BBO pruning [7]
Ensemble size(BBO pruned)
N7 N8 N9 N10ACC TPR FPR ACC TPR FPR ACC TPR FPR ACC TPR FPR
14 09480 08000 00458 09327 07667 00567 09500 08125 00423 09533 08571 0042023 09710 07750 00208 09447 08000 00461 09733 09250 00239 09697 09143 0027627 09713 08500 00236 09683 08333 00230 09810 09563 00176 09797 09357 0018232 09820 09750 00177 09750 08333 00160 09820 09500 00162 09830 09786 00168
Table 5 Rate of saving resource cost based on global ensembledetector of BBO pruned
Number Initial ensemblesize
Prunedensemble size
Saving resourcecost
1 20 14 302 40 23 4253 60 27 554 80 32 60
measures in fitness function to improve the detecting per-formance in future Besides the cost of communication isthe main reason of quick energy depletion of sensor nodesespecially for the cluster head the adaptive selection of clusterhead based on energy state will be taken into account tolengthen the lifetime of WSNs in next work
Conflict of Interests
The authors declare that there is no conflict of interestsregarding the publication of this paper
Acknowledgments
This work is supported by the National Key ScientificInstrument and Equipment Development Project(2012YQ15008703) the Zhejiang Provincial Natural ScienceFoundation of China (LY13F020015) the Open Project of TopKeyDiscipline of Computer Software andTheory in ZhejiangProvincial (ZC323014100) National Science Foundation ofChina (61473182) Science and Technology Commission ofShanghai Municipality (11JC1404000 14JC1402200) andShanghai Rising-Star Program (13QA1401600)
References
[1] Y Zhang N Meratnia and P Havinga ldquoOutlier detectiontechniques for wireless sensor networks a surveyrdquo IEEE Com-munications Surveys and Tutorials vol 12 no 2 pp 159ndash1702010
[2] Y Zhang N A S Hamm N Meratnia A Stein M van deVoort and P J M Havinga ldquoStatistics-based outlier detectionfor wireless sensor networksrdquo International Journal of Geo-graphical Information Science vol 26 no 8 pp 1373ndash1392 2012
[3] C Peng and Q-L Han ldquoA novel event-triggered transmissionscheme and L
2control co-design for sampled-data control
systemsrdquo IEEE Transactions on Automatic Control vol 58 no10 pp 2620ndash2626 2013
[4] S Rajasegarar C Leckie M Palaniswami and J C BezdekldquoDistributed anomaly detection in wireless sensor networksrdquo inProceedings of the 10th IEEE Singapore International Conferenceon Communication systems (ICCS rsquo06) pp 1ndash5 IEEE SingaporeOctober 2006
[5] S Rajasegarar C Leckie and M Palaniswami ldquoAnomalydetection in wireless sensor networksrdquo IEEE Wireless Commu-nications vol 15 no 4 pp 34ndash40 2008
[6] M Xie S Han B Tian and S Parvin ldquoAnomaly detectionin wireless sensor networks a surveyrdquo Journal of Network andComputer Applications vol 34 no 4 pp 1302ndash1325 2011
[7] Z Ding M Fei D Du and S Xu ldquoOnline anomaly detectionmethod based on BBO ensemble pruning in wireless sensornetworksrdquo in Life System Modeling and Simulation vol 461 ofCommunications in Computer and Information Science pp 160ndash169 Springer Berlin Germany 2014
[8] T G Dietterich ldquoMachine-learning researchmdashfour currentdirectionsrdquo AI Magazine vol 18 no 4 pp 97ndash136 1997
[9] Z-H Zhou J Wu andW Tang ldquoEnsembling neural networksmany could be better than allrdquoArtificial Intelligence vol 137 no1-2 pp 239ndash263 2002
[10] N Shahid I H Naqvi and S B Qaisar ldquoCharacteristics andclassification of outlier detection techniques for wireless sensornetworks in harsh environments a surveyrdquoArtificial IntelligenceReview vol 137 pp 1ndash36 2012
[11] D Du K Li and M Fei ldquoA fast multi-output RBF neuralnetwork constructionmethodrdquoNeurocomputing vol 73 no 10ndash12 pp 2196ndash2202 2010
[12] P Gil A Santos and A Cardoso ldquoDealing with outliers inwireless sensor networks an oil refinery applicationrdquo IEEETransactions on Control Systems Technology vol 23 no 4 pp1589ndash1596 2014
[13] M A Rassam M A Maarof and A Zainal ldquoAdaptive andonline data anomaly detection for wireless sensor systemsrdquoKnowledge-Based Systems vol 60 pp 44ndash57 2014
[14] S Rajasegarar A Gluhak M Ali Imran et al ldquoEllipsoidalneighbourhood outlier factor for distributed anomaly detectionin resource constrained networksrdquo Pattern Recognition vol 47no 9 pp 2867ndash2879 2014
[15] N Lu G Zhang and J Lu ldquoConcept drift detection viacompetence modelsrdquo Artificial Intelligence vol 209 pp 11ndash282014
[16] L Breiman ldquoBagging predictorsrdquoMachine Learning vol 24 no2 pp 123ndash140 1996
[17] S Seguı L Igual and J Vitria ldquoBagged one-class classifiersin the presence of outliersrdquo International Journal of PatternRecognition and Artificial Intelligence vol 27 no 5 Article ID1350014 2013
12 International Journal of Distributed Sensor Networks
[18] N Duffy and D Helmbold ldquoBoosting methods for regressionrdquoMachine Learning vol 47 no 2-3 pp 153ndash200 2002
[19] W-C Chang and C-W Cho ldquoOnline boosting for vehicledetectionrdquo IEEETransactions on SystemsMan and CyberneticsPart B Cybernetics vol 40 no 3 pp 892ndash902 2010
[20] C Desir S Bernard C Petitjean and L Heutte ldquoOne classrandom forestsrdquo Pattern Recognition vol 46 no 12 pp 3490ndash3506 2013
[21] A Fern and R Givan ldquoOnline ensemble learning an empiricalstudyrdquoMachine Learning vol 53 no 1-2 pp 71ndash109 2003
[22] A Bifet G Holmes B Pfahringer and R Gavalda ldquoImprov-ing adaptive bagging methods for evolving data streamsrdquo inAdvances in Machine Learning vol 5828 of Lecture Notes inComputer Science pp 23ndash37 Springer Berlin Germany 2009
[23] D I Curiac and C Volosencu ldquoEnsemble based sensinganomaly detection in wireless sensor networksrdquo Expert Systemswith Applications vol 39 no 10 pp 9087ndash9096 2012
[24] X Zhou S Li and Z Ye ldquoA novel system anomaly predictionsystem based on belief markov model and ensemble classifica-tionrdquo Mathematical Problems in Engineering vol 2013 ArticleID 179390 10 pages 2013
[25] H He S Chen K Li and X Xu ldquoIncremental learning fromstream datardquo IEEE Transactions on Neural Networks vol 22 no12 pp 1901ndash1914 2011
[26] D Du K Li X Li and M Fei ldquoA novel forward gene selectionalgorithm for microarray datardquo Neurocomputing vol 133 pp446ndash458 2014
[27] H Ma ldquoAn analysis of the equilibrium of migration models forbiogeography-based optimizationrdquo Information Sciences vol180 no 18 pp 3444ndash3464 2010
[28] D Simon ldquoBiogeography-based optimizationrdquo IEEE Transac-tions on Evolutionary Computation vol 12 no 6 pp 702ndash7132008
[29] S Sheen R Anitha and P Sirisha ldquoMalware detection bypruning of parallel ensembles using harmony searchrdquo PatternRecognition Letters vol 34 no 14 pp 1679ndash1686 2013
[30] Y-Y Zhang H-C Chao M Chen L Shu C-H Park and M-S Park ldquoOutlier detection and countermeasure for hierarchicalwireless sensor networksrdquo IET Information Security vol 4 no4 pp 361ndash373 2010
[31] C Peng and M-R Fei ldquoAn improved result on the stability ofuncertain T-S fuzzy systems with interval time-varying delayrdquoFuzzy Sets and Systems vol 212 pp 97ndash109 2013
[32] Y Zhang Observing the Unobservable Distributed Online Out-lier Detection inWireless Sensor Networks University of TwenteEnschede The Netherlands 2010
[33] C Peng D Yue and M Fei ldquoRelaxed stability and stabilizationconditions of networked fuzzy control systems subject toasynchronous grades of membershiprdquo IEEE Transactions onFuzzy Systems vol 22 no 5 pp 1101ndash1112 2014
[34] C Peng M-R Fei E Tian and Y-P Guan ldquoOn hold or dropout-of-order packets in networked control systemsrdquo Informa-tion Sciences vol 268 pp 436ndash446 2014
[35] M A Rassam A Zainal and M A Maarof ldquoAn adaptive andefficient dimension reduction model for multivariate wirelesssensor networks applicationsrdquo Applied Soft Computing Journalvol 13 no 4 pp 1978ndash1996 2013
[36] M Xie J Hu S Han and H-H Chen ldquoScalable hypergridk-NN-based online anomaly detection in wireless sensor net-worksrdquo IEEE Transactions on Parallel and Distributed Systemsvol 24 no 8 pp 1661ndash1670 2013
[37] Intel Berkely Reseach Lab (IBRL) dataset 2004 httpdbcsailmitedulabdatalabdatahtml
[38] J W Branch C Giannella B Szymanski R Wolff and HKargupta ldquoIn-network outlier detection in wireless sensornetworksrdquo Knowledge and Information Systems vol 34 no 1pp 23ndash54 2013
[39] M Moshtaghi T C Havens J C Bezdek et al ldquoClusteringellipses for anomaly detectionrdquo Pattern Recognition vol 44 no1 pp 55ndash69 2011
[40] S Rajasegarar J C Bezdek C Leckie and M PalaniswamildquoElliptical anomalies in wireless sensor networksrdquo ACM Trans-actions on Sensor Networks vol 6 no 1 pp 1ndash28 2009
[41] M A Rassam A Zainal and M A Maarof ldquoOne-classprincipal component classifier for anomaly detection inwirelesssensor networkrdquo in Proceedings of the 4th International Confer-ence on Computational Aspects of Social Networks (CASoN rsquo12)pp 271ndash276 IEEE Sao Carlos Brazil November 2012
[42] H Sagha H Bayati J D R Millan and R Chavarriaga ldquoOn-line anomaly detection and resilience in classifier ensemblesrdquoPattern Recognition Letters vol 34 no 15 pp 1916ndash1927 2013
[43] M Hejazi and Y P Singh ldquoOne-class support vector machinesapproach to anomaly detectionrdquo Applied Artificial Intelligencevol 27 no 5 pp 351ndash366 2013
[44] Y Zhang NMeratnia and P JMHavinga ldquoDistributed onlineoutlier detection in wireless sensor networks using ellipsoidalsupport vector machinerdquo Ad Hoc Networks vol 11 no 3 pp1062ndash1074 2013
International Journal of
AerospaceEngineeringHindawi Publishing Corporationhttpwwwhindawicom Volume 2014
RoboticsJournal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Active and Passive Electronic Components
Control Scienceand Engineering
Journal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
International Journal of
RotatingMachinery
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Hindawi Publishing Corporation httpwwwhindawicom
Journal ofEngineeringVolume 2014
Submit your manuscripts athttpwwwhindawicom
VLSI Design
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Shock and Vibration
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Civil EngineeringAdvances in
Acoustics and VibrationAdvances in
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Electrical and Computer Engineering
Journal of
Advances inOptoElectronics
Hindawi Publishing Corporation httpwwwhindawicom
Volume 2014
The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014
SensorsJournal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Modelling amp Simulation in EngineeringHindawi Publishing Corporation httpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Chemical EngineeringInternational Journal of Antennas and
Propagation
International Journal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Navigation and Observation
International Journal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
DistributedSensor Networks
International Journal of
International Journal of Distributed Sensor Networks 5
Learning(i) Input training data
(ii) Preprocess training data(iii) Learn from training data(iv) Output the local ensemble detector
Member nodes (MN1)
Learning(i) Input training data
(ii) Preprocess training data(iii) Learn from training data(iv) Output the local ensemble detector
Member nodes (MN2)
Learning(i) Input training data
(ii) Preprocess training data(iii) Learn from training data(iv) Output the local ensemble detector
Member nodes (MNm)
Cluster head node (CH)
Ensemble aggregating
(i) Receive the local ensemble detector(ii) Aggregating
(iii) Output the global ensemble detector
Bit coding mechanism
BBO (ensemble pruning)
Local ensembledetector
Local ensembledetector
Local ensembledetector
Broadcastcommunication
Broadcastcommunication
Broadcastcommunication
Broadcastcommunication
Local ensembledetector
Global initialensemble detector
State matrix Global initial ensemble detector
Figure 2 Distributed ensemble anomaly detection method based on BBO pruning in WSNs
Step 7 Once the updating condition was activated theprocedure of retraining and detector updating was triggered
This method can scale well with increase of number ofnodes in WSNs due to its distributed processing nature Ithas low communication requirements and does not need totransmit any actual observations between cluster head nodeand itsmember sensor node which saves the communicationresource significantly
Next we described some important procedures men-tioned above in detail Further considering the context ofresource constraint of each sensor node in WSNs sometricks are designed to save the communication and memoryrequirements
331 Building the Initial EnsembleDetector An initial ensem-ble detector is constructed by two steps Firstly a numberof base detectors are trained sequentially for each sensornodes in a cluster (including the cluster head node itself)based on the history dataset Because the data distributionmay be changed over time the previous trained detector maybe useless for the future detection Moreover the limitedmemory resource in the sensor node is another constraintto store too many previous detectors In practice accordingto the space of memory resource only the latest multipledetectors are kept to build the initial local ensemble for onesensor node For example to sensor node 119894 the sensed data iscollected anddivided into data chunk based on a time intervalΔ119905 which is determined by the actual monitoring processConsequently each node trains multiple individual detectorsover time In our paper supposing 119899 latest detector is kept fora sensor node if there are119898 nodes in one cluster then totally119899lowast119898 detectors are obtained for the initial ensemble Secondly
each sensor node (including cluster head node) broadcasts its119899 trained detector in the cluster Taking the cluster head as anexample after all (119899lowast(119898minus1)) individual detectors are receivedfrom its member nodes the cluster head combines with its 119899trained detector and the initial ensemble (including 119899 lowast 119898
individual detectors) is built in cluster head nodeMany techniques can be employed for combining the
results of each detector to obtain the final detection resultThe common used method in the literature is the major-ity vote (for classification problem) and weighted average(for regression problem) In our paper the final ensembledetection result can be calculated by (3) where 119908
119894denotes
weight coefficient that is 119908119894= 1 means the simple average
otherwise weighted average In our paper for simplicity thesimple average strategy is employed to combine the finallyresult
119910fin (119909) =1
119899 lowast 119898
119899lowast119898
sum
119894=1119910119894 (119909) lowast 119908119894
(3)
332 Ensemble Pruning Based on BBO Search To miti-gate the expensive communication cost and high memoryrequirement induced by ensemble learning inspired by theprinciple of ldquomany could be better than allrdquo in the ensemblelearning community the ensemble pruning is necessary
Given an initial ensemble anomaly detector 119864 =
AD1AD2 AD
119899lowast119898 AD
119894is a trained anomaly detector
which can test an observation anomalous or not a combi-nation method 119862 and a test dataset 119879 The goal of ensemblepruning is to find an optimalsuboptimal subset 1198641015840 sube 119864which can minimize the generalization error and obtainbetter or at least same detection performance compared to119864 Let 119891
119894119895(119894 = 1 2 119898 119895 = 1 2 119899) be the fitness values
6 International Journal of Distributed Sensor Networks
Input 119864mdashinitial ensemble anomaly detector 119879mdashThe number of maximization iterationOutput 1198641015840mdashfinal ensemble anomaly detectorlowast BBO parameter initialization lowast
Create a random set of habitats (populations) 1198671 1198672 119867
119873
Compute corresponding fitness that is HSI valueslowastOptimization search process lowast
While (119879)Compute immigration rate 120582 and emigration rate 119906 for each habitat based on HSIlowast Migration lowastSelect119867
119894with probability based on 120582
119894
If119867119894is selected
Select119867119895with probability based on 119906
119895
If119867119895is selected
Randomly select a SIV form119867119895
Replace a random SIV in119867119894with one from119867
119895
End ifEnd iflowast Mutation lowastSelect an SIV in119867
119894with probability based on the mutation rate 120578
If119867119894(SIV) is selected
Replace119867119894(SIV) with a randomly generated SIV
End ifRe-compute HSI values119879 = T minus 1
End whilelowast Ensemble pruning lowast
Get the final ensemble of anomaly detector 119864lowast based on the habitats119867119894
lowast with acceptable HSI
Algorithm 1 Ensemble pruning BBO (E T)
of the detecting performance such as true positive rate falsepositive rate accuracy and so on Obviously the fitness value119865 can be defined as (4) based on the results of testing data
119865 =
[
[
[
[
[
[
11989111 11989112 sdot sdot sdot 1198911119899
11989121 11989122 sdot sdot sdot 1198912119899
sdot sdot sdot sdot sdot sdot sdot sdot sdot sdot sdot sdot
1198911198981 1198911198982 sdot sdot sdot 119891
119898119899
]
]
]
]
]
]
(4)
The final fitness function can be defined as
Maximize (
1198731015840
sum
119894=1119898119895=1119899
119891119894119895)
st 1198731015840le 119898 lowast 119899
(5)
Here the problem of ensemble pruning is to find thesubset of 1198641015840 which was composed of part single detectorsFinding the optimized subset requires much heavier andmore delicate computation resources Biogeography-basedoptimization (BBO) is a novel optimization method and isemployed to find out the acceptable set of ensemble Weonly simply present some key information about BBO theinterested reader can be referred to the detailed descriptionin [28]
BBO is a population-based global optimization methodwhich has some common characteristics similar to the
existing evolutionary algorithms (EAs) such as genetic algo-rithm (GA) particle swarm optimization (PSO) and antcolony optimization (ACO) When it was used to search thesolution domain and obtain an optimalsuboptimal solutionsome operators were employed to share information amongsolutions which makes BBO applicable to many problemsthat GA and PSO are used The more distinctive differencebetween BBO and other EAs can be seen in [27 28]
The pseudo-code of ensemble pruning based on BBO canbe described as shown in Algorithm 1 [7] Here 119867 indicateshabit HIS is fitness and SIV (suitability index variable) is asolution feature
333 Some Tricks Designed to Mitigate the CommunicationRequirement In the WSNs the main reason of quick energydepletion is the radio communication among the sensornodes It has been known that the cost of communicationof one bit equals the cost of processing thousands of bitsin sensors [35] This means that the most energy in sensornode is consumed by radio communication rather thancollecting or processing data Consequently reducing thecommunication quantity will decrease the power resourcerequirement and eventually lengthen the lifetime of thewholeWSNs
It is obvious that the aforementioned method has relativehigh communication overhead Each sensor node transmitsits local ensemble detector to the cluster head and the finalpruned global ensemble detector broadcasts back to its each
International Journal of Distributed Sensor Networks 7
Input 1198641015840mdashCurrent pruned ensemble anomaly detector 119901mdashSampling probabilityOutput 119864lowastmdashUpdated pruned ensemble anomaly detector
For each sensor nodeRetain the new observation with probability 119901If buffer is replaced completely by new observations
Train new detector and transmit its summary to cluster head119864lowast = Ensemble Pruning BBO(1198641015840 119879)
Broadcast 119864lowast to its member sensor node for subsequent anomaly detection
Algorithm 2 Online updating (1198641015840 119901)
member sensor nodes In order to relieve communicationburden some skills are used to descend the communicationoverhead
In fact the distributed traininglearning method onlytransmits the summary information of trained local ensembledetector to the cluster head which has significantly decreasedthe communication cost compared to centralized anomalydetectionmanners that sent all trained data to cluster head tobuild detector Besides after the pruned ensemble is obtainedin cluster head node each member sensor node in thiscluster can obtain the pruned ensemble detector from thecluster head node A straightforward method is broadcastingthis pruned ensemble to its member sensor nodes Thisis a common used strategy but it does not make full useof local ensemble detector information and will cost morecommunication resources Here a state matrix 119875 is designedin the cluster head its element 119901
119894119895is defined by formula (6)
to represent each single detector in initial ensemble Theneach local ensemble detector is represented as a bit stringusing one bit for each single detector Detector is included orexcluded from the ensemble detector depending on the valueof the corresponding bit that is 1 denotes this single detectorthat is included in the final ensemble and 0 means it was notincluded
119901119894119895=
1 AD119894119895isin 1198641015840 119894 = 1 119898 119895 = 1 119899
0 otherwise
119875
=
1 2 119894 minus 1 119894 119894 + 1 sdot sdot sdot 119899
1198781
1198782
119878119898
[
[
[
[
[
[
0
1
sdot
0
1
0
sdot
0
sdot sdot sdot
sdot sdot sdot
sdot sdot sdot
sdot sdot sdot
0
1
sdot
1
1
1
sdot
1
1
0
sdot
1
sdot sdot sdot
sdot sdot sdot
sdot sdot sdot
sdot sdot sdot
1
0
sdot
1
]
]
]
]
]
]
(6)
After the pruned procedure is finished the cluster headbroadcasts the statematrix119875 to its member sensor node eachsensor node keeps the single detector whose correspondingvalue of state element equals 1 and it deletes the rest to buildthe pruned ensemble global detector Employing the statematrix can save the energy greatly For example after theensemble pruning is finished 1198731015840 (1198731015840 le 119899 lowast 119898) individualdetectors are broadcast in cluster If matrix 119875 is not used it
will need 4 lowast1198731015840 lowast 119889 bytes communication cost (suppose thatthe individual detector can be represented by 119889 parametersand each parameter needs at least 4 bytes) If matrix 119875 isintroduced each itemofmatrix119875 only needs 1 bit to representan individual detector Consequently only 119898 lowast 1198998 bytes arerequired to broadcast Suppose that one-third of individualdetectors are pruned (ie1198731015840 = 2lowast119899lowast1198983) then (4lowast119899lowast119898lowast
119889 lowast 23)(119898 lowast 1198998) asymp 2133119889 By introducing the ensemblepruning and state matrix the quantity of energy saving incluster head sensor is significant and the lifetime ofWSNs canbe lengthened
334 Online Update and Relearning Distribution changeof sensed dataset occurred possibly and detector updat-ing is necessary Online detector update will accompanya relearning procedure A comprised strategy (ie delayupdating strategy [36]) can cater this situation and savethe computation communication and memory resources tosome extent Simple to say for the new coming observationwhether saving and using it to update the current detectoror not are decided by a sample probability 119901 Some heuristicrules can be employed to guide its value for example if thedynamics is relatively stationary the small 119901 should be usedotherwise the big 119901 should be chosen When the buffer of asensor node is replaced by the new data completely onlineupdate is triggered and new detector is trained The pseudo-codes of algorithm can be described as shown inAlgorithm 2
4 Experimental and Analysis
In this section the dataset data preprocessing methodexperiment results and analysis are described respectivelyExperiments were conducted on a personal PC with IntelCore 2 Duo CPU P7450213GHZ and 4GB memoryThe operating system is Windows 7 professional The dataprocessing is partly on the MATLAB 2010 and the algorithmmentioned in Section 3 was implemented with MicrosoftVisual C++ platform
41 Dataset and Data Preprocessing IBRL datasets [37] wereused in our paper to validate proposed method which wascollected from aWSN deployed in Intel Research Laboratoryat University of Berkeley and commonly used to evaluatethe performance of some existing models for WSNs [3536 38ndash41] This network consists of 54 Mica2Dot sensornodes Figure 3 shows the location of each node of the
8 International Journal of Distributed Sensor Networks
1
2
3
4
5 6
7
8
9
10
11
12
1314
15
16
17
18
19
20
21
22
23
2425
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
4142
43
44
45
46
47
48
49
50
51
52
54
Lab
Server
Quiet
Phon
e
Kitchen
Elec
Copy
Storage
Conference
Office Office
53
Figure 3 Sensor nodes location in the IBRL deployment
deployment (node locations are shown in black hexagonwith their corresponding node IDs) [35] The whole datasetwas collected from 29022004 to 05042004 Four typesof measures data that is light temperature and humidityas well as voltage were collected and those measurementswere recorded in 31 s interval Because these sensors weredeployed inside a lab and the measurement variables hadlittle changes over time (except the light having the suddenchanges due to the irregular nature of this variable andfrequent onoff operation) this dataset was considered a typeof static datasets for many researchers In our experimentsto evaluate our proposed anomaly detection algorithm someartificial anomalies are created by randomly modifying someobservations which is widely used bymany researchers in theliterature [41]
Since our proposed method adopts the cluster structurea cluster (consisting of 4 sensor nodes ie N7 N8 N9and N10) and dataset (collected on 29022004) are chosenThe data distribution can be seen in [7] Here only partobservations (during 000000 amndash075959 am) from eachsensor node are employed to evaluate proposed methodThedata trend is depicted in Figure 4
From Figure 4 an obvious fact is that data distributionin a cluster is almost same which well proved that spatialcorrelation exists Though there are some trivial differencesafter analyzing the dataset carefully the main reason is thatdataset has some missing data points largely due to packetloss which can be further proved from Figure 4 In ourexperiment these missing observations can be interpolatedusing the method described in Section 33 The obvious factis that sudden peakvalley appeared in Figure 4 for eachsensor observation which implies that an interested eventmay occurred
Suppose that 119863 = 119909119894 119910119894 119894 = 1 2 119899 is a dataset
used to train an anomaly detector Here the 119909119894is a vector
with feature values and 119910119894is the label which indicates whether
the given observation is normal or anomalous Because theIBRL dataset regards all its observations as normal someanomaly data points are generated and inserted to evaluatethe performance of our proposed method In the paper anumber of 30 data points of artificial anomalies for eachsensor were injected consecutively in each dataset to calculatethe true positive rate (TPR) false negative rate (FPR) anddetection accuracy (ACC) Without loss of generality theanomalous dataset should follow a distribution very muchdifferent from that of the training dataset but their rangesshould be overlapped as much as possible Besides ananomalous event should be a small probability event fora normal dataset collected by a nonfault sensor node Theanomalies were generated using a normal randomizer withslightly deviate statistical characteristics from the normaldata characteristics [41] The detailed dataset information(including statistical parameters) of selected sensor node ispresented in Table 1
42 Performance EvaluationMetrics and BBO Parameters Inorder to evaluate our proposed method some commonlyused performance evaluation metrics for anomaly detectionare used in our paper such as detection accuracy (ACC) truepositive rate (TPR) and false positivealarm rate (FPR)Theyare described as follows
ACC =
(TP + TN)(TP + TN + FP + FN)
International Journal of Distributed Sensor Networks 9
17
175
18
185
19
195
20Temperature
0 200 400 600 800 1000
N7N8
N9N10
(a)
38
39
40
41
42
43
44
45
46
0 200 400 600 800 1000
Humidity
N7N8
N9N10
(b)
Figure 4 The data (temperature humidity) trend during 00000 amndash75959 am on February 29 2004
Table 1 Detail dataset information of selected sensor node on 29022004
Node Initial sample Mean Variance Injected anomaly Mean Variance119879 119867 119879 119867 119879 119867 119879 119867
N7 823 184154 409176 05238 14494 30 1821 4110 054 146N8 548 179844 417123 05315 14612 30 1775 4195 055 148N9 652 181140 426295 05288 14827 30 1835 4245 055 150N10 620 181144 426215 05244 14191 30 1833 4247 054 143119879 temperature119867 humidity
TPR =
TP(TP + FN)
FPR =
FP(FP + TN)
(7)
where TP means number of samples correctly predicted asanomaly class FP means number of samples incorrectlypredicted as anomaly class TN means number of samplescorrectly predicted as normal class and FN means numberof samples incorrectly predicted as normal class
BBO is employed to prune the initial ensemble and themigration model is same as that present in [27 28] and therelated parameters are set as follows
Habitat (population) size 119878 = 30 the number of SIVs(suitability index variables) in each island 119899 = 20 40 60 80the maximum migration rates 119864 = 1 and 119868 = 1 and themutation rate 120578 = 001 and 120582 120583 are the immigration rateand the emigration rate respectively The elitism parameter120588 = 2
HSI (habitat suitability index) is a fitness function similarto other population-based optimization algorithms HIS isevaluated by 119865-measure (119865-score) which considers both the
precision probability and the recall probability of binaryclassification problem
119865-measure =(1 + 120573
2) precision lowast recall
1205732lowast precision + recall
=
(1 + 1205732) lowast TP
(1 + 1205732) lowast TP + 1205732 lowast FN + FP
(8)
119865-measure can be interpreted as a weighted average of theprecision and recall and its value reaches best at 1 and worstat 0 120573 is a parameter used to adjust the relative importancebetween precision and recall 120573 = 05 1 2 Usually thevalue of 119865-measure is close to the relative small value ofprecision and recall that is the big 119865-measuremeans that theprecision and recall are all big Consequently a good detectoris analogous to a habitat with a high HSI and is included inthe final ensemble detector and a poor detector is analogousto a habitat with a low HIS and is discarded from the finalensemble detector In our paper 120573 = 1 is specified
43 Results Presentation and Discussions In the data miningand machine learning communities SVM-based method has
10 International Journal of Distributed Sensor Networks
Table 2 Detection performance of local ensemble detector
Ensemble size N7 N8 N9 N10ACC TPR FPR ACC TPR FPR ACC TPR FPR ACC TPR FPR
5 08700 05833 01181 07900 03333 01809 08267 05000 01549 08267 05714 0160810 08800 06667 01111 08033 03889 01702 08267 04375 01514 08333 06429 0157315 08900 07500 01042 08167 05000 01631 08433 05000 01373 08600 07143 0132920 08933 08333 01042 08200 05000 01596 08367 05000 01444 08567 07143 01364
Table 3 Detection performance of global ensemble detector [7]
Combined ensemble size N7 N8 N9 N10ACC TPR FPR ACC TPR FPR ACC TPR FPR ACC TPR FPR
20 09467 08333 00486 09300 07778 00603 09467 07500 00423 09500 07857 0042040 09700 07500 00208 09433 08333 00496 09710 08938 00246 09650 08929 0031560 09700 08333 00243 09733 08889 00213 09800 09375 00176 09783 09357 0019680 09817 09583 00174 09800 09444 00177 09767 09375 00211 09780 09714 00217
been widely used in classification problem which separatesthe data belonging to the different classes by fitting a hyper-plane One class SVM based method as a variation of thismethod is especially favored for anomaly detection [42ndash44]In the paper it was used to train the base detectorThe datasetof each sensor node was divided into two parts about 66was used for training the local detector and the remainder asthe test set was to evaluate proposed method
Online Bagging the commonly used ensemble strategywas used to build initial ensemble detector Our experimentsaim to achieve two goals Firstly it is to prove the effectivenessof proposed method based on ensemble learning theorySecondly it is to prove that pruned ensemble detector canobtain better (at least equal) performance compared to initialensemble detector and mitigate the resource requirementAs a result three experiments were done that is localensemble anomaly detector only considering the temporalcorrelation of each sensor node global ensemble anomalydetector considering the spatiotemporal correlation and theglobal pruned ensemble anomaly detector based on BBOThe experimental results can be seen in Tables 2 3 and 4respectively
Table 2 shows the performance of each sensor node underthe different ensemble size which does not take into accountthe spatial correlation of sensed data in a cluster Though theensemble detection performance is becoming ldquogoodrdquo gradualwith the increasing of ensemble size (the higher value ofACCTPR the better performance and the lower value of FPR thebetter performance) the overall performance is relatively lowThemaximumvalue of detection accuracy is only 8933 andmost of true positive rates are unacceptable and most of falsepositive rates (FPR) have a relative high valueAll these resultsindicate that the performance of local ensemble detectoris poor Table 3 shows the global detection performance ofeach sensor node Here after the local ensemble detectorwas trained each member node sent its local ensembleto each other to form the global ensemble detector andeach member node used this global detector to online testthe local observation From the results of Table 3 [7] an
obvious fact is that the detection performances are higherthan presented in Table 2With the help of neighbor detectorthe detection results become better and better correspondingto the increasing of ensemble size
In order to further optimize the proposed algorithmperformance and save the resource ensemble pruning is usedfor global ensemble detector Table 4 [7] shows the result ofdetection performance of pruned global ensemble detectorbased on BBO
Table 4 shows a more practicable result and the sizeof global ensemble decreases sharply while the detectorperformance is as good as or better than the initial globalensemble detector From the results of Table 5 when thesize of initial ensemble reaches 80 the 60 resource costis saved In our experiment only for validating the methodeffectively we set the ensemble sizes 5 10 15 and 20 for eachlocal ensemble detector which may be small for the practicalapplications In fact how many local ensemble detectors arechosen is an open topic and is decided by many factors suchas the computation capability and the communication cost aswell as memory usage of sensor node the expected detectingaccuracy requirement and so on In the practical applicationa trade-off is commonly considered
5 Conclusion and Future Work
After exploiting the spatiotemporal correlation existing inthe sensed data of WSNs and motivated by the advantagesof online ensemble learning a distributed online ensembleanomaly detector method has been proposed Due to thespecific resource constrained in theWSNs ensemble pruningbased on BBO is employed to mitigate the high resourcerequirement and obtain the optimized detector that performsat least as good as the original ones The experimental resultson real dataset demonstrated that our proposed method iseffective
Because the diversity of base learners is a key factorrelated to the performance of ensemble learning as a possibleextension of our work we plan to include some diversity
International Journal of Distributed Sensor Networks 11
Table 4 Detection performance of global ensemble detector based on BBO pruning [7]
Ensemble size(BBO pruned)
N7 N8 N9 N10ACC TPR FPR ACC TPR FPR ACC TPR FPR ACC TPR FPR
14 09480 08000 00458 09327 07667 00567 09500 08125 00423 09533 08571 0042023 09710 07750 00208 09447 08000 00461 09733 09250 00239 09697 09143 0027627 09713 08500 00236 09683 08333 00230 09810 09563 00176 09797 09357 0018232 09820 09750 00177 09750 08333 00160 09820 09500 00162 09830 09786 00168
Table 5 Rate of saving resource cost based on global ensembledetector of BBO pruned
Number Initial ensemblesize
Prunedensemble size
Saving resourcecost
1 20 14 302 40 23 4253 60 27 554 80 32 60
measures in fitness function to improve the detecting per-formance in future Besides the cost of communication isthe main reason of quick energy depletion of sensor nodesespecially for the cluster head the adaptive selection of clusterhead based on energy state will be taken into account tolengthen the lifetime of WSNs in next work
Conflict of Interests
The authors declare that there is no conflict of interestsregarding the publication of this paper
Acknowledgments
This work is supported by the National Key ScientificInstrument and Equipment Development Project(2012YQ15008703) the Zhejiang Provincial Natural ScienceFoundation of China (LY13F020015) the Open Project of TopKeyDiscipline of Computer Software andTheory in ZhejiangProvincial (ZC323014100) National Science Foundation ofChina (61473182) Science and Technology Commission ofShanghai Municipality (11JC1404000 14JC1402200) andShanghai Rising-Star Program (13QA1401600)
References
[1] Y Zhang N Meratnia and P Havinga ldquoOutlier detectiontechniques for wireless sensor networks a surveyrdquo IEEE Com-munications Surveys and Tutorials vol 12 no 2 pp 159ndash1702010
[2] Y Zhang N A S Hamm N Meratnia A Stein M van deVoort and P J M Havinga ldquoStatistics-based outlier detectionfor wireless sensor networksrdquo International Journal of Geo-graphical Information Science vol 26 no 8 pp 1373ndash1392 2012
[3] C Peng and Q-L Han ldquoA novel event-triggered transmissionscheme and L
2control co-design for sampled-data control
systemsrdquo IEEE Transactions on Automatic Control vol 58 no10 pp 2620ndash2626 2013
[4] S Rajasegarar C Leckie M Palaniswami and J C BezdekldquoDistributed anomaly detection in wireless sensor networksrdquo inProceedings of the 10th IEEE Singapore International Conferenceon Communication systems (ICCS rsquo06) pp 1ndash5 IEEE SingaporeOctober 2006
[5] S Rajasegarar C Leckie and M Palaniswami ldquoAnomalydetection in wireless sensor networksrdquo IEEE Wireless Commu-nications vol 15 no 4 pp 34ndash40 2008
[6] M Xie S Han B Tian and S Parvin ldquoAnomaly detectionin wireless sensor networks a surveyrdquo Journal of Network andComputer Applications vol 34 no 4 pp 1302ndash1325 2011
[7] Z Ding M Fei D Du and S Xu ldquoOnline anomaly detectionmethod based on BBO ensemble pruning in wireless sensornetworksrdquo in Life System Modeling and Simulation vol 461 ofCommunications in Computer and Information Science pp 160ndash169 Springer Berlin Germany 2014
[8] T G Dietterich ldquoMachine-learning researchmdashfour currentdirectionsrdquo AI Magazine vol 18 no 4 pp 97ndash136 1997
[9] Z-H Zhou J Wu andW Tang ldquoEnsembling neural networksmany could be better than allrdquoArtificial Intelligence vol 137 no1-2 pp 239ndash263 2002
[10] N Shahid I H Naqvi and S B Qaisar ldquoCharacteristics andclassification of outlier detection techniques for wireless sensornetworks in harsh environments a surveyrdquoArtificial IntelligenceReview vol 137 pp 1ndash36 2012
[11] D Du K Li and M Fei ldquoA fast multi-output RBF neuralnetwork constructionmethodrdquoNeurocomputing vol 73 no 10ndash12 pp 2196ndash2202 2010
[12] P Gil A Santos and A Cardoso ldquoDealing with outliers inwireless sensor networks an oil refinery applicationrdquo IEEETransactions on Control Systems Technology vol 23 no 4 pp1589ndash1596 2014
[13] M A Rassam M A Maarof and A Zainal ldquoAdaptive andonline data anomaly detection for wireless sensor systemsrdquoKnowledge-Based Systems vol 60 pp 44ndash57 2014
[14] S Rajasegarar A Gluhak M Ali Imran et al ldquoEllipsoidalneighbourhood outlier factor for distributed anomaly detectionin resource constrained networksrdquo Pattern Recognition vol 47no 9 pp 2867ndash2879 2014
[15] N Lu G Zhang and J Lu ldquoConcept drift detection viacompetence modelsrdquo Artificial Intelligence vol 209 pp 11ndash282014
[16] L Breiman ldquoBagging predictorsrdquoMachine Learning vol 24 no2 pp 123ndash140 1996
[17] S Seguı L Igual and J Vitria ldquoBagged one-class classifiersin the presence of outliersrdquo International Journal of PatternRecognition and Artificial Intelligence vol 27 no 5 Article ID1350014 2013
12 International Journal of Distributed Sensor Networks
[18] N Duffy and D Helmbold ldquoBoosting methods for regressionrdquoMachine Learning vol 47 no 2-3 pp 153ndash200 2002
[19] W-C Chang and C-W Cho ldquoOnline boosting for vehicledetectionrdquo IEEETransactions on SystemsMan and CyberneticsPart B Cybernetics vol 40 no 3 pp 892ndash902 2010
[20] C Desir S Bernard C Petitjean and L Heutte ldquoOne classrandom forestsrdquo Pattern Recognition vol 46 no 12 pp 3490ndash3506 2013
[21] A Fern and R Givan ldquoOnline ensemble learning an empiricalstudyrdquoMachine Learning vol 53 no 1-2 pp 71ndash109 2003
[22] A Bifet G Holmes B Pfahringer and R Gavalda ldquoImprov-ing adaptive bagging methods for evolving data streamsrdquo inAdvances in Machine Learning vol 5828 of Lecture Notes inComputer Science pp 23ndash37 Springer Berlin Germany 2009
[23] D I Curiac and C Volosencu ldquoEnsemble based sensinganomaly detection in wireless sensor networksrdquo Expert Systemswith Applications vol 39 no 10 pp 9087ndash9096 2012
[24] X Zhou S Li and Z Ye ldquoA novel system anomaly predictionsystem based on belief markov model and ensemble classifica-tionrdquo Mathematical Problems in Engineering vol 2013 ArticleID 179390 10 pages 2013
[25] H He S Chen K Li and X Xu ldquoIncremental learning fromstream datardquo IEEE Transactions on Neural Networks vol 22 no12 pp 1901ndash1914 2011
[26] D Du K Li X Li and M Fei ldquoA novel forward gene selectionalgorithm for microarray datardquo Neurocomputing vol 133 pp446ndash458 2014
[27] H Ma ldquoAn analysis of the equilibrium of migration models forbiogeography-based optimizationrdquo Information Sciences vol180 no 18 pp 3444ndash3464 2010
[28] D Simon ldquoBiogeography-based optimizationrdquo IEEE Transac-tions on Evolutionary Computation vol 12 no 6 pp 702ndash7132008
[29] S Sheen R Anitha and P Sirisha ldquoMalware detection bypruning of parallel ensembles using harmony searchrdquo PatternRecognition Letters vol 34 no 14 pp 1679ndash1686 2013
[30] Y-Y Zhang H-C Chao M Chen L Shu C-H Park and M-S Park ldquoOutlier detection and countermeasure for hierarchicalwireless sensor networksrdquo IET Information Security vol 4 no4 pp 361ndash373 2010
[31] C Peng and M-R Fei ldquoAn improved result on the stability ofuncertain T-S fuzzy systems with interval time-varying delayrdquoFuzzy Sets and Systems vol 212 pp 97ndash109 2013
[32] Y Zhang Observing the Unobservable Distributed Online Out-lier Detection inWireless Sensor Networks University of TwenteEnschede The Netherlands 2010
[33] C Peng D Yue and M Fei ldquoRelaxed stability and stabilizationconditions of networked fuzzy control systems subject toasynchronous grades of membershiprdquo IEEE Transactions onFuzzy Systems vol 22 no 5 pp 1101ndash1112 2014
[34] C Peng M-R Fei E Tian and Y-P Guan ldquoOn hold or dropout-of-order packets in networked control systemsrdquo Informa-tion Sciences vol 268 pp 436ndash446 2014
[35] M A Rassam A Zainal and M A Maarof ldquoAn adaptive andefficient dimension reduction model for multivariate wirelesssensor networks applicationsrdquo Applied Soft Computing Journalvol 13 no 4 pp 1978ndash1996 2013
[36] M Xie J Hu S Han and H-H Chen ldquoScalable hypergridk-NN-based online anomaly detection in wireless sensor net-worksrdquo IEEE Transactions on Parallel and Distributed Systemsvol 24 no 8 pp 1661ndash1670 2013
[37] Intel Berkely Reseach Lab (IBRL) dataset 2004 httpdbcsailmitedulabdatalabdatahtml
[38] J W Branch C Giannella B Szymanski R Wolff and HKargupta ldquoIn-network outlier detection in wireless sensornetworksrdquo Knowledge and Information Systems vol 34 no 1pp 23ndash54 2013
[39] M Moshtaghi T C Havens J C Bezdek et al ldquoClusteringellipses for anomaly detectionrdquo Pattern Recognition vol 44 no1 pp 55ndash69 2011
[40] S Rajasegarar J C Bezdek C Leckie and M PalaniswamildquoElliptical anomalies in wireless sensor networksrdquo ACM Trans-actions on Sensor Networks vol 6 no 1 pp 1ndash28 2009
[41] M A Rassam A Zainal and M A Maarof ldquoOne-classprincipal component classifier for anomaly detection inwirelesssensor networkrdquo in Proceedings of the 4th International Confer-ence on Computational Aspects of Social Networks (CASoN rsquo12)pp 271ndash276 IEEE Sao Carlos Brazil November 2012
[42] H Sagha H Bayati J D R Millan and R Chavarriaga ldquoOn-line anomaly detection and resilience in classifier ensemblesrdquoPattern Recognition Letters vol 34 no 15 pp 1916ndash1927 2013
[43] M Hejazi and Y P Singh ldquoOne-class support vector machinesapproach to anomaly detectionrdquo Applied Artificial Intelligencevol 27 no 5 pp 351ndash366 2013
[44] Y Zhang NMeratnia and P JMHavinga ldquoDistributed onlineoutlier detection in wireless sensor networks using ellipsoidalsupport vector machinerdquo Ad Hoc Networks vol 11 no 3 pp1062ndash1074 2013
International Journal of
AerospaceEngineeringHindawi Publishing Corporationhttpwwwhindawicom Volume 2014
RoboticsJournal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Active and Passive Electronic Components
Control Scienceand Engineering
Journal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
International Journal of
RotatingMachinery
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Hindawi Publishing Corporation httpwwwhindawicom
Journal ofEngineeringVolume 2014
Submit your manuscripts athttpwwwhindawicom
VLSI Design
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Shock and Vibration
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Civil EngineeringAdvances in
Acoustics and VibrationAdvances in
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Electrical and Computer Engineering
Journal of
Advances inOptoElectronics
Hindawi Publishing Corporation httpwwwhindawicom
Volume 2014
The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014
SensorsJournal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Modelling amp Simulation in EngineeringHindawi Publishing Corporation httpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Chemical EngineeringInternational Journal of Antennas and
Propagation
International Journal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Navigation and Observation
International Journal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
DistributedSensor Networks
International Journal of
6 International Journal of Distributed Sensor Networks
Input 119864mdashinitial ensemble anomaly detector 119879mdashThe number of maximization iterationOutput 1198641015840mdashfinal ensemble anomaly detectorlowast BBO parameter initialization lowast
Create a random set of habitats (populations) 1198671 1198672 119867
119873
Compute corresponding fitness that is HSI valueslowastOptimization search process lowast
While (119879)Compute immigration rate 120582 and emigration rate 119906 for each habitat based on HSIlowast Migration lowastSelect119867
119894with probability based on 120582
119894
If119867119894is selected
Select119867119895with probability based on 119906
119895
If119867119895is selected
Randomly select a SIV form119867119895
Replace a random SIV in119867119894with one from119867
119895
End ifEnd iflowast Mutation lowastSelect an SIV in119867
119894with probability based on the mutation rate 120578
If119867119894(SIV) is selected
Replace119867119894(SIV) with a randomly generated SIV
End ifRe-compute HSI values119879 = T minus 1
End whilelowast Ensemble pruning lowast
Get the final ensemble of anomaly detector 119864lowast based on the habitats119867119894
lowast with acceptable HSI
Algorithm 1 Ensemble pruning BBO (E T)
of the detecting performance such as true positive rate falsepositive rate accuracy and so on Obviously the fitness value119865 can be defined as (4) based on the results of testing data
119865 =
[
[
[
[
[
[
11989111 11989112 sdot sdot sdot 1198911119899
11989121 11989122 sdot sdot sdot 1198912119899
sdot sdot sdot sdot sdot sdot sdot sdot sdot sdot sdot sdot
1198911198981 1198911198982 sdot sdot sdot 119891
119898119899
]
]
]
]
]
]
(4)
The final fitness function can be defined as
Maximize (
1198731015840
sum
119894=1119898119895=1119899
119891119894119895)
st 1198731015840le 119898 lowast 119899
(5)
Here the problem of ensemble pruning is to find thesubset of 1198641015840 which was composed of part single detectorsFinding the optimized subset requires much heavier andmore delicate computation resources Biogeography-basedoptimization (BBO) is a novel optimization method and isemployed to find out the acceptable set of ensemble Weonly simply present some key information about BBO theinterested reader can be referred to the detailed descriptionin [28]
BBO is a population-based global optimization methodwhich has some common characteristics similar to the
existing evolutionary algorithms (EAs) such as genetic algo-rithm (GA) particle swarm optimization (PSO) and antcolony optimization (ACO) When it was used to search thesolution domain and obtain an optimalsuboptimal solutionsome operators were employed to share information amongsolutions which makes BBO applicable to many problemsthat GA and PSO are used The more distinctive differencebetween BBO and other EAs can be seen in [27 28]
The pseudo-code of ensemble pruning based on BBO canbe described as shown in Algorithm 1 [7] Here 119867 indicateshabit HIS is fitness and SIV (suitability index variable) is asolution feature
333 Some Tricks Designed to Mitigate the CommunicationRequirement In the WSNs the main reason of quick energydepletion is the radio communication among the sensornodes It has been known that the cost of communicationof one bit equals the cost of processing thousands of bitsin sensors [35] This means that the most energy in sensornode is consumed by radio communication rather thancollecting or processing data Consequently reducing thecommunication quantity will decrease the power resourcerequirement and eventually lengthen the lifetime of thewholeWSNs
It is obvious that the aforementioned method has relativehigh communication overhead Each sensor node transmitsits local ensemble detector to the cluster head and the finalpruned global ensemble detector broadcasts back to its each
International Journal of Distributed Sensor Networks 7
Input 1198641015840mdashCurrent pruned ensemble anomaly detector 119901mdashSampling probabilityOutput 119864lowastmdashUpdated pruned ensemble anomaly detector
For each sensor nodeRetain the new observation with probability 119901If buffer is replaced completely by new observations
Train new detector and transmit its summary to cluster head119864lowast = Ensemble Pruning BBO(1198641015840 119879)
Broadcast 119864lowast to its member sensor node for subsequent anomaly detection
Algorithm 2 Online updating (1198641015840 119901)
member sensor nodes In order to relieve communicationburden some skills are used to descend the communicationoverhead
In fact the distributed traininglearning method onlytransmits the summary information of trained local ensembledetector to the cluster head which has significantly decreasedthe communication cost compared to centralized anomalydetectionmanners that sent all trained data to cluster head tobuild detector Besides after the pruned ensemble is obtainedin cluster head node each member sensor node in thiscluster can obtain the pruned ensemble detector from thecluster head node A straightforward method is broadcastingthis pruned ensemble to its member sensor nodes Thisis a common used strategy but it does not make full useof local ensemble detector information and will cost morecommunication resources Here a state matrix 119875 is designedin the cluster head its element 119901
119894119895is defined by formula (6)
to represent each single detector in initial ensemble Theneach local ensemble detector is represented as a bit stringusing one bit for each single detector Detector is included orexcluded from the ensemble detector depending on the valueof the corresponding bit that is 1 denotes this single detectorthat is included in the final ensemble and 0 means it was notincluded
119901119894119895=
1 AD119894119895isin 1198641015840 119894 = 1 119898 119895 = 1 119899
0 otherwise
119875
=
1 2 119894 minus 1 119894 119894 + 1 sdot sdot sdot 119899
1198781
1198782
119878119898
[
[
[
[
[
[
0
1
sdot
0
1
0
sdot
0
sdot sdot sdot
sdot sdot sdot
sdot sdot sdot
sdot sdot sdot
0
1
sdot
1
1
1
sdot
1
1
0
sdot
1
sdot sdot sdot
sdot sdot sdot
sdot sdot sdot
sdot sdot sdot
1
0
sdot
1
]
]
]
]
]
]
(6)
After the pruned procedure is finished the cluster headbroadcasts the statematrix119875 to its member sensor node eachsensor node keeps the single detector whose correspondingvalue of state element equals 1 and it deletes the rest to buildthe pruned ensemble global detector Employing the statematrix can save the energy greatly For example after theensemble pruning is finished 1198731015840 (1198731015840 le 119899 lowast 119898) individualdetectors are broadcast in cluster If matrix 119875 is not used it
will need 4 lowast1198731015840 lowast 119889 bytes communication cost (suppose thatthe individual detector can be represented by 119889 parametersand each parameter needs at least 4 bytes) If matrix 119875 isintroduced each itemofmatrix119875 only needs 1 bit to representan individual detector Consequently only 119898 lowast 1198998 bytes arerequired to broadcast Suppose that one-third of individualdetectors are pruned (ie1198731015840 = 2lowast119899lowast1198983) then (4lowast119899lowast119898lowast
119889 lowast 23)(119898 lowast 1198998) asymp 2133119889 By introducing the ensemblepruning and state matrix the quantity of energy saving incluster head sensor is significant and the lifetime ofWSNs canbe lengthened
334 Online Update and Relearning Distribution changeof sensed dataset occurred possibly and detector updat-ing is necessary Online detector update will accompanya relearning procedure A comprised strategy (ie delayupdating strategy [36]) can cater this situation and savethe computation communication and memory resources tosome extent Simple to say for the new coming observationwhether saving and using it to update the current detectoror not are decided by a sample probability 119901 Some heuristicrules can be employed to guide its value for example if thedynamics is relatively stationary the small 119901 should be usedotherwise the big 119901 should be chosen When the buffer of asensor node is replaced by the new data completely onlineupdate is triggered and new detector is trained The pseudo-codes of algorithm can be described as shown inAlgorithm 2
4 Experimental and Analysis
In this section the dataset data preprocessing methodexperiment results and analysis are described respectivelyExperiments were conducted on a personal PC with IntelCore 2 Duo CPU P7450213GHZ and 4GB memoryThe operating system is Windows 7 professional The dataprocessing is partly on the MATLAB 2010 and the algorithmmentioned in Section 3 was implemented with MicrosoftVisual C++ platform
41 Dataset and Data Preprocessing IBRL datasets [37] wereused in our paper to validate proposed method which wascollected from aWSN deployed in Intel Research Laboratoryat University of Berkeley and commonly used to evaluatethe performance of some existing models for WSNs [3536 38ndash41] This network consists of 54 Mica2Dot sensornodes Figure 3 shows the location of each node of the
8 International Journal of Distributed Sensor Networks
1
2
3
4
5 6
7
8
9
10
11
12
1314
15
16
17
18
19
20
21
22
23
2425
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
4142
43
44
45
46
47
48
49
50
51
52
54
Lab
Server
Quiet
Phon
e
Kitchen
Elec
Copy
Storage
Conference
Office Office
53
Figure 3 Sensor nodes location in the IBRL deployment
deployment (node locations are shown in black hexagonwith their corresponding node IDs) [35] The whole datasetwas collected from 29022004 to 05042004 Four typesof measures data that is light temperature and humidityas well as voltage were collected and those measurementswere recorded in 31 s interval Because these sensors weredeployed inside a lab and the measurement variables hadlittle changes over time (except the light having the suddenchanges due to the irregular nature of this variable andfrequent onoff operation) this dataset was considered a typeof static datasets for many researchers In our experimentsto evaluate our proposed anomaly detection algorithm someartificial anomalies are created by randomly modifying someobservations which is widely used bymany researchers in theliterature [41]
Since our proposed method adopts the cluster structurea cluster (consisting of 4 sensor nodes ie N7 N8 N9and N10) and dataset (collected on 29022004) are chosenThe data distribution can be seen in [7] Here only partobservations (during 000000 amndash075959 am) from eachsensor node are employed to evaluate proposed methodThedata trend is depicted in Figure 4
From Figure 4 an obvious fact is that data distributionin a cluster is almost same which well proved that spatialcorrelation exists Though there are some trivial differencesafter analyzing the dataset carefully the main reason is thatdataset has some missing data points largely due to packetloss which can be further proved from Figure 4 In ourexperiment these missing observations can be interpolatedusing the method described in Section 33 The obvious factis that sudden peakvalley appeared in Figure 4 for eachsensor observation which implies that an interested eventmay occurred
Suppose that 119863 = 119909119894 119910119894 119894 = 1 2 119899 is a dataset
used to train an anomaly detector Here the 119909119894is a vector
with feature values and 119910119894is the label which indicates whether
the given observation is normal or anomalous Because theIBRL dataset regards all its observations as normal someanomaly data points are generated and inserted to evaluatethe performance of our proposed method In the paper anumber of 30 data points of artificial anomalies for eachsensor were injected consecutively in each dataset to calculatethe true positive rate (TPR) false negative rate (FPR) anddetection accuracy (ACC) Without loss of generality theanomalous dataset should follow a distribution very muchdifferent from that of the training dataset but their rangesshould be overlapped as much as possible Besides ananomalous event should be a small probability event fora normal dataset collected by a nonfault sensor node Theanomalies were generated using a normal randomizer withslightly deviate statistical characteristics from the normaldata characteristics [41] The detailed dataset information(including statistical parameters) of selected sensor node ispresented in Table 1
42 Performance EvaluationMetrics and BBO Parameters Inorder to evaluate our proposed method some commonlyused performance evaluation metrics for anomaly detectionare used in our paper such as detection accuracy (ACC) truepositive rate (TPR) and false positivealarm rate (FPR)Theyare described as follows
ACC =
(TP + TN)(TP + TN + FP + FN)
International Journal of Distributed Sensor Networks 9
17
175
18
185
19
195
20Temperature
0 200 400 600 800 1000
N7N8
N9N10
(a)
38
39
40
41
42
43
44
45
46
0 200 400 600 800 1000
Humidity
N7N8
N9N10
(b)
Figure 4 The data (temperature humidity) trend during 00000 amndash75959 am on February 29 2004
Table 1 Detail dataset information of selected sensor node on 29022004
Node Initial sample Mean Variance Injected anomaly Mean Variance119879 119867 119879 119867 119879 119867 119879 119867
N7 823 184154 409176 05238 14494 30 1821 4110 054 146N8 548 179844 417123 05315 14612 30 1775 4195 055 148N9 652 181140 426295 05288 14827 30 1835 4245 055 150N10 620 181144 426215 05244 14191 30 1833 4247 054 143119879 temperature119867 humidity
TPR =
TP(TP + FN)
FPR =
FP(FP + TN)
(7)
where TP means number of samples correctly predicted asanomaly class FP means number of samples incorrectlypredicted as anomaly class TN means number of samplescorrectly predicted as normal class and FN means numberof samples incorrectly predicted as normal class
BBO is employed to prune the initial ensemble and themigration model is same as that present in [27 28] and therelated parameters are set as follows
Habitat (population) size 119878 = 30 the number of SIVs(suitability index variables) in each island 119899 = 20 40 60 80the maximum migration rates 119864 = 1 and 119868 = 1 and themutation rate 120578 = 001 and 120582 120583 are the immigration rateand the emigration rate respectively The elitism parameter120588 = 2
HSI (habitat suitability index) is a fitness function similarto other population-based optimization algorithms HIS isevaluated by 119865-measure (119865-score) which considers both the
precision probability and the recall probability of binaryclassification problem
119865-measure =(1 + 120573
2) precision lowast recall
1205732lowast precision + recall
=
(1 + 1205732) lowast TP
(1 + 1205732) lowast TP + 1205732 lowast FN + FP
(8)
119865-measure can be interpreted as a weighted average of theprecision and recall and its value reaches best at 1 and worstat 0 120573 is a parameter used to adjust the relative importancebetween precision and recall 120573 = 05 1 2 Usually thevalue of 119865-measure is close to the relative small value ofprecision and recall that is the big 119865-measuremeans that theprecision and recall are all big Consequently a good detectoris analogous to a habitat with a high HSI and is included inthe final ensemble detector and a poor detector is analogousto a habitat with a low HIS and is discarded from the finalensemble detector In our paper 120573 = 1 is specified
43 Results Presentation and Discussions In the data miningand machine learning communities SVM-based method has
10 International Journal of Distributed Sensor Networks
Table 2 Detection performance of local ensemble detector
Ensemble size N7 N8 N9 N10ACC TPR FPR ACC TPR FPR ACC TPR FPR ACC TPR FPR
5 08700 05833 01181 07900 03333 01809 08267 05000 01549 08267 05714 0160810 08800 06667 01111 08033 03889 01702 08267 04375 01514 08333 06429 0157315 08900 07500 01042 08167 05000 01631 08433 05000 01373 08600 07143 0132920 08933 08333 01042 08200 05000 01596 08367 05000 01444 08567 07143 01364
Table 3 Detection performance of global ensemble detector [7]
Combined ensemble size N7 N8 N9 N10ACC TPR FPR ACC TPR FPR ACC TPR FPR ACC TPR FPR
20 09467 08333 00486 09300 07778 00603 09467 07500 00423 09500 07857 0042040 09700 07500 00208 09433 08333 00496 09710 08938 00246 09650 08929 0031560 09700 08333 00243 09733 08889 00213 09800 09375 00176 09783 09357 0019680 09817 09583 00174 09800 09444 00177 09767 09375 00211 09780 09714 00217
been widely used in classification problem which separatesthe data belonging to the different classes by fitting a hyper-plane One class SVM based method as a variation of thismethod is especially favored for anomaly detection [42ndash44]In the paper it was used to train the base detectorThe datasetof each sensor node was divided into two parts about 66was used for training the local detector and the remainder asthe test set was to evaluate proposed method
Online Bagging the commonly used ensemble strategywas used to build initial ensemble detector Our experimentsaim to achieve two goals Firstly it is to prove the effectivenessof proposed method based on ensemble learning theorySecondly it is to prove that pruned ensemble detector canobtain better (at least equal) performance compared to initialensemble detector and mitigate the resource requirementAs a result three experiments were done that is localensemble anomaly detector only considering the temporalcorrelation of each sensor node global ensemble anomalydetector considering the spatiotemporal correlation and theglobal pruned ensemble anomaly detector based on BBOThe experimental results can be seen in Tables 2 3 and 4respectively
Table 2 shows the performance of each sensor node underthe different ensemble size which does not take into accountthe spatial correlation of sensed data in a cluster Though theensemble detection performance is becoming ldquogoodrdquo gradualwith the increasing of ensemble size (the higher value ofACCTPR the better performance and the lower value of FPR thebetter performance) the overall performance is relatively lowThemaximumvalue of detection accuracy is only 8933 andmost of true positive rates are unacceptable and most of falsepositive rates (FPR) have a relative high valueAll these resultsindicate that the performance of local ensemble detectoris poor Table 3 shows the global detection performance ofeach sensor node Here after the local ensemble detectorwas trained each member node sent its local ensembleto each other to form the global ensemble detector andeach member node used this global detector to online testthe local observation From the results of Table 3 [7] an
obvious fact is that the detection performances are higherthan presented in Table 2With the help of neighbor detectorthe detection results become better and better correspondingto the increasing of ensemble size
In order to further optimize the proposed algorithmperformance and save the resource ensemble pruning is usedfor global ensemble detector Table 4 [7] shows the result ofdetection performance of pruned global ensemble detectorbased on BBO
Table 4 shows a more practicable result and the sizeof global ensemble decreases sharply while the detectorperformance is as good as or better than the initial globalensemble detector From the results of Table 5 when thesize of initial ensemble reaches 80 the 60 resource costis saved In our experiment only for validating the methodeffectively we set the ensemble sizes 5 10 15 and 20 for eachlocal ensemble detector which may be small for the practicalapplications In fact how many local ensemble detectors arechosen is an open topic and is decided by many factors suchas the computation capability and the communication cost aswell as memory usage of sensor node the expected detectingaccuracy requirement and so on In the practical applicationa trade-off is commonly considered
5 Conclusion and Future Work
After exploiting the spatiotemporal correlation existing inthe sensed data of WSNs and motivated by the advantagesof online ensemble learning a distributed online ensembleanomaly detector method has been proposed Due to thespecific resource constrained in theWSNs ensemble pruningbased on BBO is employed to mitigate the high resourcerequirement and obtain the optimized detector that performsat least as good as the original ones The experimental resultson real dataset demonstrated that our proposed method iseffective
Because the diversity of base learners is a key factorrelated to the performance of ensemble learning as a possibleextension of our work we plan to include some diversity
International Journal of Distributed Sensor Networks 11
Table 4 Detection performance of global ensemble detector based on BBO pruning [7]
Ensemble size(BBO pruned)
N7 N8 N9 N10ACC TPR FPR ACC TPR FPR ACC TPR FPR ACC TPR FPR
14 09480 08000 00458 09327 07667 00567 09500 08125 00423 09533 08571 0042023 09710 07750 00208 09447 08000 00461 09733 09250 00239 09697 09143 0027627 09713 08500 00236 09683 08333 00230 09810 09563 00176 09797 09357 0018232 09820 09750 00177 09750 08333 00160 09820 09500 00162 09830 09786 00168
Table 5 Rate of saving resource cost based on global ensembledetector of BBO pruned
Number Initial ensemblesize
Prunedensemble size
Saving resourcecost
1 20 14 302 40 23 4253 60 27 554 80 32 60
measures in fitness function to improve the detecting per-formance in future Besides the cost of communication isthe main reason of quick energy depletion of sensor nodesespecially for the cluster head the adaptive selection of clusterhead based on energy state will be taken into account tolengthen the lifetime of WSNs in next work
Conflict of Interests
The authors declare that there is no conflict of interestsregarding the publication of this paper
Acknowledgments
This work is supported by the National Key ScientificInstrument and Equipment Development Project(2012YQ15008703) the Zhejiang Provincial Natural ScienceFoundation of China (LY13F020015) the Open Project of TopKeyDiscipline of Computer Software andTheory in ZhejiangProvincial (ZC323014100) National Science Foundation ofChina (61473182) Science and Technology Commission ofShanghai Municipality (11JC1404000 14JC1402200) andShanghai Rising-Star Program (13QA1401600)
References
[1] Y Zhang N Meratnia and P Havinga ldquoOutlier detectiontechniques for wireless sensor networks a surveyrdquo IEEE Com-munications Surveys and Tutorials vol 12 no 2 pp 159ndash1702010
[2] Y Zhang N A S Hamm N Meratnia A Stein M van deVoort and P J M Havinga ldquoStatistics-based outlier detectionfor wireless sensor networksrdquo International Journal of Geo-graphical Information Science vol 26 no 8 pp 1373ndash1392 2012
[3] C Peng and Q-L Han ldquoA novel event-triggered transmissionscheme and L
2control co-design for sampled-data control
systemsrdquo IEEE Transactions on Automatic Control vol 58 no10 pp 2620ndash2626 2013
[4] S Rajasegarar C Leckie M Palaniswami and J C BezdekldquoDistributed anomaly detection in wireless sensor networksrdquo inProceedings of the 10th IEEE Singapore International Conferenceon Communication systems (ICCS rsquo06) pp 1ndash5 IEEE SingaporeOctober 2006
[5] S Rajasegarar C Leckie and M Palaniswami ldquoAnomalydetection in wireless sensor networksrdquo IEEE Wireless Commu-nications vol 15 no 4 pp 34ndash40 2008
[6] M Xie S Han B Tian and S Parvin ldquoAnomaly detectionin wireless sensor networks a surveyrdquo Journal of Network andComputer Applications vol 34 no 4 pp 1302ndash1325 2011
[7] Z Ding M Fei D Du and S Xu ldquoOnline anomaly detectionmethod based on BBO ensemble pruning in wireless sensornetworksrdquo in Life System Modeling and Simulation vol 461 ofCommunications in Computer and Information Science pp 160ndash169 Springer Berlin Germany 2014
[8] T G Dietterich ldquoMachine-learning researchmdashfour currentdirectionsrdquo AI Magazine vol 18 no 4 pp 97ndash136 1997
[9] Z-H Zhou J Wu andW Tang ldquoEnsembling neural networksmany could be better than allrdquoArtificial Intelligence vol 137 no1-2 pp 239ndash263 2002
[10] N Shahid I H Naqvi and S B Qaisar ldquoCharacteristics andclassification of outlier detection techniques for wireless sensornetworks in harsh environments a surveyrdquoArtificial IntelligenceReview vol 137 pp 1ndash36 2012
[11] D Du K Li and M Fei ldquoA fast multi-output RBF neuralnetwork constructionmethodrdquoNeurocomputing vol 73 no 10ndash12 pp 2196ndash2202 2010
[12] P Gil A Santos and A Cardoso ldquoDealing with outliers inwireless sensor networks an oil refinery applicationrdquo IEEETransactions on Control Systems Technology vol 23 no 4 pp1589ndash1596 2014
[13] M A Rassam M A Maarof and A Zainal ldquoAdaptive andonline data anomaly detection for wireless sensor systemsrdquoKnowledge-Based Systems vol 60 pp 44ndash57 2014
[14] S Rajasegarar A Gluhak M Ali Imran et al ldquoEllipsoidalneighbourhood outlier factor for distributed anomaly detectionin resource constrained networksrdquo Pattern Recognition vol 47no 9 pp 2867ndash2879 2014
[15] N Lu G Zhang and J Lu ldquoConcept drift detection viacompetence modelsrdquo Artificial Intelligence vol 209 pp 11ndash282014
[16] L Breiman ldquoBagging predictorsrdquoMachine Learning vol 24 no2 pp 123ndash140 1996
[17] S Seguı L Igual and J Vitria ldquoBagged one-class classifiersin the presence of outliersrdquo International Journal of PatternRecognition and Artificial Intelligence vol 27 no 5 Article ID1350014 2013
12 International Journal of Distributed Sensor Networks
[18] N Duffy and D Helmbold ldquoBoosting methods for regressionrdquoMachine Learning vol 47 no 2-3 pp 153ndash200 2002
[19] W-C Chang and C-W Cho ldquoOnline boosting for vehicledetectionrdquo IEEETransactions on SystemsMan and CyberneticsPart B Cybernetics vol 40 no 3 pp 892ndash902 2010
[20] C Desir S Bernard C Petitjean and L Heutte ldquoOne classrandom forestsrdquo Pattern Recognition vol 46 no 12 pp 3490ndash3506 2013
[21] A Fern and R Givan ldquoOnline ensemble learning an empiricalstudyrdquoMachine Learning vol 53 no 1-2 pp 71ndash109 2003
[22] A Bifet G Holmes B Pfahringer and R Gavalda ldquoImprov-ing adaptive bagging methods for evolving data streamsrdquo inAdvances in Machine Learning vol 5828 of Lecture Notes inComputer Science pp 23ndash37 Springer Berlin Germany 2009
[23] D I Curiac and C Volosencu ldquoEnsemble based sensinganomaly detection in wireless sensor networksrdquo Expert Systemswith Applications vol 39 no 10 pp 9087ndash9096 2012
[24] X Zhou S Li and Z Ye ldquoA novel system anomaly predictionsystem based on belief markov model and ensemble classifica-tionrdquo Mathematical Problems in Engineering vol 2013 ArticleID 179390 10 pages 2013
[25] H He S Chen K Li and X Xu ldquoIncremental learning fromstream datardquo IEEE Transactions on Neural Networks vol 22 no12 pp 1901ndash1914 2011
[26] D Du K Li X Li and M Fei ldquoA novel forward gene selectionalgorithm for microarray datardquo Neurocomputing vol 133 pp446ndash458 2014
[27] H Ma ldquoAn analysis of the equilibrium of migration models forbiogeography-based optimizationrdquo Information Sciences vol180 no 18 pp 3444ndash3464 2010
[28] D Simon ldquoBiogeography-based optimizationrdquo IEEE Transac-tions on Evolutionary Computation vol 12 no 6 pp 702ndash7132008
[29] S Sheen R Anitha and P Sirisha ldquoMalware detection bypruning of parallel ensembles using harmony searchrdquo PatternRecognition Letters vol 34 no 14 pp 1679ndash1686 2013
[30] Y-Y Zhang H-C Chao M Chen L Shu C-H Park and M-S Park ldquoOutlier detection and countermeasure for hierarchicalwireless sensor networksrdquo IET Information Security vol 4 no4 pp 361ndash373 2010
[31] C Peng and M-R Fei ldquoAn improved result on the stability ofuncertain T-S fuzzy systems with interval time-varying delayrdquoFuzzy Sets and Systems vol 212 pp 97ndash109 2013
[32] Y Zhang Observing the Unobservable Distributed Online Out-lier Detection inWireless Sensor Networks University of TwenteEnschede The Netherlands 2010
[33] C Peng D Yue and M Fei ldquoRelaxed stability and stabilizationconditions of networked fuzzy control systems subject toasynchronous grades of membershiprdquo IEEE Transactions onFuzzy Systems vol 22 no 5 pp 1101ndash1112 2014
[34] C Peng M-R Fei E Tian and Y-P Guan ldquoOn hold or dropout-of-order packets in networked control systemsrdquo Informa-tion Sciences vol 268 pp 436ndash446 2014
[35] M A Rassam A Zainal and M A Maarof ldquoAn adaptive andefficient dimension reduction model for multivariate wirelesssensor networks applicationsrdquo Applied Soft Computing Journalvol 13 no 4 pp 1978ndash1996 2013
[36] M Xie J Hu S Han and H-H Chen ldquoScalable hypergridk-NN-based online anomaly detection in wireless sensor net-worksrdquo IEEE Transactions on Parallel and Distributed Systemsvol 24 no 8 pp 1661ndash1670 2013
[37] Intel Berkely Reseach Lab (IBRL) dataset 2004 httpdbcsailmitedulabdatalabdatahtml
[38] J W Branch C Giannella B Szymanski R Wolff and HKargupta ldquoIn-network outlier detection in wireless sensornetworksrdquo Knowledge and Information Systems vol 34 no 1pp 23ndash54 2013
[39] M Moshtaghi T C Havens J C Bezdek et al ldquoClusteringellipses for anomaly detectionrdquo Pattern Recognition vol 44 no1 pp 55ndash69 2011
[40] S Rajasegarar J C Bezdek C Leckie and M PalaniswamildquoElliptical anomalies in wireless sensor networksrdquo ACM Trans-actions on Sensor Networks vol 6 no 1 pp 1ndash28 2009
[41] M A Rassam A Zainal and M A Maarof ldquoOne-classprincipal component classifier for anomaly detection inwirelesssensor networkrdquo in Proceedings of the 4th International Confer-ence on Computational Aspects of Social Networks (CASoN rsquo12)pp 271ndash276 IEEE Sao Carlos Brazil November 2012
[42] H Sagha H Bayati J D R Millan and R Chavarriaga ldquoOn-line anomaly detection and resilience in classifier ensemblesrdquoPattern Recognition Letters vol 34 no 15 pp 1916ndash1927 2013
[43] M Hejazi and Y P Singh ldquoOne-class support vector machinesapproach to anomaly detectionrdquo Applied Artificial Intelligencevol 27 no 5 pp 351ndash366 2013
[44] Y Zhang NMeratnia and P JMHavinga ldquoDistributed onlineoutlier detection in wireless sensor networks using ellipsoidalsupport vector machinerdquo Ad Hoc Networks vol 11 no 3 pp1062ndash1074 2013
International Journal of
AerospaceEngineeringHindawi Publishing Corporationhttpwwwhindawicom Volume 2014
RoboticsJournal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Active and Passive Electronic Components
Control Scienceand Engineering
Journal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
International Journal of
RotatingMachinery
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Hindawi Publishing Corporation httpwwwhindawicom
Journal ofEngineeringVolume 2014
Submit your manuscripts athttpwwwhindawicom
VLSI Design
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Shock and Vibration
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Civil EngineeringAdvances in
Acoustics and VibrationAdvances in
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Electrical and Computer Engineering
Journal of
Advances inOptoElectronics
Hindawi Publishing Corporation httpwwwhindawicom
Volume 2014
The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014
SensorsJournal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Modelling amp Simulation in EngineeringHindawi Publishing Corporation httpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Chemical EngineeringInternational Journal of Antennas and
Propagation
International Journal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Navigation and Observation
International Journal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
DistributedSensor Networks
International Journal of
International Journal of Distributed Sensor Networks 7
Input 1198641015840mdashCurrent pruned ensemble anomaly detector 119901mdashSampling probabilityOutput 119864lowastmdashUpdated pruned ensemble anomaly detector
For each sensor nodeRetain the new observation with probability 119901If buffer is replaced completely by new observations
Train new detector and transmit its summary to cluster head119864lowast = Ensemble Pruning BBO(1198641015840 119879)
Broadcast 119864lowast to its member sensor node for subsequent anomaly detection
Algorithm 2 Online updating (1198641015840 119901)
member sensor nodes In order to relieve communicationburden some skills are used to descend the communicationoverhead
In fact the distributed traininglearning method onlytransmits the summary information of trained local ensembledetector to the cluster head which has significantly decreasedthe communication cost compared to centralized anomalydetectionmanners that sent all trained data to cluster head tobuild detector Besides after the pruned ensemble is obtainedin cluster head node each member sensor node in thiscluster can obtain the pruned ensemble detector from thecluster head node A straightforward method is broadcastingthis pruned ensemble to its member sensor nodes Thisis a common used strategy but it does not make full useof local ensemble detector information and will cost morecommunication resources Here a state matrix 119875 is designedin the cluster head its element 119901
119894119895is defined by formula (6)
to represent each single detector in initial ensemble Theneach local ensemble detector is represented as a bit stringusing one bit for each single detector Detector is included orexcluded from the ensemble detector depending on the valueof the corresponding bit that is 1 denotes this single detectorthat is included in the final ensemble and 0 means it was notincluded
119901119894119895=
1 AD119894119895isin 1198641015840 119894 = 1 119898 119895 = 1 119899
0 otherwise
119875
=
1 2 119894 minus 1 119894 119894 + 1 sdot sdot sdot 119899
1198781
1198782
119878119898
[
[
[
[
[
[
0
1
sdot
0
1
0
sdot
0
sdot sdot sdot
sdot sdot sdot
sdot sdot sdot
sdot sdot sdot
0
1
sdot
1
1
1
sdot
1
1
0
sdot
1
sdot sdot sdot
sdot sdot sdot
sdot sdot sdot
sdot sdot sdot
1
0
sdot
1
]
]
]
]
]
]
(6)
After the pruned procedure is finished the cluster headbroadcasts the statematrix119875 to its member sensor node eachsensor node keeps the single detector whose correspondingvalue of state element equals 1 and it deletes the rest to buildthe pruned ensemble global detector Employing the statematrix can save the energy greatly For example after theensemble pruning is finished 1198731015840 (1198731015840 le 119899 lowast 119898) individualdetectors are broadcast in cluster If matrix 119875 is not used it
will need 4 lowast1198731015840 lowast 119889 bytes communication cost (suppose thatthe individual detector can be represented by 119889 parametersand each parameter needs at least 4 bytes) If matrix 119875 isintroduced each itemofmatrix119875 only needs 1 bit to representan individual detector Consequently only 119898 lowast 1198998 bytes arerequired to broadcast Suppose that one-third of individualdetectors are pruned (ie1198731015840 = 2lowast119899lowast1198983) then (4lowast119899lowast119898lowast
119889 lowast 23)(119898 lowast 1198998) asymp 2133119889 By introducing the ensemblepruning and state matrix the quantity of energy saving incluster head sensor is significant and the lifetime ofWSNs canbe lengthened
334 Online Update and Relearning Distribution changeof sensed dataset occurred possibly and detector updat-ing is necessary Online detector update will accompanya relearning procedure A comprised strategy (ie delayupdating strategy [36]) can cater this situation and savethe computation communication and memory resources tosome extent Simple to say for the new coming observationwhether saving and using it to update the current detectoror not are decided by a sample probability 119901 Some heuristicrules can be employed to guide its value for example if thedynamics is relatively stationary the small 119901 should be usedotherwise the big 119901 should be chosen When the buffer of asensor node is replaced by the new data completely onlineupdate is triggered and new detector is trained The pseudo-codes of algorithm can be described as shown inAlgorithm 2
4 Experimental and Analysis
In this section the dataset data preprocessing methodexperiment results and analysis are described respectivelyExperiments were conducted on a personal PC with IntelCore 2 Duo CPU P7450213GHZ and 4GB memoryThe operating system is Windows 7 professional The dataprocessing is partly on the MATLAB 2010 and the algorithmmentioned in Section 3 was implemented with MicrosoftVisual C++ platform
41 Dataset and Data Preprocessing IBRL datasets [37] wereused in our paper to validate proposed method which wascollected from aWSN deployed in Intel Research Laboratoryat University of Berkeley and commonly used to evaluatethe performance of some existing models for WSNs [3536 38ndash41] This network consists of 54 Mica2Dot sensornodes Figure 3 shows the location of each node of the
8 International Journal of Distributed Sensor Networks
1
2
3
4
5 6
7
8
9
10
11
12
1314
15
16
17
18
19
20
21
22
23
2425
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
4142
43
44
45
46
47
48
49
50
51
52
54
Lab
Server
Quiet
Phon
e
Kitchen
Elec
Copy
Storage
Conference
Office Office
53
Figure 3 Sensor nodes location in the IBRL deployment
deployment (node locations are shown in black hexagonwith their corresponding node IDs) [35] The whole datasetwas collected from 29022004 to 05042004 Four typesof measures data that is light temperature and humidityas well as voltage were collected and those measurementswere recorded in 31 s interval Because these sensors weredeployed inside a lab and the measurement variables hadlittle changes over time (except the light having the suddenchanges due to the irregular nature of this variable andfrequent onoff operation) this dataset was considered a typeof static datasets for many researchers In our experimentsto evaluate our proposed anomaly detection algorithm someartificial anomalies are created by randomly modifying someobservations which is widely used bymany researchers in theliterature [41]
Since our proposed method adopts the cluster structurea cluster (consisting of 4 sensor nodes ie N7 N8 N9and N10) and dataset (collected on 29022004) are chosenThe data distribution can be seen in [7] Here only partobservations (during 000000 amndash075959 am) from eachsensor node are employed to evaluate proposed methodThedata trend is depicted in Figure 4
From Figure 4 an obvious fact is that data distributionin a cluster is almost same which well proved that spatialcorrelation exists Though there are some trivial differencesafter analyzing the dataset carefully the main reason is thatdataset has some missing data points largely due to packetloss which can be further proved from Figure 4 In ourexperiment these missing observations can be interpolatedusing the method described in Section 33 The obvious factis that sudden peakvalley appeared in Figure 4 for eachsensor observation which implies that an interested eventmay occurred
Suppose that 119863 = 119909119894 119910119894 119894 = 1 2 119899 is a dataset
used to train an anomaly detector Here the 119909119894is a vector
with feature values and 119910119894is the label which indicates whether
the given observation is normal or anomalous Because theIBRL dataset regards all its observations as normal someanomaly data points are generated and inserted to evaluatethe performance of our proposed method In the paper anumber of 30 data points of artificial anomalies for eachsensor were injected consecutively in each dataset to calculatethe true positive rate (TPR) false negative rate (FPR) anddetection accuracy (ACC) Without loss of generality theanomalous dataset should follow a distribution very muchdifferent from that of the training dataset but their rangesshould be overlapped as much as possible Besides ananomalous event should be a small probability event fora normal dataset collected by a nonfault sensor node Theanomalies were generated using a normal randomizer withslightly deviate statistical characteristics from the normaldata characteristics [41] The detailed dataset information(including statistical parameters) of selected sensor node ispresented in Table 1
42 Performance EvaluationMetrics and BBO Parameters Inorder to evaluate our proposed method some commonlyused performance evaluation metrics for anomaly detectionare used in our paper such as detection accuracy (ACC) truepositive rate (TPR) and false positivealarm rate (FPR)Theyare described as follows
ACC =
(TP + TN)(TP + TN + FP + FN)
International Journal of Distributed Sensor Networks 9
17
175
18
185
19
195
20Temperature
0 200 400 600 800 1000
N7N8
N9N10
(a)
38
39
40
41
42
43
44
45
46
0 200 400 600 800 1000
Humidity
N7N8
N9N10
(b)
Figure 4 The data (temperature humidity) trend during 00000 amndash75959 am on February 29 2004
Table 1 Detail dataset information of selected sensor node on 29022004
Node Initial sample Mean Variance Injected anomaly Mean Variance119879 119867 119879 119867 119879 119867 119879 119867
N7 823 184154 409176 05238 14494 30 1821 4110 054 146N8 548 179844 417123 05315 14612 30 1775 4195 055 148N9 652 181140 426295 05288 14827 30 1835 4245 055 150N10 620 181144 426215 05244 14191 30 1833 4247 054 143119879 temperature119867 humidity
TPR =
TP(TP + FN)
FPR =
FP(FP + TN)
(7)
where TP means number of samples correctly predicted asanomaly class FP means number of samples incorrectlypredicted as anomaly class TN means number of samplescorrectly predicted as normal class and FN means numberof samples incorrectly predicted as normal class
BBO is employed to prune the initial ensemble and themigration model is same as that present in [27 28] and therelated parameters are set as follows
Habitat (population) size 119878 = 30 the number of SIVs(suitability index variables) in each island 119899 = 20 40 60 80the maximum migration rates 119864 = 1 and 119868 = 1 and themutation rate 120578 = 001 and 120582 120583 are the immigration rateand the emigration rate respectively The elitism parameter120588 = 2
HSI (habitat suitability index) is a fitness function similarto other population-based optimization algorithms HIS isevaluated by 119865-measure (119865-score) which considers both the
precision probability and the recall probability of binaryclassification problem
119865-measure =(1 + 120573
2) precision lowast recall
1205732lowast precision + recall
=
(1 + 1205732) lowast TP
(1 + 1205732) lowast TP + 1205732 lowast FN + FP
(8)
119865-measure can be interpreted as a weighted average of theprecision and recall and its value reaches best at 1 and worstat 0 120573 is a parameter used to adjust the relative importancebetween precision and recall 120573 = 05 1 2 Usually thevalue of 119865-measure is close to the relative small value ofprecision and recall that is the big 119865-measuremeans that theprecision and recall are all big Consequently a good detectoris analogous to a habitat with a high HSI and is included inthe final ensemble detector and a poor detector is analogousto a habitat with a low HIS and is discarded from the finalensemble detector In our paper 120573 = 1 is specified
43 Results Presentation and Discussions In the data miningand machine learning communities SVM-based method has
10 International Journal of Distributed Sensor Networks
Table 2 Detection performance of local ensemble detector
Ensemble size N7 N8 N9 N10ACC TPR FPR ACC TPR FPR ACC TPR FPR ACC TPR FPR
5 08700 05833 01181 07900 03333 01809 08267 05000 01549 08267 05714 0160810 08800 06667 01111 08033 03889 01702 08267 04375 01514 08333 06429 0157315 08900 07500 01042 08167 05000 01631 08433 05000 01373 08600 07143 0132920 08933 08333 01042 08200 05000 01596 08367 05000 01444 08567 07143 01364
Table 3 Detection performance of global ensemble detector [7]
Combined ensemble size N7 N8 N9 N10ACC TPR FPR ACC TPR FPR ACC TPR FPR ACC TPR FPR
20 09467 08333 00486 09300 07778 00603 09467 07500 00423 09500 07857 0042040 09700 07500 00208 09433 08333 00496 09710 08938 00246 09650 08929 0031560 09700 08333 00243 09733 08889 00213 09800 09375 00176 09783 09357 0019680 09817 09583 00174 09800 09444 00177 09767 09375 00211 09780 09714 00217
been widely used in classification problem which separatesthe data belonging to the different classes by fitting a hyper-plane One class SVM based method as a variation of thismethod is especially favored for anomaly detection [42ndash44]In the paper it was used to train the base detectorThe datasetof each sensor node was divided into two parts about 66was used for training the local detector and the remainder asthe test set was to evaluate proposed method
Online Bagging the commonly used ensemble strategywas used to build initial ensemble detector Our experimentsaim to achieve two goals Firstly it is to prove the effectivenessof proposed method based on ensemble learning theorySecondly it is to prove that pruned ensemble detector canobtain better (at least equal) performance compared to initialensemble detector and mitigate the resource requirementAs a result three experiments were done that is localensemble anomaly detector only considering the temporalcorrelation of each sensor node global ensemble anomalydetector considering the spatiotemporal correlation and theglobal pruned ensemble anomaly detector based on BBOThe experimental results can be seen in Tables 2 3 and 4respectively
Table 2 shows the performance of each sensor node underthe different ensemble size which does not take into accountthe spatial correlation of sensed data in a cluster Though theensemble detection performance is becoming ldquogoodrdquo gradualwith the increasing of ensemble size (the higher value ofACCTPR the better performance and the lower value of FPR thebetter performance) the overall performance is relatively lowThemaximumvalue of detection accuracy is only 8933 andmost of true positive rates are unacceptable and most of falsepositive rates (FPR) have a relative high valueAll these resultsindicate that the performance of local ensemble detectoris poor Table 3 shows the global detection performance ofeach sensor node Here after the local ensemble detectorwas trained each member node sent its local ensembleto each other to form the global ensemble detector andeach member node used this global detector to online testthe local observation From the results of Table 3 [7] an
obvious fact is that the detection performances are higherthan presented in Table 2With the help of neighbor detectorthe detection results become better and better correspondingto the increasing of ensemble size
In order to further optimize the proposed algorithmperformance and save the resource ensemble pruning is usedfor global ensemble detector Table 4 [7] shows the result ofdetection performance of pruned global ensemble detectorbased on BBO
Table 4 shows a more practicable result and the sizeof global ensemble decreases sharply while the detectorperformance is as good as or better than the initial globalensemble detector From the results of Table 5 when thesize of initial ensemble reaches 80 the 60 resource costis saved In our experiment only for validating the methodeffectively we set the ensemble sizes 5 10 15 and 20 for eachlocal ensemble detector which may be small for the practicalapplications In fact how many local ensemble detectors arechosen is an open topic and is decided by many factors suchas the computation capability and the communication cost aswell as memory usage of sensor node the expected detectingaccuracy requirement and so on In the practical applicationa trade-off is commonly considered
5 Conclusion and Future Work
After exploiting the spatiotemporal correlation existing inthe sensed data of WSNs and motivated by the advantagesof online ensemble learning a distributed online ensembleanomaly detector method has been proposed Due to thespecific resource constrained in theWSNs ensemble pruningbased on BBO is employed to mitigate the high resourcerequirement and obtain the optimized detector that performsat least as good as the original ones The experimental resultson real dataset demonstrated that our proposed method iseffective
Because the diversity of base learners is a key factorrelated to the performance of ensemble learning as a possibleextension of our work we plan to include some diversity
International Journal of Distributed Sensor Networks 11
Table 4 Detection performance of global ensemble detector based on BBO pruning [7]
Ensemble size(BBO pruned)
N7 N8 N9 N10ACC TPR FPR ACC TPR FPR ACC TPR FPR ACC TPR FPR
14 09480 08000 00458 09327 07667 00567 09500 08125 00423 09533 08571 0042023 09710 07750 00208 09447 08000 00461 09733 09250 00239 09697 09143 0027627 09713 08500 00236 09683 08333 00230 09810 09563 00176 09797 09357 0018232 09820 09750 00177 09750 08333 00160 09820 09500 00162 09830 09786 00168
Table 5 Rate of saving resource cost based on global ensembledetector of BBO pruned
Number Initial ensemblesize
Prunedensemble size
Saving resourcecost
1 20 14 302 40 23 4253 60 27 554 80 32 60
measures in fitness function to improve the detecting per-formance in future Besides the cost of communication isthe main reason of quick energy depletion of sensor nodesespecially for the cluster head the adaptive selection of clusterhead based on energy state will be taken into account tolengthen the lifetime of WSNs in next work
Conflict of Interests
The authors declare that there is no conflict of interestsregarding the publication of this paper
Acknowledgments
This work is supported by the National Key ScientificInstrument and Equipment Development Project(2012YQ15008703) the Zhejiang Provincial Natural ScienceFoundation of China (LY13F020015) the Open Project of TopKeyDiscipline of Computer Software andTheory in ZhejiangProvincial (ZC323014100) National Science Foundation ofChina (61473182) Science and Technology Commission ofShanghai Municipality (11JC1404000 14JC1402200) andShanghai Rising-Star Program (13QA1401600)
References
[1] Y Zhang N Meratnia and P Havinga ldquoOutlier detectiontechniques for wireless sensor networks a surveyrdquo IEEE Com-munications Surveys and Tutorials vol 12 no 2 pp 159ndash1702010
[2] Y Zhang N A S Hamm N Meratnia A Stein M van deVoort and P J M Havinga ldquoStatistics-based outlier detectionfor wireless sensor networksrdquo International Journal of Geo-graphical Information Science vol 26 no 8 pp 1373ndash1392 2012
[3] C Peng and Q-L Han ldquoA novel event-triggered transmissionscheme and L
2control co-design for sampled-data control
systemsrdquo IEEE Transactions on Automatic Control vol 58 no10 pp 2620ndash2626 2013
[4] S Rajasegarar C Leckie M Palaniswami and J C BezdekldquoDistributed anomaly detection in wireless sensor networksrdquo inProceedings of the 10th IEEE Singapore International Conferenceon Communication systems (ICCS rsquo06) pp 1ndash5 IEEE SingaporeOctober 2006
[5] S Rajasegarar C Leckie and M Palaniswami ldquoAnomalydetection in wireless sensor networksrdquo IEEE Wireless Commu-nications vol 15 no 4 pp 34ndash40 2008
[6] M Xie S Han B Tian and S Parvin ldquoAnomaly detectionin wireless sensor networks a surveyrdquo Journal of Network andComputer Applications vol 34 no 4 pp 1302ndash1325 2011
[7] Z Ding M Fei D Du and S Xu ldquoOnline anomaly detectionmethod based on BBO ensemble pruning in wireless sensornetworksrdquo in Life System Modeling and Simulation vol 461 ofCommunications in Computer and Information Science pp 160ndash169 Springer Berlin Germany 2014
[8] T G Dietterich ldquoMachine-learning researchmdashfour currentdirectionsrdquo AI Magazine vol 18 no 4 pp 97ndash136 1997
[9] Z-H Zhou J Wu andW Tang ldquoEnsembling neural networksmany could be better than allrdquoArtificial Intelligence vol 137 no1-2 pp 239ndash263 2002
[10] N Shahid I H Naqvi and S B Qaisar ldquoCharacteristics andclassification of outlier detection techniques for wireless sensornetworks in harsh environments a surveyrdquoArtificial IntelligenceReview vol 137 pp 1ndash36 2012
[11] D Du K Li and M Fei ldquoA fast multi-output RBF neuralnetwork constructionmethodrdquoNeurocomputing vol 73 no 10ndash12 pp 2196ndash2202 2010
[12] P Gil A Santos and A Cardoso ldquoDealing with outliers inwireless sensor networks an oil refinery applicationrdquo IEEETransactions on Control Systems Technology vol 23 no 4 pp1589ndash1596 2014
[13] M A Rassam M A Maarof and A Zainal ldquoAdaptive andonline data anomaly detection for wireless sensor systemsrdquoKnowledge-Based Systems vol 60 pp 44ndash57 2014
[14] S Rajasegarar A Gluhak M Ali Imran et al ldquoEllipsoidalneighbourhood outlier factor for distributed anomaly detectionin resource constrained networksrdquo Pattern Recognition vol 47no 9 pp 2867ndash2879 2014
[15] N Lu G Zhang and J Lu ldquoConcept drift detection viacompetence modelsrdquo Artificial Intelligence vol 209 pp 11ndash282014
[16] L Breiman ldquoBagging predictorsrdquoMachine Learning vol 24 no2 pp 123ndash140 1996
[17] S Seguı L Igual and J Vitria ldquoBagged one-class classifiersin the presence of outliersrdquo International Journal of PatternRecognition and Artificial Intelligence vol 27 no 5 Article ID1350014 2013
12 International Journal of Distributed Sensor Networks
[18] N Duffy and D Helmbold ldquoBoosting methods for regressionrdquoMachine Learning vol 47 no 2-3 pp 153ndash200 2002
[19] W-C Chang and C-W Cho ldquoOnline boosting for vehicledetectionrdquo IEEETransactions on SystemsMan and CyberneticsPart B Cybernetics vol 40 no 3 pp 892ndash902 2010
[20] C Desir S Bernard C Petitjean and L Heutte ldquoOne classrandom forestsrdquo Pattern Recognition vol 46 no 12 pp 3490ndash3506 2013
[21] A Fern and R Givan ldquoOnline ensemble learning an empiricalstudyrdquoMachine Learning vol 53 no 1-2 pp 71ndash109 2003
[22] A Bifet G Holmes B Pfahringer and R Gavalda ldquoImprov-ing adaptive bagging methods for evolving data streamsrdquo inAdvances in Machine Learning vol 5828 of Lecture Notes inComputer Science pp 23ndash37 Springer Berlin Germany 2009
[23] D I Curiac and C Volosencu ldquoEnsemble based sensinganomaly detection in wireless sensor networksrdquo Expert Systemswith Applications vol 39 no 10 pp 9087ndash9096 2012
[24] X Zhou S Li and Z Ye ldquoA novel system anomaly predictionsystem based on belief markov model and ensemble classifica-tionrdquo Mathematical Problems in Engineering vol 2013 ArticleID 179390 10 pages 2013
[25] H He S Chen K Li and X Xu ldquoIncremental learning fromstream datardquo IEEE Transactions on Neural Networks vol 22 no12 pp 1901ndash1914 2011
[26] D Du K Li X Li and M Fei ldquoA novel forward gene selectionalgorithm for microarray datardquo Neurocomputing vol 133 pp446ndash458 2014
[27] H Ma ldquoAn analysis of the equilibrium of migration models forbiogeography-based optimizationrdquo Information Sciences vol180 no 18 pp 3444ndash3464 2010
[28] D Simon ldquoBiogeography-based optimizationrdquo IEEE Transac-tions on Evolutionary Computation vol 12 no 6 pp 702ndash7132008
[29] S Sheen R Anitha and P Sirisha ldquoMalware detection bypruning of parallel ensembles using harmony searchrdquo PatternRecognition Letters vol 34 no 14 pp 1679ndash1686 2013
[30] Y-Y Zhang H-C Chao M Chen L Shu C-H Park and M-S Park ldquoOutlier detection and countermeasure for hierarchicalwireless sensor networksrdquo IET Information Security vol 4 no4 pp 361ndash373 2010
[31] C Peng and M-R Fei ldquoAn improved result on the stability ofuncertain T-S fuzzy systems with interval time-varying delayrdquoFuzzy Sets and Systems vol 212 pp 97ndash109 2013
[32] Y Zhang Observing the Unobservable Distributed Online Out-lier Detection inWireless Sensor Networks University of TwenteEnschede The Netherlands 2010
[33] C Peng D Yue and M Fei ldquoRelaxed stability and stabilizationconditions of networked fuzzy control systems subject toasynchronous grades of membershiprdquo IEEE Transactions onFuzzy Systems vol 22 no 5 pp 1101ndash1112 2014
[34] C Peng M-R Fei E Tian and Y-P Guan ldquoOn hold or dropout-of-order packets in networked control systemsrdquo Informa-tion Sciences vol 268 pp 436ndash446 2014
[35] M A Rassam A Zainal and M A Maarof ldquoAn adaptive andefficient dimension reduction model for multivariate wirelesssensor networks applicationsrdquo Applied Soft Computing Journalvol 13 no 4 pp 1978ndash1996 2013
[36] M Xie J Hu S Han and H-H Chen ldquoScalable hypergridk-NN-based online anomaly detection in wireless sensor net-worksrdquo IEEE Transactions on Parallel and Distributed Systemsvol 24 no 8 pp 1661ndash1670 2013
[37] Intel Berkely Reseach Lab (IBRL) dataset 2004 httpdbcsailmitedulabdatalabdatahtml
[38] J W Branch C Giannella B Szymanski R Wolff and HKargupta ldquoIn-network outlier detection in wireless sensornetworksrdquo Knowledge and Information Systems vol 34 no 1pp 23ndash54 2013
[39] M Moshtaghi T C Havens J C Bezdek et al ldquoClusteringellipses for anomaly detectionrdquo Pattern Recognition vol 44 no1 pp 55ndash69 2011
[40] S Rajasegarar J C Bezdek C Leckie and M PalaniswamildquoElliptical anomalies in wireless sensor networksrdquo ACM Trans-actions on Sensor Networks vol 6 no 1 pp 1ndash28 2009
[41] M A Rassam A Zainal and M A Maarof ldquoOne-classprincipal component classifier for anomaly detection inwirelesssensor networkrdquo in Proceedings of the 4th International Confer-ence on Computational Aspects of Social Networks (CASoN rsquo12)pp 271ndash276 IEEE Sao Carlos Brazil November 2012
[42] H Sagha H Bayati J D R Millan and R Chavarriaga ldquoOn-line anomaly detection and resilience in classifier ensemblesrdquoPattern Recognition Letters vol 34 no 15 pp 1916ndash1927 2013
[43] M Hejazi and Y P Singh ldquoOne-class support vector machinesapproach to anomaly detectionrdquo Applied Artificial Intelligencevol 27 no 5 pp 351ndash366 2013
[44] Y Zhang NMeratnia and P JMHavinga ldquoDistributed onlineoutlier detection in wireless sensor networks using ellipsoidalsupport vector machinerdquo Ad Hoc Networks vol 11 no 3 pp1062ndash1074 2013
International Journal of
AerospaceEngineeringHindawi Publishing Corporationhttpwwwhindawicom Volume 2014
RoboticsJournal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Active and Passive Electronic Components
Control Scienceand Engineering
Journal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
International Journal of
RotatingMachinery
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Hindawi Publishing Corporation httpwwwhindawicom
Journal ofEngineeringVolume 2014
Submit your manuscripts athttpwwwhindawicom
VLSI Design
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Shock and Vibration
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Civil EngineeringAdvances in
Acoustics and VibrationAdvances in
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Electrical and Computer Engineering
Journal of
Advances inOptoElectronics
Hindawi Publishing Corporation httpwwwhindawicom
Volume 2014
The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014
SensorsJournal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Modelling amp Simulation in EngineeringHindawi Publishing Corporation httpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Chemical EngineeringInternational Journal of Antennas and
Propagation
International Journal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Navigation and Observation
International Journal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
DistributedSensor Networks
International Journal of
8 International Journal of Distributed Sensor Networks
1
2
3
4
5 6
7
8
9
10
11
12
1314
15
16
17
18
19
20
21
22
23
2425
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
4142
43
44
45
46
47
48
49
50
51
52
54
Lab
Server
Quiet
Phon
e
Kitchen
Elec
Copy
Storage
Conference
Office Office
53
Figure 3 Sensor nodes location in the IBRL deployment
deployment (node locations are shown in black hexagonwith their corresponding node IDs) [35] The whole datasetwas collected from 29022004 to 05042004 Four typesof measures data that is light temperature and humidityas well as voltage were collected and those measurementswere recorded in 31 s interval Because these sensors weredeployed inside a lab and the measurement variables hadlittle changes over time (except the light having the suddenchanges due to the irregular nature of this variable andfrequent onoff operation) this dataset was considered a typeof static datasets for many researchers In our experimentsto evaluate our proposed anomaly detection algorithm someartificial anomalies are created by randomly modifying someobservations which is widely used bymany researchers in theliterature [41]
Since our proposed method adopts the cluster structurea cluster (consisting of 4 sensor nodes ie N7 N8 N9and N10) and dataset (collected on 29022004) are chosenThe data distribution can be seen in [7] Here only partobservations (during 000000 amndash075959 am) from eachsensor node are employed to evaluate proposed methodThedata trend is depicted in Figure 4
From Figure 4 an obvious fact is that data distributionin a cluster is almost same which well proved that spatialcorrelation exists Though there are some trivial differencesafter analyzing the dataset carefully the main reason is thatdataset has some missing data points largely due to packetloss which can be further proved from Figure 4 In ourexperiment these missing observations can be interpolatedusing the method described in Section 33 The obvious factis that sudden peakvalley appeared in Figure 4 for eachsensor observation which implies that an interested eventmay occurred
Suppose that 119863 = 119909119894 119910119894 119894 = 1 2 119899 is a dataset
used to train an anomaly detector Here the 119909119894is a vector
with feature values and 119910119894is the label which indicates whether
the given observation is normal or anomalous Because theIBRL dataset regards all its observations as normal someanomaly data points are generated and inserted to evaluatethe performance of our proposed method In the paper anumber of 30 data points of artificial anomalies for eachsensor were injected consecutively in each dataset to calculatethe true positive rate (TPR) false negative rate (FPR) anddetection accuracy (ACC) Without loss of generality theanomalous dataset should follow a distribution very muchdifferent from that of the training dataset but their rangesshould be overlapped as much as possible Besides ananomalous event should be a small probability event fora normal dataset collected by a nonfault sensor node Theanomalies were generated using a normal randomizer withslightly deviate statistical characteristics from the normaldata characteristics [41] The detailed dataset information(including statistical parameters) of selected sensor node ispresented in Table 1
42 Performance EvaluationMetrics and BBO Parameters Inorder to evaluate our proposed method some commonlyused performance evaluation metrics for anomaly detectionare used in our paper such as detection accuracy (ACC) truepositive rate (TPR) and false positivealarm rate (FPR)Theyare described as follows
ACC =
(TP + TN)(TP + TN + FP + FN)
International Journal of Distributed Sensor Networks 9
17
175
18
185
19
195
20Temperature
0 200 400 600 800 1000
N7N8
N9N10
(a)
38
39
40
41
42
43
44
45
46
0 200 400 600 800 1000
Humidity
N7N8
N9N10
(b)
Figure 4 The data (temperature humidity) trend during 00000 amndash75959 am on February 29 2004
Table 1 Detail dataset information of selected sensor node on 29022004
Node Initial sample Mean Variance Injected anomaly Mean Variance119879 119867 119879 119867 119879 119867 119879 119867
N7 823 184154 409176 05238 14494 30 1821 4110 054 146N8 548 179844 417123 05315 14612 30 1775 4195 055 148N9 652 181140 426295 05288 14827 30 1835 4245 055 150N10 620 181144 426215 05244 14191 30 1833 4247 054 143119879 temperature119867 humidity
TPR =
TP(TP + FN)
FPR =
FP(FP + TN)
(7)
where TP means number of samples correctly predicted asanomaly class FP means number of samples incorrectlypredicted as anomaly class TN means number of samplescorrectly predicted as normal class and FN means numberof samples incorrectly predicted as normal class
BBO is employed to prune the initial ensemble and themigration model is same as that present in [27 28] and therelated parameters are set as follows
Habitat (population) size 119878 = 30 the number of SIVs(suitability index variables) in each island 119899 = 20 40 60 80the maximum migration rates 119864 = 1 and 119868 = 1 and themutation rate 120578 = 001 and 120582 120583 are the immigration rateand the emigration rate respectively The elitism parameter120588 = 2
HSI (habitat suitability index) is a fitness function similarto other population-based optimization algorithms HIS isevaluated by 119865-measure (119865-score) which considers both the
precision probability and the recall probability of binaryclassification problem
119865-measure =(1 + 120573
2) precision lowast recall
1205732lowast precision + recall
=
(1 + 1205732) lowast TP
(1 + 1205732) lowast TP + 1205732 lowast FN + FP
(8)
119865-measure can be interpreted as a weighted average of theprecision and recall and its value reaches best at 1 and worstat 0 120573 is a parameter used to adjust the relative importancebetween precision and recall 120573 = 05 1 2 Usually thevalue of 119865-measure is close to the relative small value ofprecision and recall that is the big 119865-measuremeans that theprecision and recall are all big Consequently a good detectoris analogous to a habitat with a high HSI and is included inthe final ensemble detector and a poor detector is analogousto a habitat with a low HIS and is discarded from the finalensemble detector In our paper 120573 = 1 is specified
43 Results Presentation and Discussions In the data miningand machine learning communities SVM-based method has
10 International Journal of Distributed Sensor Networks
Table 2 Detection performance of local ensemble detector
Ensemble size N7 N8 N9 N10ACC TPR FPR ACC TPR FPR ACC TPR FPR ACC TPR FPR
5 08700 05833 01181 07900 03333 01809 08267 05000 01549 08267 05714 0160810 08800 06667 01111 08033 03889 01702 08267 04375 01514 08333 06429 0157315 08900 07500 01042 08167 05000 01631 08433 05000 01373 08600 07143 0132920 08933 08333 01042 08200 05000 01596 08367 05000 01444 08567 07143 01364
Table 3 Detection performance of global ensemble detector [7]
Combined ensemble size N7 N8 N9 N10ACC TPR FPR ACC TPR FPR ACC TPR FPR ACC TPR FPR
20 09467 08333 00486 09300 07778 00603 09467 07500 00423 09500 07857 0042040 09700 07500 00208 09433 08333 00496 09710 08938 00246 09650 08929 0031560 09700 08333 00243 09733 08889 00213 09800 09375 00176 09783 09357 0019680 09817 09583 00174 09800 09444 00177 09767 09375 00211 09780 09714 00217
been widely used in classification problem which separatesthe data belonging to the different classes by fitting a hyper-plane One class SVM based method as a variation of thismethod is especially favored for anomaly detection [42ndash44]In the paper it was used to train the base detectorThe datasetof each sensor node was divided into two parts about 66was used for training the local detector and the remainder asthe test set was to evaluate proposed method
Online Bagging the commonly used ensemble strategywas used to build initial ensemble detector Our experimentsaim to achieve two goals Firstly it is to prove the effectivenessof proposed method based on ensemble learning theorySecondly it is to prove that pruned ensemble detector canobtain better (at least equal) performance compared to initialensemble detector and mitigate the resource requirementAs a result three experiments were done that is localensemble anomaly detector only considering the temporalcorrelation of each sensor node global ensemble anomalydetector considering the spatiotemporal correlation and theglobal pruned ensemble anomaly detector based on BBOThe experimental results can be seen in Tables 2 3 and 4respectively
Table 2 shows the performance of each sensor node underthe different ensemble size which does not take into accountthe spatial correlation of sensed data in a cluster Though theensemble detection performance is becoming ldquogoodrdquo gradualwith the increasing of ensemble size (the higher value ofACCTPR the better performance and the lower value of FPR thebetter performance) the overall performance is relatively lowThemaximumvalue of detection accuracy is only 8933 andmost of true positive rates are unacceptable and most of falsepositive rates (FPR) have a relative high valueAll these resultsindicate that the performance of local ensemble detectoris poor Table 3 shows the global detection performance ofeach sensor node Here after the local ensemble detectorwas trained each member node sent its local ensembleto each other to form the global ensemble detector andeach member node used this global detector to online testthe local observation From the results of Table 3 [7] an
obvious fact is that the detection performances are higherthan presented in Table 2With the help of neighbor detectorthe detection results become better and better correspondingto the increasing of ensemble size
In order to further optimize the proposed algorithmperformance and save the resource ensemble pruning is usedfor global ensemble detector Table 4 [7] shows the result ofdetection performance of pruned global ensemble detectorbased on BBO
Table 4 shows a more practicable result and the sizeof global ensemble decreases sharply while the detectorperformance is as good as or better than the initial globalensemble detector From the results of Table 5 when thesize of initial ensemble reaches 80 the 60 resource costis saved In our experiment only for validating the methodeffectively we set the ensemble sizes 5 10 15 and 20 for eachlocal ensemble detector which may be small for the practicalapplications In fact how many local ensemble detectors arechosen is an open topic and is decided by many factors suchas the computation capability and the communication cost aswell as memory usage of sensor node the expected detectingaccuracy requirement and so on In the practical applicationa trade-off is commonly considered
5 Conclusion and Future Work
After exploiting the spatiotemporal correlation existing inthe sensed data of WSNs and motivated by the advantagesof online ensemble learning a distributed online ensembleanomaly detector method has been proposed Due to thespecific resource constrained in theWSNs ensemble pruningbased on BBO is employed to mitigate the high resourcerequirement and obtain the optimized detector that performsat least as good as the original ones The experimental resultson real dataset demonstrated that our proposed method iseffective
Because the diversity of base learners is a key factorrelated to the performance of ensemble learning as a possibleextension of our work we plan to include some diversity
International Journal of Distributed Sensor Networks 11
Table 4 Detection performance of global ensemble detector based on BBO pruning [7]
Ensemble size(BBO pruned)
N7 N8 N9 N10ACC TPR FPR ACC TPR FPR ACC TPR FPR ACC TPR FPR
14 09480 08000 00458 09327 07667 00567 09500 08125 00423 09533 08571 0042023 09710 07750 00208 09447 08000 00461 09733 09250 00239 09697 09143 0027627 09713 08500 00236 09683 08333 00230 09810 09563 00176 09797 09357 0018232 09820 09750 00177 09750 08333 00160 09820 09500 00162 09830 09786 00168
Table 5 Rate of saving resource cost based on global ensembledetector of BBO pruned
Number Initial ensemblesize
Prunedensemble size
Saving resourcecost
1 20 14 302 40 23 4253 60 27 554 80 32 60
measures in fitness function to improve the detecting per-formance in future Besides the cost of communication isthe main reason of quick energy depletion of sensor nodesespecially for the cluster head the adaptive selection of clusterhead based on energy state will be taken into account tolengthen the lifetime of WSNs in next work
Conflict of Interests
The authors declare that there is no conflict of interestsregarding the publication of this paper
Acknowledgments
This work is supported by the National Key ScientificInstrument and Equipment Development Project(2012YQ15008703) the Zhejiang Provincial Natural ScienceFoundation of China (LY13F020015) the Open Project of TopKeyDiscipline of Computer Software andTheory in ZhejiangProvincial (ZC323014100) National Science Foundation ofChina (61473182) Science and Technology Commission ofShanghai Municipality (11JC1404000 14JC1402200) andShanghai Rising-Star Program (13QA1401600)
References
[1] Y Zhang N Meratnia and P Havinga ldquoOutlier detectiontechniques for wireless sensor networks a surveyrdquo IEEE Com-munications Surveys and Tutorials vol 12 no 2 pp 159ndash1702010
[2] Y Zhang N A S Hamm N Meratnia A Stein M van deVoort and P J M Havinga ldquoStatistics-based outlier detectionfor wireless sensor networksrdquo International Journal of Geo-graphical Information Science vol 26 no 8 pp 1373ndash1392 2012
[3] C Peng and Q-L Han ldquoA novel event-triggered transmissionscheme and L
2control co-design for sampled-data control
systemsrdquo IEEE Transactions on Automatic Control vol 58 no10 pp 2620ndash2626 2013
[4] S Rajasegarar C Leckie M Palaniswami and J C BezdekldquoDistributed anomaly detection in wireless sensor networksrdquo inProceedings of the 10th IEEE Singapore International Conferenceon Communication systems (ICCS rsquo06) pp 1ndash5 IEEE SingaporeOctober 2006
[5] S Rajasegarar C Leckie and M Palaniswami ldquoAnomalydetection in wireless sensor networksrdquo IEEE Wireless Commu-nications vol 15 no 4 pp 34ndash40 2008
[6] M Xie S Han B Tian and S Parvin ldquoAnomaly detectionin wireless sensor networks a surveyrdquo Journal of Network andComputer Applications vol 34 no 4 pp 1302ndash1325 2011
[7] Z Ding M Fei D Du and S Xu ldquoOnline anomaly detectionmethod based on BBO ensemble pruning in wireless sensornetworksrdquo in Life System Modeling and Simulation vol 461 ofCommunications in Computer and Information Science pp 160ndash169 Springer Berlin Germany 2014
[8] T G Dietterich ldquoMachine-learning researchmdashfour currentdirectionsrdquo AI Magazine vol 18 no 4 pp 97ndash136 1997
[9] Z-H Zhou J Wu andW Tang ldquoEnsembling neural networksmany could be better than allrdquoArtificial Intelligence vol 137 no1-2 pp 239ndash263 2002
[10] N Shahid I H Naqvi and S B Qaisar ldquoCharacteristics andclassification of outlier detection techniques for wireless sensornetworks in harsh environments a surveyrdquoArtificial IntelligenceReview vol 137 pp 1ndash36 2012
[11] D Du K Li and M Fei ldquoA fast multi-output RBF neuralnetwork constructionmethodrdquoNeurocomputing vol 73 no 10ndash12 pp 2196ndash2202 2010
[12] P Gil A Santos and A Cardoso ldquoDealing with outliers inwireless sensor networks an oil refinery applicationrdquo IEEETransactions on Control Systems Technology vol 23 no 4 pp1589ndash1596 2014
[13] M A Rassam M A Maarof and A Zainal ldquoAdaptive andonline data anomaly detection for wireless sensor systemsrdquoKnowledge-Based Systems vol 60 pp 44ndash57 2014
[14] S Rajasegarar A Gluhak M Ali Imran et al ldquoEllipsoidalneighbourhood outlier factor for distributed anomaly detectionin resource constrained networksrdquo Pattern Recognition vol 47no 9 pp 2867ndash2879 2014
[15] N Lu G Zhang and J Lu ldquoConcept drift detection viacompetence modelsrdquo Artificial Intelligence vol 209 pp 11ndash282014
[16] L Breiman ldquoBagging predictorsrdquoMachine Learning vol 24 no2 pp 123ndash140 1996
[17] S Seguı L Igual and J Vitria ldquoBagged one-class classifiersin the presence of outliersrdquo International Journal of PatternRecognition and Artificial Intelligence vol 27 no 5 Article ID1350014 2013
12 International Journal of Distributed Sensor Networks
[18] N Duffy and D Helmbold ldquoBoosting methods for regressionrdquoMachine Learning vol 47 no 2-3 pp 153ndash200 2002
[19] W-C Chang and C-W Cho ldquoOnline boosting for vehicledetectionrdquo IEEETransactions on SystemsMan and CyberneticsPart B Cybernetics vol 40 no 3 pp 892ndash902 2010
[20] C Desir S Bernard C Petitjean and L Heutte ldquoOne classrandom forestsrdquo Pattern Recognition vol 46 no 12 pp 3490ndash3506 2013
[21] A Fern and R Givan ldquoOnline ensemble learning an empiricalstudyrdquoMachine Learning vol 53 no 1-2 pp 71ndash109 2003
[22] A Bifet G Holmes B Pfahringer and R Gavalda ldquoImprov-ing adaptive bagging methods for evolving data streamsrdquo inAdvances in Machine Learning vol 5828 of Lecture Notes inComputer Science pp 23ndash37 Springer Berlin Germany 2009
[23] D I Curiac and C Volosencu ldquoEnsemble based sensinganomaly detection in wireless sensor networksrdquo Expert Systemswith Applications vol 39 no 10 pp 9087ndash9096 2012
[24] X Zhou S Li and Z Ye ldquoA novel system anomaly predictionsystem based on belief markov model and ensemble classifica-tionrdquo Mathematical Problems in Engineering vol 2013 ArticleID 179390 10 pages 2013
[25] H He S Chen K Li and X Xu ldquoIncremental learning fromstream datardquo IEEE Transactions on Neural Networks vol 22 no12 pp 1901ndash1914 2011
[26] D Du K Li X Li and M Fei ldquoA novel forward gene selectionalgorithm for microarray datardquo Neurocomputing vol 133 pp446ndash458 2014
[27] H Ma ldquoAn analysis of the equilibrium of migration models forbiogeography-based optimizationrdquo Information Sciences vol180 no 18 pp 3444ndash3464 2010
[28] D Simon ldquoBiogeography-based optimizationrdquo IEEE Transac-tions on Evolutionary Computation vol 12 no 6 pp 702ndash7132008
[29] S Sheen R Anitha and P Sirisha ldquoMalware detection bypruning of parallel ensembles using harmony searchrdquo PatternRecognition Letters vol 34 no 14 pp 1679ndash1686 2013
[30] Y-Y Zhang H-C Chao M Chen L Shu C-H Park and M-S Park ldquoOutlier detection and countermeasure for hierarchicalwireless sensor networksrdquo IET Information Security vol 4 no4 pp 361ndash373 2010
[31] C Peng and M-R Fei ldquoAn improved result on the stability ofuncertain T-S fuzzy systems with interval time-varying delayrdquoFuzzy Sets and Systems vol 212 pp 97ndash109 2013
[32] Y Zhang Observing the Unobservable Distributed Online Out-lier Detection inWireless Sensor Networks University of TwenteEnschede The Netherlands 2010
[33] C Peng D Yue and M Fei ldquoRelaxed stability and stabilizationconditions of networked fuzzy control systems subject toasynchronous grades of membershiprdquo IEEE Transactions onFuzzy Systems vol 22 no 5 pp 1101ndash1112 2014
[34] C Peng M-R Fei E Tian and Y-P Guan ldquoOn hold or dropout-of-order packets in networked control systemsrdquo Informa-tion Sciences vol 268 pp 436ndash446 2014
[35] M A Rassam A Zainal and M A Maarof ldquoAn adaptive andefficient dimension reduction model for multivariate wirelesssensor networks applicationsrdquo Applied Soft Computing Journalvol 13 no 4 pp 1978ndash1996 2013
[36] M Xie J Hu S Han and H-H Chen ldquoScalable hypergridk-NN-based online anomaly detection in wireless sensor net-worksrdquo IEEE Transactions on Parallel and Distributed Systemsvol 24 no 8 pp 1661ndash1670 2013
[37] Intel Berkely Reseach Lab (IBRL) dataset 2004 httpdbcsailmitedulabdatalabdatahtml
[38] J W Branch C Giannella B Szymanski R Wolff and HKargupta ldquoIn-network outlier detection in wireless sensornetworksrdquo Knowledge and Information Systems vol 34 no 1pp 23ndash54 2013
[39] M Moshtaghi T C Havens J C Bezdek et al ldquoClusteringellipses for anomaly detectionrdquo Pattern Recognition vol 44 no1 pp 55ndash69 2011
[40] S Rajasegarar J C Bezdek C Leckie and M PalaniswamildquoElliptical anomalies in wireless sensor networksrdquo ACM Trans-actions on Sensor Networks vol 6 no 1 pp 1ndash28 2009
[41] M A Rassam A Zainal and M A Maarof ldquoOne-classprincipal component classifier for anomaly detection inwirelesssensor networkrdquo in Proceedings of the 4th International Confer-ence on Computational Aspects of Social Networks (CASoN rsquo12)pp 271ndash276 IEEE Sao Carlos Brazil November 2012
[42] H Sagha H Bayati J D R Millan and R Chavarriaga ldquoOn-line anomaly detection and resilience in classifier ensemblesrdquoPattern Recognition Letters vol 34 no 15 pp 1916ndash1927 2013
[43] M Hejazi and Y P Singh ldquoOne-class support vector machinesapproach to anomaly detectionrdquo Applied Artificial Intelligencevol 27 no 5 pp 351ndash366 2013
[44] Y Zhang NMeratnia and P JMHavinga ldquoDistributed onlineoutlier detection in wireless sensor networks using ellipsoidalsupport vector machinerdquo Ad Hoc Networks vol 11 no 3 pp1062ndash1074 2013
International Journal of
AerospaceEngineeringHindawi Publishing Corporationhttpwwwhindawicom Volume 2014
RoboticsJournal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Active and Passive Electronic Components
Control Scienceand Engineering
Journal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
International Journal of
RotatingMachinery
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Hindawi Publishing Corporation httpwwwhindawicom
Journal ofEngineeringVolume 2014
Submit your manuscripts athttpwwwhindawicom
VLSI Design
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Shock and Vibration
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Civil EngineeringAdvances in
Acoustics and VibrationAdvances in
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Electrical and Computer Engineering
Journal of
Advances inOptoElectronics
Hindawi Publishing Corporation httpwwwhindawicom
Volume 2014
The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014
SensorsJournal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Modelling amp Simulation in EngineeringHindawi Publishing Corporation httpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Chemical EngineeringInternational Journal of Antennas and
Propagation
International Journal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Navigation and Observation
International Journal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
DistributedSensor Networks
International Journal of
International Journal of Distributed Sensor Networks 9
17
175
18
185
19
195
20Temperature
0 200 400 600 800 1000
N7N8
N9N10
(a)
38
39
40
41
42
43
44
45
46
0 200 400 600 800 1000
Humidity
N7N8
N9N10
(b)
Figure 4 The data (temperature humidity) trend during 00000 amndash75959 am on February 29 2004
Table 1 Detail dataset information of selected sensor node on 29022004
Node Initial sample Mean Variance Injected anomaly Mean Variance119879 119867 119879 119867 119879 119867 119879 119867
N7 823 184154 409176 05238 14494 30 1821 4110 054 146N8 548 179844 417123 05315 14612 30 1775 4195 055 148N9 652 181140 426295 05288 14827 30 1835 4245 055 150N10 620 181144 426215 05244 14191 30 1833 4247 054 143119879 temperature119867 humidity
TPR =
TP(TP + FN)
FPR =
FP(FP + TN)
(7)
where TP means number of samples correctly predicted asanomaly class FP means number of samples incorrectlypredicted as anomaly class TN means number of samplescorrectly predicted as normal class and FN means numberof samples incorrectly predicted as normal class
BBO is employed to prune the initial ensemble and themigration model is same as that present in [27 28] and therelated parameters are set as follows
Habitat (population) size 119878 = 30 the number of SIVs(suitability index variables) in each island 119899 = 20 40 60 80the maximum migration rates 119864 = 1 and 119868 = 1 and themutation rate 120578 = 001 and 120582 120583 are the immigration rateand the emigration rate respectively The elitism parameter120588 = 2
HSI (habitat suitability index) is a fitness function similarto other population-based optimization algorithms HIS isevaluated by 119865-measure (119865-score) which considers both the
precision probability and the recall probability of binaryclassification problem
119865-measure =(1 + 120573
2) precision lowast recall
1205732lowast precision + recall
=
(1 + 1205732) lowast TP
(1 + 1205732) lowast TP + 1205732 lowast FN + FP
(8)
119865-measure can be interpreted as a weighted average of theprecision and recall and its value reaches best at 1 and worstat 0 120573 is a parameter used to adjust the relative importancebetween precision and recall 120573 = 05 1 2 Usually thevalue of 119865-measure is close to the relative small value ofprecision and recall that is the big 119865-measuremeans that theprecision and recall are all big Consequently a good detectoris analogous to a habitat with a high HSI and is included inthe final ensemble detector and a poor detector is analogousto a habitat with a low HIS and is discarded from the finalensemble detector In our paper 120573 = 1 is specified
43 Results Presentation and Discussions In the data miningand machine learning communities SVM-based method has
10 International Journal of Distributed Sensor Networks
Table 2 Detection performance of local ensemble detector
Ensemble size N7 N8 N9 N10ACC TPR FPR ACC TPR FPR ACC TPR FPR ACC TPR FPR
5 08700 05833 01181 07900 03333 01809 08267 05000 01549 08267 05714 0160810 08800 06667 01111 08033 03889 01702 08267 04375 01514 08333 06429 0157315 08900 07500 01042 08167 05000 01631 08433 05000 01373 08600 07143 0132920 08933 08333 01042 08200 05000 01596 08367 05000 01444 08567 07143 01364
Table 3 Detection performance of global ensemble detector [7]
Combined ensemble size N7 N8 N9 N10ACC TPR FPR ACC TPR FPR ACC TPR FPR ACC TPR FPR
20 09467 08333 00486 09300 07778 00603 09467 07500 00423 09500 07857 0042040 09700 07500 00208 09433 08333 00496 09710 08938 00246 09650 08929 0031560 09700 08333 00243 09733 08889 00213 09800 09375 00176 09783 09357 0019680 09817 09583 00174 09800 09444 00177 09767 09375 00211 09780 09714 00217
been widely used in classification problem which separatesthe data belonging to the different classes by fitting a hyper-plane One class SVM based method as a variation of thismethod is especially favored for anomaly detection [42ndash44]In the paper it was used to train the base detectorThe datasetof each sensor node was divided into two parts about 66was used for training the local detector and the remainder asthe test set was to evaluate proposed method
Online Bagging the commonly used ensemble strategywas used to build initial ensemble detector Our experimentsaim to achieve two goals Firstly it is to prove the effectivenessof proposed method based on ensemble learning theorySecondly it is to prove that pruned ensemble detector canobtain better (at least equal) performance compared to initialensemble detector and mitigate the resource requirementAs a result three experiments were done that is localensemble anomaly detector only considering the temporalcorrelation of each sensor node global ensemble anomalydetector considering the spatiotemporal correlation and theglobal pruned ensemble anomaly detector based on BBOThe experimental results can be seen in Tables 2 3 and 4respectively
Table 2 shows the performance of each sensor node underthe different ensemble size which does not take into accountthe spatial correlation of sensed data in a cluster Though theensemble detection performance is becoming ldquogoodrdquo gradualwith the increasing of ensemble size (the higher value ofACCTPR the better performance and the lower value of FPR thebetter performance) the overall performance is relatively lowThemaximumvalue of detection accuracy is only 8933 andmost of true positive rates are unacceptable and most of falsepositive rates (FPR) have a relative high valueAll these resultsindicate that the performance of local ensemble detectoris poor Table 3 shows the global detection performance ofeach sensor node Here after the local ensemble detectorwas trained each member node sent its local ensembleto each other to form the global ensemble detector andeach member node used this global detector to online testthe local observation From the results of Table 3 [7] an
obvious fact is that the detection performances are higherthan presented in Table 2With the help of neighbor detectorthe detection results become better and better correspondingto the increasing of ensemble size
In order to further optimize the proposed algorithmperformance and save the resource ensemble pruning is usedfor global ensemble detector Table 4 [7] shows the result ofdetection performance of pruned global ensemble detectorbased on BBO
Table 4 shows a more practicable result and the sizeof global ensemble decreases sharply while the detectorperformance is as good as or better than the initial globalensemble detector From the results of Table 5 when thesize of initial ensemble reaches 80 the 60 resource costis saved In our experiment only for validating the methodeffectively we set the ensemble sizes 5 10 15 and 20 for eachlocal ensemble detector which may be small for the practicalapplications In fact how many local ensemble detectors arechosen is an open topic and is decided by many factors suchas the computation capability and the communication cost aswell as memory usage of sensor node the expected detectingaccuracy requirement and so on In the practical applicationa trade-off is commonly considered
5 Conclusion and Future Work
After exploiting the spatiotemporal correlation existing inthe sensed data of WSNs and motivated by the advantagesof online ensemble learning a distributed online ensembleanomaly detector method has been proposed Due to thespecific resource constrained in theWSNs ensemble pruningbased on BBO is employed to mitigate the high resourcerequirement and obtain the optimized detector that performsat least as good as the original ones The experimental resultson real dataset demonstrated that our proposed method iseffective
Because the diversity of base learners is a key factorrelated to the performance of ensemble learning as a possibleextension of our work we plan to include some diversity
International Journal of Distributed Sensor Networks 11
Table 4 Detection performance of global ensemble detector based on BBO pruning [7]
Ensemble size(BBO pruned)
N7 N8 N9 N10ACC TPR FPR ACC TPR FPR ACC TPR FPR ACC TPR FPR
14 09480 08000 00458 09327 07667 00567 09500 08125 00423 09533 08571 0042023 09710 07750 00208 09447 08000 00461 09733 09250 00239 09697 09143 0027627 09713 08500 00236 09683 08333 00230 09810 09563 00176 09797 09357 0018232 09820 09750 00177 09750 08333 00160 09820 09500 00162 09830 09786 00168
Table 5 Rate of saving resource cost based on global ensembledetector of BBO pruned
Number Initial ensemblesize
Prunedensemble size
Saving resourcecost
1 20 14 302 40 23 4253 60 27 554 80 32 60
measures in fitness function to improve the detecting per-formance in future Besides the cost of communication isthe main reason of quick energy depletion of sensor nodesespecially for the cluster head the adaptive selection of clusterhead based on energy state will be taken into account tolengthen the lifetime of WSNs in next work
Conflict of Interests
The authors declare that there is no conflict of interestsregarding the publication of this paper
Acknowledgments
This work is supported by the National Key ScientificInstrument and Equipment Development Project(2012YQ15008703) the Zhejiang Provincial Natural ScienceFoundation of China (LY13F020015) the Open Project of TopKeyDiscipline of Computer Software andTheory in ZhejiangProvincial (ZC323014100) National Science Foundation ofChina (61473182) Science and Technology Commission ofShanghai Municipality (11JC1404000 14JC1402200) andShanghai Rising-Star Program (13QA1401600)
References
[1] Y Zhang N Meratnia and P Havinga ldquoOutlier detectiontechniques for wireless sensor networks a surveyrdquo IEEE Com-munications Surveys and Tutorials vol 12 no 2 pp 159ndash1702010
[2] Y Zhang N A S Hamm N Meratnia A Stein M van deVoort and P J M Havinga ldquoStatistics-based outlier detectionfor wireless sensor networksrdquo International Journal of Geo-graphical Information Science vol 26 no 8 pp 1373ndash1392 2012
[3] C Peng and Q-L Han ldquoA novel event-triggered transmissionscheme and L
2control co-design for sampled-data control
systemsrdquo IEEE Transactions on Automatic Control vol 58 no10 pp 2620ndash2626 2013
[4] S Rajasegarar C Leckie M Palaniswami and J C BezdekldquoDistributed anomaly detection in wireless sensor networksrdquo inProceedings of the 10th IEEE Singapore International Conferenceon Communication systems (ICCS rsquo06) pp 1ndash5 IEEE SingaporeOctober 2006
[5] S Rajasegarar C Leckie and M Palaniswami ldquoAnomalydetection in wireless sensor networksrdquo IEEE Wireless Commu-nications vol 15 no 4 pp 34ndash40 2008
[6] M Xie S Han B Tian and S Parvin ldquoAnomaly detectionin wireless sensor networks a surveyrdquo Journal of Network andComputer Applications vol 34 no 4 pp 1302ndash1325 2011
[7] Z Ding M Fei D Du and S Xu ldquoOnline anomaly detectionmethod based on BBO ensemble pruning in wireless sensornetworksrdquo in Life System Modeling and Simulation vol 461 ofCommunications in Computer and Information Science pp 160ndash169 Springer Berlin Germany 2014
[8] T G Dietterich ldquoMachine-learning researchmdashfour currentdirectionsrdquo AI Magazine vol 18 no 4 pp 97ndash136 1997
[9] Z-H Zhou J Wu andW Tang ldquoEnsembling neural networksmany could be better than allrdquoArtificial Intelligence vol 137 no1-2 pp 239ndash263 2002
[10] N Shahid I H Naqvi and S B Qaisar ldquoCharacteristics andclassification of outlier detection techniques for wireless sensornetworks in harsh environments a surveyrdquoArtificial IntelligenceReview vol 137 pp 1ndash36 2012
[11] D Du K Li and M Fei ldquoA fast multi-output RBF neuralnetwork constructionmethodrdquoNeurocomputing vol 73 no 10ndash12 pp 2196ndash2202 2010
[12] P Gil A Santos and A Cardoso ldquoDealing with outliers inwireless sensor networks an oil refinery applicationrdquo IEEETransactions on Control Systems Technology vol 23 no 4 pp1589ndash1596 2014
[13] M A Rassam M A Maarof and A Zainal ldquoAdaptive andonline data anomaly detection for wireless sensor systemsrdquoKnowledge-Based Systems vol 60 pp 44ndash57 2014
[14] S Rajasegarar A Gluhak M Ali Imran et al ldquoEllipsoidalneighbourhood outlier factor for distributed anomaly detectionin resource constrained networksrdquo Pattern Recognition vol 47no 9 pp 2867ndash2879 2014
[15] N Lu G Zhang and J Lu ldquoConcept drift detection viacompetence modelsrdquo Artificial Intelligence vol 209 pp 11ndash282014
[16] L Breiman ldquoBagging predictorsrdquoMachine Learning vol 24 no2 pp 123ndash140 1996
[17] S Seguı L Igual and J Vitria ldquoBagged one-class classifiersin the presence of outliersrdquo International Journal of PatternRecognition and Artificial Intelligence vol 27 no 5 Article ID1350014 2013
12 International Journal of Distributed Sensor Networks
[18] N Duffy and D Helmbold ldquoBoosting methods for regressionrdquoMachine Learning vol 47 no 2-3 pp 153ndash200 2002
[19] W-C Chang and C-W Cho ldquoOnline boosting for vehicledetectionrdquo IEEETransactions on SystemsMan and CyberneticsPart B Cybernetics vol 40 no 3 pp 892ndash902 2010
[20] C Desir S Bernard C Petitjean and L Heutte ldquoOne classrandom forestsrdquo Pattern Recognition vol 46 no 12 pp 3490ndash3506 2013
[21] A Fern and R Givan ldquoOnline ensemble learning an empiricalstudyrdquoMachine Learning vol 53 no 1-2 pp 71ndash109 2003
[22] A Bifet G Holmes B Pfahringer and R Gavalda ldquoImprov-ing adaptive bagging methods for evolving data streamsrdquo inAdvances in Machine Learning vol 5828 of Lecture Notes inComputer Science pp 23ndash37 Springer Berlin Germany 2009
[23] D I Curiac and C Volosencu ldquoEnsemble based sensinganomaly detection in wireless sensor networksrdquo Expert Systemswith Applications vol 39 no 10 pp 9087ndash9096 2012
[24] X Zhou S Li and Z Ye ldquoA novel system anomaly predictionsystem based on belief markov model and ensemble classifica-tionrdquo Mathematical Problems in Engineering vol 2013 ArticleID 179390 10 pages 2013
[25] H He S Chen K Li and X Xu ldquoIncremental learning fromstream datardquo IEEE Transactions on Neural Networks vol 22 no12 pp 1901ndash1914 2011
[26] D Du K Li X Li and M Fei ldquoA novel forward gene selectionalgorithm for microarray datardquo Neurocomputing vol 133 pp446ndash458 2014
[27] H Ma ldquoAn analysis of the equilibrium of migration models forbiogeography-based optimizationrdquo Information Sciences vol180 no 18 pp 3444ndash3464 2010
[28] D Simon ldquoBiogeography-based optimizationrdquo IEEE Transac-tions on Evolutionary Computation vol 12 no 6 pp 702ndash7132008
[29] S Sheen R Anitha and P Sirisha ldquoMalware detection bypruning of parallel ensembles using harmony searchrdquo PatternRecognition Letters vol 34 no 14 pp 1679ndash1686 2013
[30] Y-Y Zhang H-C Chao M Chen L Shu C-H Park and M-S Park ldquoOutlier detection and countermeasure for hierarchicalwireless sensor networksrdquo IET Information Security vol 4 no4 pp 361ndash373 2010
[31] C Peng and M-R Fei ldquoAn improved result on the stability ofuncertain T-S fuzzy systems with interval time-varying delayrdquoFuzzy Sets and Systems vol 212 pp 97ndash109 2013
[32] Y Zhang Observing the Unobservable Distributed Online Out-lier Detection inWireless Sensor Networks University of TwenteEnschede The Netherlands 2010
[33] C Peng D Yue and M Fei ldquoRelaxed stability and stabilizationconditions of networked fuzzy control systems subject toasynchronous grades of membershiprdquo IEEE Transactions onFuzzy Systems vol 22 no 5 pp 1101ndash1112 2014
[34] C Peng M-R Fei E Tian and Y-P Guan ldquoOn hold or dropout-of-order packets in networked control systemsrdquo Informa-tion Sciences vol 268 pp 436ndash446 2014
[35] M A Rassam A Zainal and M A Maarof ldquoAn adaptive andefficient dimension reduction model for multivariate wirelesssensor networks applicationsrdquo Applied Soft Computing Journalvol 13 no 4 pp 1978ndash1996 2013
[36] M Xie J Hu S Han and H-H Chen ldquoScalable hypergridk-NN-based online anomaly detection in wireless sensor net-worksrdquo IEEE Transactions on Parallel and Distributed Systemsvol 24 no 8 pp 1661ndash1670 2013
[37] Intel Berkely Reseach Lab (IBRL) dataset 2004 httpdbcsailmitedulabdatalabdatahtml
[38] J W Branch C Giannella B Szymanski R Wolff and HKargupta ldquoIn-network outlier detection in wireless sensornetworksrdquo Knowledge and Information Systems vol 34 no 1pp 23ndash54 2013
[39] M Moshtaghi T C Havens J C Bezdek et al ldquoClusteringellipses for anomaly detectionrdquo Pattern Recognition vol 44 no1 pp 55ndash69 2011
[40] S Rajasegarar J C Bezdek C Leckie and M PalaniswamildquoElliptical anomalies in wireless sensor networksrdquo ACM Trans-actions on Sensor Networks vol 6 no 1 pp 1ndash28 2009
[41] M A Rassam A Zainal and M A Maarof ldquoOne-classprincipal component classifier for anomaly detection inwirelesssensor networkrdquo in Proceedings of the 4th International Confer-ence on Computational Aspects of Social Networks (CASoN rsquo12)pp 271ndash276 IEEE Sao Carlos Brazil November 2012
[42] H Sagha H Bayati J D R Millan and R Chavarriaga ldquoOn-line anomaly detection and resilience in classifier ensemblesrdquoPattern Recognition Letters vol 34 no 15 pp 1916ndash1927 2013
[43] M Hejazi and Y P Singh ldquoOne-class support vector machinesapproach to anomaly detectionrdquo Applied Artificial Intelligencevol 27 no 5 pp 351ndash366 2013
[44] Y Zhang NMeratnia and P JMHavinga ldquoDistributed onlineoutlier detection in wireless sensor networks using ellipsoidalsupport vector machinerdquo Ad Hoc Networks vol 11 no 3 pp1062ndash1074 2013
International Journal of
AerospaceEngineeringHindawi Publishing Corporationhttpwwwhindawicom Volume 2014
RoboticsJournal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Active and Passive Electronic Components
Control Scienceand Engineering
Journal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
International Journal of
RotatingMachinery
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Hindawi Publishing Corporation httpwwwhindawicom
Journal ofEngineeringVolume 2014
Submit your manuscripts athttpwwwhindawicom
VLSI Design
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Shock and Vibration
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Civil EngineeringAdvances in
Acoustics and VibrationAdvances in
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Electrical and Computer Engineering
Journal of
Advances inOptoElectronics
Hindawi Publishing Corporation httpwwwhindawicom
Volume 2014
The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014
SensorsJournal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Modelling amp Simulation in EngineeringHindawi Publishing Corporation httpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Chemical EngineeringInternational Journal of Antennas and
Propagation
International Journal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Navigation and Observation
International Journal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
DistributedSensor Networks
International Journal of
10 International Journal of Distributed Sensor Networks
Table 2 Detection performance of local ensemble detector
Ensemble size N7 N8 N9 N10ACC TPR FPR ACC TPR FPR ACC TPR FPR ACC TPR FPR
5 08700 05833 01181 07900 03333 01809 08267 05000 01549 08267 05714 0160810 08800 06667 01111 08033 03889 01702 08267 04375 01514 08333 06429 0157315 08900 07500 01042 08167 05000 01631 08433 05000 01373 08600 07143 0132920 08933 08333 01042 08200 05000 01596 08367 05000 01444 08567 07143 01364
Table 3 Detection performance of global ensemble detector [7]
Combined ensemble size N7 N8 N9 N10ACC TPR FPR ACC TPR FPR ACC TPR FPR ACC TPR FPR
20 09467 08333 00486 09300 07778 00603 09467 07500 00423 09500 07857 0042040 09700 07500 00208 09433 08333 00496 09710 08938 00246 09650 08929 0031560 09700 08333 00243 09733 08889 00213 09800 09375 00176 09783 09357 0019680 09817 09583 00174 09800 09444 00177 09767 09375 00211 09780 09714 00217
been widely used in classification problem which separatesthe data belonging to the different classes by fitting a hyper-plane One class SVM based method as a variation of thismethod is especially favored for anomaly detection [42ndash44]In the paper it was used to train the base detectorThe datasetof each sensor node was divided into two parts about 66was used for training the local detector and the remainder asthe test set was to evaluate proposed method
Online Bagging the commonly used ensemble strategywas used to build initial ensemble detector Our experimentsaim to achieve two goals Firstly it is to prove the effectivenessof proposed method based on ensemble learning theorySecondly it is to prove that pruned ensemble detector canobtain better (at least equal) performance compared to initialensemble detector and mitigate the resource requirementAs a result three experiments were done that is localensemble anomaly detector only considering the temporalcorrelation of each sensor node global ensemble anomalydetector considering the spatiotemporal correlation and theglobal pruned ensemble anomaly detector based on BBOThe experimental results can be seen in Tables 2 3 and 4respectively
Table 2 shows the performance of each sensor node underthe different ensemble size which does not take into accountthe spatial correlation of sensed data in a cluster Though theensemble detection performance is becoming ldquogoodrdquo gradualwith the increasing of ensemble size (the higher value ofACCTPR the better performance and the lower value of FPR thebetter performance) the overall performance is relatively lowThemaximumvalue of detection accuracy is only 8933 andmost of true positive rates are unacceptable and most of falsepositive rates (FPR) have a relative high valueAll these resultsindicate that the performance of local ensemble detectoris poor Table 3 shows the global detection performance ofeach sensor node Here after the local ensemble detectorwas trained each member node sent its local ensembleto each other to form the global ensemble detector andeach member node used this global detector to online testthe local observation From the results of Table 3 [7] an
obvious fact is that the detection performances are higherthan presented in Table 2With the help of neighbor detectorthe detection results become better and better correspondingto the increasing of ensemble size
In order to further optimize the proposed algorithmperformance and save the resource ensemble pruning is usedfor global ensemble detector Table 4 [7] shows the result ofdetection performance of pruned global ensemble detectorbased on BBO
Table 4 shows a more practicable result and the sizeof global ensemble decreases sharply while the detectorperformance is as good as or better than the initial globalensemble detector From the results of Table 5 when thesize of initial ensemble reaches 80 the 60 resource costis saved In our experiment only for validating the methodeffectively we set the ensemble sizes 5 10 15 and 20 for eachlocal ensemble detector which may be small for the practicalapplications In fact how many local ensemble detectors arechosen is an open topic and is decided by many factors suchas the computation capability and the communication cost aswell as memory usage of sensor node the expected detectingaccuracy requirement and so on In the practical applicationa trade-off is commonly considered
5 Conclusion and Future Work
After exploiting the spatiotemporal correlation existing inthe sensed data of WSNs and motivated by the advantagesof online ensemble learning a distributed online ensembleanomaly detector method has been proposed Due to thespecific resource constrained in theWSNs ensemble pruningbased on BBO is employed to mitigate the high resourcerequirement and obtain the optimized detector that performsat least as good as the original ones The experimental resultson real dataset demonstrated that our proposed method iseffective
Because the diversity of base learners is a key factorrelated to the performance of ensemble learning as a possibleextension of our work we plan to include some diversity
International Journal of Distributed Sensor Networks 11
Table 4 Detection performance of global ensemble detector based on BBO pruning [7]
Ensemble size(BBO pruned)
N7 N8 N9 N10ACC TPR FPR ACC TPR FPR ACC TPR FPR ACC TPR FPR
14 09480 08000 00458 09327 07667 00567 09500 08125 00423 09533 08571 0042023 09710 07750 00208 09447 08000 00461 09733 09250 00239 09697 09143 0027627 09713 08500 00236 09683 08333 00230 09810 09563 00176 09797 09357 0018232 09820 09750 00177 09750 08333 00160 09820 09500 00162 09830 09786 00168
Table 5 Rate of saving resource cost based on global ensembledetector of BBO pruned
Number Initial ensemblesize
Prunedensemble size
Saving resourcecost
1 20 14 302 40 23 4253 60 27 554 80 32 60
measures in fitness function to improve the detecting per-formance in future Besides the cost of communication isthe main reason of quick energy depletion of sensor nodesespecially for the cluster head the adaptive selection of clusterhead based on energy state will be taken into account tolengthen the lifetime of WSNs in next work
Conflict of Interests
The authors declare that there is no conflict of interestsregarding the publication of this paper
Acknowledgments
This work is supported by the National Key ScientificInstrument and Equipment Development Project(2012YQ15008703) the Zhejiang Provincial Natural ScienceFoundation of China (LY13F020015) the Open Project of TopKeyDiscipline of Computer Software andTheory in ZhejiangProvincial (ZC323014100) National Science Foundation ofChina (61473182) Science and Technology Commission ofShanghai Municipality (11JC1404000 14JC1402200) andShanghai Rising-Star Program (13QA1401600)
References
[1] Y Zhang N Meratnia and P Havinga ldquoOutlier detectiontechniques for wireless sensor networks a surveyrdquo IEEE Com-munications Surveys and Tutorials vol 12 no 2 pp 159ndash1702010
[2] Y Zhang N A S Hamm N Meratnia A Stein M van deVoort and P J M Havinga ldquoStatistics-based outlier detectionfor wireless sensor networksrdquo International Journal of Geo-graphical Information Science vol 26 no 8 pp 1373ndash1392 2012
[3] C Peng and Q-L Han ldquoA novel event-triggered transmissionscheme and L
2control co-design for sampled-data control
systemsrdquo IEEE Transactions on Automatic Control vol 58 no10 pp 2620ndash2626 2013
[4] S Rajasegarar C Leckie M Palaniswami and J C BezdekldquoDistributed anomaly detection in wireless sensor networksrdquo inProceedings of the 10th IEEE Singapore International Conferenceon Communication systems (ICCS rsquo06) pp 1ndash5 IEEE SingaporeOctober 2006
[5] S Rajasegarar C Leckie and M Palaniswami ldquoAnomalydetection in wireless sensor networksrdquo IEEE Wireless Commu-nications vol 15 no 4 pp 34ndash40 2008
[6] M Xie S Han B Tian and S Parvin ldquoAnomaly detectionin wireless sensor networks a surveyrdquo Journal of Network andComputer Applications vol 34 no 4 pp 1302ndash1325 2011
[7] Z Ding M Fei D Du and S Xu ldquoOnline anomaly detectionmethod based on BBO ensemble pruning in wireless sensornetworksrdquo in Life System Modeling and Simulation vol 461 ofCommunications in Computer and Information Science pp 160ndash169 Springer Berlin Germany 2014
[8] T G Dietterich ldquoMachine-learning researchmdashfour currentdirectionsrdquo AI Magazine vol 18 no 4 pp 97ndash136 1997
[9] Z-H Zhou J Wu andW Tang ldquoEnsembling neural networksmany could be better than allrdquoArtificial Intelligence vol 137 no1-2 pp 239ndash263 2002
[10] N Shahid I H Naqvi and S B Qaisar ldquoCharacteristics andclassification of outlier detection techniques for wireless sensornetworks in harsh environments a surveyrdquoArtificial IntelligenceReview vol 137 pp 1ndash36 2012
[11] D Du K Li and M Fei ldquoA fast multi-output RBF neuralnetwork constructionmethodrdquoNeurocomputing vol 73 no 10ndash12 pp 2196ndash2202 2010
[12] P Gil A Santos and A Cardoso ldquoDealing with outliers inwireless sensor networks an oil refinery applicationrdquo IEEETransactions on Control Systems Technology vol 23 no 4 pp1589ndash1596 2014
[13] M A Rassam M A Maarof and A Zainal ldquoAdaptive andonline data anomaly detection for wireless sensor systemsrdquoKnowledge-Based Systems vol 60 pp 44ndash57 2014
[14] S Rajasegarar A Gluhak M Ali Imran et al ldquoEllipsoidalneighbourhood outlier factor for distributed anomaly detectionin resource constrained networksrdquo Pattern Recognition vol 47no 9 pp 2867ndash2879 2014
[15] N Lu G Zhang and J Lu ldquoConcept drift detection viacompetence modelsrdquo Artificial Intelligence vol 209 pp 11ndash282014
[16] L Breiman ldquoBagging predictorsrdquoMachine Learning vol 24 no2 pp 123ndash140 1996
[17] S Seguı L Igual and J Vitria ldquoBagged one-class classifiersin the presence of outliersrdquo International Journal of PatternRecognition and Artificial Intelligence vol 27 no 5 Article ID1350014 2013
12 International Journal of Distributed Sensor Networks
[18] N Duffy and D Helmbold ldquoBoosting methods for regressionrdquoMachine Learning vol 47 no 2-3 pp 153ndash200 2002
[19] W-C Chang and C-W Cho ldquoOnline boosting for vehicledetectionrdquo IEEETransactions on SystemsMan and CyberneticsPart B Cybernetics vol 40 no 3 pp 892ndash902 2010
[20] C Desir S Bernard C Petitjean and L Heutte ldquoOne classrandom forestsrdquo Pattern Recognition vol 46 no 12 pp 3490ndash3506 2013
[21] A Fern and R Givan ldquoOnline ensemble learning an empiricalstudyrdquoMachine Learning vol 53 no 1-2 pp 71ndash109 2003
[22] A Bifet G Holmes B Pfahringer and R Gavalda ldquoImprov-ing adaptive bagging methods for evolving data streamsrdquo inAdvances in Machine Learning vol 5828 of Lecture Notes inComputer Science pp 23ndash37 Springer Berlin Germany 2009
[23] D I Curiac and C Volosencu ldquoEnsemble based sensinganomaly detection in wireless sensor networksrdquo Expert Systemswith Applications vol 39 no 10 pp 9087ndash9096 2012
[24] X Zhou S Li and Z Ye ldquoA novel system anomaly predictionsystem based on belief markov model and ensemble classifica-tionrdquo Mathematical Problems in Engineering vol 2013 ArticleID 179390 10 pages 2013
[25] H He S Chen K Li and X Xu ldquoIncremental learning fromstream datardquo IEEE Transactions on Neural Networks vol 22 no12 pp 1901ndash1914 2011
[26] D Du K Li X Li and M Fei ldquoA novel forward gene selectionalgorithm for microarray datardquo Neurocomputing vol 133 pp446ndash458 2014
[27] H Ma ldquoAn analysis of the equilibrium of migration models forbiogeography-based optimizationrdquo Information Sciences vol180 no 18 pp 3444ndash3464 2010
[28] D Simon ldquoBiogeography-based optimizationrdquo IEEE Transac-tions on Evolutionary Computation vol 12 no 6 pp 702ndash7132008
[29] S Sheen R Anitha and P Sirisha ldquoMalware detection bypruning of parallel ensembles using harmony searchrdquo PatternRecognition Letters vol 34 no 14 pp 1679ndash1686 2013
[30] Y-Y Zhang H-C Chao M Chen L Shu C-H Park and M-S Park ldquoOutlier detection and countermeasure for hierarchicalwireless sensor networksrdquo IET Information Security vol 4 no4 pp 361ndash373 2010
[31] C Peng and M-R Fei ldquoAn improved result on the stability ofuncertain T-S fuzzy systems with interval time-varying delayrdquoFuzzy Sets and Systems vol 212 pp 97ndash109 2013
[32] Y Zhang Observing the Unobservable Distributed Online Out-lier Detection inWireless Sensor Networks University of TwenteEnschede The Netherlands 2010
[33] C Peng D Yue and M Fei ldquoRelaxed stability and stabilizationconditions of networked fuzzy control systems subject toasynchronous grades of membershiprdquo IEEE Transactions onFuzzy Systems vol 22 no 5 pp 1101ndash1112 2014
[34] C Peng M-R Fei E Tian and Y-P Guan ldquoOn hold or dropout-of-order packets in networked control systemsrdquo Informa-tion Sciences vol 268 pp 436ndash446 2014
[35] M A Rassam A Zainal and M A Maarof ldquoAn adaptive andefficient dimension reduction model for multivariate wirelesssensor networks applicationsrdquo Applied Soft Computing Journalvol 13 no 4 pp 1978ndash1996 2013
[36] M Xie J Hu S Han and H-H Chen ldquoScalable hypergridk-NN-based online anomaly detection in wireless sensor net-worksrdquo IEEE Transactions on Parallel and Distributed Systemsvol 24 no 8 pp 1661ndash1670 2013
[37] Intel Berkely Reseach Lab (IBRL) dataset 2004 httpdbcsailmitedulabdatalabdatahtml
[38] J W Branch C Giannella B Szymanski R Wolff and HKargupta ldquoIn-network outlier detection in wireless sensornetworksrdquo Knowledge and Information Systems vol 34 no 1pp 23ndash54 2013
[39] M Moshtaghi T C Havens J C Bezdek et al ldquoClusteringellipses for anomaly detectionrdquo Pattern Recognition vol 44 no1 pp 55ndash69 2011
[40] S Rajasegarar J C Bezdek C Leckie and M PalaniswamildquoElliptical anomalies in wireless sensor networksrdquo ACM Trans-actions on Sensor Networks vol 6 no 1 pp 1ndash28 2009
[41] M A Rassam A Zainal and M A Maarof ldquoOne-classprincipal component classifier for anomaly detection inwirelesssensor networkrdquo in Proceedings of the 4th International Confer-ence on Computational Aspects of Social Networks (CASoN rsquo12)pp 271ndash276 IEEE Sao Carlos Brazil November 2012
[42] H Sagha H Bayati J D R Millan and R Chavarriaga ldquoOn-line anomaly detection and resilience in classifier ensemblesrdquoPattern Recognition Letters vol 34 no 15 pp 1916ndash1927 2013
[43] M Hejazi and Y P Singh ldquoOne-class support vector machinesapproach to anomaly detectionrdquo Applied Artificial Intelligencevol 27 no 5 pp 351ndash366 2013
[44] Y Zhang NMeratnia and P JMHavinga ldquoDistributed onlineoutlier detection in wireless sensor networks using ellipsoidalsupport vector machinerdquo Ad Hoc Networks vol 11 no 3 pp1062ndash1074 2013
International Journal of
AerospaceEngineeringHindawi Publishing Corporationhttpwwwhindawicom Volume 2014
RoboticsJournal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Active and Passive Electronic Components
Control Scienceand Engineering
Journal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
International Journal of
RotatingMachinery
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Hindawi Publishing Corporation httpwwwhindawicom
Journal ofEngineeringVolume 2014
Submit your manuscripts athttpwwwhindawicom
VLSI Design
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Shock and Vibration
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Civil EngineeringAdvances in
Acoustics and VibrationAdvances in
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Electrical and Computer Engineering
Journal of
Advances inOptoElectronics
Hindawi Publishing Corporation httpwwwhindawicom
Volume 2014
The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014
SensorsJournal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Modelling amp Simulation in EngineeringHindawi Publishing Corporation httpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Chemical EngineeringInternational Journal of Antennas and
Propagation
International Journal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Navigation and Observation
International Journal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
DistributedSensor Networks
International Journal of
International Journal of Distributed Sensor Networks 11
Table 4 Detection performance of global ensemble detector based on BBO pruning [7]
Ensemble size(BBO pruned)
N7 N8 N9 N10ACC TPR FPR ACC TPR FPR ACC TPR FPR ACC TPR FPR
14 09480 08000 00458 09327 07667 00567 09500 08125 00423 09533 08571 0042023 09710 07750 00208 09447 08000 00461 09733 09250 00239 09697 09143 0027627 09713 08500 00236 09683 08333 00230 09810 09563 00176 09797 09357 0018232 09820 09750 00177 09750 08333 00160 09820 09500 00162 09830 09786 00168
Table 5 Rate of saving resource cost based on global ensembledetector of BBO pruned
Number Initial ensemblesize
Prunedensemble size
Saving resourcecost
1 20 14 302 40 23 4253 60 27 554 80 32 60
measures in fitness function to improve the detecting per-formance in future Besides the cost of communication isthe main reason of quick energy depletion of sensor nodesespecially for the cluster head the adaptive selection of clusterhead based on energy state will be taken into account tolengthen the lifetime of WSNs in next work
Conflict of Interests
The authors declare that there is no conflict of interestsregarding the publication of this paper
Acknowledgments
This work is supported by the National Key ScientificInstrument and Equipment Development Project(2012YQ15008703) the Zhejiang Provincial Natural ScienceFoundation of China (LY13F020015) the Open Project of TopKeyDiscipline of Computer Software andTheory in ZhejiangProvincial (ZC323014100) National Science Foundation ofChina (61473182) Science and Technology Commission ofShanghai Municipality (11JC1404000 14JC1402200) andShanghai Rising-Star Program (13QA1401600)
References
[1] Y Zhang N Meratnia and P Havinga ldquoOutlier detectiontechniques for wireless sensor networks a surveyrdquo IEEE Com-munications Surveys and Tutorials vol 12 no 2 pp 159ndash1702010
[2] Y Zhang N A S Hamm N Meratnia A Stein M van deVoort and P J M Havinga ldquoStatistics-based outlier detectionfor wireless sensor networksrdquo International Journal of Geo-graphical Information Science vol 26 no 8 pp 1373ndash1392 2012
[3] C Peng and Q-L Han ldquoA novel event-triggered transmissionscheme and L
2control co-design for sampled-data control
systemsrdquo IEEE Transactions on Automatic Control vol 58 no10 pp 2620ndash2626 2013
[4] S Rajasegarar C Leckie M Palaniswami and J C BezdekldquoDistributed anomaly detection in wireless sensor networksrdquo inProceedings of the 10th IEEE Singapore International Conferenceon Communication systems (ICCS rsquo06) pp 1ndash5 IEEE SingaporeOctober 2006
[5] S Rajasegarar C Leckie and M Palaniswami ldquoAnomalydetection in wireless sensor networksrdquo IEEE Wireless Commu-nications vol 15 no 4 pp 34ndash40 2008
[6] M Xie S Han B Tian and S Parvin ldquoAnomaly detectionin wireless sensor networks a surveyrdquo Journal of Network andComputer Applications vol 34 no 4 pp 1302ndash1325 2011
[7] Z Ding M Fei D Du and S Xu ldquoOnline anomaly detectionmethod based on BBO ensemble pruning in wireless sensornetworksrdquo in Life System Modeling and Simulation vol 461 ofCommunications in Computer and Information Science pp 160ndash169 Springer Berlin Germany 2014
[8] T G Dietterich ldquoMachine-learning researchmdashfour currentdirectionsrdquo AI Magazine vol 18 no 4 pp 97ndash136 1997
[9] Z-H Zhou J Wu andW Tang ldquoEnsembling neural networksmany could be better than allrdquoArtificial Intelligence vol 137 no1-2 pp 239ndash263 2002
[10] N Shahid I H Naqvi and S B Qaisar ldquoCharacteristics andclassification of outlier detection techniques for wireless sensornetworks in harsh environments a surveyrdquoArtificial IntelligenceReview vol 137 pp 1ndash36 2012
[11] D Du K Li and M Fei ldquoA fast multi-output RBF neuralnetwork constructionmethodrdquoNeurocomputing vol 73 no 10ndash12 pp 2196ndash2202 2010
[12] P Gil A Santos and A Cardoso ldquoDealing with outliers inwireless sensor networks an oil refinery applicationrdquo IEEETransactions on Control Systems Technology vol 23 no 4 pp1589ndash1596 2014
[13] M A Rassam M A Maarof and A Zainal ldquoAdaptive andonline data anomaly detection for wireless sensor systemsrdquoKnowledge-Based Systems vol 60 pp 44ndash57 2014
[14] S Rajasegarar A Gluhak M Ali Imran et al ldquoEllipsoidalneighbourhood outlier factor for distributed anomaly detectionin resource constrained networksrdquo Pattern Recognition vol 47no 9 pp 2867ndash2879 2014
[15] N Lu G Zhang and J Lu ldquoConcept drift detection viacompetence modelsrdquo Artificial Intelligence vol 209 pp 11ndash282014
[16] L Breiman ldquoBagging predictorsrdquoMachine Learning vol 24 no2 pp 123ndash140 1996
[17] S Seguı L Igual and J Vitria ldquoBagged one-class classifiersin the presence of outliersrdquo International Journal of PatternRecognition and Artificial Intelligence vol 27 no 5 Article ID1350014 2013
12 International Journal of Distributed Sensor Networks
[18] N Duffy and D Helmbold ldquoBoosting methods for regressionrdquoMachine Learning vol 47 no 2-3 pp 153ndash200 2002
[19] W-C Chang and C-W Cho ldquoOnline boosting for vehicledetectionrdquo IEEETransactions on SystemsMan and CyberneticsPart B Cybernetics vol 40 no 3 pp 892ndash902 2010
[20] C Desir S Bernard C Petitjean and L Heutte ldquoOne classrandom forestsrdquo Pattern Recognition vol 46 no 12 pp 3490ndash3506 2013
[21] A Fern and R Givan ldquoOnline ensemble learning an empiricalstudyrdquoMachine Learning vol 53 no 1-2 pp 71ndash109 2003
[22] A Bifet G Holmes B Pfahringer and R Gavalda ldquoImprov-ing adaptive bagging methods for evolving data streamsrdquo inAdvances in Machine Learning vol 5828 of Lecture Notes inComputer Science pp 23ndash37 Springer Berlin Germany 2009
[23] D I Curiac and C Volosencu ldquoEnsemble based sensinganomaly detection in wireless sensor networksrdquo Expert Systemswith Applications vol 39 no 10 pp 9087ndash9096 2012
[24] X Zhou S Li and Z Ye ldquoA novel system anomaly predictionsystem based on belief markov model and ensemble classifica-tionrdquo Mathematical Problems in Engineering vol 2013 ArticleID 179390 10 pages 2013
[25] H He S Chen K Li and X Xu ldquoIncremental learning fromstream datardquo IEEE Transactions on Neural Networks vol 22 no12 pp 1901ndash1914 2011
[26] D Du K Li X Li and M Fei ldquoA novel forward gene selectionalgorithm for microarray datardquo Neurocomputing vol 133 pp446ndash458 2014
[27] H Ma ldquoAn analysis of the equilibrium of migration models forbiogeography-based optimizationrdquo Information Sciences vol180 no 18 pp 3444ndash3464 2010
[28] D Simon ldquoBiogeography-based optimizationrdquo IEEE Transac-tions on Evolutionary Computation vol 12 no 6 pp 702ndash7132008
[29] S Sheen R Anitha and P Sirisha ldquoMalware detection bypruning of parallel ensembles using harmony searchrdquo PatternRecognition Letters vol 34 no 14 pp 1679ndash1686 2013
[30] Y-Y Zhang H-C Chao M Chen L Shu C-H Park and M-S Park ldquoOutlier detection and countermeasure for hierarchicalwireless sensor networksrdquo IET Information Security vol 4 no4 pp 361ndash373 2010
[31] C Peng and M-R Fei ldquoAn improved result on the stability ofuncertain T-S fuzzy systems with interval time-varying delayrdquoFuzzy Sets and Systems vol 212 pp 97ndash109 2013
[32] Y Zhang Observing the Unobservable Distributed Online Out-lier Detection inWireless Sensor Networks University of TwenteEnschede The Netherlands 2010
[33] C Peng D Yue and M Fei ldquoRelaxed stability and stabilizationconditions of networked fuzzy control systems subject toasynchronous grades of membershiprdquo IEEE Transactions onFuzzy Systems vol 22 no 5 pp 1101ndash1112 2014
[34] C Peng M-R Fei E Tian and Y-P Guan ldquoOn hold or dropout-of-order packets in networked control systemsrdquo Informa-tion Sciences vol 268 pp 436ndash446 2014
[35] M A Rassam A Zainal and M A Maarof ldquoAn adaptive andefficient dimension reduction model for multivariate wirelesssensor networks applicationsrdquo Applied Soft Computing Journalvol 13 no 4 pp 1978ndash1996 2013
[36] M Xie J Hu S Han and H-H Chen ldquoScalable hypergridk-NN-based online anomaly detection in wireless sensor net-worksrdquo IEEE Transactions on Parallel and Distributed Systemsvol 24 no 8 pp 1661ndash1670 2013
[37] Intel Berkely Reseach Lab (IBRL) dataset 2004 httpdbcsailmitedulabdatalabdatahtml
[38] J W Branch C Giannella B Szymanski R Wolff and HKargupta ldquoIn-network outlier detection in wireless sensornetworksrdquo Knowledge and Information Systems vol 34 no 1pp 23ndash54 2013
[39] M Moshtaghi T C Havens J C Bezdek et al ldquoClusteringellipses for anomaly detectionrdquo Pattern Recognition vol 44 no1 pp 55ndash69 2011
[40] S Rajasegarar J C Bezdek C Leckie and M PalaniswamildquoElliptical anomalies in wireless sensor networksrdquo ACM Trans-actions on Sensor Networks vol 6 no 1 pp 1ndash28 2009
[41] M A Rassam A Zainal and M A Maarof ldquoOne-classprincipal component classifier for anomaly detection inwirelesssensor networkrdquo in Proceedings of the 4th International Confer-ence on Computational Aspects of Social Networks (CASoN rsquo12)pp 271ndash276 IEEE Sao Carlos Brazil November 2012
[42] H Sagha H Bayati J D R Millan and R Chavarriaga ldquoOn-line anomaly detection and resilience in classifier ensemblesrdquoPattern Recognition Letters vol 34 no 15 pp 1916ndash1927 2013
[43] M Hejazi and Y P Singh ldquoOne-class support vector machinesapproach to anomaly detectionrdquo Applied Artificial Intelligencevol 27 no 5 pp 351ndash366 2013
[44] Y Zhang NMeratnia and P JMHavinga ldquoDistributed onlineoutlier detection in wireless sensor networks using ellipsoidalsupport vector machinerdquo Ad Hoc Networks vol 11 no 3 pp1062ndash1074 2013
International Journal of
AerospaceEngineeringHindawi Publishing Corporationhttpwwwhindawicom Volume 2014
RoboticsJournal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Active and Passive Electronic Components
Control Scienceand Engineering
Journal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
International Journal of
RotatingMachinery
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Hindawi Publishing Corporation httpwwwhindawicom
Journal ofEngineeringVolume 2014
Submit your manuscripts athttpwwwhindawicom
VLSI Design
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Shock and Vibration
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Civil EngineeringAdvances in
Acoustics and VibrationAdvances in
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Electrical and Computer Engineering
Journal of
Advances inOptoElectronics
Hindawi Publishing Corporation httpwwwhindawicom
Volume 2014
The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014
SensorsJournal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Modelling amp Simulation in EngineeringHindawi Publishing Corporation httpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Chemical EngineeringInternational Journal of Antennas and
Propagation
International Journal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Navigation and Observation
International Journal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
DistributedSensor Networks
International Journal of
12 International Journal of Distributed Sensor Networks
[18] N Duffy and D Helmbold ldquoBoosting methods for regressionrdquoMachine Learning vol 47 no 2-3 pp 153ndash200 2002
[19] W-C Chang and C-W Cho ldquoOnline boosting for vehicledetectionrdquo IEEETransactions on SystemsMan and CyberneticsPart B Cybernetics vol 40 no 3 pp 892ndash902 2010
[20] C Desir S Bernard C Petitjean and L Heutte ldquoOne classrandom forestsrdquo Pattern Recognition vol 46 no 12 pp 3490ndash3506 2013
[21] A Fern and R Givan ldquoOnline ensemble learning an empiricalstudyrdquoMachine Learning vol 53 no 1-2 pp 71ndash109 2003
[22] A Bifet G Holmes B Pfahringer and R Gavalda ldquoImprov-ing adaptive bagging methods for evolving data streamsrdquo inAdvances in Machine Learning vol 5828 of Lecture Notes inComputer Science pp 23ndash37 Springer Berlin Germany 2009
[23] D I Curiac and C Volosencu ldquoEnsemble based sensinganomaly detection in wireless sensor networksrdquo Expert Systemswith Applications vol 39 no 10 pp 9087ndash9096 2012
[24] X Zhou S Li and Z Ye ldquoA novel system anomaly predictionsystem based on belief markov model and ensemble classifica-tionrdquo Mathematical Problems in Engineering vol 2013 ArticleID 179390 10 pages 2013
[25] H He S Chen K Li and X Xu ldquoIncremental learning fromstream datardquo IEEE Transactions on Neural Networks vol 22 no12 pp 1901ndash1914 2011
[26] D Du K Li X Li and M Fei ldquoA novel forward gene selectionalgorithm for microarray datardquo Neurocomputing vol 133 pp446ndash458 2014
[27] H Ma ldquoAn analysis of the equilibrium of migration models forbiogeography-based optimizationrdquo Information Sciences vol180 no 18 pp 3444ndash3464 2010
[28] D Simon ldquoBiogeography-based optimizationrdquo IEEE Transac-tions on Evolutionary Computation vol 12 no 6 pp 702ndash7132008
[29] S Sheen R Anitha and P Sirisha ldquoMalware detection bypruning of parallel ensembles using harmony searchrdquo PatternRecognition Letters vol 34 no 14 pp 1679ndash1686 2013
[30] Y-Y Zhang H-C Chao M Chen L Shu C-H Park and M-S Park ldquoOutlier detection and countermeasure for hierarchicalwireless sensor networksrdquo IET Information Security vol 4 no4 pp 361ndash373 2010
[31] C Peng and M-R Fei ldquoAn improved result on the stability ofuncertain T-S fuzzy systems with interval time-varying delayrdquoFuzzy Sets and Systems vol 212 pp 97ndash109 2013
[32] Y Zhang Observing the Unobservable Distributed Online Out-lier Detection inWireless Sensor Networks University of TwenteEnschede The Netherlands 2010
[33] C Peng D Yue and M Fei ldquoRelaxed stability and stabilizationconditions of networked fuzzy control systems subject toasynchronous grades of membershiprdquo IEEE Transactions onFuzzy Systems vol 22 no 5 pp 1101ndash1112 2014
[34] C Peng M-R Fei E Tian and Y-P Guan ldquoOn hold or dropout-of-order packets in networked control systemsrdquo Informa-tion Sciences vol 268 pp 436ndash446 2014
[35] M A Rassam A Zainal and M A Maarof ldquoAn adaptive andefficient dimension reduction model for multivariate wirelesssensor networks applicationsrdquo Applied Soft Computing Journalvol 13 no 4 pp 1978ndash1996 2013
[36] M Xie J Hu S Han and H-H Chen ldquoScalable hypergridk-NN-based online anomaly detection in wireless sensor net-worksrdquo IEEE Transactions on Parallel and Distributed Systemsvol 24 no 8 pp 1661ndash1670 2013
[37] Intel Berkely Reseach Lab (IBRL) dataset 2004 httpdbcsailmitedulabdatalabdatahtml
[38] J W Branch C Giannella B Szymanski R Wolff and HKargupta ldquoIn-network outlier detection in wireless sensornetworksrdquo Knowledge and Information Systems vol 34 no 1pp 23ndash54 2013
[39] M Moshtaghi T C Havens J C Bezdek et al ldquoClusteringellipses for anomaly detectionrdquo Pattern Recognition vol 44 no1 pp 55ndash69 2011
[40] S Rajasegarar J C Bezdek C Leckie and M PalaniswamildquoElliptical anomalies in wireless sensor networksrdquo ACM Trans-actions on Sensor Networks vol 6 no 1 pp 1ndash28 2009
[41] M A Rassam A Zainal and M A Maarof ldquoOne-classprincipal component classifier for anomaly detection inwirelesssensor networkrdquo in Proceedings of the 4th International Confer-ence on Computational Aspects of Social Networks (CASoN rsquo12)pp 271ndash276 IEEE Sao Carlos Brazil November 2012
[42] H Sagha H Bayati J D R Millan and R Chavarriaga ldquoOn-line anomaly detection and resilience in classifier ensemblesrdquoPattern Recognition Letters vol 34 no 15 pp 1916ndash1927 2013
[43] M Hejazi and Y P Singh ldquoOne-class support vector machinesapproach to anomaly detectionrdquo Applied Artificial Intelligencevol 27 no 5 pp 351ndash366 2013
[44] Y Zhang NMeratnia and P JMHavinga ldquoDistributed onlineoutlier detection in wireless sensor networks using ellipsoidalsupport vector machinerdquo Ad Hoc Networks vol 11 no 3 pp1062ndash1074 2013
International Journal of
AerospaceEngineeringHindawi Publishing Corporationhttpwwwhindawicom Volume 2014
RoboticsJournal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Active and Passive Electronic Components
Control Scienceand Engineering
Journal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
International Journal of
RotatingMachinery
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Hindawi Publishing Corporation httpwwwhindawicom
Journal ofEngineeringVolume 2014
Submit your manuscripts athttpwwwhindawicom
VLSI Design
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Shock and Vibration
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Civil EngineeringAdvances in
Acoustics and VibrationAdvances in
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Electrical and Computer Engineering
Journal of
Advances inOptoElectronics
Hindawi Publishing Corporation httpwwwhindawicom
Volume 2014
The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014
SensorsJournal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Modelling amp Simulation in EngineeringHindawi Publishing Corporation httpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Chemical EngineeringInternational Journal of Antennas and
Propagation
International Journal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Navigation and Observation
International Journal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
DistributedSensor Networks
International Journal of
International Journal of
AerospaceEngineeringHindawi Publishing Corporationhttpwwwhindawicom Volume 2014
RoboticsJournal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Active and Passive Electronic Components
Control Scienceand Engineering
Journal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
International Journal of
RotatingMachinery
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Hindawi Publishing Corporation httpwwwhindawicom
Journal ofEngineeringVolume 2014
Submit your manuscripts athttpwwwhindawicom
VLSI Design
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Shock and Vibration
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Civil EngineeringAdvances in
Acoustics and VibrationAdvances in
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Electrical and Computer Engineering
Journal of
Advances inOptoElectronics
Hindawi Publishing Corporation httpwwwhindawicom
Volume 2014
The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014
SensorsJournal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Modelling amp Simulation in EngineeringHindawi Publishing Corporation httpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Chemical EngineeringInternational Journal of Antennas and
Propagation
International Journal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Navigation and Observation
International Journal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
DistributedSensor Networks
International Journal of