anomaly detection in wireless sensor networks- a survey

Upload: raghu-vamsi-potukuchi

Post on 06-Jul-2018

222 views

Category:

Documents


0 download

TRANSCRIPT

  • 8/17/2019 Anomaly Detection in Wireless Sensor Networks- A Survey

    1/25

    This article appeared in a journal published by Elsevier. The attached

    copy is furnished to the author for internal non-commercial research

    and education use, including for instruction at the authors institution

    and sharing with colleagues.

    Other uses, including reproduction and distribution, or selling or

    licensing copies, or posting to personal, institutional or third partywebsites are prohibited.

    In most cases authors are permitted to post their version of the

    article (e.g. in Word or Tex form) to their personal website or

    institutional repository. Authors requiring further information

    regarding Elsevier’s archiving and manuscript policies are

    encouraged to visit:

    http://www.elsevier.com/copyright

    http://www.elsevier.com/copyrighthttp://www.elsevier.com/copyright

  • 8/17/2019 Anomaly Detection in Wireless Sensor Networks- A Survey

    2/25

    Author's personal copy

    Anomaly detection in wireless sensor networks: A survey

    Miao Xie ,1, Song Han , Biming Tian, Sazia Parvin

    Digital Ecosystems and Business Intelligence Institute, Curtin University, DEBII, GPO Box U1987, Perth, WA 6845, Australia

    a r t i c l e i n f o

     Article history:

    Received 19 August 2010Received in revised form

    10 February 2011

    Accepted 7 March 2011Available online 21 March 2011

    Keywords:

    Wireless sensor networks

    Information security

    Anomaly detection

    a b s t r a c t

    Since security threats to WSNs are increasingly being diversified and deliberate, prevention-based

    techniques alone can no longer provide WSNs with adequate security. However, detection-basedtechniques might be effective in collaboration with prevention-based techniques for securing WSNs. As

    a significant branch of detection-based techniques, the research of anomaly detection in wired

    networks and wireless ad hoc networks is already quite mature, but such solutions can be rarely

    applied to WSNs without any change, because WSNs are characterized by constrained resources, such

    as limited energy, weak computation capability, poor memory, short communication range, etc. The

    development of anomaly detection techniques suitable for WSNs is therefore regarded as an essential

    research area, which will enable WSNs to be much more secure and reliable. In this survey paper, a few

    of the key design principles relating to the development of anomaly detection techniques in WSNs are

    discussed in particular. Then, the state-of-the-art techniques of anomaly detection in WSNs are

    systematically introduced, according to WSNs’ architectures (Hierarchical/Flat) and detection technique

    categories (statistical techniques, rule based, data mining, computational intelligence, game theory,

    graph based, and hybrid, etc.). The analyses and comparisons of the approaches that belong to a similar

    technique category are represented technically, followed by a brief discussion towards the potential

    research areas in the near future and conclusion.

    &   2011 Elsevier Ltd. All rights reserved.

    1. Introduction

    A wireless sensor network (WSN) is made up of a mass of 

    spatially distributed autonomous sensors, to jointly monitor

    physical or environmental conditions, such as temperature,

    sound, vibration, pressure, motion and pollutants (Yick et al.,

    2008). To date, WSNs have been successfully applied to many

    industrial and civil domains, including industrial process, mon-

    itoring and control, machine health monitoring, environment and

    habitat monitoring, healthcare applications, home automation,

    and traffic control. A typical WSN has little or no infrastructure. If the deployment of a WSN is subject to an ad hoc manner, it is

    categorized as unstructured. In contrast, the network deployed

    with a pre-planned manner is categorized as structured. Each

    sensor node is optionally built up with a variety of network

    services such as localization, coverage, synchronization, data

    compression and aggregation, and security, for the purpose of 

    enhancing the network’s overall performance. Sensor nodes

    communicate with each other, through following the typical

    five-layer communication protocol stack, which consists of 

    physical layer, data link layer, network layer, transport layer,

    and application layer.

    The properties of WSN inevitably cause that a sensor node is

    extremely restricted by resources, including energy, memory,

    computing, bandwidth, and communication. Hence, WSN is

    vulnerable to security threats both external and internal. In

    addition, physical access is allowed for sensor nodes, as the

    network is usually deployed near the physical source of the event,

    but without tamper-resistance owing to cost constraint. What is

    worse, the information exchange can be captured by any internal

    and external devices, caused by the use of publicly accessiblecommunication channels. In consequence, a WSN is often threa-

    tened by multiple security threats, which could be categorized as

    follows (Lopez and Zhou, 2008):

     communication attack;   denial of service attack;  node compromise;   impersonation attack;  protocol-specific attack.

    Han et al. (2005) also propose a good taxonomy that surveys the

    security threats according to a more detailed criteria.

    Securing WSN is imperative and challenging accordingly.

    Prevention-based techniques that fundamentally build upon

    Contents lists available at  ScienceDirect

    journal homepage:  www.elsevier.com/locate/jnca

     Journal of Network and Computer Applications

    1084-8045/$ - see front matter &  2011 Elsevier Ltd. All rights reserved.

    doi:10.1016/j.jnca.2011.03.004

    Corresponding authors.

    E-mail addresses:  [email protected] (M. Xie),

    [email protected] (S. Han).1

    Tel.: þ61 040 1400624.

     Journal of Network and Computer Applications 34 (2011) 1302–1325

  • 8/17/2019 Anomaly Detection in Wireless Sensor Networks- A Survey

    3/25

    Author's personal copy

    cryptography are the first line of defense for protecting WSN.

    Based on a primitive of secret key management, encryption and

    authentication are the primary measures in a prevention-based

    technique, as that introduced in the security framework SPINS

    (Perrig et al., 2001). However, in case the first line of defense is

    broken through, compromised nodes could extract security-sen-

    sitive information (e.g. secret key), leading to breaches of security.Thus, developing detection-based techniques as the second line of 

    defense appears to be of great importance. Intrusion detection is a

    typical example of detection-based techniques. This concept was

    originally proposed by   Anderson (1980)   two decades ago in a

    report ‘‘Computer Security Threat Monitoring and Surveillance’’.

    Intrusion detection is defined as the process of monitoring the

    events occurring in a computer system or network and analyzing

    them for any signs of possible incidents, which are violations or

    imminent threats of violation of computer policies, acceptable use

    policies, or standard practices (Scarfone and Mell, 2007). How-

    ever, anomaly detection (Hu, 2010, also referred as outlier

    detection, deviation detection, etc.), a branch of intrusion detec-

    tion, is best suited to WSN because its methodology is flexible and

    resource-friendly in general. Anomaly detection is defined as theprocess of comparing definitions of activity that is considered

    normal against observed events in order to identify significant

    deviations. Moreover, an anomaly in a dataset is defined as an

    observation that appears to be inconsistent with the remainder of 

    the dataset (Hodge and Justin, 2004).

    Anomaly may be caused by not only security threats, but also

    faulty sensor nodes in the network or unusual phenomena in the

    monitoring zone (Rajasegarar et al., 2008). In the real world,

    isolated node failures can bring down the entire network, which

    is harmful to reliability of WSN. This survey paper merely focuses

    on anomaly detection techniques in WSN, irrespective of causes

    of generating anomaly. The overview of the content of this survey

    paper is given in  Fig. 1.

    1.1. Motivation

    The research relating to anomaly detection in WSN has been

    followed with much interest in recent years. From the ISSNIP

    (Intelligent Sensors, Sensor Networks and Information Proces-

    sing, The University of Melbourne, Australia) group,  Rajasegarar

    et al. (2008) did a survey on the related works before 2007 with a

    simpler criteria: statistical parameter estimation techniques or

    non-parametric techniques. Nevertheless, a technology-con-

    cerned survey is yet absent to present the latest progress of 

    developing anomaly detection in WSN.

    Moreover, our paper expects acting as a guideline of selecting

    appropriate anomaly detection techniques. Through analyzing

    and comparing those particular approaches that belong to a

    similar technique category, the advantages and shortcomings of each technique category can be identified. Accordingly, it further

    extracts the key design principles to overcome possible flaws.

    The pattern of anomaly detection significantly impacts on the

    performance of a detection scheme, which basically relates to

    who is mainly responsible for the data processing of detection.

    The choice of detection pattern depends on the application

    scenario. The fair understanding with regard to these available

    anomaly detection patterns could facilitate the development of 

    detection schemes. In consequence, these anomaly detection

    patterns are surveyed separately in this paper.

    In our survey paper, all detection schemes are divided into two

    types of detection method: prior-knowledge based, or prior-

    knowledge free. The prior-knowledge-based detection schemes

    are better suited to the applications which are biased to detectionspeed; the prior-knowledge free schemes, on the contrary, are

    capable of providing applications with stronger detection general-

    ity. This awareness is positive to optimally selecting anomaly

    detection techniques. Attribute selection is traditionally a critical

    issue in a detection system, as using less number of attributes is

    able to conserve resource. Our paper emphasizes the importance

    of this issue for developing anomaly detectors in WSNs, whereas a

    detailed discussion is not given owing to space constraint.

    Finally, the developing orientations in this area are examined,

    and a number of potential research areas in the near future are

    proposed.

    1.2. State-of-the-art techniques

    Other than anomaly detection, there are also misuse/signa-

    ture detection and stateful protocol analysis in the category of 

    intrusion detection (Scarfone and Mell, 2007). Misuse/signature

    detection is defined as a process of comparing signatures against

    observed events to identify possible incidents, where each

    signature is a pattern corresponding to a known threat. Stateful

    protocol analysis is defined as the process of comparing pre-

    determined profiles of generally accepted definitions of benign

    Fig. 1.  The content of this survey paper.

    M. Xie et al. / Journal of Network and Computer Applications 34 (2011) 1302–1325   1303

    https://www.researchgate.net/publication/200446667_Guide_to_Intrusion_Detection_and_Prevention_Systems_IDPS?el=1_x_8&enrichId=rgreq-1c1db93d-e251-4f8b-9cff-4df8e6a3ba0e&enrichSource=Y292ZXJQYWdlOzI1NjA5NTAxNDtBUzoxMDE1ODI4Mzg0MzU4NDlAMTQwMTIzMDY0NTU5OA==

  • 8/17/2019 Anomaly Detection in Wireless Sensor Networks- A Survey

    4/25

    Author's personal copy

    protocol activities for each protocol state against observed

    events to identify outliers. Misuse/signature detection and state-

    ful protocol analysis need complicated expression computing

    and/or sizeable memory, to which WSNs usually cannot afford.

    Moreover, they are unable to defense against unknown security

    threats. Consequently, anomaly detection is currently the domi-

    nant technology for enhancing the security and reliabilityof WSN.

    Though WSN is derived from wireless ad hoc networks, the

    most of detection schemes well-functioned in ad hoc networks

    are not suitable for WSN, probably because (Akyildiz et al., 2002):

      the number of sensor nodes in a WSN can be several orders of magnitude higher than that of an ad hoc network;

      sensor nodes are densely deployed;  a sensor node is less stable;   the topology of WSNs varies frequently;   sensor nodes mainly use a broadcast communication para-

    digm, whereas ad hoc networks are mainly based on point-to-

    point communication;

     each sensor node is highly constrained in energy, computationcapability, memory, etc.

     sensor nodes may have no global identifications as a result of the large amount of overhead.

    Accordingly, the advanced anomaly detection schemes in ad

    hoc networks (Qian et al., 2007;  Tarique et al., 2009;  Wu et al.,

    2007) cannot be applied to WSN, as well as those developed in

    wired networks.

    In this survey paper, recently proposed detection schemes

    in WSN are introduced. Because the architecture of a WSN is

    strongly related to many aspects of designing a suited scheme,

    these detection schemes are classified as hierarchical and flat

    (homogeneous) according to their architectures. In a hierarchical

    WSN, all sensor nodes are grouped or clustered, where only asingle node is elected as the cluster head (possibly equipped with

    stronger capacity) to conduct the organizational functions within

    its group or cluster. On the contrary, all sensor nodes equally

    contribute to any team-functions and participate in internal

    protocols (e.g. routing protocols) in a flat WSN. For each of the

    architectures, a number of typical examples are given in terms of 

    the technique category that they belong to.

    As far as the technique categories, statistical techniques, data

    mining, and computational intelligence are employed most

    widely. Statistical techniques consist of statistical distribution

    (Palpanas et al., 2003; Subramaniam et al., 2006; Liu et al., 2007;

    Dallas et al., 2007; Li et al., 2008a; Tiwari et al., 2009), statistical

    measure (e.g. mean, variance, self-defined, etc.) (Zhang et al.,

    2008; Pires et al., 2004; Onat and Miri, 2005a,b; Li et al., 2008b),

    and statistical model (e.g. auto regression) (Curiac et al., 2007).

    Computational intelligence is closely linked to machine learning

    and remotely linked to data mining. Conceptually, machine

    learning is more concerned with design and development of the

    algorithms that enable computers to learn from large-scale

    datasets. Data mining, however, principally focuses on discover-

    ing patterns, associations, changes, anomalies, and statistically

    significant structures and events in datasets. Under the technique

    category of data mining and computational intelligence, a couple

    of examples are introduced, including clustering algorithms

    (Rajasegarar et al., 2006; Masud et al., 2009; Wang et al., 2009),

    support vector machine (SVM) (Rajasegarar et al., 2007), artificial

    neural network (ANN) (Wang et al., 2009), self-organizing map

    (SOM) (Wang et al., 2009), genetic algorithm (GA) (Rahul et al.,

    2009), and association rule learning (Yu and Tsai, 2008). Gametheory is dedicated to build up smart strategies for identifying

    vulnerable areas in WSN (Agah et al., 2004a,b). There is only a

    case that concentrates on linking detection with prevention

    together to protect a hierarchical WSN from both internal and

    external attacks (Su et al., 2005). Graph-based techniques specia-

    lize in modeling a graph with the network flow (Ngai et al., 2006,

    2007), which allows applying a few of graph algorithms (such as

    tree construction, depth-first search, etc.) to detect anomaly.

    Finally, rule-based techniques, which often build upon prior-

    knowledge such as assumption and experience, are preferred inflat WSNs (Silva et al., 2005;   Yu and Xiao, 2006;   Ioannis et al.,

    2007; Ho et al., 2009). Table 1 shows this taxonomy in brief.

    1.3. Key challenge

    The key challenge of evolving anomaly detection in WSN is to

    identify anomaly with high accuracy but minimized energy cost,

    so as to prolong the lifetime of the entire network. This target

    could be attained from several paths. Above all, paying much

    more attention on lightweight detection techniques, which are

    characterized by compactness and efficiency. Second, reconstruct-

    ing detection schemes with a distributed manner can spread the

    energy overhead around the entire network and markedly reduce

    the communication overhead, such that the lifetime of the net-work stretches. A suited detection pattern could also conserve the

    energy cost without losing the security and reliability. In addition,

    taking smart strategies into account such as shrinking the scale of 

    attributes set, compressing the input dataset, and simplifying the

    procedure of analysis and decision could make lots of progress for

    conserving energy.

    1.4. Organization

    The rest of this paper is organized as follows. In the second

    section, these key design principles with respect to anomaly

    detection in WSNs are discussed in detail. The following two

    sections introduce many representative detection schemes, in

    terms of hierarchical and flat topologies respectively. The fifth

    section states the analysis and comparisons between schemes

    that belong to a similar technique category. Finally, this survey is

    summarized with a presentation about the potential research

    areas in the near future.

    2. Key design principles

    The key design principles of anomaly detection in WSN must

    be followed along with several aspects

      target;

      typical security threats;

     detection pattern;  detection method;   attribute selection.

     Table 1

    Summary of the taxonomy.

    Category Techniques

    Statistical Distribution Measure Model

    Data mining Clustering SVM Rule learner

    Computational intelligence SOM ANN GARule Assumption Experience

    Game theory Non-cooperative and non-zero-sum

    Graph Tree construction Depth-first search

    Hybrid Prevention and detection

    M. Xie et al. / Journal of Network and Computer Applications 34 (2011) 1302–13251304

    https://www.researchgate.net/publication/232621756_Intrusion_Detection_for_Wireless_Sensor_Networks_Based_on_Multi-agent_and_Refined_Clustering_PDF?el=1_x_8&enrichId=rgreq-1c1db93d-e251-4f8b-9cff-4df8e6a3ba0e&enrichSource=Y292ZXJQYWdlOzI1NjA5NTAxNDtBUzoxMDE1ODI4Mzg0MzU4NDlAMTQwMTIzMDY0NTU5OA==https://www.researchgate.net/publication/232621756_Intrusion_Detection_for_Wireless_Sensor_Networks_Based_on_Multi-agent_and_Refined_Clustering_PDF?el=1_x_8&enrichId=rgreq-1c1db93d-e251-4f8b-9cff-4df8e6a3ba0e&enrichSource=Y292ZXJQYWdlOzI1NjA5NTAxNDtBUzoxMDE1ODI4Mzg0MzU4NDlAMTQwMTIzMDY0NTU5OA==https://www.researchgate.net/publication/232621756_Intrusion_Detection_for_Wireless_Sensor_Networks_Based_on_Multi-agent_and_Refined_Clustering_PDF?el=1_x_8&enrichId=rgreq-1c1db93d-e251-4f8b-9cff-4df8e6a3ba0e&enrichSource=Y292ZXJQYWdlOzI1NjA5NTAxNDtBUzoxMDE1ODI4Mzg0MzU4NDlAMTQwMTIzMDY0NTU5OA==https://www.researchgate.net/publication/224719426_Quarter_Sphere_Based_Distributed_Anomaly_Detection_in_Wireless_Sensor_Networks?el=1_x_8&enrichId=rgreq-1c1db93d-e251-4f8b-9cff-4df8e6a3ba0e&enrichSource=Y292ZXJQYWdlOzI1NjA5NTAxNDtBUzoxMDE1ODI4Mzg0MzU4NDlAMTQwMTIzMDY0NTU5OA==https://www.researchgate.net/publication/4345953_A_Framework_of_Machine_Learning_Based_Intrusion_Detection_for_Wireless_Sensor_Networks?el=1_x_8&enrichId=rgreq-1c1db93d-e251-4f8b-9cff-4df8e6a3ba0e&enrichSource=Y292ZXJQYWdlOzI1NjA5NTAxNDtBUzoxMDE1ODI4Mzg0MzU4NDlAMTQwMTIzMDY0NTU5OA==https://www.researchgate.net/publication/4314491_Malicious_Node_Detection_in_Wireless_Sensor_Networks_Using_an_Autoregression_Technique?el=1_x_8&enrichId=rgreq-1c1db93d-e251-4f8b-9cff-4df8e6a3ba0e&enrichSource=Y292ZXJQYWdlOzI1NjA5NTAxNDtBUzoxMDE1ODI4Mzg0MzU4NDlAMTQwMTIzMDY0NTU5OA==

  • 8/17/2019 Anomaly Detection in Wireless Sensor Networks- A Survey

    5/25

    Author's personal copy

     2.1. Target 

    The target implies what a detection scheme is expected to be

    able to do. In order for ensuring the performance, a detection

    scheme is suggested to achieve a target comprising of   Ioannis

    et al. (2007):

    Effectiveness: The effectiveness of a detection scheme reflect bythe   detection accuracy  and  false alarm rate. The rate of detection

    accuracy is the number of successfully detected anomalies divides

    by the number of total anomalies. False alarm consists of false

    positive and false negative, where a false positive signifies a

    legitimate activity is falsely identified as an anomaly, and a miss

    of capturing a real anomaly results in a false negative. False alarm

    rate is the number of false alarm divides by the number of 

    reported anomalies. A good scheme should reach at high detec-

    tion accuracy rate while remaining false alarm rate down. On the

    other hand, the ability of detecting unknown (new types of 

    anomaly) anomalies is also significant as security threats to

    WSN are more and more diversified and deliberate. This ability

    is referred as detection generality  in this paper.

    Minimized resource: WSN characterizes by tremendously con-strained resources, especially the availability of energy. As a

    result, minimizing the energy cost is a priority. The less use of 

    resource partly determines faster  detection speed, but probably

    leads to the loss of effectiveness. In consequence, it is difficult to

    trade off the effectiveness and resource usage. According to a

    truth that the most of energy in a sensor node is drained by radio

    communication rather than by computation (Roman et al., 2006),

    activating in-network computing as much as possible, namely

    using distributed manner for computing, might be a promising

    way to address this issue. In addition, the resource conservation

    may come with effort made to design lightweight detection

    schemes as well as smart strategies.

    Trust no node: Unlike wired networks or ad hoc networks, a

    sensor node can be compromised easily due to its weakness.

    Accordingly, a detection scheme has to meet the criterion

    ‘‘trust-no-node’’ at any time. Based on a security foundation

    (Zhang et al., 2008;   Curiac et al., 2007;   Su et al., 2005;   Ngai

    et al., 2006, 2007; Yu and Xiao, 2006; Ho et al., 2009), adding a

    process of data filtering (Liu et al., 2007), and employing a vote

    (or similar) mechanism (Liu et al., 2007;   Li et al., 2008a,b;

    Tiwari et al., 2009; Pires et al., 2004; Ioannis et al., 2007) might

    be effective for directly ensuring the legitimate identity of a

    sensor node or diluting the bad effects caused by the unat-

    tended malicious nodes.

    Be secure: The detection schemes themselves must be secure,

    because the line of defense would be destroyed to the ground if 

    sophisticated adversaries disable or jump over the detection

    service before initiate thorough attacks. In theory, adversaries

    could make use of analytical measures to speculate what a kind of detection rules or algorithms is in employment by their targeted

    schemes. Furthermore, adversaries perhaps wreck the detection

    scheme with brute force. The   survivability   against malicious

    activities is thus a significant point to assess the security of 

    detection schemes themselves. Moreover, the optimal detection

    scheme must own the capability to recover its detection service

    immediately once being wrecked, which is referred as tolerability.

     2.2. Typical security threats

    The typical security threats to WSN which can be identified by

    a detection scheme should be fully reviewed. Many surveys

    regarding these security threats have been introduced (Lopez

    and Zhou, 2008; Han et al., 2005) according to different criteria,but detection is not effective against all of the mentioned threats,

    such as eavesdropping attack only can be resisted by the built-in

    security foundation. On the other hand, the relationship between

    these threats is sometimes indistinguishable. Selective forward-

    ing attack is a subsequent offence based on sinkhole attack, for

    example, whereas the breakthrough of a sinkhole attack will

    result in not only the following selective forwarding attack, but

    also a series of severe security damages such as message alter. As

    a result, the typical security threats and their countermeasures

    which have been mentioned in the cited papers are roughly

    shown in Table 2. In fact, more comparisons should be put intopractice, such as the damage scope of each security threat, the

    damage degree of each security threat, the symptom of each

    security threat (relating to attribute selection, see Section 2.5),

    etc. This full work is expected to be finished separately, due to the

    space limitation. Random failure is regarded as a special case of 

    security threats here, as anomaly detection is also able to deal

    with it.

     2.3. Detection pattern

    Axelsson (1998)   proposed a generic framework of intrusion

    detection systems (IDSs), consisting of   audit collection/storage,

     processing ,   configuration/reference data,   active/processing data,and alarm. As a branch technique of intrusion detection, a generic

    framework of anomaly detection systems (ADSs) is simply

    derived from the original IDS framework, which is comprised of 

    input ,  data processing ,  analysis and decision, and output  (Chandola

    et al., 2009). In general, a dataset that includes a collection of data

    instances is the input for anomaly detection. A data instance

    consists of a set of attributes, either univariate or multivariate.

    The feature of an attribute could be binary, categorical, or

    continuous. In the procedure of data processing, a normal profile

    representing the benign status of the system is produced with a

    training procedure, or with prior-knowledge. Certain detection

    schemes probably need a special procedure of preprocessing.

    According to the label of the input dataset, supervised, semi-

    supervised, and unsupervised are popular methodologies to

    training. Relying on the established normal profile, a test instance

    can be identified whether it is an anomaly with specified algo-

    rithms, during the procedure of analysis and decision. Usually,

    single or multiple thresholds will be established for doing this

    task. The type of anomaly could be point, contextual, or collective.

    The final result, namely the output is produced by the anomaly

    detector as one of the two possible forms: score or label. Figure 2

    illustrates the generic framework of anomaly detection.

    As for the detection pattern, it is basically linked to who takes

    charge of carrying out the data processing procedure of anomaly

    detection, since this is deterministic to many design details of a

    scheme as well as its performance. Depending on the architecture

    of a WSN, a range of detection patterns have been in use, which

    will be briefly described below. Moreover, Table 3 shows a list of 

    these popular detection patterns and their corresponding refer-ences, where we use CH and CSN stand for cluster head and

    common sponsor node for short.

     Table 2

    The typical security threats and preferred countermeasures.

    Security threats Preferred countermeasures

    Black-hole Statistical measure

    Malicious node Statistical distribution, data mining

    Sinkhole Graph, ruleSelective for warding Statistical mea sur e, data min ing

    Wormhole Statistical measure, rule

    Replica node Rule

    Random failure Statistical distribution, data mining

    M. Xie et al. / Journal of Network and Computer Applications 34 (2011) 1302–1325   1305

    https://www.researchgate.net/publication/221243920_Insider_Attacker_Detection_in_Wireless_Sensor_Networks?el=1_x_8&enrichId=rgreq-1c1db93d-e251-4f8b-9cff-4df8e6a3ba0e&enrichSource=Y292ZXJQYWdlOzI1NjA5NTAxNDtBUzoxMDE1ODI4Mzg0MzU4NDlAMTQwMTIzMDY0NTU5OA==

  • 8/17/2019 Anomaly Detection in Wireless Sensor Networks- A Survey

    6/25

    Author's personal copy

    In a hierarchical WSN, basically there are three available

    detection patterns. First, the cluster head is responsible for

    the data processing procedure alone (Wang et al., 2009;   Su

    et al., 2005). Second, the cluster head and common sensor

    nodes cooperate to accomplish this (Palpanas et al., 2003;

    Subramaniam et al., 2006;   Zhang et al., 2008;   Rajasegarar

    et al., 2006,   2007). Third, this procedure is carried out at the

    base station (Masud et al., 2009; Rahul et al., 2009). In the first

    pattern, except collecting the input datasets the common sensor

    nodes do not participate in the data processing procedure, and/or

    partly contribute to the procedure of analysis and decision; the

    cluster head alone is in charge of the data processing procedure.

    However, this clearly leads to the overuse of energy in the cluster

    head. As a result, the second and third detection patterns seem to

    be more reasonable. None of them considers having the cluster

    head attended; this may fail to meet the criterion ‘‘trust-no-

    node’’. One possible remedy is letting the common sensor nodes

    to monitor the cluster head by turns, such as picking out a part

    of nodes according to their remaining energy ( Wang et al., 2009;

    Su et al., 2005). These detection patterns are illustrated in Fig. 3.

    There are also three broad categories of detection pattern in

    flat WSNs. First, a part of nodes are on duty for covering its

    neighborhood according to certain specification. In detail, thisneighborhood can be its ‘‘one-hop’’ (Onat and Miri, 2005a,b),

    ‘‘radio range’’ (Liu et al., 2007; Pires et al., 2004; Silva et al., 2005),

    or ‘‘other’’ (Dallas et al., 2007; Yu and Tsai, 2008; Yu and Xiao, 2006;

    Ioannis et al., 2007; Ho et al., 2009). The active nodes take care of its

    specified neighborhood by monitoring and accomplishing the proce-

    dure of data processing. The procedure of analysis and decision may

    be resolved by the active nodes alone or a cooperative method.

    Second, the base station conducts anomaly detection across the

    network (Curiac et al., 2007; Ngai et al., 2006, 2007). Third, partition

    the network into groups and then activate a part of sensor nodes in

    each group to take charge of the monitoring and data processing

    procedure (Li et al., 2008a,b). The common shortcoming of the first

    pattern is the redundancy of protection coverage, because there is no

    mechanism capable of accurately measuring the maximal protection

    coverage that the active nodes can afford. As far as the third pattern, it

    provides flat WSNs with a chance as employing advanced technique

    as hierarchical WSNs. However, the grouping procedure certainly

    brings a massive energy burden. Available detection patterns in flat

    WSNs are shown in Fig. 4.

     2.4. Detection method

    Detection method is a key point of a detection scheme, as the

    method impacts on its usable scope. The applicable range of a

    scheme is to be restricted by the preconditions, according towhich two detection methods are introduced:   prior-knowledge

    based and  prior-knowledge free.

    Fig. 2.  Generic framework of anomaly detection.

     Table 3

    Popular detection patterns.

    Hierarchical WSNs Flat WSNs

    Patterns References Patterns References

    CH   Wang et al. (2009) and  Su et al. (2005)   One-hop   Onat and Miri (2005a) and  Onat and Miri (2005b)

    CH and CSNs   Palpanas et al. (2003), Subramaniam et al. (2006),

    Zhang et al. (2008), and Rajasegarar et al. (2006, 2007)

    Radio-range   Liu et al. (2007), Pires et al. (2004), and Silva et al. (2005)

    Base station   Masud et al. (2009) and Rahul et al. (2009)   Other   Dallas et al. (2007), Yu and Tsai (2008), Yu and Xiao (2006),and Ioannis et al. (2007);  Ho et al. (2009)

    Base station   Curiac et al. (2007) and  Ngai et al. (2006, 2007)

    Grouping   Li et al. (2008a,b)

    M. Xie et al. / Journal of Network and Computer Applications 34 (2011) 1302–13251306

    https://www.researchgate.net/publication/4178614_A_real-time_node-based_traffic_anomaly_detection_algorithm_for_wireless_sensor_networks?el=1_x_8&enrichId=rgreq-1c1db93d-e251-4f8b-9cff-4df8e6a3ba0e&enrichSource=Y292ZXJQYWdlOzI1NjA5NTAxNDtBUzoxMDE1ODI4Mzg0MzU4NDlAMTQwMTIzMDY0NTU5OA==https://www.researchgate.net/publication/224676267_On_the_Intruder_Detection_for_Sinkhole_Attack_in_Wireless_Sensor_Networks?el=1_x_8&enrichId=rgreq-1c1db93d-e251-4f8b-9cff-4df8e6a3ba0e&enrichSource=Y292ZXJQYWdlOzI1NjA5NTAxNDtBUzoxMDE1ODI4Mzg0MzU4NDlAMTQwMTIzMDY0NTU5OA==https://www.researchgate.net/publication/221243920_Insider_Attacker_Detection_in_Wireless_Sensor_Networks?el=1_x_8&enrichId=rgreq-1c1db93d-e251-4f8b-9cff-4df8e6a3ba0e&enrichSource=Y292ZXJQYWdlOzI1NjA5NTAxNDtBUzoxMDE1ODI4Mzg0MzU4NDlAMTQwMTIzMDY0NTU5OA==https://www.researchgate.net/publication/223482038_Group-based_intrusion_detection_system_in_wireless_sensor_networks?el=1_x_8&enrichId=rgreq-1c1db93d-e251-4f8b-9cff-4df8e6a3ba0e&enrichSource=Y292ZXJQYWdlOzI1NjA5NTAxNDtBUzoxMDE1ODI4Mzg0MzU4NDlAMTQwMTIzMDY0NTU5OA==https://www.researchgate.net/publication/4077741_Malicious_node_detection_in_wireless_sensor_networks?el=1_x_8&enrichId=rgreq-1c1db93d-e251-4f8b-9cff-4df8e6a3ba0e&enrichSource=Y292ZXJQYWdlOzI1NjA5NTAxNDtBUzoxMDE1ODI4Mzg0MzU4NDlAMTQwMTIzMDY0NTU5OA==https://www.researchgate.net/publication/4314491_Malicious_Node_Detection_in_Wireless_Sensor_Networks_Using_an_Autoregression_Technique?el=1_x_8&enrichId=rgreq-1c1db93d-e251-4f8b-9cff-4df8e6a3ba0e&enrichSource=Y292ZXJQYWdlOzI1NjA5NTAxNDtBUzoxMDE1ODI4Mzg0MzU4NDlAMTQwMTIzMDY0NTU5OA==https://www.researchgate.net/publication/221167128_Reduced_Complexity_Intrusion_Detection_in_Sensor_Networks_Using_Genetic_Algorithm?el=1_x_8&enrichId=rgreq-1c1db93d-e251-4f8b-9cff-4df8e6a3ba0e&enrichSource=Y292ZXJQYWdlOzI1NjA5NTAxNDtBUzoxMDE1ODI4Mzg0MzU4NDlAMTQwMTIzMDY0NTU5OA==

  • 8/17/2019 Anomaly Detection in Wireless Sensor Networks- A Survey

    7/25

    Author's personal copy

    Base Station

    A hierarchical wireless sensor network

    Common Sensor Node

    Cluster Head

    Pattern 1 CH

    A Cluster

    Pattern 2 CH & CSNs

    Pattern 3 BS

    Fig. 3.  Available detection patterns (hierarchical).

    Base Station

    Working Node

    A flat wireless sensor network

    Sensor Node

    Pattern 1 One-hop

    A Group

    Pattern 2 Radio Range

    Pattern 3 Other

    Pattern 4 BS

    Pattern 5 Grouping

    Fig. 4.  Available detection patterns (flat).

    M. Xie et al. / Journal of Network and Computer Applications 34 (2011) 1302–1325   1307

  • 8/17/2019 Anomaly Detection in Wireless Sensor Networks- A Survey

    8/25

    Author's personal copy

    The knowledge regarding anomaly detection often consists of 

    assumption  (Palpanas et al., 2003; Subramaniam et al., 2006; Liu

    et al., 2007; Dallas et al., 2007; Li et al., 2008a; Pires et al., 2004;Curiac et al., 2007; Ho et al., 2009), and  experience  (Tiwari et al.,

    2009; Silva et al., 2005; Ioannis et al., 2007). If a normal profile is

    produced on the basis of the knowledge known in advance

    instead of by an explicit training procedure, this scheme is

    categorized as prior-knowledge based. For instance, a detector

    is put into practice in terms of the assumption that the Mahala-

    nobis squared distance constructed by the networking attributes

    is subject to chi-square distribution (Liu et al., 2007). Based on the

    assumption that the signal propagates with a known model (e.g.

    two-ray ground model), a detection scheme is carried out by

    comparing the estimated signal strength from the given model

    and the real signal strength from the transceiver (Pires et al.,

    2004). Security experts suggest that a node is highly possible to

    be compromised if it discards the packets more than   w  percen-

    tage during t  time units; through this experience, a detection rule

    is established (Tiwari et al., 2009; Ioannis et al., 2007).

    A prior-knowledge free scheme allows performing detection

    without any related knowledge in advance. The normal profile is

    produced by a training procedure. All data mining and computa-

    tional intelligence-based and graph-based detection schemes are

    prior-knowledge free (Rajasegarar et al., 2006, 2007; Masud et al.,

    2009;   Wang et al., 2009;   Rahul et al., 2009;   Yu and Tsai, 2008;

    Ngai et al., 2006, 2007), as well as the most of statistical detection

    schemes (Zhang et al., 2008;   Onat and Miri, 2005a,b;   Li et al.,

    2008b). Classification is a typical detection technique derived

    from the family of data mining, in which the classifier is built

    upon the training procedure. As to computational intelligence, GA

    is a good example, which is applied to measure the fitness of node

    without any prior-knowledge (Rahul et al., 2009), and then adetection scheme can be optimally deployed. With the network

    flow information, sensor nodes are divided into many sub-trees,

    where the root of biggest sub-tree is regarded as a compromised

    node (Ngai et al., 2006, 2007). In addition, the standard deviation

    of packet arrival intervals during a specified time period is trainedas the normal profile for identifying anomaly (Onat and Miri,

    2005a), in accordance to

    jmeanðrecBuf ÞmeanðintBuf Þj4K   stdðrecBuf Þ:In conclusion, the dependency on prior-knowledge certainly

    limits their applicability, but prior-knowledge-based schemes are

    generally good at detecting anomaly that closely correlates to

    their known knowledge. Besides, these schemes are usually with

    fast detection speed, and simplicity of being realized. On the

    contrary, prior-knowledge free detection schemes may be awk-

    ward at detection speed, whereas they are provided with stronger

    capability of addressing unknown security threats or random

    failures. Consequently, a rough process of identifying appropriate

    detection techniques is shown in  Fig. 5.

     2.5. Attribute selection

    A truth of interest is that the most of malicious activities or

    random failures against a WSN could be reflected by a single attribute

    or multiple ones over the network. In fact, this is the essence why

    anomaly detection can take effect to enhance the security and

    reliability of WSN. For example, the irregular change of hop count

    implicates a huge likelihood of being endangered by sinkhole attacks

    (Dallas et al., 2007;   Ngai et al., 2006,   2007); the signal power is

    impractical while encountering Hello flood and wormhole attacks

    (Pires et al., 2004); the insider attacks markedly affect the underlying

    distribution of the sensed data (Liu et al., 2007); and the network

    traffic behaviors related measurements such as packet dropping rate

    (Ioannis et al., 2007) and packets arrival process (Onat and Miri,2005a) are capable of identifying black-hole and selective forwarding

    attacks. This nature of attribute makes it a critical research problem.

    Fig. 5.   Process of identifying detection techniques. DM: Data Mining; CI: Computational Intelligence; IDA: Intrusion Detection Agent; SF: Security Foundation;

    VD: Verifying Dataset; Stat: Statistical Techniques and DAD: Distributed Anomaly Detection.

    M. Xie et al. / Journal of Network and Computer Applications 34 (2011) 1302–13251308

    https://www.researchgate.net/publication/224676267_On_the_Intruder_Detection_for_Sinkhole_Attack_in_Wireless_Sensor_Networks?el=1_x_8&enrichId=rgreq-1c1db93d-e251-4f8b-9cff-4df8e6a3ba0e&enrichSource=Y292ZXJQYWdlOzI1NjA5NTAxNDtBUzoxMDE1ODI4Mzg0MzU4NDlAMTQwMTIzMDY0NTU5OA==https://www.researchgate.net/publication/221243920_Insider_Attacker_Detection_in_Wireless_Sensor_Networks?el=1_x_8&enrichId=rgreq-1c1db93d-e251-4f8b-9cff-4df8e6a3ba0e&enrichSource=Y292ZXJQYWdlOzI1NjA5NTAxNDtBUzoxMDE1ODI4Mzg0MzU4NDlAMTQwMTIzMDY0NTU5OA==https://www.researchgate.net/publication/224095788_Designing_Intrusion_Detection_to_Detect_Black_Hole_and_Selective_Forwarding_Attack_in_WSN_Based_on_Local_Information?el=1_x_8&enrichId=rgreq-1c1db93d-e251-4f8b-9cff-4df8e6a3ba0e&enrichSource=Y292ZXJQYWdlOzI1NjA5NTAxNDtBUzoxMDE1ODI4Mzg0MzU4NDlAMTQwMTIzMDY0NTU5OA==https://www.researchgate.net/publication/221167128_Reduced_Complexity_Intrusion_Detection_in_Sensor_Networks_Using_Genetic_Algorithm?el=1_x_8&enrichId=rgreq-1c1db93d-e251-4f8b-9cff-4df8e6a3ba0e&enrichSource=Y292ZXJQYWdlOzI1NjA5NTAxNDtBUzoxMDE1ODI4Mzg0MzU4NDlAMTQwMTIzMDY0NTU5OA==

  • 8/17/2019 Anomaly Detection in Wireless Sensor Networks- A Survey

    9/25

    Author's personal copy

    Furthermore, a reduced set of attributes can improve the detection

    speed as well as the detection accuracy remarkably (Chebrolu et al.,

    2005;   Kloft et al., 2008). But, this problem remains open in the

    anomaly detection of WSNs, despite little progress has been spor-

    adically made (Silva et al., 2005; Ho et al., 2009). This issue would be

    accounted for separately later, owing to the space limitation.

    3. Anomaly detection based on hierarchical WSNs

    In hierarchical WSNs, statistical techniques, data mining and

    computational intelligence, game theory, and hybrid detection

    have been employed to realizing detection schemes. The input is

    collected at each common sensor node, probably followed by a

    preprocessing procedure or a part of computation tasks coming

    from the procedure of data processing. The original/preprocessed

    inputs or local normal profiles are then sent to the cluster head or

    base station, where the global normal profile is produced with a

    training algorithm, some prior-knowledge or a combing algorithm

    during the data processing procedure. The procedure of analysis

    and decision would be carried out at each common sensor node orthe cluster head respectively, or both. Finally, the output of 

    anomaly detection is produced as a specified form where the

    analysis and decision procedure have been done. Basically, these

    techniques tend to find a normal profile using a training proce-

    dure in order to realize higher detection generality. Thus, the

    most of their detection methods belong to prior-knowledge free.

    One common feature of these detection schemes is making

    use of their hierarchical architecture to implement detection

    within a distributed manner, which spreads the energy overhead

    around the entire network and relieves the communication

    burden. Because in distributed detection a central entity is

    required to globally organize and coordinate the sub-computa-

    tion tasks throughout a group, the cluster head suits to such

    naturally. In a distributed detection scheme, the common sensornodes participate in the procedure of data processing, thereby

    taking over a part of computing cost of the cluster head, and

    capable of exchanging less information with the cluster head in

    order for conserving the communication cost. For example, kernel

    density estimator (Palpanas et al., 2003; Subramaniam et al., 2006),

    clustering algorithms (Rajasegarar et al., 2006; Masud et al., 2009),

    and support vector machine (SVM) (Rajasegarar et al., 2007) are

    typical techniques, upon which these distributed schemes depend.

    In the following, a number of particular detection schemes are

    introduced according to their technique categories, for each of 

    which its principle, detection pattern, detection method, and any

    unique feature or additional strategy are depicted in detail.

     3.1. Statistical techniques

     3.1.1. Distributed detection using kernel density estimator 

    A kernel density estimator is built up to identify anomaly by

    estimating the underlying distribution of sensed data (Palpanas

    et al., 2003). First, each common sensor node accomplishes the

    local detection. The cluster head then collects all local normal

    profiles to carry out the global detection within its group. For the

    purpose of ensuring the smooth delivery of streaming data, each

    discrete event occurs under the control of timing parameters:

    dead line and importance.

    The principle is simply described as follows. Given that  S   is a

    random sample of static relation  T  and  k( x) is the kernel function,

    such that for all tuples in  S ,

     f ð xÞ ¼  1n

    Xt iAS 

    kð xt iÞ:

    The underlying distribution  f ( x) is estimated with

     f ð xÞ ¼  1n

    Xt iA S 

    kð xt iÞ:

    Epanechnikov kernel is employed in this case, as

    kð xÞ ¼ 34 1B   1   xB 2

    ,  xB

    o1,0 otherwise,

    8><>:

    where B  (B ¼ ffiffiffi

    5p 

      sjS j1=5, and s is the standard deviation of  T ) is thebandwidth of kernel function. Once   f ( x) is estimated, it enables

    identifying anomaly through calculating the number of sensed

    data’s values ranged within the neighborhood of  t 0. N (t 0, r ) is the

    number of sensed data’s values in   T , which are falling into a

    sphere of radius  r  around t 0, as

    N ðt 0 ,r Þ ¼Z 

     f ð xÞ dx:

    If  N (t 0,r ) is less than a threshold p,  t 0   is identified as an anomaly.

    Afterwards, the sample set   S   and bandwidth of kernel function

    B at each common senor node are sent to the cluster head. Using acombing algorithm, the cluster head is able to work out the global

    normal profile, by which the global detection is launched then.

    Kernel density estimator is good at approximating the under-

    lying distribution of a multiple dimensional dataset with reason-

    able resource cost. Moreover, it is easy to be operated in a

    distributed manner by combining the bandwidths of local kernel

    functions together. The choice of kernel function is critical to the

    performance; however, the estimation of parameters is a hard

    problem in this kind of non-parametric statistical techniques.

     3.1.2. Online detection using kernel density estimator 

    In the advanced kernel density estimator-based detection

    scheme (Subramaniam et al., 2006), many enhancements arefigured out in contrast with its original effort (Palpanas et al.,

    2003). The online approximation of sensed data in a sliding

    window is proposed, using ‘‘chain-sample’’ algorithm. In the

    interest of supporting the online approximation, a couple of 

    points are improved. First, the size of the resulting set from two

    sensor nodes is reduced by the technique of warehousing of 

    samples. Second, a suitable technique for computing the standard

    deviation in a sliding window of streaming data is made use of 

    facilitating the combination of bandwidths, as

    V 1,2 ¼ V 1 þV 2 þ N 1N 2

    N 1,2ðm1m2Þ2,

    where   m   is the mean,   V    is the variance,   N 1,2¼N 1þN 2, andm

    1,

    2 ¼ ðm

    1

    N 1þm

    2

    N 2Þ=N 1,2. Third, each common sensor node only

    reports the update of its kernel density estimator with a prob-

    ability f  ¼ jR pj=ljRj, where a parent node has  l  children nodes, eachwith a kernel density estimator of size jRj, and the kernel densityestimator of parent node has size jR pj. Except distributed devia-tion detection algorithm which is based on distance (Palpanas

    et al., 2003), a new local metrics-based algorithm multi-granular

    deviation detection (MGDD) is introduced. Given that  MDEF ð p,r ,aÞis the deviation factor of an observation  p, and sMDEF ð p,r ,aÞ  is thenormalized standard deviation in the sampling neighborhood of 

     p,  p  is flagged as an anomaly if 

    MDEF ð p,r ,aÞ4kssMDEF ð p,r ,aÞ,where ks   is the factor of determining a significant deviation.

    Online detection is carried out in this advanced scheme. With

    a probability-based strategy, the normal profile can be regularlyupdated to meet the dynamic of system but not incurring too

    much energy cost. On the other hand, a new local metrics-based

    M. Xie et al. / Journal of Network and Computer Applications 34 (2011) 1302–1325   1309

  • 8/17/2019 Anomaly Detection in Wireless Sensor Networks- A Survey

    10/25

    Author's personal copy

    algorithm is introduced to detection, which suits to the dataset

    indistinguishable by distance.

     3.1.3. Detection using statistical measures

    Relying on spatiotemporal correlation and consistency in some

    spatial granularity, and a frequency mechanism respectively, a

    detection scheme is designed to deal with insider attacks (Zhanget al., 2008), such as exceptional message and abnormal behavior.

    Two detection mechanisms are introduced, one of which is that

    the cluster head covers its group, and the other one is that each

    common sensor node watches its one-hop neighbors. A random

    secret key pre-distribution mechanism cooperates with this

    detection scheme.

    The principle of the exceptional message detection mechanism

    (EMDM) is adopting the similarity between a pair of messages

    coming from the common sensor nodes to identify anomaly.

    Given a dynamic set maintained by the cluster head

    D ¼ fðM i,W iÞjðM 1,W 1Þ,ðM 2,W 2Þ, . . .  ,ðM n,W nÞg,where   M i   stands for a recorded message,   W i   is the weight

    (frequency) of   M i. When a new message   M new   arrives at thecluster head, M new traverses across D. If  M new matches with any M iin accordance to

    simðM new,M iÞ ¼  V ðM newÞ  V ðM iÞV ðM newÞ V ðM iÞ

     ,

    namely the similarity between  M new and  M i is less than a thresh-

    old,   M new   is identified as normal and its corresponding   W iincreases. Otherwise,  M new  is put into a new observing period to

    eventually determine it is a new type of message or fake message.

    If similar messages come from the other nodes during this period,

    M new is a new type of normal message; on the contrary,  M new is a

    fake message firmly. The sender of  M new  is marked as malicious

    immediately, and let the other common sensor nodes and base

    station be informed.

    As for the abnormal behavior detection mechanism (ABDM),

    two measures are employed to identify anomaly. One is to

    examine if a common sensor node sends too much or too less

    messages in a turn. The other one is built upon a security

    foundation. Each common sensor node records its one-hop

    neighbors’ ID  and N (IDi), where N (IDi) is the value of the abnormal

    behavior of node  ID i. Given

    jðID xÞ ¼ ððID jÞ,N ðID jÞÞjððID1Þ,N ðID1ÞÞ, . . .  , ðIDmÞ,N ðIDmÞ,where m  is the number of  ID x’s neighbors, and

    uID x ¼  1

    m

    Xm j ¼ 1

    N ðID jÞ,

    sID x ¼ ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi

    1m1

    Xm j ¼ 1

    N ðID jÞmID xv uut

      ,

    jID x ¼N ðID jÞmID j

    sIDj

    ,

    where   uID x   and  sID x  denote the mean and standard deviation of jðID xÞ  respectively, if  j ID x  is deviated from a normal value, nodeID j  will be reported to the cluster head as suspicious node.

    This detection scheme makes use of a comparatively simple

    technique, such that a faster detection speed comes true. Because

    EMDM and ABDM work together, the cluster head and common

    sensor nodes activate to perform detection at the same time,

    which may provide the network with stronger security. However,

    an apparent flaw exists in EMDM. If more than one maliciousnode sends the same fake messages, EMDM is incapable of 

    sustaining its operation against such attacks.

     3.1.4. Detection using rules based on probability

    Tiwari et al. lead a probability model (Tiwari et al., 2009) into

    the rule-based scheme (Ioannis et al., 2007), aiming at black-hole

    and selective forwarding attacks. By using the probability model

    to more accurately measure the traffic behaviors, the false alarm

    rate of the rule-based detection scheme can be sharply reduced. A

    part of the common sensor nodes are selected as watchdogs, tomonitoring the neighbors within its radio range; the cluster head

    is responsible for the analysis and decision procedure.

    This scheme employs two detection rules: (A) During a time

    window of  w, if the probability  p0  of packets dropping in a sensornode is greater than a threshold   t , this node is reported as

    suspicious; (B) if the probability  p  of a sensor node being reported

    as suspicious is greater than 50%, the cluster head marks it as

    compromised definitely. At each watchdog, the network traffic

    pattern is modeled with Poison distribution. If the expected

    amount of occurrences during a given interval is  l, the probability

    of  k  occurrences (non-negative integer,  k ¼0,1,2y) is equal to

     f 

    ðk,l

    Þ ¼

     lkel

    k!

      ,

    where   l   can be estimated according to network learning. If a

    sudden change of the network traffic in a sensor node is perceived

    by a watchdog, this node is reported as suspicious to the cluster

    head. The rest of the watchdogs covering the radio range where a

    suspicion appears, are called for participating in the procedure of 

    analysis and decision. During this procedure, if the probability  p 0

    reported by a watchdog against the suspicious node is greater

    than   t , the cluster head records it as ‘‘1’’, otherwise ‘‘0’’. After a

    specified time interval, the cluster head generates a probability

    sequence against the suspicious node, with the reports of watch-

    dogs. This sequence is split into two-bit pairs; afterwards, all ‘‘00’’

    and ‘‘11’’ pairs are eliminated for preventing from bias. Let the

    probability of outcome ‘‘0’’ be   q   and ‘‘1’’ be 1 q. p   is thencomputed from the resulting sequence; if (B) is satisfied, thesuspicious node is marked as a compromised node definitively.

    This scheme improves a rule-based detection scheme by

    taking advantage of probability-based measure, reducing the false

    alarm rate significantly.

     3.1.5. Research problems

    Statistical techniques-based detection schemes are flexible.

    Single or multiple attributes over the network such as the

    network traffic (Tiwari et al., 2009) and the sensed data (multi-

    dimensional) (Palpanas et al., 2003;   Subramaniam et al., 2006)

    can be utilized to construct a variety of statistical distributions; or

    the statistical measurements are dedicated to reflect a normal

    status, such as similarity, mean, variance, standard deviation

    (Zhang et al., 2008), etc. Taking the appropriate statistical dis-

    tributions and measurements into account is necessary for the

    sake of meeting a wider range of application scenarios.

    The benefits of distributed manner are already mentioned. It is

    strongly encouraged that makes use of it as much as possible.

    Statistical techniques own great potential to be reconstructed in a

    distributed manner, because their core computing tasks are able

    to be divided into smaller ones and then combined easily, such as

    kernel density estimator (Palpanas et al., 2003;   Subramaniam

    et al., 2006). Moving along this path, the detection schemes based

    on statistical techniques can be implemented with stronger

    detection generality, but resource-efficient.

    Online detection, which is of great significance for many real-

    time application scenarios, has brought to success with kernel

    density estimator technique (Subramaniam et al., 2006). How-ever, this needs smart strategies to enormously reduce the

    information exchange.

    M. Xie et al. / Journal of Network and Computer Applications 34 (2011) 1302–13251310

  • 8/17/2019 Anomaly Detection in Wireless Sensor Networks- A Survey

    11/25

    Author's personal copy

    Incorporating other techniques into statistical techniques

    could boost the detection performance, such as rule-based detec-

    tion technique (Tiwari et al., 2009), where a couple of detection

    rules are set up to avoid the difficulty of training the normal

    profile, but using a probability model to accurately measure the

    traffic behaviors.

     3.2. Data mining and computational intelligence-based techniques

     3.2.1. Distributed detection using K-means clustering 

    With a K-means clustering algorithm, Rajasegarar et al. (2006)

    design a distributed detection scheme. Each common sensor node

    locally collects the input dataset to work out a normal profile.

    Then the cluster head collects all local normal profiles to accom-

    plish the procedure of data processing, where a global normal

    profile is produced. After received the global normal profile, each

    common sensor node initiates the analysis and decision proce-

    dure to perform detection. In order to fit in distance-based

    clustering, the input dataset is normalized at each common

    sensor node with a preprocessing procedure.

    Given a dataset  vkj, k ¼1ym, it is transformed toukj ¼ ðvkjmvjÞ=dvj,where  mvj   and  dvj  stand for the mean and standard deviation of the jth attribute in vkj,8k respectively. Subsequently ukj is normal-ized in the interval [0,1], according to

    ukj ¼ ðukjminu jÞ=ðmaxu jminu jÞ:Given a common sensor node si collecting a dataset X i, si sends the

    local normal profile

    Xmk ¼ 1

     xik,Xm

    k ¼ 1ð xikÞ2,m, ximax, ximin

    !

    to the cluster head, where   m   stands for  j X ij. After the global

    normal profileðmG,d2G, xGmax, xGminÞis computed, the cluster head sends it back to the common sensor

    nodes. After received the global normal profile, each common

    sensor node initiates detection locally, using a fixed-width clus-

    tering algorithm. If the Euclidean distance between a data point

    and its closest cluster centroid is larger than a user-specified

    radius   o, a new cluster is organized with this data point ascentroid. For reducing the number of resulting clusters, a cluster

    merging process is then conducted, through measuring the inner-

    cluster distances. The clusters   c 1   and   c 2   merge if their inner-

    cluster distance  d(c 1,c 2) is less than  o . Finally, the average inter-cluster distance of  K  nearest neighbor (KNN) clusters is applied to

    identify anomalous clusters. Let   ICDi  be the average inter-cluster

    distance (KNN) of cluster i, AVG(ICD) and SD(ICD) be the mean and

    standard deviation of all inter-cluster distances respectively. If 

    ICDi4SDðICDÞþ AVGðICDÞ,cluster i   is viewed as anomalous.

    This detection scheme is subject to a distributed manner,

    where the common sensor nodes are responsible for a part of the

    global normalizing procedure, which is served for the core

    K-means clustering algorithm. There is a four-parameter tuple

    making up a normal profile, which conserves energy cost in

    communications.

     3.2.2. Distributed detection using SVM 

    One-class quarter-sphere SVM, as a representative algorithm

    of SVM, is also suited to distribute anomaly detection (Rajasegararet al., 2007). First, the local quarter-sphere is computed at each

    common sensor node. Second, the cluster heads collects these

    locally computed radii to work out a global radius. Detection is

    then launched at each common sensor node with the global

    normal profile.

    In terms of the optimization problem:

    minRAR,eARn

      R2

    þ

     1

    vnX

    n

    i ¼ 1xi,   s:t:   Jj

    ð xi

    ÞJ

    2rR2

    þxi,   xiZ0,

    where   xi   is a data vector, the mapped vector   jð xiÞ   is calledas image vector,   R   is the radius of the quarter-sphere, and

    fxi  :  i ¼ 1 . . . ng   are the slack variables that allow a part of theimage vectors lying outside the quarter-sphere. This problem can

    be resolved by Lagrange algorithm. The image vectors conse-

    quently may fall inside, on the boundary of, and outside the

    quarter-sphere (outliers). Subsequently, the cluster head collects

    the radii locally computed at each common sensor node to obtain

    a global radius Rm. A couple of measures are optional to compute

    Rm: mean, median, maximum, and minimum. When the common

    sensor nodes receive Rm, detection is initiated. If a test instance xisatisfies

    norm~

    kð xi, xiÞ4

    R

    2

    m,

     xi  is identified as an anomaly.

    This scheme may suffer from a more massive procedure of 

    data processing, as a result of the high complexity of SVM. But,

    only one parameter as the normal profile is exchanged between

    the cluster head and common sensor nodes, indicating mush less

    communication cost.

     3.2.3. Distributed detection using clustering ellipsoids

    Across the entire network, a WSN probably contains multiple

    types of data underlying distribution; accordingly, Moshtaghi

    et al. propose a distributed detection scheme based on clustering

    ellipsoids (Masud et al., 2009). The base station takes charge of 

    computing the global hyper-ellipsoid, to accommodate the non-

    homogenous data underlying distributions. The common sensornodes are in charge of performing detection, on the other hand,

    with the global hyper-ellipsoid.

    The general form of the elliptical boundary is represented as

    ellða, A; t Þ ¼ f xAR pjð xaÞT  Að xaÞ ¼ t 2g,where a  is the center of the ellipsoid and  t   is its effective radius.

    The Mahalanobis distance of  x  is

    J xmJV 1 ¼ ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffið xmÞT V 1ð xmÞ

    q   ,

    where   m   is the mean and   V   is the covariance matrix. Conse-

    quently, x  is actually resided within a hyper-ellipsoidal boundary

    if its Mahalanobis distance is  t , i.e.:

    Bðm,V 1; t Þ ¼ f xAR pjJ xmJ2V 1 ¼ t 2g: x is considered as a local anomaly if falling outside this boundary.

    Hyper-ellipsoids are sent to the base station by the common

    sensor nodes as local normal profiles, where a global ellipsoid is

    produced. In order to satisfy as many types of data underlying

    distribution as possible,   t   is intentionally selected. In addition,

    these ellipsoids reported by the common sensor nodes are

    disposed off with clustering which reduces the redundancy

    between them. Given a common sensor node   N  j   sending the

    parameter tuple (m j, V  j, n j) regarding its local ellipse  E  j to the base

    station B , the similarity between two ellipsoids is measured as

    S ðE 1,E 2Þ ¼ eJm1m2J:Positive root eigenvalue (PRE) plot is employed to estimate the

    number of clusters  c . Ellipses merge as a pairwise manner whenthe similarities and c  are ready: Let (mu, V u, nu) and (mv, V v, nv) be

    the parameter tuples of the ellipsoids  E u  and  E v  respectively, the

    M. Xie et al. / Journal of Network and Computer Applications 34 (2011) 1302–1325   1311

  • 8/17/2019 Anomaly Detection in Wireless Sensor Networks- A Survey

    12/25

    Author's personal copy

    parameter tuple of the global ellipse  E 0  will be (m,V ,n):

    n ¼ nu þnv,

    m ¼  nun

      mu þ nv

    n mv,

    V  ¼  nu1n1  V u þ

     nv1n1  V v þ

      nunvnðn1Þ ½ðmumvÞðmumvÞ

    T :

    This parameter tuple of the global ellipse is the global normal

    profile in fact. When the common sensor nodes receive it from  B,

    detection is launched locally.

    Using the base station to undertake the main computing tasks,

    this detection scheme is energy-efficient. However, there is a

    scope for thinking over better similarity measures for hyper-

    ellipsoids, which take the shape and orientation of the ellipses

    into consideration, as well as their separation. Moreover, more

    robust methods are in need to merge ellipses which are from

    slightly different underlying distributions. In this context, it also

    desires for a more appropriate boundary than a standard devia-

    tion, in order to avoid excessive false positive alarms.

     3.2.4. Detection using multi-agent and refined clustering 

    Wang et al. (2009)  introduce a multi-agents-based detection

    scheme, which takes advantage of self-organizing map (SOM)

    neural network algorithm and K-means clustering algorithm.

    Detection agents including sentry, analysis, response, and man-

    agement are attached to each node over the network, which

    particularly take charge of detection. In this scheme, the cluster

    head is taking care of its common sensor nodes, whereas a part of 

    common sensor nodes are activated in terms of their remaining

    energy for monitoring the cluster head.

    In fact, the cluster head and common sensor nodes monitor

    with each other, using a same principle. The input dataset is

    clustered by SOM neural network first of all. Afterwards, theclusters are refined by using K-means clustering algorithm. Let

    Dxi   be the Euclidian distance between   xi   and the center of its

    cluster  X  j1. If  Dxi  is larger than the distance between  xi  and the

    center of another cluster  X  j2, xi is re-clustered into cluster  X  j2. The

    U-Matrix Map of the weight generated by neural network enables

    to identify anomaly. Once anomaly is perceived, the trust degree

    between two nodes is decreased. The definitive alarm is produced

    until the degree of trust is below a predefined threshold.

    The participation of agents provides this scheme with higher

    flexibility, but also incurs excess costs. Letting the cluster head be

    attended increases the security, as it meets ‘‘trust-no-node’’.

    However, employing SOM neural network algorithm and K-means

    clustering algorithm at the same time brings a massive computa-

    tion burden.

     3.2.5. Optimized detection using genetic algorithm

    This GA-based scheme does not focus on detection explicitly,

    but it is able to not only speed up the detection accuracy, but also

    reduce the false alarm rate (Rahul et al., 2009). This scheme

    allocates the monitoring function to the sensor nodes through

    using GA to evaluate its fitness on the basis of workloads patterns,

    packet statistics, utilization data, battery status, and quality-of-

    service compliance.

    Sensor nodes are classified as cluster head (CH), inactive node

    (powered off), inter-cluster router (ICR), and common sensor

    node (NS) in particular. The base station obtains a competing

    fitness function based on GA to optimally select CH or ICR as the

    local monitoring node (LMN), where each solution is representedas a binary string (chromosome) and an associated fitness

    measure. From the mating pool, a solution is picked out with a

    probability P i, as

    P i ¼  F iPN 

     j ¼ 0 F  j,

    where  F i  is the functional fitness of a possible solution, and  N   is

    the total number of possible solutions. LMN agent is in charge of 

    monitoring its neighbor nodes: (a) received signal strength,(b) transmission periodicity, (c) spurious transmissions from

    illegitimate nodes, (d) response delay, and (e) packet dropping

    or modification. In addition, the base station utilizes LMN as a

    loop-back agent to transmit special patterns through its trusted

    route and receive the patterns with a pre-established route, in

    which malicious nodes can be identified by the transmitting of 

    hashed data. Moreover, the base station covers the entire network

    with optional techniques (statistical metrics and models, Markov

    model, and time series model, etc.) on the basis of analytical

    traffic data and LMN alerts. The fitness function consists of 

    monitoring node integrity fitness (MIF), monitoring node battery

    fitness (MBF), monitoring node coverage fitness (MCF), and

    cumulative truest fitness (CTF). MIF resists the allocation of 

    LMN which is suspected to be compromised; the base stationestimates MIF with integrity rank value, whereby a low value

    indicates high susceptibility to intrusion.

    MIF  ¼PN 

    ch ¼ 1 IRch  K chPN ch ¼ 1 K ch

    þPN 

    icr  ¼ 1 IRicr   K icr PM icr  ¼ 1 K icr 

    ,

    K  x ¼ 1 if   x ¼ LMN ; xAðch,icr Þ,

    IRicr  ¼PR

    r  ¼ 1 IRr icr 

    R  ,

    where   IRch   and   IRicr    are the integrity ranks of CH and ICR 

    respectively,  R   is the number of routes, and   IRicr r  is the integrity

    rank of the route   r   that includes   icr   as a router in its path.   IR   is

    estimated by the base station according to

    R x, y ¼  covð x, yÞ

    varð xÞ varð yÞ ; 1oR x, yo1,

    lðt Þ ¼ a  lðt 1Þþð1aÞ lðt 1Þ,

    IDC  ¼varXnk ¼ 0

    lk

    !  E 

    Xnk ¼ 0

    lk

    !,  ,

    where   lðt Þ   stands for the actual number of the packet arrivalsduring interval   t ,   lðt Þ   stands for the estimated number of thepacket arrivals during interval   t , and   lk   is the number of the

    packet arrivals between time intervals tk  and  tk þ 1. MBF reflects apenalty on the battery usage of the communication between

    sensor nodes, as

    MBF  ¼PN 

    i   BC i  K iPN i   K i

    ,   BCi ¼ f ðQ ,U Þ,

    where   Q   is the residual battery capacity, BCi   is the projected

    battery capacity of node   i   (CH or ICR). Battery usage rate (U )

    depends on individual load and can be estimated with traffic

    patterns and node-sync data. MCF rewards LMNs those can snoop

    around the maximal number of nodes with low estimated

    integrity rank:

    MCF  ¼ 12

    b1PN 

    i   ciF 1  N 

      þ b2PM 

     j   c jF 1  M 

    !,

    b1 þb2 ¼ 1,where  ci   is the number of LMN agents that monitor maliciousnode   i, which is below the integrity rank threshold,   c j   is the

    number of LMN agents that monitor non-malicious node  j, which

    M. Xie et al. / Journal of Network and Computer Applications 34 (2011) 1302–13251312

  • 8/17/2019 Anomaly Detection in Wireless Sensor Networks- A Survey

    13/25

    Author's personal copy

    is above the integrity rank threshold, and   F 1   and   F 2   are the

    expected coverage redundancies for each malicious and non-

    malicious node respectively. The total fitness is given by CTF, as

    CTF  ¼a1MIF þa2MBF þa3MCF :This scheme is extremely appropriate to cooperate with any

    detection scheme, for not only conserving resource usage, but alsopromoting its detection performance. The limitation of this

    scheme is that GA suffers from exponential time increase if the

    network’s scale grows.

     3.2.6. Research problems

    Data mining and computational intelligence algorithms-based

    detection schemes characterize by strong detection generality,

    meaning effective to defense against a wider range of security

    threats even if unknown. The tempting detection generality, of 

    course, comes along with high complexity, such that these

    schemes’ best effort are tried to operate in distributed manner

    (Rajasegarar et al., 2006, 2007; Masud et al., 2009).

    Not simply profiting from the hierarchical architecture of the

    network, such as proficient control and management, littleredundancy of routing, and adaptability to a distributed manner,

    arranging the primary computing tasks to the base station also

    provides the detection schemes with much more conversation of 

    energy overheads (Masud et al., 2009;   Rahul et al., 2009).

    Equipping each sensor node with detection agents could enhance

    the performance and the ease of implementation without taking

    too much energy in sensor nodes away (Wang et al., 2009), but

    certainly leads to extra expense on advanced devices.

    In fact, the GA-based scheme (Rahul et al., 2009) is an

    attractive paradigm for developing intelligent detection schemes

    over WSNs. A few of significant factors relating to the benign

    status are modeled with a fitness function in each potential

    solution, according to which the best solution is eventually found

    by an optimizing process. The final detection solution couldachieve maximal detection performance with minimal resource.

    This scheme is able to cooperate with a range of detection

    techniques, and makes them more intelligent.

     3.3. Game theory-based techniques

     3.3.1. Non-cooperative game theory

    A game theory-based scheme is introduced for finding out the

    vulnerable areas in a WSN (Agah et al., 2004a), based on many

    risk factors such as reliability of a sensor node, different types of 

    attack, and past behaviors of the attacker. Only these identified

    areas are provided with the protection of detection, in order to

    save the energy cost.

    Intrusion detection is modeled as a game played between

    detection system and adversary. Each player is allowed to select

    a strategy from a set of strategies once. Given a fixed cluster in

    the network, say   K , these strategies are available to adversary:

    attack cluster   K , not attack cluster   K , and attack a different

    cluster. Detection system responds to either defend cluster  K , or

    defend a different cluster. The strategies are marked with 1 to

    3 and 1 to 2 for adversary and detection system respectively,

    where two 2 3 payoff matrixes A  and  B  can be established. Theproblem is to find out the optimized strategy that maximizes the

    profit for both players, namely achieving Nash equilibrium.

    Measuring the payoff depends on a couple of factors, including

    attack type, density of sensor nodes, and the number of previous

    attacks. Nash equilibrium is achieved when both players selected

    their own first strategy. In other words, protecting the clusterwhich has the highest value of   U (t )C k   brings about a reliablerate of successful detection, where   U (t ) indicates the utility of 

    the network’s on-going sessions, and   C k   indicates the average

    cost of protecting cluster  K .

     3.3.2. Comparisons with game theory-based scheme

    The non-cooperative game theory-based scheme (Agah et al.,

    2004a) is then compared with Markov decision process (MDP)

    and intuitive traffic measure (Agah et al., 2004b).With a stochastic process known as Markov Chain, MDP can do

    forecasting by modeling the system’s state transitions in the past.

    MDP contains a tuple (S , A,R,tr ), where  S  is a state set, A  is a set of 

    actions,   R   is the reward function, and   tr   is the state-transition

    function. The past system states and the transitions between

    states can be described by a MDP model. The target is to

    maximize the expected value of the received rewards over time.

    On the other hand, the traffic measure is based on the intuitive

    metric, so that the cluster which suffers from heaviest traffic

    volume is marked as the most vulnerable area. Because of taking

    account into many factors, the non-cooperative game theory-

    based scheme accomplishes highest forecasting accuracy among

    others.

     3.3.3. Research problems

    Similar to the GA-based scheme (Rahul et al., 2009) mentioned

    earlier, non-cooperative game theory-based schemes are not

    concerned with detection immediately; however, it could assist

    detection schemes in advancing their performance as well as

    efficiency. The design of the payoff function is crucial to the

    forecasting accuracy, which is worth more studying. Moreover, if 

    the GA-based scheme which is capable of optimizing the place-

    ment of the monitoring nodes could cooperate with the game

    theory-based scheme which enables identifying the vulnerable

    areas, it is expected that the detection schemes can achieve better

    performance.

     3.4. Hybrid detection

     3.4.1. Detection with prevention technique

    There is only a hybrid detection framework (Su et al., 2005),

    which really calls for the collaboration between the energy-saving

    detection technique and the authentication-based prevention

    technique. In the detection scheme, the cluster head is respon-

    sible for monitoring its common senor nodes; on the other hand, a

    part of the common senor nodes are picked out in terms of their

    residual energy to monitor their cluster head in turn.

    A suite of secret keys are established during initialization, in

    which the base station and common sensor nodes share the

    individual secret key, each common sensor node shares a set of 

    pairwise secret keys with its neighbors, the common sensor nodes

    within a cluster share a cluster secret key, and the group secret keyis shared among all sensor nodes over the network. The packets

    transmitting through the network are categorized as control mes-

    sages and sensed data. When the base station, cluster head, or any

    intermediate node forwards a control message, a message authenti-

    cation code (MAC) is appended with proper secret key. The inter-

    mediate nodes forwarding this control message verify the appended

    MAC and replace it with a new MAC. The verifying and replacing

    of MAC continues until this control message arrives at its destina-

    tion. If sender (u) sends control message (M ) to receiver (vi) with

    current time stamp  T c , a MAC is generated by a proper secret key

    according to

    u-vi  :  M ,T c ,MAC ðK uvi  ,M jT c Þwhere  M 

    jT c  is the concatenation of  M  and T c , and MAC 

    ðK uvi  ,M 

    jT c 

    Þ is

    the MAC generated from  M jT c   with the secret key   K uvi   which isshared between u and vi. When a common sensor node (vi) forwards

    a sensed data (D) to the cluster head (u),   u   needs to verify   D   to

    M. Xie et al. / Journal of Network and Computer Applications 34 (2011) 1302–1325   1313

  • 8/17/2019 Anomaly Detection in Wireless Sensor Networks- A Survey

    14/25

    Author's personal copy

    prevent from any fake or redundant messages sent by the attackers.

    Because   D   is usually large and periodically sent from  vi   to   u, the

    generation of MACs during the forwarding path is time-consuming

    and impractical for a WSN. In consequence, an enhanced authenti-

    cation scheme of LEAP is put forward. The original LEAP cannot

    identify the compromised nodes, as all the common sensor nodes

    within a cluster share only one cluster secret key. First, pairwisesecret key is used by the enhanced scheme, instead of cluster secret

    key which is used by the original LEAP. Second, the enhanced

    scheme employs one-time key chain as session keys, which is fairly

    efficient for authentication.

    The detection is implemented in accordance to three types of 

    misbehaviors: packet dropping, packet duplicating, and packet

     jamming. This detection scheme can be divided into two parts:

    one is that the cluster head monitors its common sensor nodes

    and the other one is that the common senor nodes monitor their

    cluster head in turn. In particular, monitoring the cluster head

    consists of arranging monitoring nodes, reacting to the abnormal

    cluster heads, determining the alarm threshold, and determining

    the group size. Moreover, monitoring the common sensor nodes is

    simply to localize the suspicious node by pairwise secret key if anomaly found.

    This scheme is certainly able to reach at energy-efficient as

    well as strongly secured, by taking consideration into many

    details, for example linking detection against internal attackers

    with prevention against external attackers together, using one-

    time key chain, letting the cluster head to be attended with

    minimized energy cost, and fast localizing the compromised

    nodes with a secret key. However, sensor nodes cannot move

    and new sensor nodes cannot be added, once the pairwise key has

    been established. Probably a dynamic key management and a

    distribution mechanism could overcome this flaw.

     3.4.2. Research problems

    Few schemes (Zhang et al., 2008) mentioned to cooperate with

    a prevention-based technique in hierarchical WSNs. Moreover,

    the security foundation established with a prevention technique

    is only served as enhancing the security of the network, instead of 

    taking advantage of the functions brought by the availability of 

    secret keys. WSNs should have been protected by a security

    foundation (Perrig et al., 2001). Apparently, the detection scheme

    will be more efficient if capable of utilizing the functions provided

    by this security foundation, rather than making use of prevention

    and detection separately.

    4. Anomaly detection based on flat WSNs

    In flat WSNs, rule-based techniques and statistical techniques

    are more likely to be made use of. Without hierarchical architec-

    ture, all nodes are equally capable of functioning and participat-

    ing in internal protocols. Consequently, detection schemes which

    are lightweight and require less communication are preferable. In

    this section, we survey some of the representative literatures for

    each technique category mentioned above.

    A rule-based model is commonly developed in accordance

    with assumptions, information, or experiences known in advance.

    As a result, it often focuses on specific security issues by examin-

    ing the particular attributes of networking behaviors. In flat

    WSNs, statistical techniques are relatively simpler than those

    for hierarchical WSNs, because of the nature of the architecture.

    Because data mining and computation intelligence techniques

    often depend on a central entity to cope with heavy organiza-tional tasks, flat architecture is naturally disabled for this,

    although data mining and computation intelligence techniques

    might be implemented with assistance such as the installation of 

    agents (Ho et al., 2009).

    Besides, detection methods in flat WSN are also diverse.

    Minimizing energy consumption while retaining good perfor-

    mance is always important, and this is discussed along with the

    various detection methods mentioned in the proposed detection

    schemes below.

    4.1. Rule-based detection

    4.1.1. Decentralized detection using rules

    A decentralized rule-based scheme is proposed (Silva et al.,

    2005), in which a rule union picked from a set of candidate rules

    is applied to satisfy the specific demands of application scenarios.

    Given a WSN composed of common nodes, monitor nodes,

    intruder nodes, and base station, each monitor node is in charge

    of monitoring the neighbors within its radio range, by turning the

    promiscuous listening mode on.

    In particular, this scheme makes up of data acquisition, rule

    application, and intrusion detection. In the first phase, each

    monitor node collects messages by a promiscuous listening modeand filters off the important information for subsequent analysis.

    The applicable rules are selected out according to requirements

    during the second phase. As for the intrusion detection phase,

    failing to match a rule increases one onto the failure counter. An

    alarm is produced until this counter is over a predefined thresh-

    old within a round of detection.

    This scheme gives a good framework to rule-based detection.

    But, there is a lack of clear description in regard of the details of 

    determining monitor nodes, such as particularly how many and

    which sensor nodes should be on duty to make sure the entire

    network is under protection.

    4.1.2. Detection using multi-hop ACK Building upon a mechanism of multi-hop acknowledgement, a

    detection scheme is put forward to defense against selective

    forwarding attack (Yu and Xiao, 2006). Detection is active during

    the path forwarding packets from the source node to the base

    station, where the base station, intermediate nodes, and source

    node take part.

    A security foundation has to be established firstly, including

    (A) node initialization and deployment, and (B) OHC (one-way

    hash chain) based one-to-many authentication. The secret key

    server loads every sensor node with a unique secret key and a

    symmetric bivariate polynomial   f (u,   v) in the initialization. The

    unique secret key is shared between this node and the base

    station, and can be used for encrypting messages and genera-

    ting MACs (message authentication codes). In the deployme