dataset generation and network intrusion detection...

DATASET GENERATION AND NETWORK INTRUSION DETECTION BASED

ON FLOW-LEVEL INFORMATION

AHMED ABDALLA MOHAMEDALI ABDALLA

A thesis submitted in fulfilment of the

requirements for the award of the degree of

Doctor of Philosophy (Electrical Engineering)

Faculty of Electrical Engineering

Universiti Teknologi Malaysia

17th

SEPTEMBER 2015

iii

To My Mother for all what she gave to me and for what she is still giving.

I dedicated this humble work to you.

iv

ACKNOWLEDGEMENT

First of all, I would like to express my deepest thanks to ALLAH (SWT),

who guided, helped, and supported me.

I wish to send my serious gratefulness and thankfulness to my research

supervisors, Associate Professor Dr. Sulaiman Mohd Nor and Dr. Shaikh Nasir bin

Shaikh Husin for encouragement, motivation, full support in academic, social, and

technical issues. Truly, I have been overwhelmed by their patience, personal

kindness, valuable comments, and advices during my study. Without their

continuous guidance and help, this thesis would not have been completed.

I have greatly appreciated the opportunity from UTM to purse my PhD, I

thank very much all UTM community including each people at (faculties, library,

staff, employees, students and labourers). Special thankfulness to every member of

people in FKE, CICT, SPS.

Finally, I would take pleasure in gratitude my mother, father and all the

members of my family, friends and relatives for their constant dear one, help and

support which motivate me to face the life difficulties.

Ahmed Abdalla

v

ABSTRACT

The growth of the Internet and networking has made securing networks

against attacks a very challenging task. For high-speed networks, flow meta-data

inspection can replace conventional Deep Packet Inspection but with the cost of low

precision of identifying attacks since the former deals with an aggregated version of

the traffic. The first part of this research addresses the problem of the lack in

benchmarking datasets for developing new Network Intrusion Detection Systems

(NIDSs) or comparing existing NIDSs. The aim in the second part is to design a near

real-time NIDS without degrading the detection accuracy when compared to

conventional misuse packet-based approaches. To achieve the first objective, a NIDS

dataset creation framework had been developed. Based on that framework, a flow-

level NIDS dataset had been created. The traces were collected from campus main

routers in NetFlow format while malicious traffic of different attack scenarios was

generated by Nmap and BoNesi tools. In the second part a flow-based software-

based system were developed to detect and identify network volume-level attacks in

near real-time. Attack detection is based on statistical time-aggregated features of the

exported NetFlow version of the traffic to detect several scan and Denial-of-Service

(DoS) attacks. A validation for the designed system is done using Defense Advanced

Research Projects Agency (DARPA) datasets. The timeline performance

outperformed all relevant software-based systems and suited for up to one gigabit per

second links with an average detection delay of less than one minute. The proposed

method achieved 95% True Positive Rate (TPR) and almost zero False Positive Rate

(FPR). Compared to relevant methods when operated in the same conditions, the

proposed method improved the TPR by 4% and improved FPR by 1%. In addition,

the capability of flow-based approach in detecting packet-level attacks was

experimentally demonstrated. The results against Snort were benchmarked and 75%

TPR and almost zero FPR were achieved.

vi

ABSTRAK

Pertumbuhan Internet dan rangkaian telah menjadikan keselamatan rangkaian

terhadap serangan satu tugas yang sangat mencabar. Untuk rangkaian kelajuan

tinggi, penelitian meta-data boleh mengganti penelitian tahap paket konvensional

tetapi dengan kos ketepatan rendah semasa mengenalpasti serangan memandangkan

tahap-aliran berkait dengan agregasi trafik. Bahagian pertama kajian ini bertujuan

menyumbang dalam menyelesaikan masalah keperluan set data penandaan aras yang

piawai untuk membangun Sistem Pengesan Penceroboh Rangkaian (NIDS) baru atau

membanding dengan yang sedia ada. Tujuan di bahagian kedua adalah untuk

merekabentuk NIDS masa nyata berhampiran tanpa merendahkan ketepatan

pengesanan apabila dibandingkan dengan pendekatan konvensional penyalahgunaan

berasaskan paket. Bagi mencapai objektif pertama satu rangka kerja bagi

mewujudkan dataset NIDS telah dibangunkan. Berdasarkan rangka kerja tersebut

dataset tahap-aliran NIDS telah dibangunkan. Surih dikutip dari penghala utama

kampus dalam format NetFlow manakala trafik hasad dari senario serangan berbeza

telah dihasilkan oleh peralatan Nmap dan BoNesi. Dalam bahagian kedua, satu sistem

berasaskan perisian tahap-aliran telah dibangunkan bagi mengesan dan

mengenalpasti serangan rangkaian tahap-jumlah dalam masa nyata berhampiran.

Pengesanan serangan adalah berdasarkan kepada ciri-ciri statistik masa terkumpul

trafik NetFlow versi yang dieksport bagi mengesan beberapa imbasan dan serangan

Penafian Perkhidmatan (DoS). Pengesahan sistem yang direkabentuk dibuat

menggunakan set data Agensi Projek Penyelidikan Termaju Pertahanan (DARPA).

Prestasi had masa sistem telah mengatasi sistem lain berasaskan perisian dan hanya

sesuai sehingga kelajuan satu gigabit sesaat dengan purata lengah kurang dari satu

minit. Kaedah ini berjaya mencapai 95% Kadar Positif Benar (TPR) dan hampir sifar

Kadar Positif Palsu (FPR). Berbanding dengan kaedah berkaitan apabila beroperasi

dalam keadaan yang sama, kaedah ini telah memperbaiki TPR sebanyak 4% dan

memperbaiki FPR sebanyak 1%. Sebagai tambahan, kemampuan kebolehan

pendekatan berasaskan aliran di dalam mengesan serangan tahap paket telah dapat

ditunjukkan secara eksperimen. Keputusan berbanding dengan Snort telah ditanda

aras dan telah mendapat 75% TPR dan hampir sifar FPR.

vii

TABLE OF CONTENTS

CHAPTER TITLE PAGE

DECLARATION ii

DEDICATION iii

ACKNOWLEDGEMENT iv

ABSTRACT v

ABSTRAK vi

TABLE OF CONTENTS vii

LIST OF TABLES xiii

LIST OF FIGURES xv

1 INTRODUCTION 1

1.1 Overview and Motivation 1

1.2 Background 3

1.2.1 A Simple NIDS Architecture 3

1.2.2 IDS Taxonomy 4

1.2.3 NIDS Required Characteristics 5

1.3 Problem Statement 6

1.4 Research Objectives 8

1.5 Research Scope 9

1.6 Significance of the Research 10

1.7 Thesis Organization 11

2 LITERATURE REVIEW 12

2.1 Introduction 12

2.2 Network Attacks Classification 12

2.2.1 Classification According to Attack Type 13

viii

2.2.1.1 Denial of Service (DoS) 14

2.2.1.2 Probing 15

2.2.1.3 Compromises 16

2.2.1.4 Replicators (Viruses/Worms/Trojans) 16

2.2.2 Classification According to Attack Approach 17

2.2.2.1 Brute-force Attacks 18

2.2.2.2 Packet Payload Attacks 18

2.3 Network Intrusion Detection Systems 19

2.3.1 Packet-level Versus Flow-level NIDS 19

2.3.1.1 Packet-level NIDS 20

2.3.1.2 Flow-level NIDS 21

2.4 IP Flows and Evolution of Flow-level Intrusion

Detection

23

2.4.1 Network Flow 23

2.4.2 Flow Export Protocols 25

2.5 Overview of Flow-level NIDS 26

2.6 Overview of Flow-level NIDS Datasets 30

2.6.1 Knowledge Acquisition in NIDSs 33

2.6.2 Classification and Categories of NIDS Datasets 34

2.6.3 Required Characteristics and Challenges of

Ideal NIDS Datasets

36

2.6.3.1 Labeled Dataset 36

2.6.3.2 Realistic Dataset 37

2.6.3.3 Sizable Dataset 39

2.6.3.4 Scalable Dataset 40

2.6.3.5 Complete Capture Traces Dataset 40

2.6.3.6 Publicy Available Dataset 41

2.6.4 Significant NIDS Datasets and Traffic

Captures

42

2.6.4.1 DARPA Family 42

2.6.4.2 KDDcup99 42

2.6.4.3 NSL-KDD 43

2.6.4.4 DEFCON 43

ix

2.6.4.5 CAIDA 43

2.6.4.6 LBNL 44

2.6.4.7 TUIDS 44

2.6.4.8 Summary of NIDS Datasets 45

2.7 Network Traffic Generating, Capturing and

Classification Tools

46

2.7.1 Traffic Capturing and Packet Analyzing Tools 46

2.7.1.1 Tcpdump 46

2.7.1.2 Wireshark 46

2.7.1.3 Gulp 47

2.7.2 Netflow Processing Tools 47

2.7.2.1 Nfsen 47

2.7.2.2 Softflowd 47

2.7.2.3 Nfdump 48

2.7.3 Attack Tools 48

2.7.3.1 Nmap 48

2.7.3.2 Bonesi 48

2.8 Evaluation Measures 49

2.8.1 Accuracy 49

2.8.1.1 Sensetivity and Specificity 50

2.8.1.2 Confusion Matrix 51

2.8.2 Timeliness 54

2.8.3 Attack Identification 54

2.9 Machine Learning in Intrusion Detection 55

2.10 Chapter Summary 57

3 RESEARCH METHODOLOGY 59

3.1 Introduction 59

3.2 Overall Research Methodology Framework 59

3.3 Dataset Creation Framework 63

3.4 Creation of Brute-force Attacks Dataset 62

3.4.1 Specifications of the Dataset 63

3.4.1.1 Malicious Traffic and Attacks in the Dataset 63

x

3.4.2 Testbed and Capturing Plan Setup 65

3.4.3 Features Selection 67

3.5 Flow-level Detection of Brute-force Attacks 68

3.5.1 Design Framework of the Flow-level Brute-

force Attack Detector

68

3.5.2 Time-based Aggregation of Traffic 70

3.5.3 Source-based and Destination-based

Aggregation of Traffic

72

3.5.4 Host Destination-based Aggregation of Traffic 74

3.5.5 Labeling TAT Records 75

3.5.6 Detection Model Design and Profiling 75

3.5.7 Testing and Validating Detection Model 77

3.5.8 Flow-level Detection of Packet Content-based

Attacks

78

3.5.9 Detection of Packet Content-based Attacks

From Flow-level

79

3.5.10 General Framework and Settings 81

3.5.11 Dataset Creation and Labeling 82

3.5.12 Evaluation Efficiency Using Machine Learning

Algorithms

82


4 CREATION OF FLOW-LEVEL DATASET FOR

BRUTE-FORCE ATTACKS

84

4.1 Introduction 84

4.2 Creation of Flow-level Dataset 84

4.2.1 Dataset Prerequisites Preparation 85

4.2.1.1 Dataset Characteristics 86

4.2.1.2 Testbed Setup 86

4.2.1.3 Setting Attack Tools and

Attack Host

89

4.2.2 Getting Network Traces 93

4.2.2.1 Flow Exporting 93

xi

4.2.2.2 Flow Data Capturing and Decoding 94

4.2.3 Dataset Labeling 95

4.3 Dataset Statistics 96

4.3.1 Attack-free Dataset Statistics 96

4.3.2 Brute-force Attack-based Dataset Statistics 101

4.4 Dataset Assessment 104

4.4 Summary 107

5 FLOW-LEVEL DETECTION OF NETWORK BRUTE-

FORCE ATTACKS

108

5.1 Introduction 108

5.2 Offline Phase of the Flow-level Brute-force Attack

Detector

108

5.2.1 Time-based Aggregation of Traffic 109

5.2.2 Labeling TAT Records 111

5.2.3 Detection Model Design and Profiling 113

5.2.4 Selection of Significant TAT Attributes 114

5.2.5 TAT Statistics and Profiling 117

5.2.6 Threshold Values and Detection Rules 121

5.4 Validation of the Created Detection Model 127

5.5 Summary 130

6 FLOW-LEVEL DETECTION OF PACKET CONTENT-

BASED ATTACKS

131


6.2 General Framework and Settings 131

6.3 Alert Analysis 133

6.4 Dataset Creation and Labeling 136

6.5 Feature Selection 138

6.6 Results and Discusion 139

6.6.1 Full Set Attributes Classification Results 140

6.6.2 Minimum Effective Attributes 141

6.6.3 Most Significant Attributes 142


xii

7 CONCLUSION 144


6.2 Conclusion 144

6.3 Research Contributions 146

6.4 Suggestions for Future Work 147

REFERENCES 149

xiii

LIST OF TABLES

TABLE NO. TITLE PAGE

2.1 Summarization of works in flow-level network intrusion

detection.

31

2.2 A summary of NIDS dataset. 45

2.3 The four possible outcomes of a classifier. 50

3.1 Specifications of the brute-force attack dataset. 64

3.2 Attack types and attack generating tools. 67

3.3 Version 5 header format. 69

3.4 Version 5 flow record format. 69

3.5 Typical extracted attributes of time window NetFlow

records.

72

3.6 Single detection possibilities. 82

4.1 Characteristics of the brute-force attack dataset. 87

4.2 Capturing periods. 93

4.3 Final ten attributes selected for datasets. 95

4.4 Three digit labeling method of malicious flows. 96

4.5 Attack-free dataset overall and layer 4 statistics. 97

4.6 Brute-force attack-based dataset overall and layer 4

statistics.

102

4.7 Attacks types in the brute-force attack-based dataset. 102

4.8 Characteristics summary of available NIDS datasets and

our dataset.

105

4.9 Superiority of our dataset over other datasets. 106

5.1 Extracted features of time window NetFlow records. 111

5.2 Malicious TAT records statistics. 113

xiv

5.3 1-to-2p flows statistics in our dataset. 116

5.4 TAT features used to detect and identify brute-force

attacks.

117

5.5 Normal TAT records statistics. 118

5.6 Normal TAT records statistics. 121

5.7 Offline testing confusion matrix. 124

5.8 Accuracy results of brute force attack types. 125

5.9 Online testing confusion matrix. 126

5.10 Accuracy results of online testing. 126

5.11 Timeline performance of the online testing. 126

5.12 Information summary of DARPA99. 127

5.13 DARPA99 brute-force detection confusion matrix. 129

5.14 Detection results of DARPA99 brute-force. 129

6.1 First variant Snort alerts analysis. 133

6.2 Second variant Snort alerts analysis. 136

6.3 Final ten attributes selected for datasets. 139

6.4 Dataset derived attributes. 139

6.5 Minimum effective attributes. 142

xv

LIST OF FIGURES

FIGURE NO. TITLE PAGE

1. 1 Simple architecture of NIDS. 3

1. 2 NIDS classification aspects. 5

2. 1 Classification of network attacks. 13

2. 2 Evolution of flow data in network applications. 24

2. 3 Knowledge acquisition process in NIDS. 33

2. 4 Classifications of NIDS datasets. 34

2. 5 Confusion matrix for intrusion detection. 52

2. 6 3-array confusion matrix. 52

2. 7 Transformation from n-array into binary confusion matrix. 53

3. 1 Overall research methodology framework. 60

3. 2 General dataset creation framework. 62

3. 3 Network architecture and testbed for attacks. 66

3. 4 Design framework of the flow-level brute-force detector. 70

3. 5 Aggregation of traffic of different time windows. 73

3. 6 Time-based aggregation of traffic. 76

3. 7 Framework of validating detection model. 77

3. 8 Common attacks detectable by both packet-level and flow-

level inspection.

80

3. 9 General experiment framework. 81

4. 1 Dataset creation implementation flow. 85

4. 2 Network architecture. 88

4. 3 Classes of brute-force attacks in the dataset. 90

4. 4 (a) Part of a script file contains attack commands,

(b) Part of the log file created by the script file.

91

xvi

4. 5 (a) Flow distribution of the attack-free dataset over layer-4

protocol, (b) Byte distribution of the attack-free dataset over

Layer-4 protocols.

98

4. 6 (a) Attack-free dataset flow distribution over layer-5

protocols, (b) Attack-free dataset byte distribution over

significant layer-5 protocols.

99

4. 7 5-min average number of flows of day2 attack-free traffic. 100

4. 8 5-min average number of flows of the 7-day attack-free

traffic.

100

4. 9 (a) Brute-force dataset flow distribution over layer-4

protocols, (b) Brute-force dataset byte distribution over

layer-4 protocols.

101

4. 10 (a) Brute-force dataset flow distribution over layer-5

protocols, (b) Brute-force dataset byte distribution over

layer-5 protocols.

103

4. 11 Percentages of each attack category in brute-force attack

dataset.

104

5. 1 Designing and testing of the flow-level brute-force detector. 109

5. 2 Time-based aggregation of traffic. 110

5. 3 Flow chart of traffic aggregation of traffic. 112

5. 4 Flow chart traffic aggregation of traffic and TAT record

labeling.

114

5. 5 Distribution of malicious TAT records over attack classes. 115

5. 6 Distribution of malicious 60-sec TAT records over attack

classes.

115

5. 7 TAT attributes for normal and network scan TAT (average

values).

119

5. 8 TAT attributes for normal and host scan TAT (average

values).

120

5. 9 TAT attributes for normal and DoS flood TAT (average

values).

120

5. 10 Average values TAT attributes for normal and DDoS flood

TAT.

121

xvii

5. 11 Brute-force attack detecting rules. 122

5. 12 Pseudo code of the brute-force attack detection rules. 123

5. 13 Offline testing of brute-force attack detection model. 124

5. 14 Offline testing confusion matrix. 124

5. 15 Online testing process. 125

5. 16 Online testing confusion matrix. 126

5. 17 Validation of the detection method. 128

5. 18 Normal and malicious TAT records of DARPA99

validating dataset.

129

5. 19 DARPA99 brute-force detection confusion matrix. 129

6. 1 General experiment framework. 132

6. 2 (a) Alert distribution over Snort class type attacks for

dataset1,

(b) Alert distribution over layer 4 protocols for dataset1.

134

6. 3 (a) Alert distribution over Snort class type attacks for

dataset2,

(b) Alert distribution over layer-4 and layer-5 protocols for

dataset2.

135

6. 4 Flow distribution of captured traces over transport layer

protocols.

137

6. 5 Malicious flow distribution over layer-4 and layer-5 in

dataset2.

138

6. 6 Full set attribute ML classification results on dataset1. 140

6. 7 Full set attribute ML classification results on dataset2. 141

6. 8 Most three significant classification results on dataset2. 143

6. 9 Individual impact of each attribute on classification. 143

CHAPTER 1

INTRODUCTION

1.1 Overview and Motivation

The global growth of the Internet and networking has made securing

networks and information one of the most challenging tasks in the field of network

communication. Today, intrusion attacks are generating significant worldwide

epidemic to network security environment and bad impact involving financial loss.

Intruders have the capability to infect thousands of hosts and networks within few

minutes before human action takes place.

Studies on the field by Dell™ SonicWALL™ Threat Research Team

demonstrated that network attacks are increasing exponentially every year. For

example, over 20.1 million unique malware samples were collected in 2013,

compared to 16 million in 2012 (DellSonicWALL, 2013). In the year 2013 more than

1.06 trillion intrusion related incidents had been detected and prevented and more

than 1.78 billion malwares downloads had been blocked. According to the same

report, there were approximately 4,429 new vulnerabilities reported from Common

Vulnerabilities and Exposures (CVE) and 3,644 were related with network attacks in

2013 (DellSonicWALL, 2013). Even a small security breach on commercial dealings

such as online e-payment and money transactions may cause huge unrecoverable

losses to companies and individuals. Successful attacks to hijack network-based

sensitive services in hospitals or airports may lead to disastrous results.

https://cve.mitre.org/index.html



2

In response to those increasing cyber-attacks, the cost of securing networks

information is also increasing. According to the industry survey on cyber security

conducted by New York State Department of Financial Services (Andrew M.

Cuomo, 2014), 77% of institutions involved in the survey experienced an increase in

their total information security budget in the past three years. Almost no institutions

reported a reduction in spending in the past three years.

In order to cope with the enormous increasing security threads that face

network communication, various third party techniques or middle-boxes techniques

have evolved such as firewalls, Network Intrusion Detection Systems (NIDSs).

Network Intrusion Detection Systems (NIDS) proved to be an efficient technique that

can process large volume of networks traffic and detect intrusions in their early

stages in order to limit their catastrophic damages.

Intrusion can be defined as any activity that violates confidentiality,

authority, integrity or availability of information system (Lazarevic et al., 2005).

Although in the field of information security, intrusion and attack are used

interchangeably, there is a little difference between the two terms. An attack is an

intrusion attempt, and an intrusion results from an attack that has been (at least

partially) successful (Barrus and Rowe, 1998). Intrusion Detection is defined by The

National Institute of Standards and Technology (Bace and Mell, 2001) as “the

process of monitoring the events occurring in a computer or network and analyzing

them for signs of intrusions”. An Intrusion Detection System (IDS) can be defined as

a combination of software or/and hardware components that monitors computer and

network systems and raises an alarm when intrusion happens (Lazarevic et al., 2005).

There are two main categories of IDSs; host-based IDSs (HIDS) which are

designed for single host, and network-based IDSs (NIDS) which monitor the whole

traffic of a network. Generally, the concepts of IDS is more related to NIDS than

HIDS since the latter can be considered as an ordinary antivirus system limited to

one host whilst the prior is an entity that can monitor a whole enterprise network or

even an ISP network.

3

1.2 Background

1.2.1. A Simple NIDS Architecture

In general, the components which make up any NIDS can be viewed as

shown in Figure 1.1 (Lazarevic et al., 2005).

Data Acquisition Module – Monitored System

Feature Generation

Module

Incident Detection Module Response Module

Reference Data Module

Event

Raw Data

Actions

Alarms

Figure 1.1 Simple architecture of NIDS (Lazarevic et al., 2005)

The purpose of each module is as follows:

i. Data Acquisition Module: Responsible for collecting network traffic data.

ii. Feature Generation Module: Responsible for extracting a set of selected

traffic features (packet header features, payload features, flow features)

iii. Incident Detection Module: Responsible for identifying intrusion and

generating alarms by comparing the data generated from the feature

generation module with that of the reference module. Generally there are

two methods of this detection; misuse-based detection and anomaly-based

detection.

iv. Response Module: Once an alert is received, this module is responsible for

initiating actions in response to a possible intrusion.

4

v. Reference Data Module: This module contains the reference data that is to be

used by the incident detector to compare with. If the detection method is

misuse-based then this data would be based on the description of all known

intrusion (intrusion database), and if the detection method is anomaly-based

then it would be based on description of normal attack free network

operation (network profile).

1.2.2. IDS Taxonomy

Several taxonomies had been proposed for categorizing NIDS, but the most

common and acceptable one is that one used by Axelsson (2000) and Debar et al.

(1999). This classification categorizes IDSs according to the following aspects:

i. Information source: NIDS can be packet-level based or flow-level. Packet-

level NIDS can be further divided to packet header-based or payload-based.

ii. Analysis Strategy (Detection Method): This categorization distinguishes

NIDS according to the nature of reference data used for identifying

intrusions which is either misuse-based or anomaly-based.

iii. Architecture (Locus of Detection): NIDS can be centralized when it is placed

in single position or distributed when there are several points for

monitoring.

iv. Response to intrusion: NIDS can be either passive when it is just to raise an

alarm on detecting intrusion or active when a further action is to be done in

response to that intrusion.

v. Time Aspects: NIDS can work in real-time and detect intrusions while they

are taking place or by batch (non real-time).

5

Information Source

Analysis Strategy

Time Aspects

Architecture

Response

NIDS

Packet Level

Flow Level

Packet Header

Packet Payload

Anomaly Detection

Misuse Detection

Real-time Prediction

Off-line Prediction

Centralized

Distributed

Passive Response

Active Response

Figure 1.2 NIDS classification aspects

1.2.3. NIDS Required Characteristics

The required NIDS characteristics had been specified by many researchers

(Lazarevic et al., 2005, Catania and Garino, 2012). These characteristics are required

from NIDS to achieve security goals. We can summarize these characteristics as

follows:

i. High Detection Performance: Detection of all attacks without false alarms

ii. Low Processing time: Attack is to be detected as soon as possible

iii. Adaptability: NIDS should be able to readapt itself to deal with novel attacks

and changing environments

iv. Fault tolerance: robustness, resistance to attacks, quick recovery from

successful attacks

v. Minimum Resource consumption: storage resources and processing

capabilities needed.

6

1.3 Problem Statement

As stated in the previous section, NIDS have a wide range of approaches to

be categorized accordingly (Figure 1.2). However, there are two main different

approaches now for NIDS. The conventional and most common implemented

approach nowadays is the misuse-based NIDS with deep packet inspection (DPI),

which tries to detect attacks based on previously prepared database of all known

intrusion signatures. This approach proved to be efficient and accurate for detecting

known intrusions but cannot detect novel and new attacks or even new variants of

known attacks. Another drawback of misuse-based NIDS is that it uses DPI. This

approach of auditing data by inspection inspects every incoming packet in the traffic.

Hence, it cannot cope with high speed networks. Besides, it seems that researches on

misuse-based NIDS have reached a saturation point making it harder to make further

enhancements.

The other approach is the anomaly-based NIDS, which is based on detection

of deviations of a normal attack-free model of the protected network and flagging

these deviations as intrusions. This approach can detect known and novel attacks as

well. Although this latter approach seems promising and in spite of lot of researches

on it, it is still immature and suffers from serious problems that makes it impractical

for real life situations. The major problem of anomaly-based NIDS is that it suffers

from high false alarm rates. This problem arises from the difficulty of specifying

normal and abnormal thresholds for any network and which itself can be considered

as a second problem. A third problem of that approach is the limited capacity and

precision of identifying an attack when an alarm is raised. This problem is a

consequence of anomaly-based detection nature which reacts with any network event

as either normal or abnormal (anomaly) and nothing else. If we consider the fact that

there are lots of network anomalies that are neither attacks nor intrusions, then

definitely there would be a high rate of false alarms.

With respect to the information inspected (audit data), some NIDS inspect the

complete traffic (Deep Packet Inspection) for known attack signatures (byte patterns)

whilst others consider data from packets headers for attacks that violates network

7

protocols rules. Meanwhile, other NIDSs depend on extensive summaries and

aggregations of traffic flow (flow-level) for intrusion that attack network resources.

Each of these levels of audit data inspection concentrates on particular category of

intrusive attacks and, therefore, is strong on some parts but may be poor on other

parts. Since flow-level NIDS deals with aggregations and summarization of traffic, it

can inspect high speed traffic in a near real-time mode. However, it suffers from low

precision of specifying attacks in the traffic. This problem is a result of the absence

of the actual data in the detection and the use of metadata instead. The whole flow is

to be flagged when an attack is detected. The exact malicious packet(s) cannot be

specified nor the exact attack time nor any other further information that can help to

describe the attack.

Any NIDS needs to go through a knowledge acquisition process to build a

reference model in the system before the actual detection process can take place. To

build this reference model, the system must be supplied with previously known

categorized dataset so that the system would acquire the necessary information to

detect attacks and differentiate between malicious and non-malicious incidents in

network traffic. Unfortunately, creation of optimum NIDS dataset that would acquire

all required characteristics is very hard and costly. In addition, some of these

required characteristics seem to be contradicting with each other. Hence, combining

these characteristics in a dataset is almost impossible. There are a number of

challenges and difficulties in creating ideal datasets. For all that, researchers on

NIDS generally suffer from a lack of standard datasets for developing and training

new systems. The problem is even severe when looking for benchmarking common

public datasets for validating, evaluating and comparing between existing systems

and methods.

Most NIDS in the research community today have achieved a very high

degree of accuracy in detecting intrusions. However, the majority of these systems

face difficulties when they are to be deployed and implemented in real world. With

the growing connection speed by the establishment of end users broadband Internet,

the data volumes in the backbone networks has increased steadily. Monitoring of

links that may reach few gigabits per second capacities by NIDS may require more

8

computational resources than that available on commodity computer hardware. This

problem of resource demand is commonly solved by engineering dedicated hardware

platforms such as on Field-Programmable Gate Array (FPGA) that can process

detection tasks more efficiently. However, specially crafted hardware comes at a

significant higher cost and may not be worthwhile in every situation.

Our research problem is divided into two main parts. The first part aims to deal

with the problem of the lack in standard benchmarking datasets for developing and

training new systems or comparing existing ones. The second part tries to find software-

based solutions for flow-level NIDS to achieve accurate near real-time detection. As

a part from the second problem, the research tries to answer some questions

concerning the capability and efficiency of flow-level attributes to detect network

traffic attacks in spite of the absence of actual monitored data, and to find its

efficiency in terms of accuracy and timeline performance.

1.4 Research Objectives

The objectives of this research can be summarized as follows:

1. To create a labeled flow-level dataset for anomaly-based NIDS using a

general framework that would be available publicly.

A labeled flow-level NIDS dataset is a set of flow records with each flow

record labeled as either benign or malicious. A flow record is the set of

attributes that abstracts and summarizes all the information which concerns a

single session between two communicating entities in a network. The dataset

is to satisfy a set of conditions and guidelines in order to acquire validity for

anomaly detection.

2. To design an effective near real-time anomaly-based NIDS to detect

brute-force attacks based only on software solutions.

9

Effective here means the proposed solution will not degrade the detection

accuracy and will not increase the false alarm rate compared to existing

conventional DPI or relevant flow-based methods. The system is supposed to

detect network intrusion attacks when they are taking place not more than few

minutes after the attack has commence. Detection scope of the system consists

of all attacks that consume significant network bandwidth in any phase of the

attack. The detection system is a set of developed software that does not need

special hardware platform to run on and meant to be placed in the

ingress/egress point of Universiti Teknologi Malaysia (UTM) campus

network. Hence, it has to cope with high speed traffic of possibility to few

hundreds megabit per second.

3. To investigate the efficiency of flow-level traffic inspection to detect

packet content-based attacks.

The capability, accuracy and timeline performance of flow-level features to

detect attack contained in payload packets will be investigated and evaluated.

1.5 Research Scope

This research focuses on developing a system that can monitor the traffic of

server farm in UTM campus network in real-time and setting alarm when observing

intrusive activities. Hence, the research deals with traffic bandwidth not exceeding

one gigabit per second, as this is a little higher than the core bandwidth of UTM

campus backbone.

All proposed solutions are software-based and are to be run within general

purpose computer machines that are commercially available in the market. No

special hardware platform such as FPGA would be considered.

Although the study includes misuse detection approach, it will limit

inspection level to flow features and will not consider any of payload inspection or

10

pattern signature as information source for intrusion detection. Inspection of flow-

level data is limited to network and transport layers. Hence, attacks and techniques in

lower layers (physical layers) and upper layers (application layers) will not be

considered.

All proposed solutions and systems are passive with respect to the detected

intrusions. Their role is solely to flag an alarm when any intrusive activity is detected

with no further action to be performed.

The study is about Network IDS. Hence, all techniques reviewed, studied or

analyzed will depend only on network traffic data (packets and flow data). Host IDS

is out of scope of this study. Intrusions would be considered an attack when it uses

network resources to scan, propagate, or penetrate network information resources.

Intrusions affect information sources through other means (e.g. physical access) will

not be considered as network attack.

1.6 Significance of the Research

The most significant contribution of this study is the development of a

general framework to create NIDS datasets. That framework would be available

publicly for research community. This would help other researchers to develop their

own systems and methods and ease the comparison and validations for these new

systems and methods.

Another outcome of the study is a cost effective solution to the problem of

real-time traffic monitoring in any enterprise networks with up to one gigabit per

second bandwidth. It provides a user-friendly NIDS that make the security task easier

and provide an early alarm security system for the monitored network through

developing an efficient NIDS as a step to replace other costly commercial software

products that are currently being used.

11

1.7 Thesis Organization

The rest of this thesis is organized as follows. In Chapter 2, an overview on

the evolution of NIDS and its state-of-the-art that highlights gaps and defines future

trends is given. This is followed by an overview on network traffic generating,

capturing and processing tools and the significant theories and methods used in

NIDS in general. A deep flow-level NIDS literature review is done followed by

another review on NIDS datasets.

Chapter 3 shows the overall research methodology, explains and discusses

general dataset creation framework, the framework of flow-level detection of brute-

force attacks, explains the framework of flow-level detection of packet-level attacks.

It also discusses with details all experimental setups, dataset creation steps,

validation methods and rational constructions behind the selected methodology.

Chapter 4 presents the implementation details of creating brute-force attacked

dataset from captured traffic traces, statistics and various distributions of the created

dataset and discusses the corresponding results.

Chapter 5 presents the implementation details of designing flow-level NIDS

to detect brute-force attacks and all partial and final results concerning evaluation of

accuracy and performance and validation of the detection method and discusses

results.

Chapter 6 presents the implementation details of investigating the efficiency

of flow-level data to detect packet-level attacks and a discussion to justify the results.

Chapter 7 concludes all the study and states the degree of objective

fulfillment, highlights research contributions and makes suggestion for future work.

REFERENCES

Aldous, D. (1991). The Continuum Random Tree I. TAe o/ProMMzYy, 19(1),

1-28.

Allen, J., Christie, A., Fithen, W., McHugh, J., Pickel, J. and Stoner, E. (2000). <SY%?e

o / A Pr^c^/ce o / T^^rMs/o Deêc^/o^ TecAô/og/es. Technical Report

CMU/SEI-99-TR-028 ESC-99-028. Software Engineering Institute,

CarnegieMellon University, Pittsburgh, Pennsylvania.

Aslam, T. (1995). ^ Tbxoôwy o / êcMr/(y F^M/^s m Âe (T /% Oper^^/^g

^ysêw.Doctoral Dissertation.Department of Computer Sciences,Purdue

University, Lafayette, Indiana.

Attanasio, C. R., Markstein, P. W. and Phillips, R. J. (1976). Penetrating an

Operating System: AStudy of VM/370 Integrity. TFM ^ysêws JoMr^^/, 15,

102-116.

Axelsson, S. (2000a). The Base-rate Fallacy and the Difficulty of Intrusion

Detection. Tm%y%c%o%y oM T^/orw^^/oM ^^J ^ysêw êcMr/(y, 3, 186

205.

Axelsson, S. (2000b). T ^ rM so Deêc^/o^ ^ysêws/ ^ <SMrvey ^^ J

Thxoôwy.Technical Report 99-15, Department of Computer Engineering,

Chalmers University of Technology, Goteborg, Sweden.

Bace, R. and Mell, P. (2001). AZST <Spec/%/ PMN/c%?/oM oM T^rMs/o^ Deêc^/o^

^ysêws.Technical ReportSP 800-31, National Institute of Standards and

Technology, Gaithersburg, Maryland.

Barrus, J. and Rowe, N. C. (1998) A Distributed Autonomous-agent Network

Intrusion Detection and Response System. ProceeJmgs o / ?Ae Coww%^J %^J

Co^^ro/ êse%rcA ^^J TecAô/ogy ^yôs/Mw,Naval Postgraduate School

Monterey,USA.

https://en.wikipedia.org/wiki/Pennsylvania

https://en.wikipedia.org/wiki/Maryland

150

Bhuyan, M. H., Bhattacharyya, D. and Kalita, J. K. (2014). Network Anomaly

Detection: Methods, Systems and Tools. TEEE CowwM^/c^^/o^s -Surveys ^^J

TMôr/ /s, 16(1), 303-336.

Bodt, B. A. (2002). Computer Intrusion Detection and Network Monitoring.

EecAôwe^r/cs, 44(3), 294-295.

Breiman, L. (2001). Random Forests. M%cAme Ee^r^/^g, 45(1), 5-32.

CAIDA. (2015). Ce^êr /o r ^_pp//eJ T^êrê^ ^^^/ys/s, [Online]. Available:

http://www.caida.org/home/.

Catania, C. A. and Garino, C. G. (2012). Automatic Network Intrusion Detection:

Current Techniques and Open Issues. CowpMêrs %^J E/ec?r/c%/ E^gmeermg,

38(5), 1062-1072.

Cheng, CM ., Kung, H. R. and Tan, K.S. (2002). Use of Spectral Analysis in Defense

Against DoS Attacks. ProceeJmgs o / TEEE G/oA%/ Ee/ecowwM^/c^^/o^s

Co^/ere^ce, 3,2143-2148.

Cisco. (2013). C/sco NeE/ow [Online]. Available:

http://www.cisco.com/en/US/products/ps6601/products ios protocol group

home.html.

Cisco. (2015). <SWOPr [Online]. Available: https://www.snort.org/.

Cisco. (2015). NeE/ow Export D^^^gr^w Eorw^^ [Online]. Available:

http://www.cisco.com/c/en/us/td/docs/net mgmt/netflow collection engine/3

-6/user/guide/format.html.

Claise, B. (2008). <Spec//?c%?o o / Ae TP //ow T^/orw%?o Export ^TPETX) Proôco/

/o r A ExcA^^ge o / TP Trq///c E/ow T^/orw^^/o^. RFC 7011, IETF.Available:

https://tools.ietf.org/html/rfc7011.

Computer Knowledge. (2015). CowpMêr Xôw/eJge- F/rMs TMôr/^/ [Online].

Available: http://www.cknow.com/ cms/vtutor/cknow-virus-tutorial.html.

Copeland III, J. A. (2007). Flow-based Detection of Network Intrusions. U.S. Patent

No. 7,185,368. Washington, DC: U.S. Patent and Trademark Office.

Dagon, D., Gu, G., Lee, C. P. and Lee, W. (2007). A Taxonomy of Botnet Structures.

ProceeJmgs o / A ^MMM%/ CowpMêr <S*ecMr/(y ^_pp//c%Ao%y Co^/ere^ce, 2J,

325-339.

Debar, H., Dacier, M. and Wespi, A. (1999). Towards a Taxonomy of Intrusion-

Detection Systems. CowpMêr NeWorAs, 31, 805-822.

http://www.caida.org/home/

http://www.cisco.com/en/US/products/ps6601/products_ios_protocol_group_home.html

http://www.cisco.com/en/US/products/ps6601/products_ios_protocol_group_home.html

https://www.snort.org/

http://www.snort.org/

http://www.cisco.com/c/en/us/td/docs/net_mgmt/netflow_collection_engine/3-6/user/guide/format.html

http://www.cisco.com/c/en/us/td/docs/net_mgmt/netflow_collection_engine/3-6/user/guide/format.html

https://tools.ietf.org/html/rfc7011

http://www.cknow.com/cms/vtutor/cknow-virus-tutorial.html

151

DellSonicWALL (2013). Dell Network Security Threat Report 2013. US-TD593-

20140123. U.S.A:

Dell.Available:http://marketing.sonicwall.com/documents/dell-network-

security-threat-report-2013 -whitepaper-30197.pdf

Denning, D. E. (1987). An IntrusionDetection Model. TEEE Tm%y%c?/o%y oM

<Sb/?w%re FMg/Meer/Mg, 13(2), 222-232.

Diadem Firewall European Project. (2015). DrnJew F/rew%//[Online]. Available:

http://www.diadem-firewall org/index.php.

Dokas, P., Ertoz, L., Kumar, V., Lazarevic, A., Srivastava, J. and Tan, P.N. (2002).

Data Mining for Network Intrusion Detection. ProceeJ/Mgs o / NSF

^ForAsAop oM Nex? GeMer^?/oM D^?^ M/M/Mg, 21-30.

Dreger, H., Feldmann, A., Paxson, V. and Sommer, R. (2004). Operational

Experiences with High Volume Network Intrusion Detection. ProceeJ/Mgs o /

?Ae CoM/ereMce oM Co^M ?er ^MJ CowwMM/c%?/oMs ^ecMr/(y. ACM, 11, 2-11.

Erbacher, R. F. (2001). Visual Behavior Characterization for Intrusion Detection in

Large Scale Systems. ProceeJ/Mgs o / ?Ae 7M?erM%?/oM%/ CoM/ereMce oM

F/sM%//z%?/oM, Vw^g/Mg, ^MJ Tw^ge Process/Mg, 54-59.

Fide, S. and Jenks, S. (2006). ^ <SMrvey o / <SYr/Mg M3?cA/Mg ppro^cAes /M N^rJw^re.

Technical ReportTR SPDS 06-01, Department of Electrical Engineering and

Computer Science, University of California, Irvine, USA.

Fioreze, T., Oude Wolbers, M., van de Meent, R. and Pras, A. (2007). Finding

Elephant Flows for Optical Networks. ProceeJ/Mgs o / ?Ae 70?A TFVP/TEFF

7M?erM%?/oM%/ ^y^os/M w oM VM?egr^?eJ Ne?wor^ M%M%geweM?, 627-640.

Fisher, D. H., Pazzani, M. J. and Langley, P. (E ds).(2014). CoMcep? Forw%?/oM.*

XMow/eJge %MJ Exper/eMce /M ^MsMperv/seJ Le^rM/Mg.San Mateo: Morgan

Kaufmann.

Frank, J. (1994). Artificial Intelligence and Intrusion Detection: Current and Future

Directions. ProceeJ/Mgs o / ?Ae 77?A N3?/oM^/ Co^M ?er <S*ecMr/?y CoM/ereMce,

10, 1-12.

Galtsev, A. A. and Sukhov, A. M. (2011). Network Attack Detection at Flow Level.

ProceeJ/Mgs o / ?Ae 77?A VM?erM ?/oM / CoM/ereMce ^MJ 4?A TM?erM ?/oM /

CoM/ereMce oM ^w^r? <Sp%ces ^MJ Nex? GeMer^?/oM ^ /reJ /^ /re /ess

Ne?worA/Mg, 326-334.

http://marketing.sonicwall.com/documents/dell-network-

http://www.diadem-firewall.org/index.php

152

Gao, M., Zhang, K. and Lu, J. (2006). Efficient Packet Matching for Gigabit

Network Intrusion Detection Using TCAMs. ProceeJz^g^ o / ?Ae 20YA

V^Yer^^Yzo^ f Co^/ere^ce o / ^Jv^^ceJ V^/orw^Yzo NeYworAz^g ^^J

_ppfzc%?zo%s', 249-254.

Gao, Y., Li, Z. and Chen, Y. (2006). A DoSResilient Flow-level Intrusion Detection

Approach for High-speed Networks. ProceeJz^g^ o/2dYAVEEE V^Yer^^Yzo^ f

Co^/ere^ceo^ Dz Yrz MYeJ CowpMYz^g -Sêw.s', 39-39.

Garner, S. R. (1995). Weka: The Waikato Environment for Knowledge Analysis.

ProceeJz^g^ o / YAe New Ze^f^^J Co^MYer Scze^ce êê^rcA SYMJe Y

Co^/ere^ce, 57-64.

Gates, C., McNutt, J. J., Kadane, J. B. and Kellner, M. I. (2006). Scan Detection on

very Large Networks Using Logistic Regression Modeling. ProceeJz^g^ o /

YAe77YA TEEE Syô^zMw o^ Co^MYer^ ^^ J CowwM^z'c^^z'o^ , 402-408.

Ghorbani, A. A., Lu, W. and Tavallaee, M. (2009). NeYworA V YrM zo DeYecYzo ^ ^ J

Preve^Yzo^/ Co^cepY^ ^^J TecA^z^Me , New York:Springer.

Gil, T. M. and Poletto, M. (2001). MULTOPS: A DataStructure for Bandwidth

Attack Detection. ProceeJz^g^ o / 70YA L'SENVZ SecMrzYy Syô^z'Mw,23-38

GitHub. (2015). FoNeSz - YAe DDoS FoY eY Sz^Mf^Yor [Online]. Available:

https://github.com/markus-go/bonesi.

Gogoi, P., Bhuyan, M. H., Bhattacharyya, D. and Kalita, J. K. (2012). Packet and

Flow Based Network Intrusion Dataset. ProceeJm g o / YAe JYA V^Yer^^Yzo^ f

Co^/ere^ce o^ Co^Yeôr^ry Co^MYz^g, Springer, 322-334.

Andrew M. Cuomo, Lawsky,B. M. (2014). êporY o^ Cyêr SecMrzYy z YAe F^Âz^g

Sector, Department of Financial Services,New York

State,USA.Available:http://www.dfs.ny.gov/about/press2014/pr140505_cyber_sec

urity.pdf

Hansman, S. and Hunt, R. (2005). A Taxonomy of Network and Computer Attacks.

CowpMYers* %^J SecMrzYy, 24(1), 31-43.

Hofmeyr, S. A. and Forrest, S. (1999). VwwMôfogzc^f ^ o J e f o / Dz Yrz MYeJ

DeYecYzo ^^J z'Y ^ppfzc^Y o Yo Co^MYer SecMrzYy.

DoctoralDissertation,Department of Computer Science,University of New

Mexico, New Mexico.

https://github.com/markus-go/bonesi

http://www.dfs.ny.gov/about/press2014/pr140505_cyber_sec

https://en.wikipedia.org/wiki/Albuquerque,_New_Mexico

153

Hoque, N., Bhuyan, M. H., Baishya, R. C., Bhattacharyya, D. and Kalita, J. K.

(2014). Network Attacks: Taxonomy, Tools and Systems. JoMr^^f o/NeYworA

%^J CowpMYer ppfz'c^Yz'o^y, 40, 307-324.

Hu, W., Liao, Y. and Vemuri, V. R. (2003). Robust Anomaly Detection Using

Support Vector Machines. ProceeJmgy o / YAe V^Yer^^Yz'o^ f Co^/ere^ce o^

M^cAz'ê Le^r^z'^g, 282-289.

Igure, V. and Williams, R. (2008). Taxonomies of Attacks and Vulnerabilities in

Computer Systems. VEEECowwM^z'c^Yz'o^y SMrveyy %^J TMYorz'%fy, 10(1), 6

19.

Information Security Center of eXcellence. (2015). VSCZ D^Y^yeY [Online].

Available: http://www.iscx.ca/datasets.

irchelp.org. (2015). Tro/%^ Norye ^YY cAy [Online]. Available:

http://www.irchelp.ora/irchelp/security/trojan.html.

Jones, A. K. and Sielken, R. S. (2000). CowpMYer SyyYew V YrMyz'o DeYecYz'o / ^

SMrvey. Technical Report, Computer Science Department, University of

Virginia, Charlottesville.

Joshi, M. V., Agarwal, R. C. and Kumar, V. (2002). Predicting Rare Classes: Can

Boosting Make any Weak Learner Strong? ProceeJmgy o / YAe ezgAYA ^C M

V^Yer^^Yz'o^ f Co^/ere^ce o^ XôwfeJge Dzycovery ^^ J D%Y% Mz'^z'^g, 297

306.

Jung, J., Paxson, V., Berger, A. W. and Balakrishnan, H. (2004). Fast Portscan

Detection Using Sequential Hypothesis Testing. VEEE SywpoyzMw o^

SecMrzYy %^J Prz'v^cy, 211-225.

Kashyap, H. J. and Bhattacharyya, D. (2012). A DDoS Attack Detection Mechanism

Based on Protocol Specific Traffic Features. ProceeJmgy o / YAe Seco^J

V^Yer^^Yz'o^ f Co^/ere^ce o^ Co^MY^Yz'o^^f Scz'e^ce, E^gmeermg ^^J

V /orw^Yz'o TecAôfogy. ACM, 194-200.

Ke-xin, Y. and Jian-qi, Z. (2011). A Novel DoS Detection Mechanism. ProceeJmgy

o / YAe V^Yer^^Yz'o^ f Co^/ere^ce o^ MecA^Yro^z'c Scz'e^ce, EfecYrzc

E^gz'êerz'^g ^ ^ J Co^MYer, 296-298.

Kendall, K. (1999). ^ D^Y^^^ye O / Co^MYer ^YY cAy For TAe Ev%fM%Yo O /

V YrMyz'o DeYecYz'o SyyYewy.Master's Thesis, Department of Electrical

Engineering and Computer Science,Massachusetts Institute of Technology,

Cambridge.

http://www.iscx.ca/datasets

http://www.irchelp.org/irchelp/security/trojan.html

https://en.wikipedia.org/wiki/Charlottesville,_Virginia

154

Kim, M.-S., Kong, H.-J., Hong, S.-C., Chung, S.-H. and Hong, J. W. (2004). A

Flow-based Method for Abnormal Network Traffic Detection. ProceeJmgs

/ TEEE/EPNeworA Oper%?/o%y ^^J M^^^gewe^? ^yôs/M w, 599-612.

Kotsiantis, S., Zaharakis, I. and Pintelas, P. (2007). Supervised Machine Learning: A

Review of Classification Techniques. E ro^/ers m ^r/?c/% / Tê///ge^ce %^J

^pp//c^?/o^s, 160, 3-24.

Kotsokalis, C., Kalogeras, D. and Maglaris, B. (2001). Router-based Detection of

DoS and DDoS Attacks. E/gA?A ^rAsAop o / ?Ae H P Ope^F/ew Mwers/zy

^ssoc/^^/o^, Fer//^, Gerw^^y.

Kruegel, C. and Toth, T. (2000). ^ <SMrvey O^ T^^rMs/o Deêc^/o^ ^ysêws.

Technical Report TUV-1841-00-11, University of Technology,

Vienna,Austria.

Kumar, S. (1995). C/%ss'//?c%?/oM ^^ J Deêc^/o^ o / CowpM?er T^^rMs/o^s.Doctoral

Dissertation, Department of Computer Sciences,Purdue University, Lafayette,

Indiana.

Lai, H., Cai, S., Huang, H., Xie, J. and Li, H. (2004). A Parallel Intrusion Detection

System For High-speed Networks. ProceeJmgs o / ?Ae êco^J T^?er^^?/o^^/

Co^/ere^ce o^ ^_pp//eJ Crypôgr^pAy ^^ J Ne^worA ecMrzYy, 439-451.

Lakhina, A., Crovella, M. and Diot, C. (2004a). Characterization of Network-wide

Anomalies in Traffic Flows. ProceeJmgs o / ?Ae 4?A ^C M ^TGCOMM

co^/ere^ce o^ Têrê? Me^sMrewe^?, 201-206.

Lakhina, A., Crovella, M. and Diot, C. (2004b). Diagnosing Network-wide Traffic

Anomalies. ProceeJmgs o / ?Ae ^C M ^TGCOMM co^/ere^ce o^ Co^M êr

CowwM^/c^?/o^ Pev/ew, 219-230.

Lakhina, A., Crovella, M. and Diot, C. (2005). Mining anomalies Using Traffic

Feature Distributions. ProceeJmgs o / ?Ae ^C M ^TGCOMM co^/ere^ce o^

Co^M êr CowwM^/c^?/o^ Pev/ew, 217-228.

Lakhina, A., Papagiannaki, K., Crovella, M., Diot, C., Kolaczyk, E. D. and Taft, N.

(2004c). Structural Analysis of Network Traffic Flows.^CM

<S7gwe?r/cs,32(1), 61-72.

Larochelle, D. and Evans, D. (2001). Statically Detecting Likely Buffer Overflow

Vulnerabilities. ProceeJmgs o / ?Ae TO/AMENS" ecMrzYy ^yôs/M w, 32,

177-190.

155

Lau, F., Rubin, S. H., Smith, M. H. and Trajkovic, L. (2000). Distributed Denial of

Service Attacks. ProceeJ/Mgs o / ?Ae TEEE VM?erM ?/oM / CoM/ereMce oM

^ys?ews, M%M, ^MJ CyêrMe?/cs,3, 2275-2280.

Lawrence Berkeley National Laboratory and ICSI. (2005). ERNE/TCST EM?erpr/se

Tr^c/Mg Pro/ec? [Online]. Available: http://www.icir.org/enterprise-tracina/.

Lazarevic, A., Kumar, V. and Srivastava, J. (2005). Intrusion Detection: A Survey.

M%M%g/Mg Cyêr TAre%?s, 19-78.

Lee, W., Wang, C. and Dagon, D. (2007). Fo?Me? De?ec?/oM/ CoMM?er/Mg ?Ae Larges?

êcMr/?y TAre^?.New York: Springer.

Lee, W. and Xiang, D. (2001). Information-theoretic Measures for Anomaly

Detection. ProceeJ/Mgs o / ?Ae TEEE ^yôs/M w oM êcMr/?y ^MJ Prrv%cy,

130-143.

Li, B., Springer, J., Bebis, G. and Gunes, M. H. (2013). A Survey of Network Flow

Applications. JoMrM%/ o/*Ne?wor^ %MJ CowpM?er pp//c^?/oMs, 36, 567-581.

Li, Z., Gao, Y. and Chen, Y. (2005). Towards a High-speed Router-based

Anomaly/Intrusion Detection System.ProceeJ/Mgs o / ?Ae ^pec/^/ TM?eres?

GroMp oM D^?^ CowwMM/c^?/oM , 15-16.

Lincoln Laboratory, M. I. o. T. (2000). D ^ P ^ TM?rMs/oM De?ec?/oM

Ev^/M^?/oM[Online]. Available:

http://www.ll.mit.edu/mission/communications/cyber/CSTcorpora/ideval/data

/2000data.html.

Lindqvist, U. and Jonsson, E. (1997). How to Systematically Classify Computer

Security Intrusions. ProceeJ/Mgs o / ?Ae TEEE ^yôs/M w oM êcMr/?y %MJ

Pr/'v^cy, 154-163.

Lippmann, R. P., Fried, D. J., Graf, I., Haines, J. W., Kendall, K. R., McClung, D.,

Weber, D., Webster, S. E., Wyschogrod, D. and Cunningham, R. K. (1998).

Evaluating Intrusion Detection Systems: The 1998 DARPA Offline Intrusion

Detection Evaluation. TAe TM?erM%?/oM%/ JoMrM%/ o / CowpM?er ^MJ

ye/ecowwMM/'c^?/'oMs Ne?worA/Mg,34, 579-595.

Long, N. and Thomas, R. (2001). Trends In Denial Of Service Attack Technology.

CE^T CoorJ/M%?/oM CeM?er. [Online]. Available:

http://resources.sei.cmu.edu/asset_files/WhitePaper/2001_019_001_52491.pdf

http://www.icir.org/enterprise-tracing/

http://www.ll.mit.edu/mission/communications/cyber/CSTcorpora/ideval/data/2000data.html

http://www.ll.mit.edu/mission/communications/cyber/CSTcorpora/ideval/data/2000data.html

http://resources.sei.cmu.edu/asset_files/WhitePaper/2001_019_001_52491.pdf

156

Lough, D. L. (2001). ^ Tkxo^owy o / C o^M /er //^cAs w//A ^pp//c^//o^s /o ^/re/ess

Ne/worAs.Doctoral Dissertation, Faculty of the Virginia Polytechnic Institute

and State University, Blacksburg, Virginia.

157

Makhtar, M., Neagu, D. C. and Ridley, M. J. (2011). Comparing Multi-class

Classifiers: on the Similarity of Confusion Matrices for Predictive

ToxicologyApplications.ProceeJmgy o / YAe VEEE 72YAV^Yer^^Yz'o^ f

Co^/ere^ce o^ V Yeffz'ge Y D%Y% E^gz'êerz'^g ^^J ^MYow^YeJ Le^r^z'^g, 252

261.

McHugh, J. (2000). Testing Intrusion Detection Systems: A Critique of the 1998 and

1999 DARPA Intrusion Detection System Evaluations as Performed by

Lincoln Laboratory. ^C M Yr% y%cYz'o y o^ V /orw^Yz'o %^J yyyYew SecMrz'Yy,

3, 262-294.

Mell, P., Hu, V., Lippmann, R., Haines, J. and Zissman, M. (2003). An Overview of

Issues in Testing Intrusion Detection Systems. Technical Report NIST

Interagency Reports 7007, Department of Commerce, National Institute of

Standards and Technology, USA.

Mirkovic, J., Prier, G. and Reiher, P. (2002). Attacking DDoS at the Source.

ProceeJwgy o / YAe 70YA VEEE V^Yer^^Yz'o^ f Co^/ere^ce o^ NeYworA

ProYocofy, 312-321.

Mirkovic, J. and Reiher, P. (2004). A taxonomy of DDoS Attack and DDoS Defense

Mechanisms. ^C M SVGCOMM Co^MYer CowwM^z'c^Yz'o êvz'ew, 34, 39-53.

Moore, D., Shannon, C., Brown, D. J., Voelker, G. M. and Savage, S. (2006).

Inferring Internet Denial ofService Activity. ^C M Tm^y^cYz'o^y o^ CowpMYer

SyyYewy, 24, 115-139.

Morin, B. and Me, L. (2007). Intrusion Detection and Virology: An Analysis of

Differences, Similarities and Complementariness. JoMr^^f m cowpMYer

vz'rofogy 3, 39-49.

Munz, G. and Carle, G. (2007). Real-time Analysis of Flow Data for Network Attack

Detection. ProceeJmgy o / YAe 70YA VETP/TEEE V^Yer^^Yz'o^ f SyôyzMw o^

V^Yegr^YeJ NeYworA M^^^gewe^Y, 100-108.

Munz, G., Weber, N. and Carle, G. (2007). Signature Detection in Sampled Packets.

^ForAsAop o^ Mo^z'Yormg ^YY cA DeYecYz'o ^^ J Mz'Yzg%Yz'o , Citeseer,

Toulouse, France.

Muraleedharan, N., Parmar, A. and Kumar, M. (2010). A Flow based Anomaly

Detection System Using Chi-square Technique. ProceeJmgy o / YAe2"J

V^Yer^^Yz'o^ f VEEECo^/ere^ce o^ ^Jv%^ce Co^MYz'^g, 285-289.

158

Nguyen, H. A., Tam Van Nguyen, T., Kim, D. I. and Choi, D. (2008). Network

Traffic Anomalies Detection and Identification with Flow Monitoring.

ProceeJmgs / /Ae J /ATEE/TETP TM/em%//oM%/ Co^/ere^ce o^ ^/re/ess ^^J

Op//c%/ CowwM^/c^//o^s Ne/worAs, 1-5.

NLS-KDD. (2015). TAe NSE-XDD D^/^ Se/ [Online]. Available:

http://iscx.cs.unb.ca/NSL-KDD/.

nmap.org. (2015). Nw%p SecMr//y Sc^^êr [Online]. Available: http://nmap.org/.

Park, J., Tyan, H.-R. and Kuo, C.-C. (2006). Internet Traffic Classification for

Scalable QoS Provision. ProceeJmgs / /Ae T^/er^^//o^^/ TEEE Co^/ere^ce

o^ MM///weJ/^ ^^ J Expo, 1221-1224.

Park, K. and Lee, H. (2001). On the Effectiveness of Route-based Packet Filtering

for Distributed DoS Attack Prevention in Power-law Internets. ^C M

STGCOMM CowpM/er CowwMM/c%/o Pev/ew,31(4), 15-26.

Paxson, V. (1999). Bro: A System for Detecting Network Intruders in Real-time.

CowpM/er Ne/worAs, 31, 2435-2463.

Portnoy, L., Eskin, E. and Stolfo, S. (2001). Intrusion Detection with Unlabeled Data

Using Clustering. ProceeJwgs / C M CSS ôrAsAop o^ D^/^ M/^/^g

^_pp//eJ /o SecMr//y, Citeseer.

Quinlan, J. R. (1993). C4. J/ progr^ws /o r w%cAme /e^r^/^g, San Francisco: Morgan

Kaufmann.

Quittek, J., Zseby, T., Claise, B. and Zander, S. (2004). Requirements for IP Flow

Information Export (IPFIX). RFC 3917, IETF. Available: https://www.rfc-

editor.org/rfc/rfc3917.txt

Reich, Y. and Fenves, S. J. (1991). The Formation and Use of Abstract Concepts in

Design. Co^cep/ /orw%//oM.' Aôw/eJge %^J exper/e^ce m M^sMperv/seJ

/e^r^/^g. San Mateo: Morgan Kaufmann.

Rhodes, B. C., Mahaffey, J. A. and Cannady, J. D. (2000). Multiple Self-organizing

Maps for Intrusion Detection. ProceeJmgs / /Ae 2JrJN2//o^%/ T^/orw%/o

Sys/ews SecMr//y Co^/ere^ce, 16-19.

Robertson, S., Siegel, E. V., Miller, M. and Stolfo, S. J. (2003). Surveillance

Detection In High Bandwidth Environments. ProceeJmgs o / /Ae D ^P P ^

T^/orw^//o^ SMrv/v^^////y Co^/ere^ce %^J Expos///o^, 130-138.

http://iscx.cs.unb.ca/NSL-KDD/

http://nmap.org/

https://www.rfc-

159

Roesch, M. (1999). Snort Lightweight Intrusion Detection for Networks.

ProceeJ/Mgs o / ?Ae TJ?A ^SENVZ coM/ereMce oM Sys?ew ^Jw/M/s?r^?/oM, 229

238.

Satten, C. (2008). Loss/ess G/g%M êwo?e P^cê? C^p?Mre ^/?A L/MMx [Online].

Available: http://staf^'.washinaton.edu/corey/gulp/.

Schaffrath, G. and Stiller, B. (2008). Conceptual Integration of Flow-based and

Packet-based Network Intrusion Detection. ês///eM? NeWorAs* %MJ Serv/ces.

Springer.ProceeJ/Mgs o / ?Ae2MJ TM?erM ?/oM / CoM/ereMce oM ^M?oMowoMs

VM/r s?rMc?Mre, M^M^geweM? ^MJ SecMr/?y, 190-194.

Shannon, C. E. (2001). A Mathematical Theory Of Communication. ^C M

STGMOFTLE M oMe Co^M?/Mg ^MJ CowwMM/c^?/oMs êv/ew, 5, 3-55.

Shiravi, A., Shiravi, H., Tavallaee, M. and Ghorbani, A. A. (2012). Toward

Developing a Systematic Approach to Generate Benchmark Datasets for

Intrusion Detection. CowpM?ers %MJ SecMr/?y, 31, 357-374.

Sinclair, C., Pierce, L. and Matzner, S. (1999). An Application of Machine Learning

to Network Intrusion Detection. ProceeJ/Mgs o / ?AeTJ?ATEEE ^MMM%/

CoM/ereMce oM CowpM?er SecMr/?y pp//c^?/oMs, 371-377.

Snort Incorporation. (2013). SNO^T ^sers M%MM%/ 2.P.J [Online]. Available:

http://manual.snort.org/.

Softflowd. (2013). So/?/7owJ [Online]. Available:

http://www.mindrot.org/projects/softflowd/.

Sommer, R. and Paxson, V. (2010). Outside the Closed World: On Using Machine

Learning for Network Intrusion Detection. ProceeJ/Mgs o / ?AeTEEE

Sywpos/Mw oM SecMr/?y ^MJ Pr/v^cy, 305-316.

sourceforge.net. (2014). NFD^M P [Online]. Available:

http://nfdump.sourceforge.net/.

sourceforge.net. (2015). N/SeM - Ne?/7ow SeMsor [Online]. Available:

http://nfsen.sourceforge.net/.

Sperotto, A., Sadre, R., van Vliet, F. and Pras, A. (2009). A Labeled Dataset for

Flow-based Intrusion Detection. ProceeJ/Mgs o / ?Ae 9?A TEEE TM?erM ?/oM /

^ForAsAop oM TP Oper%?/oMs ^MJ M^M^geweM?,39-50.

Sperotto, A., Schaffrath, G., Sadre, R., Morariu, C., Pras, A. and Stiller, B. (2010).

An Overview of IP Flow-based Intrusion Detection. TEEE CowwMM/c^?/oMs

-Surveys %MJ TM?or/^/s, 12(3), 343-356.

http://staff.washington.edu/corey/gulp/

http://manual.snort.org/

http://www.mindrot.org/projects/softflowd/

http://nfdump.sourceforge.net/

http://nfsen.sourceforge.net/

160

Staniford, S., Hoagland, J. A. and McAlerney, J. M. (2002). Practical Automated

Detection of Stealthy Portscans. JoMm%/ / CowpM/er SecMr//y, 10, 105-136.

Stoecklin, M. P., Le Boudec, J.-Y. and Kind, A. (2008). A Two-layered Anomaly

Detection Technique Based on Multi-modal Flow Behavior Models.

ProceeJrngs / /Ae P/A T^/er^^//o^^/Co^/ere^ce o^ P^ss/ve %^J ^c//ve

Ne/worA Me^sMrewe^/, 212-221.

Stolfo, S. J., Fan, W., Lee, W., Prodromidis, A. and Chan, P. K. (2000). Cost-based

Modeling for Fraud and Intrusion Detection: Results from the JAM Project.

ProceeJrngs / /AeD^PP^ T^/orw^//o^ SMrv/v^^////y Co^/ere^ce ^^J

Expos///o^, 130-144.

Stone, R. CenterTrack: (2000). An IP Overlay Network for Tracking DoS Floods.

ProceeJrngs o/^SENTZ SecMr//y SyôszMw, 21, 114.

Strayer, W. T., Lapsely, D., Walsh, R. and Livadas, C. (2008). Botnet Detection

Based on Network Behaviorin Series: Advances in Information Security,

Springer, 1-24.

Tavallaee, M., Bagheri, E., Lu, W. and Ghorbani, A.-A. (2009). A Detailed Analysis

of the KDD CUP 99 Dataset. ProceeJmgs / /Ae Seco^J TEEE Sywpos/Mw

o^ CowpM/%//o^%/ T^/e/Z/ge^ce /o r SecMr//y ^^ J De/e^ce pp//c^//o^s, 53-58.

Tcpdump/Libpcap. (2015). ECPD^MP & ETFPC^P [Online]. Available:

http://www.tcpdump.org/.

The Shmoo Group, the DefCon. (2015). C^p/Mre /Ae C^p/Mre /Ae E/%g D^/^

[Online]. Available: http://cctfshmoo.com/.

University of California, (1999). XDD CMp 7PPP D%/% [Online]. Available:

http://kdd.ics.uci.edu/databases/kddcup99/kddcup99.html.

Wagner, A. and Plattner, B. (2005). Entropy Based Worm and Anomaly Detection in

fast IP Networks. ProceeJmgs o / /Ae74/ATEEE TM/em%//oM%/ ôrAsAops o^

E^^^//^g TecAô/og/es/ T/%s/rMc/Mre/or Co//^ôr^//ve E^/erpr/se, 172-177.

Wang, H., Zhang, D. and Shin, K. G. (2002a). Detecting SYN Flooding Attacks.

ProceeJrngs o / /Ae27s/TEEE ^MMM%/ Jo/^/ Co^/ere^ce o^ C o^M /er %^J

CowwM^/c^//o^s Soc/e//es, 1530-1539.

Wang, H., Zhang, D. and Shin, K. G. (2002b). SYN-dog: Sniffing SYNFlooding

Sources. ProceeJ/^gso/ /Ae 22"JTEEE TM/em%//oM%/ Co^/ere^ce o^

D/s/r/^M/eJ Co^M //^g Sys/ew, 421-428.

http://www.tcpdump.org/

http://cctf.shmoo.com/

http://kdd.ics.uci.edu/databases/kddcup99/kddcup99.html

161

Wang, Y. and Global, I. (2009). SY%Yz'yYz'c%f TecA^Mey /o r NeYworA SecMrzYy. MoJer^

SY Yz'yYz'c ffy- ^yeJ V YrMyz'o DeYecYo ^^ J ProYecYz'o , Hershey: IGI Global.

Wireshark.org. (2015). ^z'reyA%rA [Online]. Available: https://www.wireshark.org/.

Witten, I. H., Frank, E., Trigg, L. E., Hall, M. A., Holmes, G. and Cunningham, S. J.

(1999). Weka: Practical Machine Learning Tools and Techniques with Java

Implementations.ProceeJmgyo/ YAeV Yer ^Yz'o f ôrAyAop o^ Ewergz'^g

XôwfeJge E ^gm eerm g^J Co^êcYz'o^^z'yY-^^yeJ V /orw^Yz'o SyyYews, 99,

192-196.

Yau, D. K., Lui, J., Liang, F. and Yam, Y. (2005). Defending Against Distributed

Denial of Service Attacks with Max-Min Fair Server-centric Router

Throttles. VEEE/^CM Tr^^y^cYz'o^y o^ NeYworAz^g, 13, 29-42.

Zhang, Z., Li, J., Manikopoulos, C., Jorgenson, J. and Ucles, J. (2001). HIDE: a

Hierarchical Network Intrusion Detection System using Statistical

Preprocessing and Neural Network Classification. ProceeJmgyo/ YAe VEEE

ôrAyAop o^ V /orw^Yz'o ^yyMm^ce %^J SecMrzYy, 85-90.

Zhao, Q., Xu, J. and Kumar, A. (2006). Detection of Super Sources and Destinations

in High-speed Networks: Algorithms, Analysis and Evaluation. VEEE JoMr^^f

o^ SefecYeJ re%y m CowwM z'c%Yz'o y, 24, 1840-1852.

https://www.wireshark.org/

http://www.wireshark.org/

dataset generation and network intrusion detection...

Documents