implementation of machine learning and chaos combination for improving attack detection accuracy on...

29
Implementation of Machine Learning and Chaos Combination for Improving Attack Detection Accuracy on Intrusion Detection System (IDS) Bisyron Wahyudi Kalamullah Ramli Department of Electrical Engineering Universitas Indonesia

Upload: annabelle-baldwin

Post on 03-Jan-2016

215 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Implementation of Machine Learning and Chaos Combination for Improving Attack Detection Accuracy on Intrusion Detection System (IDS) Bisyron Wahyudi Kalamullah

Implementation of Machine Learning and Chaos Combinationfor Improving Attack Detection Accuracy on Intrusion Detection

System (IDS)

Bisyron WahyudiKalamullah Ramli

Department of Electrical Engineering

Universitas Indonesia

Page 2: Implementation of Machine Learning and Chaos Combination for Improving Attack Detection Accuracy on Intrusion Detection System (IDS) Bisyron Wahyudi Kalamullah

Network Security

Page 3: Implementation of Machine Learning and Chaos Combination for Improving Attack Detection Accuracy on Intrusion Detection System (IDS) Bisyron Wahyudi Kalamullah

The most important element In the network security: IDS

Intrusion detection principles: Misuse detection (signature base) Anomaly detection (statistics) Classification with Machine Learning

(research)

Background: IDS

Page 4: Implementation of Machine Learning and Chaos Combination for Improving Attack Detection Accuracy on Intrusion Detection System (IDS) Bisyron Wahyudi Kalamullah

Intrusion detection too many false alarm More often arise new types of attack Required effective and adaptable

detection method Classification with Machine Learning gives

the best result depend on the kernel function and its parameters, and network data attributes/features.

There are no systematic theories concerning how to choose the appropriate kernel/parameters.

Background: Problem

Page 5: Implementation of Machine Learning and Chaos Combination for Improving Attack Detection Accuracy on Intrusion Detection System (IDS) Bisyron Wahyudi Kalamullah

1. Capturing packets transferred on the network. 2. Extracting an extensive set of

attributes/features of the network packets data that can describe a network connection or a host session.

3. Learning a model that can accurately describe the behavior of abnormal and normal activities by applying data mining techniques.

4. Detecting the intrusions by using the learned models.

Data Mining Approach for IDS

Page 6: Implementation of Machine Learning and Chaos Combination for Improving Attack Detection Accuracy on Intrusion Detection System (IDS) Bisyron Wahyudi Kalamullah

Classification (Supervised)

Clustering (Unsupervised)

K Nearest Neighbor (K-NN) K-Means

Naïve Bayes Hierarchical Clustering

Artificial Neural Network DBSCAN

Support Vector Machine Fuzzy C-Means

Fuzzy K-NN Self Organizing Map

Data Mining Approach

Page 7: Implementation of Machine Learning and Chaos Combination for Improving Attack Detection Accuracy on Intrusion Detection System (IDS) Bisyron Wahyudi Kalamullah

Machine Learning

Input Training Data

(x,y)

Model Development

Learning Algorithm

Model Implementatio

n

Input Test Data (x,?)

OutputTest Data (x,y)

Page 8: Implementation of Machine Learning and Chaos Combination for Improving Attack Detection Accuracy on Intrusion Detection System (IDS) Bisyron Wahyudi Kalamullah

SVM Classification

Page 9: Implementation of Machine Learning and Chaos Combination for Improving Attack Detection Accuracy on Intrusion Detection System (IDS) Bisyron Wahyudi Kalamullah

Kernel Name Definition of Function

Linear K(x,y)= x.y

Polynomial K(x,y)= (x.y + c)d

Gaussian RBF K(x,y)= exp(-IIx-yII2/2.σ2)

Sigmoid (Tangent Hyperbolic)

K(x,y)= tanh(σ(x.y) + c)

Inverse Multiquadric

K(x,y)= 1/ √IIx-yII2 + c

Kernel Function

x and y pair of data from train dataset σ, c, d > 0 constant parameter

Page 10: Implementation of Machine Learning and Chaos Combination for Improving Attack Detection Accuracy on Intrusion Detection System (IDS) Bisyron Wahyudi Kalamullah

How to choose the optimal/significant input dataset feature.

How to set the best kernel function and parameters: σ, ε and C.

SVM Performance

Page 11: Implementation of Machine Learning and Chaos Combination for Improving Attack Detection Accuracy on Intrusion Detection System (IDS) Bisyron Wahyudi Kalamullah

Three important dynamic properties: the intrinsic stochastic property, ergodicity and regularity

Advantage of chaos escape from local minima

More efficient to obtain optimization parameters by means of its powerful global searching ability

Chaos

Page 12: Implementation of Machine Learning and Chaos Combination for Improving Attack Detection Accuracy on Intrusion Detection System (IDS) Bisyron Wahyudi Kalamullah

System Design

Page 13: Implementation of Machine Learning and Chaos Combination for Improving Attack Detection Accuracy on Intrusion Detection System (IDS) Bisyron Wahyudi Kalamullah

MetodologiData Collection

Data Preprocessing

Model Development

Data Classification

Training Dataset

Test Dataset

KDDCUP ’99 DARPA Dataset

Predicted Intrusion

Data

Page 14: Implementation of Machine Learning and Chaos Combination for Improving Attack Detection Accuracy on Intrusion Detection System (IDS) Bisyron Wahyudi Kalamullah

Data PreprocessingDataset

TransformationDataset

Normalization

Range Discretization

Format Conversion

Dataset Division: Training & Test

KDDCUP ’99 DARPA Dataset

Test Dataset

Training Dataset

Page 15: Implementation of Machine Learning and Chaos Combination for Improving Attack Detection Accuracy on Intrusion Detection System (IDS) Bisyron Wahyudi Kalamullah

Model DevelopmentInput

Training Data (x,y)

Parameter Selection with

Chaos Optimization

Learning Algorithm

(SVM)

Model Implementation

Input Test Data (x,?)

OutputTest Data (x,y)

Kernel Function Selection

Page 16: Implementation of Machine Learning and Chaos Combination for Improving Attack Detection Accuracy on Intrusion Detection System (IDS) Bisyron Wahyudi Kalamullah

Fitur 1-9 : intrinsic feature extracted from header paket Fitur 10-22 : atribut konten yang didapat dari pengetahuan ahli dari paket Fitur 23-31 : atribut konten dari koneksi 2 detik sebelumnya Fitur 32-41 : atribut trafik dari mesin yang didapat dari 100 koneksi

sebelumnya Fitur Payload : payload berdasarkan waktu (minggu)

Feature in KddCup

Page 17: Implementation of Machine Learning and Chaos Combination for Improving Attack Detection Accuracy on Intrusion Detection System (IDS) Bisyron Wahyudi Kalamullah

Intrinsic Attributes

These attributes are extracted from the headers' area of the network packets

Page 18: Implementation of Machine Learning and Chaos Combination for Improving Attack Detection Accuracy on Intrusion Detection System (IDS) Bisyron Wahyudi Kalamullah

Content Attributes

These attributes are extracted from the contents area of the network packets based on expert person knowledge

Page 19: Implementation of Machine Learning and Chaos Combination for Improving Attack Detection Accuracy on Intrusion Detection System (IDS) Bisyron Wahyudi Kalamullah

Time Traffics Attributes

To calculate these attributes we considered the connections that occurred in the past 2 seconds

Page 20: Implementation of Machine Learning and Chaos Combination for Improving Attack Detection Accuracy on Intrusion Detection System (IDS) Bisyron Wahyudi Kalamullah

Machine Traffic Attributes

To calculate these attributes we took into account the previous 100 connections

Page 21: Implementation of Machine Learning and Chaos Combination for Improving Attack Detection Accuracy on Intrusion Detection System (IDS) Bisyron Wahyudi Kalamullah

21

Network Traffic Classification

Page 22: Implementation of Machine Learning and Chaos Combination for Improving Attack Detection Accuracy on Intrusion Detection System (IDS) Bisyron Wahyudi Kalamullah

The features that used in previous works are eight features from Mukkamala are: src_bytes, dst_bytes, Count, srv_count, dst_host_count, dst_host_srv_count, dst_host_same_src_port_rate, dst_host_srv_diff _host_rate.

Selected Features

Page 23: Implementation of Machine Learning and Chaos Combination for Improving Attack Detection Accuracy on Intrusion Detection System (IDS) Bisyron Wahyudi Kalamullah

The features that used in previous works are 24 features from Natesan are:Duration, protocol_type, Service, Flag, src_bytes, dst_bytes, Hot, num_failed_logins, logged-in, num_compromised, root_shell, num_root, num_file_creations, num_shells, num_access_files, is_host_login, is_guest_login, Count, serror_rate, rerror_rate, diff_srv_rate, dst_host_count, dst_host_diff_srv_rate, dst_host_srv_serror_rate.

Selected Features

Page 24: Implementation of Machine Learning and Chaos Combination for Improving Attack Detection Accuracy on Intrusion Detection System (IDS) Bisyron Wahyudi Kalamullah

Proposed Features

Page 25: Implementation of Machine Learning and Chaos Combination for Improving Attack Detection Accuracy on Intrusion Detection System (IDS) Bisyron Wahyudi Kalamullah

Proposed Features

Page 26: Implementation of Machine Learning and Chaos Combination for Improving Attack Detection Accuracy on Intrusion Detection System (IDS) Bisyron Wahyudi Kalamullah

Data Pre-

processing

Page 27: Implementation of Machine Learning and Chaos Combination for Improving Attack Detection Accuracy on Intrusion Detection System (IDS) Bisyron Wahyudi Kalamullah

Simulation Experiment

Page 28: Implementation of Machine Learning and Chaos Combination for Improving Attack Detection Accuracy on Intrusion Detection System (IDS) Bisyron Wahyudi Kalamullah

Simulation Process Design

Page 29: Implementation of Machine Learning and Chaos Combination for Improving Attack Detection Accuracy on Intrusion Detection System (IDS) Bisyron Wahyudi Kalamullah

Using payload can improve accuracy of IDS in detecting R2L. Using SVM with RBF kernel, accuracy detection rates up to

98.2%. Based on experiment, average detection of all features are

best using 28 features using payload :

Experiment Result

Features Set Non payload Using Payload

8 Feature 95.78% 95.99%

24 feature 95.73% 95.91%

28 Feature 95.91% 96.08%