Hybrid Intelligent Systems Hybrid Intelligent Systems for Network Securityfor Network Security
Lane ThamesLane ThamesGeorgia Institute of TechnologyGeorgia Institute of Technology
Savannah, GASavannah, [email protected]@gtsav.gatech.edu
Presentation OverviewPresentation Overview
Discuss Network Security Issues Discuss Network Security Issues Discuss the goals of this paper’s projectDiscuss the goals of this paper’s projectOverview of Self Organizing MapsOverview of Self Organizing MapsOverview of Bayesian Learning NetworksOverview of Bayesian Learning NetworksDescribe the details of the Hybrid SystemDescribe the details of the Hybrid SystemReview the Experimental ResultsReview the Experimental ResultsDiscuss Future Work and ConclusionsDiscuss Future Work and ConclusionsQ&AQ&A
Network Security MotivationNetwork Security Motivation
Internet Growth is Steadily Increasing Internet Growth is Steadily Increasing
Over 1 Billion Internet UsersOver 1 Billion Internet Users
Many different types of applications are Many different types of applications are now using the Internet as a now using the Internet as a communication channelcommunication channel
Data Source: www.idc.comData Source: www.idc.com
Network Security MotivationNetwork Security Motivation
No more “Script Kiddies”No more “Script Kiddies”Hacking is now more than just a hobbyHacking is now more than just a hobbyHackers have created their own revenue Hackers have created their own revenue generating channelsgenerating channelsCommon hacking “commodities”Common hacking “commodities” Hacking software that is for saleHacking software that is for sale Corporate ExtortionCorporate Extortion Corporate EspionageCorporate Espionage Identity TheftIdentity Theft
Network Security MotivationNetwork Security Motivation
Classical Attack TypesClassical Attack Types Buffer OverflowBuffer Overflow Denial of Service (DoS)Denial of Service (DoS) Distributed Denial of Service (DDoS)Distributed Denial of Service (DDoS) ReconnaissanceReconnaissance VirusVirus WormsWorms Trojan HorseTrojan Horse
Network Security MotivationNetwork Security MotivationHackers are using more sophisticated Hackers are using more sophisticated mechanismsmechanisms
Phishing—Less SophisticatedPhishing—Less Sophisticated Easy to fool a novice userEasy to fool a novice user
Pharming—More SophisticatedPharming—More Sophisticated Easy to fool novice and expert usersEasy to fool novice and expert users
DoS and DDoS—Used for extortionDoS and DDoS—Used for extortionRemote Root Access—Used for espionage and Remote Root Access—Used for espionage and
identity theftidentity theft
Network Security MotivationNetwork Security Motivation
The numbers do not lieThe numbers do not lie
Hackers are constantly looking for ways to Hackers are constantly looking for ways to cause mischiefcause mischief Steal your dataSteal your data Handicap your machinesHandicap your machines Take your money, etc, etc.Take your money, etc, etc.
Data Source: Data Source: http://www.cert.org/stats/cert_stats.htmlhttp://www.cert.org/stats/cert_stats.html
Network Security MotivationNetwork Security Motivation
The Bottom Line: Network Security The Bottom Line: Network Security Research and Commerce is here to stay!Research and Commerce is here to stay!
Project GoalsProject Goals
Develop an Intelligent System that works Develop an Intelligent System that works reliably with data that can be collected reliably with data that can be collected purely within a Networkpurely within a Network
Why? If security mechanisms are difficult Why? If security mechanisms are difficult to use, people will not use them.to use, people will not use them.
Using data from the network takes the Using data from the network takes the burden off the end userburden off the end user
Hybrid Intelligent SystemsHybrid Intelligent Systems
A system was developed that made use of A system was developed that made use of two types of Intelligence Algorithms:two types of Intelligence Algorithms:
Self-Organizing MapsSelf-Organizing Maps
Bayesian Learning NetworksBayesian Learning Networks
Training and Testing Data SetTraining and Testing Data Set
KDD-CUP 99 Data SetKDD-CUP 99 Data Set
The Data set used for the Third The Data set used for the Third International Knowledge Discovery and International Knowledge Discovery and Data Mining Tools CompetitionData Mining Tools Competition
Training and Testing Data SetTraining and Testing Data Set
41 Total Features Categorized as:41 Total Features Categorized as:
Basic TCP/IP featuresBasic TCP/IP features Content FeaturesContent Features Time Based Traffic FeaturesTime Based Traffic Features Host Based Traffic FeaturesHost Based Traffic Features
Training and Testing Data SetTraining and Testing Data Set
Attack Type CategoriesAttack Type Categories
Remote to Local ExploitsRemote to Local Exploits User to Root ExploitsUser to Root Exploits Denial of ServiceDenial of Service Probing (Reconnaissance)Probing (Reconnaissance)
Self Organizing Maps—SOM Self Organizing Maps—SOM
Pioneered by Dr. Teuvo KohonenPioneered by Dr. Teuvo Kohonen
An algorithm that transforms high An algorithm that transforms high dimensional input data domains to dimensional input data domains to elements of a low dimensional array of elements of a low dimensional array of nodesnodes
A fixed size grid of nodes—sometimes A fixed size grid of nodes—sometimes denoted as neurons to reflect neural net denoted as neurons to reflect neural net similaritysimilarity
Self-Organizing MapsSelf-Organizing Maps
Input Data VectorsInput Data Vectors
][ 1 rxxX
Self Organizing MapsSelf Organizing Maps
Let a parametric real set of vectors be Let a parametric real set of vectors be associated with each element, associated with each element, ii, of the , of the SOM gridSOM grid
][ 1 ikii mmM
Self-Organizing MapsSelf-Organizing Maps
Furthermore,Furthermore,
},{, ni
ni mxMX
Self-Organizing MapSelf-Organizing Map
A decoder function is defined on the A decoder function is defined on the basis of distance between the input basis of distance between the input vector and the parametric vector.vector and the parametric vector.
The decoder function is used to map the The decoder function is used to map the image of the input vector onto the SOM image of the input vector onto the SOM grid. The decoder function is usually grid. The decoder function is usually chosen to be either the Manhattan or chosen to be either the Manhattan or Euclidean distance metric.Euclidean distance metric.
),( iMxd
Self-Organizing MapsSelf-Organizing Maps
A Best Matching Unit, denoted as the A Best Matching Unit, denoted as the index c, is chosen as the node on the SOM index c, is chosen as the node on the SOM grid that is closest to the input vectorgrid that is closest to the input vector
)},({minarg ii Mxdc
Self-Organizing MapsSelf-Organizing Maps
The dynamics of the SOM algorithm The dynamics of the SOM algorithm demand that the Mdemand that the M ii be shifted towards the be shifted towards the
order of X such that a set of values {Morder of X such that a set of values {M ii} are } are
obtained as the limit of convergence of the obtained as the limit of convergence of the following:following:
iciii Htmtxttmtm )]()()[()()1(
SOM DemoSOM Demo
The next few plots will demonstrate how The next few plots will demonstrate how the parametric vector will converge to the the parametric vector will converge to the input data vectorinput data vector
Demonstrate the effects of parameters on Demonstrate the effects of parameters on one anotherone another
Display the error function for this demoDisplay the error function for this demo
Bayesian Learning Networks--BLNBayesian Learning Networks--BLN
A BLN is a probabilistic model built on the A BLN is a probabilistic model built on the concept of the Directed Acyclic Graph concept of the Directed Acyclic Graph (DAG)(DAG)The DAG is a graph of nodes where each The DAG is a graph of nodes where each node is a random variable of interestnode is a random variable of interestThe directed edges of the graph represent The directed edges of the graph represent relationships among the variablesrelationships among the variablesIf an arc is emitted from a node If an arc is emitted from a node hh to a to a node node DD, we say that , we say that h h is the parent of is the parent of DD
Bayesian Learning NetworksBayesian Learning Networks
The Fundamental Equation: Bayes TheoremThe Fundamental Equation: Bayes Theorem
)(
)()|()|(
DP
hPhDPDhP
Bayesian Learning NetworksBayesian Learning Networks
In Bayesian learning, we calculate the In Bayesian learning, we calculate the probability of an hypothesis and make probability of an hypothesis and make predictions on that basispredictions on that basis
Predictions or classifications are reduced Predictions or classifications are reduced to probabilistic inferenceto probabilistic inference
Bayesian Learning NetworksBayesian Learning Networks
With BLN, we have With BLN, we have conditional probabilities conditional probabilities for each node given its for each node given its parentsparents
The graph shows causal The graph shows causal connections, not the flow connections, not the flow of information thru the of information thru the graphgraph
Prediction versus Prediction versus abductionabduction
xx11
xx33xx22
xx55
xx44
Naïve Bayesian Learning NetworkNaïve Bayesian Learning Network
The Naïve BLN is a special The Naïve BLN is a special case of the general BLNcase of the general BLNIt contains one root (parent) It contains one root (parent) node which is called the class node which is called the class variable, Cvariable, CThe leaf nodes are the The leaf nodes are the attribute variables (Xattribute variables (X11 … X … Xii))
It is Naïve because it assumes It is Naïve because it assumes the attributes are conditionally the attributes are conditionally independent given the class.independent given the class.
CC
xxiixx22xx11
The Naïve BLN ClassifierThe Naïve BLN Classifier
Once the network is trained, it can be used Once the network is trained, it can be used to classify new examples where the to classify new examples where the attributes are given and the class variable attributes are given and the class variable is unobserved—abductionis unobserved—abduction
The Goal: Find the most probable class The Goal: Find the most probable class value given a set of attribute instantiations value given a set of attribute instantiations (X(X11 … X … Xii))
Naïve BLN ClassifierNaïve BLN Classifier
)|()(maxarg
)|()|,,(
)()|,,(maxarg
),,(
)()|,,(maxarg
),|(maxarg
1
1
1
1
1
jii
jCc
NB
jii
ji
jjiCc
NB
i
jji
CcNB
ijCc
NB
cXPcPC
cXPcXXP
cPcXXPC
XXP
cPcXXPC
XXcPC
j
j
j
j
Hybrid System ArchitectureHybrid System Architecture
Experimental ResultsExperimental Results
4 types of analyses were made with the 4 types of analyses were made with the datasetdataset BLN analysis with network and host based BLN analysis with network and host based
datadata BLN analysis with network dataBLN analysis with network data Hybrid analysis with network and host based Hybrid analysis with network and host based
datadata Hybrid analysis with network based dataHybrid analysis with network based data
Experimental ResultsExperimental ResultsBLN-Host/BLN-Host/Network Network BasedBased
BLN-BLN-Network Network BasedBased
Hybrid-Host/Hybrid-Host/Network BasedNetwork Based
Hybrid-Hybrid-Network Network BasedBased
Total Total CasesCases
65,50565,505 62,04762,047 65,50565,505 62,04762,047
Correctly Correctly ClassifiedClassified
65,01965,019 59,73459,734 65,23865,238 61,63161,631
% % Correctly Correctly ClassifiedClassified
99.26%99.26% 96.27%96.27% 99.59%99.59% 99.33%99.33%
Number ofNumber of Incorrectly Incorrectly ClassifiedClassified
486486 23152315 267267 416416
Future and Current WorkFuture and Current Work
HoneyNet ProjectHoneyNet Project
Resource Resource Management Management System with System with Intelligent System Intelligent System Processing at the Processing at the CoreCore
ConclusionConclusion
Intelligent Systems algorithms are very useful Intelligent Systems algorithms are very useful tools for applications in Network Securitytools for applications in Network Security
Experimental results show that a hybrid system Experimental results show that a hybrid system built with SOM and BLN can produce very built with SOM and BLN can produce very accurate responses when classifying Network accurate responses when classifying Network based data flows which is very promising for based data flows which is very promising for those wishing design classification systems that those wishing design classification systems that do not rely on host based datado not rely on host based data