prediction methods for mitigating computer security threats
Post on 20-Jun-2015
335 Views
Preview:
TRANSCRIPT
Prediction Methods for Mitigating
Computer Security Threats
Errin W. Fulp
Department of Computer Science
Errin W. Fulp Prediction Methods for Mitigating Computer Security Threats
Outline
Overview of data mining methods
Machine learning tools, techniques, and tasks
Preprocessing, data mining, and interpretation
Prediction or knowledge discovery
When applied to computer security
Large data sets and rare events (at least we hope...)
Methods for addressing each concern
Example application, function discovery in computer networks
Who is doing what in a computer network?
Identify the application based on the pattern of interactions
Errin W. Fulp Prediction Methods for Mitigating Computer Security Threats
What is Data Mining
Extracting hidden patterns from data
Can be used to uncover existing hidden patterns
...but it cannot uncover patterns not already in the data
Typically two major objectives
Knowledge discovery - determine facts about the data
Forecasting or predictions - predict future events
Both are relevant to computer security
Errin W. Fulp Prediction Methods for Mitigating Computer Security Threats
Steps in the Process
Standard data-oriented view of Knowledge Discovery in Databases
Data
selection
Target Data
preprocessing
Preprocessed Data
transformation
Transformed Data
data mining
Patterns
interpretation
Knowledge
Let’s divide into a process-oriented view
Preprocessing
transformed data
Data Mining
patterns
Interpretation
Errin W. Fulp Prediction Methods for Mitigating Computer Security Threats
Preprocessing Data
Once the objective is determined, assemble the data
Again, can only uncover existing patterns
Clean the data, removing noise and account for missing data
Remove unwanted data that hinders data analysis... but what is
noise with regards to security...
Do we really want to remove outliers?
Reduce and transform data into important feature vectors
egasseMemiTgaTleveLytilicaFtsoH
198.129.8.6 local7 notice 189 1171061732 sysstat
198.129.8.6 kern info 6 1171061732 kernel md : using maxim um available idle IO bandwidth
198.129.8.6 cron info 78 1171061733 crond 2500 (root) CM D (/usr/lib/sa/sa1 1 1)
198.129.8.6 auth info 38 1171062445 rsh(pam unix) 2215 session opened for user by (uid=0)
198.129.8.6 auth info 38 1171062445 in.rshd 2216 root@hpcs2.cs.edu as root: cmd=/root/temps
198.129.8.6 daemon info 30 1171062590 smartd 88 Device: /dev/twe0 SMAR T Prefailure Attribute
198.129.8.18 syslog info 46 1171062590 syslogd restart.
198.129.7.282 daemon info 30 1171062590 ntpd 2555 synchronized to 198.129.149.218, str
198.129.7.222 daemon info 30 1171062590 ntpd 2555 synchronized to 198.129.149.218, str
198.129.7.238 daemon info 30 1171062590 ntpd 2555 synchronized to 198.129.149.218, str
198.129.8.6 auth notice 37 1171062590 sshd(pam unix) 12430 auth failure; logname=el-fork-o
198.129.8.6 kern info 6 1171062590 kernel md : using 512k, over a total of 12287936 blocks.
198.129.8.6 cron info 78 1171062601 crond 2500 (root) CM D (/usr/lib/sa/fork-it 1 1)
198.129.8.6 kern alert 1 1171062692 kernel raid5: Disk failure on sde1, disabling device
preprocessing
1.1778 1.1779 1.178 1.1781 1.1782 1.1783 1.1784 1.1785
x 109
0
50
100
150
200
time (seconds)
tag
num
ber
h198.129.146.158
transformation
tag Encoding (e) Sequence f (base 10)
148 2 2148 2 22158 2 22240 1 2221158 2 22212 239188 2 22122 233188 2 21222 21588 1 12221 160158 2 22212 239188 2 22122 215
Errin W. Fulp Prediction Methods for Mitigating Computer Security Threats
Types of Data Mining
Preprocessing
transformed data
Data MiningClassificationClusteringRegression
Rule Learning
patterns
Interpretation
Errin W. Fulp Prediction Methods for Mitigating Computer Security Threats
Classification
Arrange data into predefined groups, developed from training
Learn a model (classifier) from labeled training data
Examples include k-nearest neighbor and support vector machines
Typically training is slow, but classification is fast
When applied to security (specifically IDS) [CBK]
1 Cluster training data using algorithm
2 For new data, distance to closest cluster is anomaly score
Assumption: Normal data instances belong to specific cluster(s) in the data, while
anomalous does not. Normal data is closest to the centroid.
Can also perform semi-supervised training
Errin W. Fulp Prediction Methods for Mitigating Computer Security Threats
Clustering
Arrange data into groups, but the groups are not predefined
No training data required, therefore no training time...
Attack Graph Cluster Representation
1:execCode(commServer,root)
2:RULE 2 (remote exploit of a server program):1
3:netAccess(commServer,iccpProtocol,iccpPort)140:vulExists(commServer,iccpVulnerability,iccpService,remoteExploit,privEscalation)
4:RULE 5 (multi-hop access):0.56:RULE 5 (multi-hop access):0.5
5:hacl(commServer,commServer,iccpProtocol,iccpPort)7:hacl(dataHistorian,commServer,iccpProtocol,iccpPort)8:execCode(dataHistorian,root)
9:RULE 2 (remote exploit of a server program):1
10:netAccess(dataHistorian,sqlProtocol,sqlPort) 137:networkServiceInfo(dataHistorian,oracleSqlServer,sqlProtocol,sqlPort,root)138:vulExists(dataHistorian,oracleSqlVulnerability,oracleSqlServer,remoteExploit,privEscalation)
11:RULE 5 (multi-hop access):0.5 131:RULE 5 (multi-hop access):0.5 133:RULE 5 (multi-hop access):0.5135:RULE 5 (multi-hop access):0.5
132:hacl(citrixServer,dataHistorian,sqlProtocol,sqlPort)13:execCode(citrixServer,normalAccount)
14:RULE 0 (When a principal is compromised any machine he has an account on will also be compromised):0.5
15:canAccessHost(citrixServer)
79:principalCompromised(ordinaryEmployee)
130:hasAccount(ordinaryEmployee,citrixServer,normalAccount)
16:RULE 7 (Access a host through executing code on the machine):1
17:RULE 7 (Access a host through executing code on the machine):1
113:RULE 8 (Access a host through a log-in service):1
18:execCode(citrixServer,root)
19:RULE 4 (Trojan horse installation):0.2
20:accessFile(citrixServer,write,�/usr/local/share�)
21:RULE 15 (NFS semantics):1
22:accessFile(fileServer,write,�/export�) 112:nfsMounted(citrixServer,�/usr/local/share�,fileServer,�/export�,read)
23:RULE 16 (NFS shell):0.626:RULE 16 (NFS shell):0.629:RULE 16 (NFS shell):0.6 106:RULE 16 (NFS shell):0.6 109:RULE 16 (NFS shell):0.6
27:hacl(citrixServer,fileServer,nfsProtocol,nfsPort)28:nfsExportInfo(fileServer,�/export�,write,citrixServer)30:hacl(webServer,fileServer,nfsProtocol,nfsPort) 31:nfsExportInfo(fileServer,�/export�,write,webServer)32:execCode(webServer,apache)
33:RULE 2 (remote exploit of a server program):1
34:netAccess(webServer,httpProtocol,httpPort) 104:networkServiceInfo(webServer,httpd,httpProtocol,httpPort,apache) 105:vulExists(webServer,�CAN-2002-0392�,httpd,remoteExploit,privEscalation)
35:RULE 5 (multi-hop access):0.595:RULE 5 (multi-hop access):0.5 97:RULE 5 (multi-hop access):0.599:RULE 5 (multi-hop access):0.5101:RULE 6 (direct network access):1
36:hacl(vpnServer,webServer,httpProtocol,httpPort)37:execCode(vpnServer,normalAccount)
38:RULE 0 (When a principal is compromised any machine he has an account on will also be compromised):0.5
39:canAccessHost(vpnServer) 94:hasAccount(ordinaryEmployee,vpnServer,normalAccount)
40:RULE 7 (Access a host through executing code on the machine):1 41:RULE 8 (Access a host through a log-in service):1
42:netAccess(vpnServer,vpnProtocol,vpnPort)91:logInService(vpnServer,vpnProtocol,vpnPort)
43:RULE 5 (multi-hop access):0.5 45:RULE 5 (multi-hop access):0.547:RULE 5 (multi-hop access):0.586:RULE 5 (multi-hop access):0.5 88:RULE 6 (direct network access):1
44:hacl(vpnServer,vpnServer,vpnProtocol,vpnPort) 46:hacl(webServer,vpnServer,vpnProtocol,vpnPort)87:hacl(workStation,vpnServer,vpnProtocol,vpnPort) 49:execCode(workStation,normalAccount)
50:RULE 0 (When a principal is compromised any machine he has an account on will also be compromised):0.5
51:canAccessHost(workStation)
83:hasAccount(ordinaryEmployee,workStation,normalAccount)
52:RULE 7 (Access a host through executing code on the machine):1
53:RULE 7 (Access a host through executing code on the machine):1
59:RULE 8 (Access a host through a log-in service):1
54:execCode(workStation,root)
55:RULE 4 (Trojan horse installation):0.2
56:accessFile(workStation,write,�/usr/local/share�)
57:RULE 15 (NFS semantics):1
58:nfsMounted(workStation,�/usr/local/share�,fileServer,�/export�,read)
60:netAccess(workStation,tcp,sshProtocol) 75:logInService(workStation,tcp,sshProtocol)
61:RULE 5 (multi-hop access):0.5 63:RULE 5 (multi-hop access):0.5 65:RULE 5 (multi-hop access):0.569:RULE 5 (multi-hop access):0.5 71:RULE 5 (multi-hop access):0.573:RULE 5 (multi-hop access):0.5
64:hacl(citrixServer,workStation,tcp,sshProtocol) 66:hacl(fileServer,workStation,tcp,sshProtocol)67:execCode(fileServer,root)
68:RULE 4 (Trojan horse installation):0.2
70:hacl(vpnServer,workStation,tcp,sshProtocol) 74:hacl(workStation,workStation,tcp,sshProtocol)
76:RULE 12 ():1
77:networkServiceInfo(workStation,sshd,tcp,sshProtocol,sshPort)
80:RULE 10 (password sniffing):0.8 82:RULE 10 (password sniffing):0.8 84:RULE 11 (incompetent user):0.2
85:inCompetent(ordinaryEmployee)
89:hacl(attacker,vpnServer,vpnProtocol,vpnPort) 103:attackerLocated(attacker)
92:RULE 13 ():1
93:networkServiceInfo(vpnServer,vpnService,vpnProtocol,vpnPort,root)
96:hacl(webServer,webServer,httpProtocol,httpPort) 100:hacl(workStation,webServer,httpProtocol,httpPort)102:hacl(attacker,webServer,httpProtocol,httpPort)
110:hacl(workStation,fileServer,nfsProtocol,nfsPort)111:nfsExportInfo(fileServer,�/export�,write,workStation)
114:netAccess(citrixServer,sshProtocol,sshPort) 127:logInService(citrixServer,sshProtocol,sshPort)
115:RULE 5 (multi-hop access):0.5117:RULE 5 (multi-hop access):0.5119:RULE 5 (multi-hop access):0.5121:RULE 5 (multi-hop access):0.5 123:RULE 5 (multi-hop access):0.5125:RULE 5 (multi-hop access):0.5
118:hacl(citrixServer,citrixServer,sshProtocol,sshPort)120:hacl(fileServer,citrixServer,sshProtocol,sshPort)122:hacl(vpnServer,citrixServer,sshProtocol,sshPort) 126:hacl(workStation,citrixServer,sshProtocol,sshPort)
128:RULE 12 ():1
129:networkServiceInfo(citrixServer,sshd,sshProtocol,sshPort,root)
134:hacl(commServer,dataHistorian,sqlProtocol,sqlPort)136:hacl(dataHistorian,dataHistorian,sqlProtocol,sqlPort)
5 10 15 20 25 30 35 40
5
10
15
20
25
30
35
40
5 10 15 20 25 30 35 40
5
10
15
20
25
30
35
40
Examples of statistical classification include
k-means clustering and fuzzy clustering
Have difficulty with higher dimensional data [CBK]
Errin W. Fulp Prediction Methods for Mitigating Computer Security Threats
Regression
Model the data with the least error
Useful for forecasting and prediction
As applied to security, regression typically has two steps
1 Fit regression model to the data
2 For each test instance, residual determines anomaly score
Presence of anomalies can influence the robustness of the model
Errin W. Fulp Prediction Methods for Mitigating Computer Security Threats
Association Rule Learning
Searches for relationships between variables
Learn rules that capture normal behavior, any test that is not
covered is an anomaly (one-class) [EEGPP06, LSM98]
For multi-class
Learn rules from training data
using algorithm, each rule has a
confidence values
For each test instance find the
best rule, the inverse of the
confidence is the anomaly score
if UDP is AVERAGE ∧ TCP is AVERAGE then ICMP is AVERAGE
if SYN is AVERAGE ∧ FIN is AVERAGE then ICMP is AVERAGE
if ICMP is AVERAGE ∧ UDP is AVERAGE ∧ TCP is AVERAGE ∧SYN is AVERAGE then FIN is AVERAGE
if UDP is AVERAGE ∧ FIN is AVERAGE then SYN is AVERAGE
if UDP is AVERAGE ∧ SYN is AVERAGE then ICMP is AVERAGE
if SYN is AVERAGE then ICMP is AVERAGE
if ICMP is AVERAGE ∧ FIN is AVERAGE then SYN is AVERAGE
if UDP is AVERAGE ∧ TCP is AVERAGE ∧ SYN is AVERAGE ∧FIN is AVERAGE then ICMP is AVERAGE
if UDP is AVERAGE ∧ SYN is AVERAGE then FIN is AVERAGE
if ICMP is AVERAGE ∧ TCP is AVERAGE ∧ SYN is AVERAGE
then FIN is AVERAGE
if ICMP is AVERAGE ∧ SYN is AVERAGE then FIN is AVERAGE
.
.
.
Errin W. Fulp Prediction Methods for Mitigating Computer Security Threats
Interpreting the Results
Final step of the process, evaluate the patterns discovered
Not all are valid or may have a validity time period
Standard measures: accuracy, precision, recall, and F-score
Unbalanced test sets are a concern
Overfitting – excellent job of fitting the data, but not predicting
Find patterns in training-set not present in test set
0 0.2 0.4 0.6 0.8 1-3
-2
-1
0
1
2
3
dataoverfit modelcorrect model
Errin W. Fulp Prediction Methods for Mitigating Computer Security Threats
When Applied to Computer Security
Two major issues...
Large data sets
Rare events
Errin W. Fulp Prediction Methods for Mitigating Computer Security Threats
Security and Large Data Sets
Security typically involves large data sets
Sendmail “11,500 system calls per message” [WGZ08]
1998 MIT network data, 7 weeks is about 5 million connections
Must be processed quickly and accurately
Data oriented solutions
Discretization, feature selection [FFH08], feature construction
(principal component analysis) [WGZ04], and sampling [PP07]
Method oriented solutions
Parallel data mining (high-performance data mining)
Errin W. Fulp Prediction Methods for Mitigating Computer Security Threats
Security and Rare Events
Rare event processing is often required
We hope security events are infrequent...
Are there enough examples for supervised learning?
Black swan theory (hard to predict, high consequence, and easy to
see afterwards)
Bulk anomalies (worms) are the opposite... [CBK]
Standard approaches do not work well with rare events [JAK01]
Normal events maybe similar, but rare events often different
Many techniques attempt to model normal, look for variations
Over-sample rare class, down-size large class, artificial cases
Errin W. Fulp Prediction Methods for Mitigating Computer Security Threats
Rare Events in Other Areas
Insurance risk modeling [PRA00]
E-commerce and web mining, “Online merchants convert an
average of 2%-3% of their site visitors into buyers”
Churn analysis, “number of customers that end relationship with a
company in a given period” [NGK+06]
Hardware faults, for example new disk failures [AWG+93]
Airline No-Show predictions [LHC03]
Errin W. Fulp Prediction Methods for Mitigating Computer Security Threats
Example Security Application: Who is Doing What?
Given a computer network, discover what computers are doing
Specifically what applications or types of applications
Identifying an application is important for two reasons
Management of network resources
Compliance with security policies
However current methods do not always work
Port numbers are unreliable
Payloads can be encrypted
Current in-the-dark methods can defeated
Errin W. Fulp Prediction Methods for Mitigating Computer Security Threats
A New Approach
Given a set of computer network trace data, is it possible to
identify the application protocols (e.g. HTTP, AIM, DNS) that
hosts are using, based on interactions patterns?
Three different views of the same network
Physical Logical Application
Errin W. Fulp Prediction Methods for Mitigating Computer Security Threats
Motifs
A motif is a pattern of interconnections occurring in complex
networks at numbers that are significantly higher than those in
randomized networks
Motifs have been applied to several complex networks
Gene regulation, neural networks, ecosystem food webs, electronic
circuits (forward logic chips, digital fractional multipliers), and
World Wide Web
Certain motifs can be linked to specific functions
Errin W. Fulp Prediction Methods for Mitigating Computer Security Threats
Applying this Idea to Application Identification
ugggh... time consuming easy time consumingtalk to
grad student...
Parsedata
Constructapplication graphs
Create motifprofiles
Nearest neighborclassification
Interpretresults
Evolutionaryattributeweighting
Preprocessing
Collect data, parse into connection information
Find all order 3 and 4 motifs and build motif profiles
k-nearest-neighbor classification (for training and testing)
Interpret results, possibly weight features to improve performance
Errin W. Fulp Prediction Methods for Mitigating Computer Security Threats
Initial Experiments
Sources of data
Dartmouth University campus wireless network, Fall 2003
OSDI Conference 2006
Lawrence Berkeley National Lab 2004/2005
Create a profile per application
Application x profile =
1.000 0.662 0.650 0.632 0.585
Application y profile =
0.900 0.672 0.50 0.772 0.85
Given new application, find best matching profile
Errin W. Fulp Prediction Methods for Mitigating Computer Security Threats
Motif Profile Results
AIM DNS HTTP Kazaa
AIMDNSHTTPKazaaMSDSNetbiosSSH
MSDS Netbios SSH
Results very good compared to traditional graph statistics
Although there is a problem with AIM and SSH...
So what is the problem...?
Errin W. Fulp Prediction Methods for Mitigating Computer Security Threats
So What is the Problem?
Errin W. Fulp Prediction Methods for Mitigating Computer Security Threats
For Further Reading I
[AWG+93] C. Apte, S. M. Weiss, G. Grout, Chidanand Apte, Sholom Weiss, and Gordon Grout.
Predicting defects in disk drive manufacturing: A case study.
In Proceedings of the IEEE CAIA93, pages 212–218, 1993.
[CBK] Varun Chandola, Arindam Banerjee, and Vipin Kumar.
Anomaly detection: A survey.
To appear in ACM Computing Surveys, September 2009.
[EEGPP06] Aly ElSemary, Janica Edmonds, Jesus Gonzalez-Pino, and Mauricio Papa.
Applying data mining of fuzzy association rules to network intrusion detection.
In Proceedings of the IEEE Workshop on Information Assurance , 2006.
[FFH08] Errin W. Fulp, Glenn. A. Fink, and Jereme N. Haack.
Predicting computer system failures using support vector machines.
In Proceedings of the Workshop on Analysis of Sytem Logfiles , 2008.
[JAK01] Mahesh V. Joshi, Ramesh C. Agarwal, and Vipin Kumar.
Mining needle in a haystack: classifying rare classes via two-phase rule induction.
In SIGMOD ’01: Proceedings of the 2001 ACM SIGMOD international conference on Management of data ,
pages 91–102, 2001.
[LHC03] Richard D. Lawrence, Se June Hong, and Jacques Cherrier.
Passenger-based predictive modeling of airline no-show rates.
In Proceedings of the ninth ACM SIGKDD International Conference on Knowledge Discovery and Data
Mining, pages 397–406, 2003.
Errin W. Fulp Prediction Methods for Mitigating Computer Security Threats
For Further Reading II
[LSM98] Wenke Lee, Salvatore J. Stolfo, and Kui W. Mok.
Mining audit data to build intrusion detection models.
In Proceedings of the International Conference on Knowledge Discovery and Data Mining , 1998.
[NGK+06] Scott A. Neslin, Sunil Gupta, Wagner Kamakura, Junxiang Lu, and Charlotte H. Mason.
Defection detection: Measuring and understanding the predictive accuracy of customer churn models.
Journal of Marketing Research, 43:204–211, 2006.
[PP07] Animesh Patcha and Jung-Min Park.
An adaptive sampling algorithm with applications to denial-of-service attack detection.
In Proceedings of the IEEE International Conference on Computer Communications and Networks, pages
11–16, 2007.
[PRA00] Edwin P. D. Pednault, Barry K. Rosen, and Chidanand Apte.
Handling imbalanced data sets in insurance risk modeling.
Technical Report RC-21731, IBM, 2000.
[WGZ04] Wei Wang, Xiaohong Guan, and Xiangliang Zhang.
A novel intrusion detection method based on principle component analysis in computer security.
In Proceedings of the International Symposium on Neural Networks, pages 657–662, 2004.
[WGZ08] Wei Wang, Xiaohong Guan, and Xiangliang Zhang.
Processing of massive audit data streams for real-time anomaly intrusion detection.
Computer Communications, 31(1):58 – 72, 2008.
Errin W. Fulp Prediction Methods for Mitigating Computer Security Threats
Title
Item
Sub-item
Errin W. Fulp Prediction Methods for Mitigating Computer Security Threats
top related