po-ching lin dept. of csie national chung cheng university
DESCRIPTION
Casting out Demons: Sanitizing Training Data for Anomaly Sensors in IEEE Symp . on S&P 2008 G. F. Cretu , A. Stavrou , M. E. Locasto , S. J. Stolfo and A. D. Keromytis. Po-Ching Lin Dept. of CSIE National Chung Cheng University. Problem definition. - PowerPoint PPT PresentationTRANSCRIPT
Po-Ching LinDept. of CSIE
National Chung Cheng University
Problem definitionTwo main approaches to detecting malicious inp
uts, behavior, network traffic, etc.Signature matchingAnomaly detection
Challenge of effective anomaly detection of malicious trafficA highly accurate modeling of normal trafficReal network traffic is usually polluted or unclean
Using it as the training data can be a problemHow can we sanitize training data for AD sensors?
2
Solution outline
noise: attack or non-regularityMi: micro model i
training set
MM1
MM2 MM
3
MM4
MM5
Assumption:An attack or abnormality appear only in small subsets in a large training set
The solution:1.Test each packet with the micro models using the voting scheme and build a “normal” model.2.Data deemed abnormal is used for building an abnormal model.3.Abnormal model can be distributed between sites.4.A shadow sensor architecture to handle false positives. 3
Assumption & micro modelsObservation
Over a long period, attacks and abnormalities are a minority class of data.
Deriving the micro modelsT = {md1,md2,...,mdN}, where mdi is the mi
cro-dataset starting at time (i − 1) g, g fro∗m 3 to 5 hours
Mi = AD(mdi) : the micro model from mdi
4
Deriving the sanitized training setSanitize the training dataset
Test each packet Pj with all the micro-models Mi. Lj,i = TEST(Pj ,Mi), Lj,i=1 if Pj is abnormal; otherwise 0.
Combine output from the modelsSCORE(Pj )= 1/WwiLj,i, where W=wi.
Sanitize the training datasetTsan={Pj|SCORE(Pj)V}, Msan = AD(Tsan).Tabn={Pj|SCORE(Pj)>V}, Msan = AD(Tsan).
5
Evaluation of sanitizationUse two anomaly sensors for evaluation
Anagram and PaylExperimental corpus
500 hours of real network traffic 300 hours of traffic to build the micro-models the next 100 hours to generate the sanitized model the remaining 100 hours of data was used for testing
from three different hosts: www, www1, and lists
with cross validation6
Without sanitizing the training data
A: AnagramA-S: Anagram +SnortA-SAN: Anagram + sanitizationP: PaylP-SAN: Payl+sanitizationV ∈ [0.15, 0.4
5]7
Analysis of sanitization parametersThree parameters for fine-tuning
The granularity of micro-modelsThe voting algorithm (simple voting vs. we
ighted voting)The voting threshold
8
Simple voting vs. Weighted voting
9
Results from the other two hosts
10
Granularity impact
11
Other impacts
12
Latency for different ADs
13
Long lasting training attacks
14
Collaborative SanitizationComparing models of abnormality with those generate
d by other sitesDirect model differencing
Mcross = Msan − {Mabni ∩Msan}Indirect model differencing
Differencing the sets of packets used to compute the models.
If a packet Pj is considered abnormal by at least one Mabni
features are extracted from the packet for computingthe new local abnormal model
used for computing the cross-sanitized model15
Training attacks
16
Conclusion & limitations in this paperThe capability of anomaly detection in the micr
o models Effectiveness of PAYL and Anagram?
The traffic in the evaluation “normality” is diverse in a real environment
Deriving packets to form the training set is statelessAttacks can be across packets or even connections
17