learning factor graphs for preempting multi-stage attacks in...

1 2 3 4 5 6 7 8 9 10Iteration

0.0

0.2

0.4

0.6

0.8

1.0

g(x,

z 1,z

2)

EM learning for g(x, z1, z2)

z1 = 0z1 = 1z2 = 0z2 = 1

Approach

Learning Factor Graphs for Preempting Multi-Stage Attacks in Cloud InfrastructurePhuong Cao, Alexander Withers, Zbigniew Kalbarczyk, Ravishankar IyerUniversity of Illinois at Urbana-Champaign

Result on learning parametersExpectation-Minimization algorithm shows a fast convergence rate, i.e., 3 iterations, of marginal probability of zi for a 3-variable clique.

Goals• Detect multi-stage attacks that make use of stolen credentials in large

enterprise networks, e.g., cloud infrastructure• Employ factor graphs, a probabilistic graphical model, to capture

attacker behavior and detect malicious activities.– Learning graph structure that represents dependencies among observed events and attack

stages

– Learning graph parameters, i.e., factor functions among observed events and hidden attack stages, that represents strength of their dependencies using a factor graph correspond to ongoing attack

Future Work1. Automatically build graphs for evaluating

security of pre-deployed cloud applications

2. Evaluate of learned graphs in terms of detection accuracy or model complexity

3. Build models to automatically respond to on-going attacks

References[1] Koller, Daphne, and Nir Friedman. Probabilistic graphical models: principles and techniques. MIT press, 2009. [2] Cao, Phuong, et al. "Preemptive intrusion detection." Proceedings of the 2014 Symposium and Bootcamp on the Science of Security. ACM, 2014.

AcknowledgementThis material is based upon work supported by the Maryland Procurement Office under Contract No. H98230-14-C-0141National Center for Supercomputing Applications staffs for insightful discussions on incident reports of multi-stage attacks

Attack Paths

</> —{} —————

Cloud ConfigMarkup Document

ConstructAttack Graph

…

Attack Graph

I ——C ——D——

Pastincidents

Events

Factor functions

Learnparameters

Build factor graph

DetermineAttackStage

𝑠1𝑠2𝑠3𝑠4𝑠5

=

0.10.30.10.40.1

Motivating Example: A Credential Stuffing Attack

Steal credentials / secret keys

Launch Denial of Service

Send spams

Mine Bitcoins

Target Site

Leaked Credentials

Botnet

Black Markets Password*********

Compromised Machine

Firewall

File system/dev/shm

Compile RHEL exploit

Escalate to #root

Attacker

Remote Location

srcseq

dst

ack…

3 x

Venom Rootkit

Port 9090

Loadable Kernel Module

Userland backdoor

Network scanning tools

X

Buy, steal, or fishcredentials

1

3 Drop exploit kit

ssh-rsaAAAAB3…

2 Masquerade

4 Escalate privilege

5 Deploy rootkit

6 Open port upon receivingport knocking sequence

8 Perform attack payloads

Invisible to network scanning tools

7 Send port knocking sequenceTCP dst + seq = 1221

</>Automated

guessing

Account takeover $$$

Credential Stuffing: Attacks that exploit leaked credentials for automated penetration of systems.

𝑧+ 𝑧, 𝑧- 𝑧. 𝑧/

𝑥+ 𝑥, 𝑥- 𝑥. 𝑥/ 𝑥1

𝑧1𝑧2𝑧3𝑧4𝑧5

=

0.10.30.10.40.1

𝑧𝑖 ∈ 𝑍, z8 ∈ 0,1 representhiddenattackstages𝑥𝑖 ∈ 𝑋,x𝑖 ∈ [1,|X|]representobservedevents

𝑓(𝑠1) 𝑧10.10 0

0.90 1

𝑔(𝑥1, 𝑧1, 𝑧2) 𝑥1 𝑧1 𝑧2? 1 0 0

? 1 0 1

? 1 1 0

? 1 1 1

𝒛𝟏 :initialcompromise𝒛𝟐 :hosthopping𝒛𝟑 :escalateprivileges𝒛𝟒 :maintainpresence𝒛𝟓 :deliverpayload

The most probable attack stage:”maintain presence”

𝑥1 :incorrectlogins𝑥2 :successfullogin𝑥3 :sensitivefilesyslocation𝑥4 :downloadsourcefile𝑥5 :compile𝑥6 :newkernelmodule

𝑷 𝒙, 𝒛|𝜽 =𝟏𝒁^𝜽𝒄𝑻𝒇𝒄(𝒙𝒄, 𝒛𝒄)𝒄∈𝓒

h(𝑧2) 𝑧20.70 0

0.30 1

Graph structure consist of edgesspecify link among variables

I ——C —D——

PastIncidentsDataset D

IndependenceTest

Dependent Events

𝑧,𝑥+

𝑧-𝑥-

𝑧/𝑥/

…

Events at run-time

ExpectationMaximization of 𝜃

Factor Functions

𝑧,𝑥+

𝒈(𝒙𝟏, 𝒛𝟏, 𝒛𝟐) 𝒙𝟏 𝒛𝟏 𝒛𝟐

? 1 0 0

… … … …

𝜃ef

Ongoing attacks

Bro logsSystem logsNetwork flows

Learning graph structure (offline and runtime)The goal of learning graph structure, i.e., factor graph, is to automatically establish dependencies among observed events and hidden attack stages by using an Χ,independence test on training data D. The dependencies are used to construct a set of model candidates 𝑚i ∈ 𝑀 , e.g., simple model using only strongest dependencies or complex model using all dependencies.

𝑠𝑐𝑜𝑟𝑒 opq 𝑚i 𝐷 = 𝑚𝑎𝑥tlog𝑃(𝑥, 𝑧,𝑚i, 𝐷, 𝜃) + 𝑙𝑜𝑔 𝜃𝑃 𝜃 𝑚i − 𝑑𝑖𝑚 𝑚i 𝑙𝑛|𝐷|

A model candidate𝑚i is scored based on three terms in respective order: - Goodness of fit with training data (𝑚𝑎𝑥tlog𝑃(𝑥, 𝑧,𝑚i, 𝐷, 𝜃))- Entropy of 𝜃 to avoid overfitting and favor model stability (𝑙𝑜𝑔 𝜃𝑃 𝜃 𝑚i

- Complexity of model and availability of training data (𝑑𝑖𝑚 𝑚i 𝑙𝑛|𝐷|)

Learning graph parameters (offline)The goal of learning graph parameters is to automatically define parameters 𝜃efof factor functions in tabular forms.

Expectation Maximization algorithm is used for learning parameters of each factor function because it can handle missing or incomplete training data, which is the case for most multi-stage attacks.

Input. Training dataset D of past attacksInit. Start with a random initialization of 𝜃|

Repeat each iteration until converge:E-step. Calculate expected likelihood of log likelihood function

𝑄 𝜃~�+ 𝜃~ = 𝐸 𝑧 𝑥, 𝜃~ [log(𝑃(𝑥, 𝑧, 𝜃~)]

M-step. Maximize parameters of 𝜃 ~�+

𝜃 ~�+ = 𝑎𝑟𝑔𝑚𝑎𝑥 t 𝑄 𝜃~�+ 𝜃~

Inference of ongoing attack stages (runtime)Given a factor graph of an ongoing attack at runtime, inference is to determine the most likely unknown attack stage and output a confidence level for each stage.

𝑧∗ = 𝑎𝑟𝑔𝑚𝑎𝑥 � P(x, z, 𝜃)At this stage, off-the-shelf inference techniques such as Belief Propagation, Monte Carlo Markov Chain, or Variational Inference can be employed.

Determine response to the identified attack stage, e.g., zi = maintain presence is the most probable attack stage in this example.0.1

0.3

0.1

0.4

0.1

0

0.1

0.2

0.3

0.4

0.5

1 2 3 4 5

P(z)

Illustration of the proposed approach in the context of a credential stuffing attack.

learning factor graphs for preempting multi-stage attacks in...

Documents