mechanisms for database intrusion detection and response
TRANSCRIPT
1
MECHANISMS FOR DATABASE INTRUSION
DETECTION AND RESPONSE
ASHISH KAMRAElectrical and Computer Engineering
Ph.D Defense TalkMay 12 2010
Purdue University
2
Thesis Statement
To Build a Real Time Anomaly Detection and Response
Mechanism for DatabasesIntegrated with the
Core Database Operations
3
Keywords
Real Time IntrusionDetection
Query Parse Tree
Not Anomalous
Anomalous
Query Execution
Intrusion Response
DBMS IntegrationDB Server Process
Intrusion Detection and
Response
4
MOTIVATION
5
Importance of Data
Most sensitive and proprietary data resides in a DBMS
A database breach can be extremely costly Loss of Reputation Lawsuits …
6
Premise
DBMS users and applications possess more privileges than required to carry out the task
Access control solutions alone may not enough !
7
Insider Threat Scenario
SSN Credit Card
Security Code
Database Administrators accessing sensitive application data
Disease
Patient Name
Diagnosis
Privacy breaches by curious employees
8
Related Systems
Oracle Database Vault
3rd Party Database Activity Monitoring Products
9
Why Anomaly Detection?
Not possible to “specify” all database malicious actions
Key Issue: false positive rate
We augment the detection mechanism with a response mechanism to help with false positives
10
Why DBMS Integration?
Independence from the underlying network infrastructure think moving to a cloud provider
More control on response actions in real time
Nobody else has yet tried it !
11
Organization
System Components
Anomalous Access Pattern Detection
Anomaly Response Mechanism
Privilege State Based Access Control
Joint Threshold Policy Administration
12
System Components
Query
User
Features Assessment
Profile Creator
Log
Audit Log
Training Queries
TRAINING PHASE
Detection Engine Response Engine
ResponsePolicy Base
Feature Selector
Profiles
AlertDropNo Action, Update Profiles
PrivilegeState Based
Access Control
Joint Administration
Model
13
Thesis Contributions Anomalous Access Pattern Detection in
Databases (ACSAC 2005, VLDB Journal 2008) Response Mechanism (SDM 2008, TKDE 2010)
Response Policy Language Policy Matching Algorithms
Privilege State Based Access Control for Fine Grained Intrusion Response (In submission RAID 2010)
Joint Threshold Administration Model for Response Policy Administration (TKDE 2010, POLICY 2010)
Prototype Implementation in the PostgreSQL open source DBMS
14
ANOMALOUS ACCESS PATTERN DETECTION
Query
User
Features Assessment
Profile CreatorAudit Log
Training Queries
TRAINING PHASE
Detection EngineFeature Selector
Profiles
15
Example
Normal SQL Commands
EMPLOYEE
TRAINING
PERFORMANCE
HR SCHEMA TABLES
Anomalous SQL Commands
FINANCE SCHEMA TABLES
COMPENSATION
ACCOUNTS
PAYROLL
AccessControlMechani
sm
Access Granted
Access Granted !!!
16
Key Ideas
Extract features from the SQL Queries
Experiment with feature extraction at different granularity levels 3 granularity levels considered
Two Scenarios Considered Role based Anomaly Detection Unsupervised Anomaly Detection
17
SQL Query Representation - I Coarse
- Command
- Num tables in the query
- Num columns in the query
18
SQL Query Representation - II
Medium
- Command
- Tables in the query
- Num columns per table
19
SQL Query Representation - III
Fine
- Command
- Tables in the query
- Columns in the query
20
Role Based Anomaly Detection
Profiles per Role Profile contain frequency counts for the various features
extracted from the SQL query
Role Profiles input to the Naïve Bayes Classifier
One Role One Class
Single Role Activation Assumption
Classifier Predicted Role != Activated Role ? Anomaly : No Anomaly
21
Naïve Bayes Model – medium
DB = T1(c1,c2,c3,c4) ; T2(c1,c2,c3,c4)
P(R = r1 | T1C = 2, T2C = 0, Cmd=Select) P(T1C = 2, T2c = 0, Cmd = Select | R = r1) * P(R =r1) =P(T1C = 2 | R=r1) * P(T2C = 0 | R=r1) * P(Cmd = Select | R=r1) * P(R=r1)
Query Under consideration : Select c1,c2 from T1
Role
Cmd T1C T2C
22
Why Naïve Bayes ?
Low computational complexity
Works well in practice even if the attribute independence condition is not met
Ease of Implementation
23
Role Based Anomaly Detection on Real Data
Quiplet Type False Negative (%)
False Positive (%)
c 2.6 19.2
m 2.4 17.1
f 2.4 17.9
24
Implementation in PostgreSQLPostmaster Main process
PostgresServer processServer
InitializationCode
Query Parse Tree
Query RewrittenParse Tree
QueryPlan
Access Control
EnforcementQuery
Executor
ConnectSpawn a new server process
Submit SQL Query
Query results
Collect role login statsCollect table initialization stats
Collect table/column access stats per role
Waiting for
Query Send detection stats to the statistics collector process
Anomaly Detection
Anomaly Response
Yes
No
25
Feature Extraction Architecture..
Collect stats in memory resident
data structures
.
.
.
.
.Send messages
to the statistics collectorProcess
.Periodically
read stats file
Postgres server process
Query executor
.
.Receive
messages and store stats in memory resident
data structures
.
.
.
.
.
.Periodically
write the stats to the pg_stat
file..
Statistics collector process
Stats profile creator
pg_statfile
writeread
26
Performance Impact - Experimental Setup Base Database Size ( x )
5 tables with indexes on primary keys 2 user columns per table (all integer type)
7 system columns per table
100 rows per table
Each Transaction - 5 select queries 5 update queries 5 Insert queries
27
Statistics Collection Overhead
x(5,2)
2x(10
,4)
4x(20
,8)
8x(40
,16)
16x(8
0,32)
32x(1
60,64
)02468
101214
Database Size vs Transaction Process-ing Overhead
coarsemediumfine
Ove
rhea
d (%
)
Takeaway: Statistics Collection overhead < 15 %
28
Detection Algorithm Time medium
x (6) 2x (11) 4x (21) 8x (41) 16x (81)05
101520253035
Database Size vs Detection Time
Det
ecti
on T
ime
(mi-
cros
econ
ds)
29
Detection Algorithm Time fine
x (51) 2x (121) 4x (321) 8x (961) 16x (3201)050
100150200250300350400
Database Size vs Detection Time
Det
ectio
n T
ime
(mi-
cros
econ
ds)
30
Detection Process Overhead
x(5,2) 2x(10,4) 4x(20,8) 8x(40,16) 16x(80,32)0
10
20
30
40
50
60
Database Size vs Transaction Process-ing Overhead
Ove
rhea
d (%
)
Takeaway: Large overhead for fine triplets for large DB size
31
Unsupervised Anomaly Detection
No Role Information
Partition training data in clusters of similar queries K centers K means
Map every user to its representative cluster
32
Unsupervised Anomaly Detection. . . .
. . . .
. . .
. ..
.
.. .
. . . .
. . . .
.
. . . . . . . . . . . . ..
. . . .. . .
C1
C2
C3
C4
U1 C3
U2 C3
U3 C1
U4 C2
U5 C4
U6 C1
U7 C4
U8 C3
U9 C2
33
Detection Methodology
Apply Naïve Bayes with the cluster as the class
Outlier detection in the representative cluster
34
Future Research Directions Role Based Anomaly Detection
Multiple Role Activation Considering Role Hierarchies …
Unsupervised Detection Implementation in PostgreSQL One Class Support Vector Machine …
Query Result Intrusion Detection
Machine Learning Issues Concept Drift Overfitting …
35
ANOMALY RESPONSE MECHANISM
AssessmentLog
Response Engine
ResponsePolicy Base
AlertDropNo Action, Update Profiles
36
Contributions
Response policy language
Response policy matching
Support for fine grained response actions Privilege State Based Access Control
Response policy administration Joint Threshold Administration Model
37
Response Policy Language
Interactive Event-Condition-Action Policy LanguageON EVENTIF CONDITIONTHEN INITIAL ACTIONCONFIRM CONFIRMATION ACTIONON SUCCESS RESOLUTION ACTIONON FAILURE FAILURE ACTION
38
POLICY ATTRIBUTESATTRIBUTE DESCRIPTIONCONTEXTUALUser The user associated with the request.Role The role associated with the request.Client App The client application associated with the request.Source IP The IP address associated with the request.Date/Time Date/Time of the anomalous request.STRUCTURALDatabase The database referred to in the request.Schema The schema referred to in the request.Obj Type The object types referred to in the request
such as table, view etcObj(s) The object name(s) referred in the requestSQLCmd The SQL Command associated with the requestObj Attr(s) The attributes of the object(s) referred in the
request.
39
POLICY PREDICATES
Pr1 Role != DBAPr2 Source IP IN 192.168.0.0/16Pr3 Objs IN {dbo.*}Pr4 Time BETWEEN 0800 – 1700Pr5 SQLCmd IN {Select}Pr7 SQLCmd IN {Insert,Update,Delete}
POLICY CONDITIONSC1 Pr1 ^ Pr7
C2 Pr2 ^ Pr4 ^ Pr6
40
RESPONSE ACTIONSACTION DESCRIPTIONCONSERVATIVE : LOW SEVERITYNOP No OPeration. This option can be used to filter
unwanted alarms.LOG The anomaly details are logged.ALERT A notification is sent.FINE-GRAINED : MEDIUM SEVERITYTAINT The request is audited.SUSPEND The request is put on hold till SUCCESSFUL
execution of a confirmation action.AGGRESSIVE : HIGH SEVERITYABORT The anomalous request is aborted.DISCONNECT The user session is disconnected.REVOKE A subset of user-privileges are revoked.DENY A subset of user-privileges are denied.
41
Response Policy Example
Re-authenticate unprivileged users who are logged from inside the organization’s internal network for write
anomalies to tables in the dbo schema. If re-authentication fails, drop the request and disconnect
the user else do nothing.ON ANOMALY DETECTIONIF Role != DBA and Source IP IN 192.168.0.0/16 and Obj Type = table and Objs IN dbo.* and SQLCmd IN {Insert,Update,Delete}THEN SUSPENDCONFIRM RE-AUTHENTICATEON SUCCESS NOPON FAILURE ABORT REQUEST, DISCONNECT USER
42
Policy Matching Problem
AA = {A1, A2, . . . , An} be the set of anomaly attributes.
POL ={Pol1, Pol2, . . . , Polk} be the set of response policies.
PR = {Pr1, Pr2, . . . , Prm} be the set of all distinct policy predicates.
Poli(C) be the policy condition for a policy Poli
AAS : A1 = a1, A2 = a2, . . . , An = an be the assessment of an anomaly submitted by the detection mechanism to the response system.
A policy Poli is said to match AAS if Poli(C) = ‘true’ evaluated over AAS.
The policy matching problem is to find the set of all policies in POL
that match a given anomaly assessment AAS.
43
Policy Matching Algorithm
S
A1
A2
A3
A4
Pr1
Pr2
Pr3
Pr4
Pr5
Pr6
Pol1
Pol2
Pol3
Pol4
Attribute Nodes Predicate Nodes Policy Nodes
44
RESPONSE ACTION SELECTION
MOST SEVERE POLICY
LEAST SEVERE POLICY
45
Implementation in PostgreSQL
New system catalogs for storing policy attributes, predicates and definition
Policy matching invoked for every query detected as anomalous
46
Policy Matching Overhead
12 24 36 42 60 72 84 96 108 120050
100150200250300350400450
Number of Predicates vs Policy Match-ing Overhead
Polic
y M
atch
ing
Ove
rhea
d (m
icro
seco
nds)
47
Cumulative Overhead – Real World
10 tables, 4 columns per table - under detection
24 distinct policy predicates
Quiplet Type Time (in microseconds)
Fine 130 + 20 = 150Medium 130 + 6 = 136
48
Fine Grained Intrusion Response
Goal: To change the access control system to support response actions such as
Request suspension
Request tainting
49
Privilege State Based Access Control (PSAC)
Attach a “state” parameter to every privilege assigned to a user or role
We have come up with the following 5 states DENY
SUSPENDTAINTGRANT
UNASSIGN
50
PSAC Details
“DENY” state supports negative authorizations
“SUSPEND” state supports request suspension
“TAINT” state supports request tainting
“GRANT” state supports standard SQL GRANT
“UNASSIGN” state supports standard SQL REVOKE
51
State Dominance Relationship
Xmeans ‘X’ overrides ‘Y’
DENY
SUSPEND
TAINT
UNASSIGN
GRANT
Y
52
State Transitions+
/
+
+
??
?
/
/
/
+ /+ grant
deny
? suspend
/
unassign
taint
?
+
TAINT
SUSPEND
DENY
GRANT REVOKE
?
53
Considering Role Hierarchies
Consider a role hierarchy based on privilege inheritance
Parent role inherits the privileges of children role
What about privileges in “deny”, “suspend” and “taint” states?
R_parent{insert}
R_child{select
}
{select}
54
Privilege Orientation Modes
R8
R5 R6 R7
R2 R3 R4
R1
down / neutral
deny | taint | suspend
up
unassigned | grant
55
Implementation in PostgreSQL
New SQL commands to support adhoc privilege state transitions
Additional Access Control Lists on the DB objects to support privilege states and orientation modes
Re-authentication procedure for a privilege in “suspend” state
56
Results on Overhead of PSAC Maintenance – No Role Hierarchy
16 32 64 128 256 5120
10
20
30
40
50
60
ACL Size vs Access Control Enforcement Overhead
BASEPSAC
Ove
rhea
d (m
icro
seco
nds)
57
Results on Overhead of PSAC Maintenance–With Role Hierarchy
16 32 64 128 256 5120
20
40
60
80
100
120
ACL Size vs Access Control Enforcement Overhead
BASEPSAC
Ove
rhea
d (m
icro
seco
nds)
58
Cumulative Overhead – Real World
10 tables, 4 columns per table - under detection
24 distinct policy predicates
Size of Acl = 16 (with role hierarchy)Quiplet Type Time (in
microseconds)Fine 130 + 20 + 22= 172Medium 130 + 6 + 22 = 158
59
Policy Administration
Assumptions The response policies maintained as
database objects
DBAs responsible for policy administration
The policies need to be created to monitor DBAs as well
Key Issue: How can a DBA be trusted to maintain policies to monitor
himself?
60
Organization Scenario
Medium size organizations
Few Database Administrators managing large number of databases
Each DBA has all privileges in the system
61
Joint Threshold Administration Model
Key IdeasA single DBA not trusted but threat
is mitigated if the trust is distributed among multiple DBAsA policy operation is “complete” when authorized by atleast “k”
DBAs
Use of threshold digital signatures to
maintain policy integrity
62
Threshold signatures
.
.
kl
Signature share
message
Secret key
Secret key
Secret key
Signature share
Signature share
FINAL SIGNATURE
PUBLIC KEY
VERIFY
63
PRACTICAL RSA THRESHOLD SIGNATURES – VICTOR SHOUP
Asynchronous signature share generation
Asynchronous signature share combining
Public signature shares
Efficient signature verification mechanism
64
JTAM Set Up
Trusted Dealer – DBMS
“l” = number of DBAs
For each k, 2 <= k <= l -1 Generate RSA public-private key pair “l” secret key shares using the private
key private key destroyed after set-up
65
Response Policy Lifecycle
ACTIVATED
SUSPENDED SUSPENDIN-PROGRESS
Create Response Policy… Joint Adm by
k Users;Authorize Response Policy
[Policy ID] Create;
(k-1)th time
Suspend Response Policy [Policy ID];
DROPIN-PROGRESS
Drop Response Policy [Policy ID] ;
Authorize Response Policy[Policy ID] Drop;
(k-1)th time
Authorize Response Policy [Policy ID] Suspend;
(k-1)th time
Alter Response Policy [Policy ID] …. ;
CREATED
DROPPED
66
Policy Creation Example
Create Response Policy [Policy Data] Jointly
Administered By k Users;
H(Pol) = SHA1 ( Policy ID, Conditions,Initial Action(s),Optional Action(s),
k, State).W(DBA1) : Signature share of DBA 1
PolID PolData k r hash sig shares
state final sig
… …. 3 2 H(Pol) W(DBA1)
CREATED
67
Policy Activation
Authorize Response Policy [Policy ID] Create;
PolID PolData k r hash sig shares
state final sig
… …. 3 1 H(Pol) W(DBA1); W(DBA3)
CREATED
PolID PolData
k r hash sig shares
state final sig
… …. 3 0 H(Pol) ACTIVATED
Wfinal
Policy authorization by DBA3
Policy authorization by DBA4
68
Attacks and Prevention
Signature Share Verification
Final Signature Verification
Malicious Policy Update
Malicious Policy Deletion
Signature Replay Attack
Policy Replay Attack
69
Implementation in PostgreSQL
Enhanced system catalogs for storing the signature shares and final signature
Currently, implemented signature verification for every policy during the policy matching procedure
A dedicated signature verification process (to be done)
70
Signature Verification Overhead
128 256 512 1024020406080
100120140160180
Num RSA bits vs Signature Verification Overhead
Veri
ficat
ion
Ove
rhea
d (m
icro
seco
nds)
71
Work in Progress
Implementation of Joint Administration of sensitive database commands such as “grant/revoke, user addition/modification“ POLICY 2010 demo paper
Public (open source) release of the all the techniques implemented
72
Thesis Related Publications - I1. Elisa Bertino, Ashish Kamra, Evimaria Terzi, Athena Vakali:
Intrusion Detection in RBAC-administered Databases. Applied Computer Security Applications Conference (ACSAC) 2005
2. Ashish Kamra, Evimaria Terzi, Elisa Bertino: Detecting anomalous access patterns in relational databases. Very Large DataBases (VLDB) Journal 2008
3. Ashish Kamra, Elisa Bertino, Rimma V. Nehme: Responding to Anomalous Database Requests. Secure Data Management (SDM) Workshop 2008
4. Ashish Kamra, Elisa Bertino: Design and Implementation of a Intrusion Response System for Relational Database. Transactions on Knowledge and Data Engineering (TKDE) 2010
73
Thesis Related Publications - II5. Ashish Kamra, Elisa Bertino: JTAM: A Joint
Threshold Administration Model. POLICY 2010 Demo Paper
6. Ashish Kamra, Elisa Bertino: Database Intrusion Detection and Response. Recent Advances in Intrusion Detection (RAID) 2008 Poster
7. Ashish Kamra, Elisa Bertino: Privilege State Based Access Control for Fine-Grained Intrusion Response. RAID 2010 - In Submission
74
Other Publications
1. Mercan Topkara, Ashish Kamra, Mikhail j. Atallah, Cristina Nita-Totaru: ViWiD : Visible Watermarking Based Defense Against Phishing. International Workshop on Digital Watermarking (IWDW) 2005
2. Ji-Won Byun, Ashish Kamra, Elisa Bertino, Ninghui Li: Efficient k -Anonymization Using Clustering Techniques. DASFAA 2007
3. Elisa Bertino, Ashish Kamra, James P. Early: Profiling Database Application to Detect SQL Injection Attacks. IPCCC 2007 – Invited Paper
4. Ashish Kamra: Mechanisms for Database Intrusion Detection and Response. SIGMOD PhD Workshop on Innovative Database Research IDAR 2008
5. Book Chapters1. Elisa Bertino, Ji-Won Byun, Ashish Kamra. Database Security. Security, Privacy, and
Trust in Modern Data Management 2007. Petkovic, Milan; Jonker, Willem (Eds.)2. Ashish Kamra, Elisa Bertino. Survey of Machine Learning Methods for Database
Security. Machine Learning in Cyber Trust 2009.
75
Work Experience
1. SQL Server Database Administrator at ITAP from Feb 2006 to Dec 2008.
2. Internship, EMC, Rainfinity – May to Jul 2007 Implemented a RBAC mechanism in their file virtualization product line.
3. Internship, EMC, NAS Engineering - May to Dec 2009 1. Security analysis of Network Filesystem version 4.1 protocol,2. Distributed File System Manageability.
4. Working Full-time in EMC’s Integrated Systems and Components group in the Unified Storage Division since Jan 11 2010.
76
77
BACKUP slides
78
M-estimate of probabilityP(Ai) = freq(A=a) + M(Pinitial(A=a)
total_count(A) + M
79
JTAM Set UpRSA Public Private Keys
The DBMS chooses p, q as two large prime numbers such that
p = 2p′ + 1 and q = 2q′ + 1
where p′ and q′ are themselves large primes
Let n = p * q be the RSA modulus. Let m = p′ * q′. TheDBMS also chooses e as the RSA public exponentsuch that e > l.
Thus, the RSA public key is PK = (n, e).
The server also computes the private key d $\in$ Z such that de = 1 mod m.
80
JTAM Set Up
Secret Shares For this purpose, the DBMS sets a0 = d and randomly assigns ai from {0, . . . ,m − 1} for 1 <= i
<= k − 1.
The numbers {a0 . . . ak−1} define the unique polynomial p(x) of degree k − 1
For 1 <= i <= l, the server computes the secret share, si, of each DBA, DBAi, as follows:
si = p(i) mod m.
81
Signature Share GenerationLet x = H(Pol).W(DBA1) = x2 s1
Where = l!
W(DBA1) does not leak any information about the secret share s1 because of theintractability of the generalized discrete logarithm problem