database security and audit. databasics data is stored in form of files record : is a one related...
TRANSCRIPT
Database Security And Audit
Databasics Data is stored in form of files
Record : is a one related group of data (in a row) Schema : logical structure of database Subschema : a subset of the entire logical
structure Relation : a n-value tuple Attribute : names of the variables in the n-value
tuple Query : a command which generates a
subschema Select, project, join etc
Advantages of Databases Shared access : one uniform logical view of data
accessible to all users Minimal redundancy : to prevent users from
collecting/storing redundant data Data consistency : change in one value of data
is reflected throughout Data integrity : accidental or malicious
modifications are detected Controlled access : only authorized users are
given access to the dataHowever, these benefits create conflict when
security is imposed
Security Requirements of Databases Physical database integrity
Recover from power failures, disk crashes etc Logical database integrity
Use backups, restore points Special means to update records/ recover failed
transactions Element Integrity
Field checks (type, range, bound checks), change logs
Auditability Need to check who has made changes Incremental access to protected data; through which
data modifications can be tracked
Security Requirements of Databases… Access control
Not all data need to be given to all users Access control may be needed upto a
granularity of element level from schema, subschema, attribute levels
Users may infer other field values based on the access they get
Database access control needs to take size into consideration
User authentication Availability
Reliability and Integrity Measures in Databases Problem : Failure of a system during
data modification Solution : Two-phase update,
intuitively, do temporary computations and update at a later stage
Intent phase : prepare resources to make the update (many repetitions are ok)
Commit phase : write a commit flag indicating the end of Intent phase. Start the update process. Repeat if failure occurs.
Redundancy/Internal Consistency Measures Error Correction/Detection Codes :
compute over field values, records or over entire database. Use when deleting, retrieving or updating E.g., check sums, CRC codes
Duplicate copies of records to recover from errors If original copies were detected to be
corrupted
Concurrency Measures
Two users may want to update a record at the same time leading to an inconsistent view of the record The read-modify cycle should be
treated as an atomic operation Reading a record while it is being
updated can be solved by locking reads until updates are finished
Structural Integrity Measures Range comparisons : ensure that the values
entered are consistent with acceptable ranges E.g., day of a month cannot be more than 31
State constraints : system invariants that need to be satisfied throughout the database
E.g., uniqueness conditions Transition constraints : describe conditions
necessary to effect transition of a database E.g., Adding records needs to consider values of some
other records, like reducing in-stock quantity might require that in-stock value is higher than that ordered
Sensitive Data & Disclosure Problems Types of Sensitive data
Income, identity, description of missions Types of disclosure
Exact data Range bound : knowing if the field value lies between
known bounds Negative predicates : knowing if a record exists that
does not satisfy some conditions Existence : knowing if a record exists in the first place Probability : knowing a record with a certain
probability
Inference Problem
Def: using non-sensitive data to infer sensitive data
Inference techniques : direct & indirect Direct : get information using queries on
sensitive fields Indirect : Uses statistics of data to infer
individual value (data un-compression??) Sum, count, mean, median, trackers
Inference…sum
holmes
Grey Adams West
Male 5000 3000 3000 1500
Female
1500 0 1000 2000
Total 6500 3000 4000 3500
Inference…count
holmes Grey Adams West
Male 1 2 2 1
Female 2 0 1 1
Total 3 2 3
Inference…tracker Tracking : using additional queries that produce small results E.g., Try to find number of white females in a particular dorm The following query may be rejected
q=count((SEX=F) and (RACE=C) and (DORM=Holmes)) The result of the above is 1 and hence, DBMS rejects
But not the following : count (SEX=F) : value is 6 count ((SEX=F) and ((RACE not C) and (DORM not Holmes))) :
value is 5 Subtracting 6-5=1 gives us the desired values
More generally, queries can be constructed as a set of linear equalities. Solving the equalities reveals unknown individual values
Controls for Inference Suppression
Suppress low-frequency data items Query analysis
Concealing Combining results as ranges for example Random data perturbation for statistical
queriesMuch research has gone into inference
databases and more is forthcoming. Moreover, database inference suffers from collusion which is a more serious problem
Multi-level Security
Sensitivity of data is beyond “sensitive and non-sensitive”. There are several levels of sensitivity : Element level Record level Aggregate level Granularity-combination
Multi-level Security Measures Separation
Partitioning : create multiple databases, each with their own sensitivity levels
Encryption : encrypt records with a key unique to that sensitivity
Problems such as chosen plain text, corruption of records, malicious updates exist
Integrity lock and sensitivity locks Assign sensitivity levels to data items Encrypt the sensitivity levels Use cryptographic hashes to protect integrity
Multi-level Secure DB Design Integrity locks : use a trusted controller
between DBMS and data to control access Data is either encrypted or perturbed Secure but inefficient Subject to Trojan attacks
Trusted front end : Use existing DBMS with a trusted front end. Front end filters all the data user does not need to see Wastage of queries which result in large
amounts of data
Multi-level Secure DB Design Commutative Filters : Reformats the query so
that DBMS doesn’t retrieve too many records which are rejected by the trusted front end. Advantage is that some work is relegated to the DBMS (due to reformatting of a query into multiple other queries), keeping filter size small. Filtering can be done at :
Record level Attribute level Element level
Distributed databases : controls access to two or more DBMS with varying levels of sensitivity. Users’ queries are processed based on their access levels
Role-based Access Controls Different organizations give access to
users based on the roles they perform Least-privileges : only those permissions
required can be assigned to a role Separation : mutually exclusive roles can be
invoked to achieve a task Data abstraction: a role can be defined in
terms of more complex operations like edit, audit etc
Difference between groups and roles Groups are collection of users Roles are collection of users and permissions
RBAC and DBMS RBAC seems natural for DBMS to adopt Several commercial products support RBAC
MS Active directory, Oracle, Sybase etc Broad implementation features
User role assignment Support for role relationships and constraints Assignable privileges (Database level, table etc) Role-hierarchies (using lattice model)