UCI Database Group
Privacy in Database-as-a-Service(DAS) ModelPrivacy in Database-as-a-Service(DAS) Model
Maithili Narasimha
UCI Database Group
OutlineOutline
– Introduction
– Motivation & Challenges for DAS model
– System Components
• NetDB2
– Protecting data from intruders
– Protecting data from service providers
– Future work
UCI Database Group
Software As a ServiceSoftware As a Service
• Advantages– reduced cost to client
• pay for what you use and not for hardware, software infrastructure or personnel to deploy, maintain, upgrade…
– reduced overall cost• cost amortization across
users
– better service• leveraging experts across
organizations
• Driving Forces– Faster, cheaper, more accessible
networks– Virtualization in server and storage
technologies– Established e-business
infrastructures
• Market Players– ERP and CRM (many examples)– More horizontal storage services,
disaster recovery services, e-mail services, rent-a-spreadsheet services etc.
– Sun ONE, Oracle Online Services, Microsoft .NET My Services etc
Better Service for Cheaper
UCI Database Group
OutlineOutline
– Introduction
– Motivation & Challenges for DAS model
– System Components
• NetDB2
– Protecting data from intruders
– Protecting data from service providers
– Future work
UCI Database Group
MotivationMotivation
If data storage can be offered as a service, why not the next higher value-added layer in data management ?
i.e.,
Can we outsource our databases?
UCI Database Group
Database As a ServiceDatabase As a Service
40%
51%
51%
57%
58%
0 10 20 30 40 50 60 70
% of respondents (Source: InfoWeek Research)
Platform Independence
Qualified Programmers
Compatibility
Qualified Administrators
Ease of Administration
Most Significant DB Execution Problems
• Why?– Most organizations need DBMSs
– DBMSs extremely complex to deploy, setup, maintain
– require skilled DBAs with high cost
• OfferingsService provider allows mechanisms to create, store, access databases
DB management transferred to service provider for backup, administration, restoration, space management, upgrades
Clients use the service providers HW, SW, personnel instead of hiring their own
BUT….
UCI Database Group
DAS System ComponentsDAS System Components
HTTP Server Database
(client Data)
Backup/RecoveryStandby System
Servlet Engine
Warm Standby
User
(Web Browser)
User
(Web Browser)
User
(Web Browser)
Service provider
Client
Create and load data, develop and install applications
Access data, catalogs, information
Internet
User
(Web Browser)
UCI Database Group
NetDB2 ServiceNetDB2 Service
• Developed by the UCI Database Group in collaboration with IBM
• Deployed on the Internet over a year ago
– Been used by 15 universities and more than 2500 students to learn database classes
4
2
3
1
UCI Database Group
ChallengesChallenges
• Economic/business model?– How to charge for service, what kind of service guarantees can be
offered, costing of guarantees, liability of service provider.
• Powerful interfaces to support complete application development environment– User Interface for SQL, support for embedded SQL programming,
support for user defined interfaces, etc.
• Scalability in the web environment– overheads due to network latency (data proxies?)
• Privacy and Security – Protecting data at service providers from intruders and attacks.
– Protecting clients from misuse of data by service providers
UCI Database Group
OutlineOutline
– Introduction
– Motivation & Challenges for DAS model
– System Components
• NetDB2
– Protecting data from intruders
– Protecting data from service providers
– Future work
UCI Database Group
Protecting Data From Intruders (ICDE 2002)Protecting Data From Intruders (ICDE 2002)
• Approach
– data stored at service provider in an encrypted form
• Issues and Challenges
– Key generation and management
• who generates and stores keys
– Granularity of encryption
• Attribute, row, page, table
– Implementation
• Encryption mechanisms
– Query Processing and Optimization
• optimal implementation of relational operators
UCI Database Group
OutlineOutline
– Introduction
– Motivation & Challenges for DAS model
– System Components
• NetDB2
– Protecting data from intruders
– Protecting data from service providers
– Future work
UCI Database Group
Protecting Data from Service ProviderProtecting Data from Service Provider
• Motivation– total data privacy
• Naïve approach– Store encrypted database with the service provider
– Transmit the requisite encrypted tables from the server to the client
– Decrypt the tables and execute the query at the client
Almost all the advantages of the DAS model are lost
UCI Database Group
The real challenge…The real challenge…
• How can the service provider execute a query without decrypting the data?
UCI Database Group
Protecting Data from Service Provider Protecting Data from Service Provider (SIGMOD 2002)(SIGMOD 2002)
Approach:• Server hosted by the service provider stores encrypted database
• The encrypted database is augmented with additional information (aka index) – this allows certain amount of query processing to occur at the server
• Client maintains metadata
Strategy:• Split the original query into
• A corresponding query over encrypted relations to run on the server
• A client query for post processing the results of the server query
UCI Database Group
Protecting Data from Service Provider Protecting Data from Service Provider (SIGMOD 2002)(SIGMOD 2002)
• Approach:– Query split into server side (Qs) and client side
(Qc)
– Qs executes at service provider on encrypted data
– Qc executes on client after decrypting
Temporary results
Query translator
Meta data
User
(Web Browser)
Query executor
Encrypteddata
Query executor
client service providerresults
Original query
Query over encrypted data
Encrypted results
UCI Database Group
Protecting Data from Service Provider Protecting Data from Service Provider (SIGMOD 2002)(SIGMOD 2002)
• For a relation R(A1, A2,…, An), Rs(etuple, A1, A2,
…, An ) is stored at the server
– etuple is the encrypted string that corresponds to a tuple in relation R
– each As corresponds to the index for the attribute A
s
s s
UCI Database Group
Partition Function & Identification FunctionPartition Function & Identification Function
• Map the domain of values of an attribute into partitions s.t.– these partitions taken together cover the whole domain &
– any two partitions do not overlap
Split the domain into a set of buckets
partition(R.Ai) = {p1, p2, …, pk}
• Assign an identifier to each bucket
identR.Ai(pj) for each partition pj of attribute Ai
UCI Database Group
Mapping FunctionMapping Function
• Map a value v in the domain of the attribute A to the identifier of the partition to which v belong
mapR.Ai(v) = identR.Ai
(pj) v is in pj
• Types:
– Order preserving• For any two values vi and vj, if vi < vj then
mapR.Ai(vi) < mapR.Ai
(vj)
– Random• Mapping is not order preserving
• Mapping function type affects query translation!– Order preserving mapping lends itself to easier query translation– However, random ordering is more secure
UCI Database Group
Mapping Conditions for Query translationMapping Conditions for Query translation
• Attribute = value• Ai = v Ai = mapAi
(v)
• Attribute < Value– order preserving
• Ai < v Ai ≤ mapAi(v)
– random• Translation is more complex. Need to check if the attribute value
representation Ai lies in any of the partitions that may contain a value v’ where v’ < v
• Attribute 1 = Attribute 2 (Join queries)• Ai = Aj (Ai = ident Ai
(pk) ) Λ (Aj = ident Aj (pl))
for all pk partition(Ai) , pl partition(Aj) , pk intersection pl ≠ Ø
• And so on …
s
s
s
s s
Post processing (filtering the results) is necessary!!
UCI Database Group
Some issuesSome issues
• Buckets (Equi-width vs. equi-depth):– # of buckets and
– # of elements in buckets
• Various overheads: – Metadata at client (fewer buckets lesser metadata)
– Amount of filtering (fewer buckets more filtering)
– Bandwidth consumed and
– Storage wasted
How are Security and Performance affected by these choices?
UCI Database Group
What next?What next?
• SIGMOD 2002 could execute SQL queries involving SELECT, JOIN, UNION, GROUP BY … – equality and logical comparison predicate clauses
What about “aggregation” queries??
UCI Database Group
Aggregation queriesAggregation queries
• A large fraction of queries require data aggregationArithmetic operations (sum, count, average etc.) on
encrypted data!
• Traditional symmetric encryption schemes are not useful
UCI Database Group
One possible solution – Privacy HomomorphismsOne possible solution – Privacy Homomorphisms
• PH Overview:
– A is the domain of unencrypted values, εk is an encryption
function using key k and Dk is the corresponding decryption
function.
– Let A = {α1, α2 , . . ., αn} and Β = {β1, β2 , . . ., βn } be two
function families. – (εk , Dk , A , Β) is defined as a PH if
Dk (βi (εk (a1 ), εk (a2 ), . . ., εk (am ))) = αi (a1, a2,… am )
for all i, 0 ≤ i ≤ 1
UCI Database Group
PH by Rivest et al.PH by Rivest et al.
– Setup• n = pq
– Encryption • εk(a) = (a mod p, a mod q) where a Zn
– Decryption• Dk (a) = d1qq-1 + d2pp-1 (mod n)
(d1= a mod p & d2= a mod q)
– Proof of correctness based on CRT
– PH works for modular addition, subtraction and multiplication
UCI Database Group
A small example…A small example…
• p = 5, q = 7 (n = 35)
• a1 = 5 & a2 = 6
• ε(a1) = ( 0, 5) ε(a2) = (1, 6) are stored on the server
• Compute (a1 + a2)
• Server computes ε(a1) + ε(a2) componentwise– (0+1, 5+6) = (1, 11)
• Client decrypts (1, 11) as d1qq-1 + d2pp-1 (mod n)
– (1.7.3 + 11.5.3) (mod 35) = 186 mod 35 = 11
UCI Database Group
OutlineOutline
– Introduction
– Motivation & Challenges for DAS model
– System Components
• NetDB2
– Protecting data from intruders
– Protecting data from service providers
– Future work
UCI Database Group
Future workFuture work
• Other homomorphic encryption schemes?• Paillier’s cryptosystem based on composite degree residuosity
• Benaloh’s cryptosystem based on prime residuosity
• scheme based on DLP
– Performance vs. Security offered
• These schemes need to be efficiently extended to other data-types (e.g., floats) as well as arbitrary sequence of arithmetic operations
• (e.g., SUM(A1+A2*100))
• What if an aggregation query has an associated selection clause? How to execute queries with complex selection conditions?– E.g., SUM salary where department_id > 35 and department_id < 45
UCI Database Group
ReferencesReferences
• Hakan Hacigumus, Bala Iyer, Chen Li, and Sharad Mehrotra "Executing SQL over Encrypted Data in the Database-Service-Provider Model", 2002 ACM SIGMOD Conference on Management of Data, Jun, 2002.
• Hakan Hacigumus, Bala Iyer, and Sharad Mehrotra "Providing Database as a Service", 2002 IEEE International Conference on Data Engineering (ICDE), Feb., 2002.
• Hakan Hacigumus, Bala Iyer, and Sharad Mehrotra "Efficient Execution of Aggregation Queries over Encrypted Relational Databases"
UCI Database Group
Thank You!
Thank You!