Outsourcing Database Services
Đỗ Phước Hoàng Tường Lân 50701257Nguyễn Minh Thông 50702368Lê Tuấn Đạt 50700487
Outline
• Service providers models for ODBS
• Two extreme security protocols
• Balancing Security and Efficiency
Introductions
Advances in networking Tech. & growth of the Internet -> New trend towards outsourcing data management (musics, videos, pictures…).
Without exception, Outsourcing Database Services has been being developed incessantly.
Introductions (cont.)
Commercial companies and especially Research community use the Outsourcing Database Services (ODBS) models to store, maintain and retrieve their data. WHY ?
DBMSs
Data owners
Clients
Store & Retrieve
RetrievePay for the services
Introductions (cont.)
Reduce cost in purchasing softwares and hardwares infrastructure.
Needn’t pay for deployment, maintenance and upgrade the system
….
The Reasons :
* But every coins has two sides *
Introductions (cont.)The Problems :Security !!!
Client stores its private data at an external service provider who is typically not fully trusted.
How is clients’ private data protected against sophisticated attackers?
How clients can operate on their outsourced data without worrying about leak of their sensitive information ?
Introductions (cont.)
Security requirements :
Data confidentiality User privacy Data privacy Authentication and data integrity
Data confidentiality : Outsiders and even the server’s operators (database administrators) are not able to see the client’s outsourced data contents in any case
User privacy : Clients do not want the server to know about their queries and the returned results
Data privacy : Clients are not allowed to get more information than what they are querying on the server.
Authentication and data integrity : Clients must be ensured that data returned from the untrusted server has originated from the data owner and has not been tampered with
Example :An organization M has a DNA database containing patterns about various diseases. M stores these DNA patterns on a database server DB and allows a client A to access the database to get information with respect to A’s DNA sequence
Organization M
DNA database(Untrusted server)
Outsourcing(Data confidentiality problem)
Client AQueries
(User privacy problem)
(Data privacy problem)
Responses
The need of data confidentiality, data or user privacy depends on particular scenarios in the ODBS model and this must be considered carefully.
Example : DB is hired just for M to use, i.e. client A is M itself and M is outsourcing its database services only to make use of the advantages of the ODBS model, then, although the data privacy is unnecessary in this case, neglecting the user privacy as mentioned above may potentially lead to expose the outsourced data to danger, even if they have been encrypted
Service providers models for ODBS
Service providers models for ODBS
Four mains service provider (SP) model : UP-DP model (User Privacy – Data Privacy) UP-nDP model (User Privacy – not Data
Privacy) DC-UP model (Data Confidentiality – User
Privacy) DC-UP-DP model (Data Confidentiality – User
Privacy – Data Privacy)
UP-DP model : Data owners are also the SPs. They sell information and charge clients for using their services. The sold information is important and thus the SP is concerned about the data privacy. In this model, the client is concerned about the user privacy.
UP-nDP model : Similarly to the UP-DP model, data owners here are also the SPs and they charge clients only for using their services, but the stored data is public. In this model, the client is also concerned about the user privacy, but the SP is not concerned about the data privacy.
DC-UP model: Data owners are also unique clients and their data is outsourced to the external database server. In this model the data owner (also the client) is only concerned about the data confidentiality and the user privacy
UP-nDP model : Data owners outsource their data and charge clients for using their data information. The data owner is concerned about both the data confidentiality and data privacy .The client, in turn, is concerned about the user privacy. Moreover, the data owner also takes the client role when accessing its outsourced data on the server and, in this case, the data owner is concerned about the user privacy as well.
* Each SP model requires different security objectives and thus different security techniques/ protocols have been invented to satisfy these objectives
For User privacy :
Private information retrieval (PIR) protocol First introduce by Chor and colleagues. Allow a client to access a database without
revealing to the server both the query and the returned result
Database
Data server
Clients
Query (i-th record)
Response (N-record[i])
i-th is hidden from the server
*Not supporting Writing operations*Hight IO-cost
Private information storage (PIS) protocol Support the writing operations privately
Repudiative information retrieval(PIS) protocol
Preserve the user privacy but with a better IO-cost
Reduce IO-cost from O(NlogN) -> O(sqrt(N))User anonymity
Extreme protocol
For Data privacy :Symmetrically private information retrieval (SPIR) protocol
Built on the basis of any PIR protocol with the aim to satisfy both user and data privacy requirements
Extreme protocol
For Data confidentiality :
Index of range, Hashbased methods
Extreme protocol
Proposed by Hacigümüs and team. Relies on partitioning of the domains of client
tables’ attributes into sets of intervals, is suitable for both exact match and range queries
For Authentication and data integrity :Merkle hash tree’s
Use a binary search trees to construct authenticated dictionary structures.
Data Index problems: Nowaday, data are build structurally for the
convenience of insertion, retrieval and removal of records
Tree-based structures are used, B-tree,B+tree…
An Example of B+-tree on Attribute CustomerName.
John
Bob Ha Rose Trang
TrangRose Sam
John Linh
Alice Anne
Bob Carol
Ha
0
1 2
4 5 6 7 8
Base on the sequence of nodes be accessed, the user will get more information showing => Data privacy problem.
Using the this sequence of nodes, by some data mining techniques, server can infer some sensitive information => Data confidentiality problem.
*To solve those security problems, Lin and Candan introduced new techniques to access outsourced tree nodes, called access redundancy and node swapping.
Without employing special security hardware equipment, we can’t prevent attacks made by exploiting outsourced tree-based index structures. How can we deal with this ?
Two Extreme Security Protocols
Two Extreme Security Protocols
In order to protect the outsourced data from possible intruders we encrypt the data prior to outsourcing
NID Node
012345678
(1,John,2,-,-1) (3,Bob,4,Ha,5) (6,Rose,7,Trang,8)(Alice,Anne,4) (Bob,Carol,5) (Ha,-,6) (John,Linh,7) (Rose,Sam,8) (Trang,-,-1)
NID Encryped Node
012345678
D0a1n2g3Kh75nhs& T9&8ra§ÖÄajh³q91 H&$uye”µnÜis57ß@ L?{inh*ß²³&§gnaD Wh09a/[%?Ö*#Aj2k j8Hß}[aHo$§angµG #Xyi29?ß~R@€>Kh ~B³!jKDÖbd0K3}%§ T-§µran&gU19=75m
B+ Table
B+ Encrypted Table
Two Extreme Security Protocols (cont.)
DC-UP model’s formula
DC+UP = Encryption + PIR Protocol
For private reading and writing operationsDC+UP = Encryption + PIS
Protocol
Two Extreme Security Protocols (cont.)DC-UP-DP model
A trusted third-party will be concerned like a secure coprocessors
DB
MA
K root
i-node i-node i-node
leaf leaf leaf leaf leaf
A Security Protocol for the DC-UP-DP Model
Balancing Security and Efficiency
Balancing Security and Efficiency
What is Efficiency ?
• CPU cost• IO cost• Memory cost
DC-UP (and DC-UP-DP as well) model -> PIR Protocol’s efficiency
* The questions is : How will the client’s queries be performed effectively, efficiently and obliviously over encrypted data without revealing any information about both data and queries to unauthorized people?
Balancing Security and Efficiency (cont.)
Modifications for the DC-UP Model
* RIR is used instead of PIR for the goal of reducing cost
DC + UP = Encryption + RIR protocol
* Similarly to PIS, the replacement is RIS
DC + UP = Encryption + RIS protocol
Balancing Security and Efficiency (cont.)
Modifications for the DC-UP Model (cont.)
* In order to support the oblivious search on a single outsourced search tree, new two techniques called Access redundancy and Node swapping was developed. PIR/RIR-like protocols with these techniques can be used for DC-UP model
Access redundancy• This technique requires that whenever a
client accesses a node, called target node, she asks for a set of m-1 randomly selected nodes in addition to the target node from the server.
• Better performance but worse security level. So what is the weakness ?
Node swapping
• Overcome the former technique’s weakness.
• There’s still limitations and weakness.
Balancing Security and Efficiency (cont.)
Modifications Related to the DC-UP-DP Model• All possible modifications for the DC-UP
model can also be suitably applied to the DC-UP-DP model
• DB now stores only leaf nodes of the tree.• It is not necessary to encrypt the meta-data
stored at K
ConclusionsODBS model Outsourcing data Depending on external service providers (untrusted) Dealing with securities issues A trade-off between Security and Efficiency.
Sources• Paper of Mr. Dang Tran Khanh• Wikipeadia