keyword search on encrypted data
DESCRIPTION
Keyword search on encrypted data. Keyword search problem. Linux utility: grep Information retrieval Basic operation Advanced operations – relevance analysis and ranking Search engines highly complicated problem. New settings. Search data in the cloud Filter encrypted emails - PowerPoint PPT PresentationTRANSCRIPT
Keyword search on encrypted data
Keyword search problem Linux utility: grep Information retrieval
Basic operation Advanced operations – relevance
analysis and ranking Search engines highly complicated problem
New settings Search data in the cloud Filter encrypted emails Privacy preserving log retrieval
Basic techniques
Symmetric encryptionPublic key encryption
Simple keyword matchingA little bit relevance evaluation
Secure keyword search with symmetric encryption Paper: Song 2000
•Seed is random, different for each Wi•Key idea: Li and Ri are self-verifiable •Advantage of XOR
How to set K?
Setting of ki Ki = Fk’(Wi), k’ is secret User publishes W and k = Fk’(W) Server checks CiW
whether <Li, Fk(Li)> == CiW It reveals nothing if Ci is not the ciphertext
for W. And Li is random for different Wi – server
cannot find any information from Li.
Hidden search In previous schemes, W is revealed
Weakness: each search will have to release k for W Easy to collect information
Solution: encrypt Wi with an private key, then xor with <Li, Fk(Li)>
Still weaknesses Wi encryption should be deterministic Access pattern is leaked Linear scan over the whole doc collection
Typical method for speedy keyword based search Using the “inverted index”
Word -> doc1:pos, doc2:pos,…
Or simply word -> doc1, doc2, …
However, inverted index reveals the word frequency
Recent developments Reza 2006
“Searchable symmetric encryption: improved definitions and efficient constructions”
Completely solved this problem, with a solution indistinguishability under chosen ciphertext attack (IND-CCA) Allow inverted index Hide word frequency
setup D – the set of documents {D1,…,Dn} max - the maximum number of
distinct words in a document Li – the list of document IDs that
contain the keyword w_i , plus some dummy entries to reach max
A – array contains all elements in Li (max * |D|)
T – table that contains the <wi, address of Li’s first node>)
Symmetric encryption function, encrypt words and document ids id(Dj) for wi entry is encoded as enc(wi||
j) to make indistinguishable
Pseudo-random function f Two pseudo-random permutation
functions : for mapping word to table entry : for mapping index to next node of Li
to the index of array A
Building the index table T
The key used to encryptthe node Ni,1
1.
2.
to random values of the same size of the existing entries
Generating Li
with Ki,0, We can decryptall nodes in the list
For the remaining max – |D(wi)| dummy nodes, store the doc id thatAlready appears in the first |D(wi)| entries. This can be done with the help of a look-up table I
Search Generate the trapdoor
Search
Property Each keyword search returns the
same number of encrypted document ids – the attacker cannot distinguish word frequency
Search public-key encrypted data Users who encrypt the data (with
public key) can be different from the owner of the private key
Cyclic group For example, if G =
{ g0, g1, g2, g3, g4, g5 } mod p is a group, then g6 = g0, and G is cyclic. p is the order g is the generator
Bilinear-map construction Two groups G1 G2 of prime order p A bilinear map : G1 X G1 -> G2 Properties: