how to keyword-search securely in cloud storage service kaoru kurosawa ibaraki university, japan...
TRANSCRIPT
How to Keyword-Search Securely in Cloud Storage Service
Kaoru Kurosawa Ibaraki University, Japan
ICISC 2014, Dec. 3-5, Chung-Ang University, Korea
Cloud Storage Serviceis now available
Service Provider
Amazon S3/Cloud Drive Amazon
Google Drive Google
OneDrive Microsoft
iCloud Apple
Dropbox Dropbox
and many more
We know that
• we should store encrypted documents.• Then, we cannot even do keyword search.
3
A Searchable Symmetric Encryption(SSE) scheme
• solves this problem.• It consists of a store phase and a search phase.
4
In the store phase,
• A client stores the encrypted files (or documents) and the encrypted Index on the server
Client Server
E(D1), , E(D⋯ N) E(Index)
5
In the search phase,
• The client sends an encrypted keyword to the server
Client Server
E(keyword)
6
The server somehow returns
• The encrypted files E(D3), E(D6), E(D10)
which contain the keyword
Client Server
E(keyword)
E(D3), E(D6), E(D10)
7
So the client can
• retrieve some of the encrypted files• which contain a specific keyword,• keeping the keyword secret
Client Server
E(keyword)
E(D3), E(D6), E(D10)
8
SSE has been studied by
• D.Song, D.Wagner, A.Perrig (2000)• Eu-Jin Goh (2003)• Golle, Staddon, Waters (2004)• Y.Chang and M.Mitzenmacher (2005)• Curtmola, Garay, Kamara and Ostrovsky (2006)• Peishun Wang, Huaxiong Wang, Josef Pieprzyk (2008)• Kamara, Papamanthou an Roeder (2012)• Cash, Jarecki, Jutla, Krawczyk, Rosu, Steiner (2013)• Cash and Tessaro (2014)
9
In this talk,
• UC-Secure Searchable Symmetric Encryption
• How to Update Documents Verifiably in Searchable Symmetric Encryption
• Garbled Searchable Symmetric Encryption
10
First
• UC-Secure Searchable Symmetric Encryption, Kaoru Kurosawa and Yasuhiro Ohtaki (FC 2012)
11
By Passive Attack
• A server tries to break the privacy:• she tries to find • the keyword and the documents
Client Server
E(keyword)
E(D3), E(D6), E(D10)
Malicious
12
By Active Attack• A server tries to break the reliability:• she tries to forge and delete some files,• or replace E(D3) with another E(D100).
Client Server
E(keyword)
E(D3), E(D6), E(D10)E(D100)
Malicious
13
Curtmola, Garay, Kamara and Ostrovsky (2006)
• showed a rigorous definition of security against passive attacks (privacy.)• They also presented a scheme which satisfies their definition.
14
At FC 2012
Privacy Curtmola et al.Reliability Our paperUC security Our paper
15
We studied
and proved that Privacy + Reliability = UC security
Curtmola et al.
keyword DocumentsAustin D3, D6, D10
Boston D8, D10
Washington D1, D4, D8
Showed an SSE scheme such as follows.
Consider the following “Index”
Index16
The client first constructs E(Index) • as follows.• He chooses a pseudorandom permutation π.
= E(Index)
17
π(1)π(2)π(3)
…
He next computes • π(Austin, 1), π(Austin, 2) and π(Austin, 3),• Writes the indexes (3, 6, 10) in these addresses
3
6
10
Address
π(Austin, 1)
π(Austin, 2)
π(Austin, 3) E(Index)
18
Do the same for each keyword
3
6
10
8
10
Address
π(Austin, 1)
π(Austin, 2)
π(Austin, 3)
π(Boston, 1)
π(Boston, 2)
E(Index)
19
In the store phase,
• The client stores this E(Index) and the ciphertext of each file to the server
Client Server
E(Index) E(D1), , E(D⋯ N)
20
In the search phase,
• The client sends a trapdoor information
Client Server
t(Austin)=( π(Austin, 1), π(Austin, 2), π(Austin, 3) )
3
6
10
8
10
E(Index)
21
The server findsthe corresponding indexes
Client Server
π(Austin, 1), π(Austin, 2), π(Austin, 3)
3
6
10
8
10
E(Index)22
and returns
Client Server
π(Austin, 1), π(Austin, 2), π(Austin, 3)
E(D3), E(D6), E(D10)
3
6
10
8
10
E(Index)23
This scheme
• Is secure against passive attacks.• But it is not secure against active attacks.
24
This scheme
• Is secure against passive attacks.• But it is not secure against active attacks.
• We will show how to make this scheme verifiable.
25
A naive approach is to add MAC to each E(Di)
Client Server
π(Austin, 1), π(Austin, 2), π(Austin, 3)
E(D3), MAC(E(D3)),E(D6), MAC(E(D6)),E(D10), MAC(E(D10))
The server returnsthese files together with their MACs 26
But a malicious server will
Client
π(Austin, 1), π(Austin, 2), π(Austin, 3)
E(D3), MAC(E(D3)),E(D6), MAC(E(D6)),E(D10), MAC(E(D10))
Malicious
Replace some pair with another pairof (file, MAC)
E(D100), MAC(E(D100))
27
The client cannot detect this cheating
Client
π(Austin, 1), π(Austin, 2), π(Austin, 3)
E(D3), MAC(E(D3)),E(D6), MAC(E(D6)),E(D10), MAC(E(D10))
Malicious
Because this is a valid pairof MAC
E(D100), MAC(E(D100))
28
In our verifiable scheme
π(Austin, 1)
So the server returns E(D3), Tag3=MAC(π(Austin, 1), E(D3))
We include π(Austin, 1) in the input of MAC
29
This method works
π(Austin, 1)
E(D3),
Tag3=MAC(π(Austin, 1), E(D3))
Because the MAC authenticates the whole communication
30
At the store phase,• The client writes such MAC values in E(Index)
3, tag3=MAC( π(Austin, 1), E(D3) )
6, tag6=MAC( π(Austin, 2) , E(D6) )
10, tag10=MAC( π(Austin, 3) , E(D10) )
π(Austin, 1)
π(Austin, 2)
π(Austin, 3)
E(Index)
31
For a query π(Austin, 1)E(Index)
π(Austin, 1)
π(Austin, 1)
The server returns E(D3) and Tag3
3, tag3=MAC( π(Austin, 1), E(D3) )
6, tag6=MAC( π(Austin, 2) , E(D6) )
10, tag10=MAC( π(Austin, 3) , E(D10) )
32
The client checks the validity of
π(Austin, 1)
tag3=MAC( π(Austin, 1), E(D3) )
E(D3)
33
We next consider
• the definition of security.• The security against active attacks consists of privacy and reliability• We define privacy similarly to Curtmola et al. as follows.
34
Minimum Leakage
In the store phase,
E(D1), , E(D⋯ N), E(Index)
the server learns |D1|, …, |DN| and |{keywords}|
35
In the search phase,
This means that the server knows the corresponding indexes {3, 6, 10}
For t(keyword),the server returns
t(keyword)
C(keyword)=( E(D3), E(D6), E(D10) )Tag
36
We call
these information• |D1|, …, |DN| and |{keywords}|• corresponding indexes {3, 6, 10}
The minimum leakage
37
The Privacy definition
• requires that the server should not be able to learn any more information
38
The Privacy definition
• requires that the server should not be able to learn any more information• To formulate this, we consider a real game and a simulation game
39
In the Real Game
D = {D1, …, DN}W={set of keywords}Index
Distinguisher
C= { E(D1), , E(D⋯ N) } I= E{ Index }
Challenger
40
In the search phase
keyword
Distinguisher
t(keyword)
Challenger
41
Repeat
keyword
Distinguisher
t(keyword)
Challenger
42
Finally
keyword
Distinguisher
t(keyword)
Challenger
b=0 or 1
43
In the Simulation Game
D = {D1, …, DN}W={set of keywords}Index
Distinguisher
Somehow computes the ciphertexts C= { E(D1), , E(D⋯ N) } I= E{ Index }
ChallengerSimulator
the minimum leakage|D1|, …, |DN| and |{keywords}|
44
In the search phase,
keyword
Distinguisher
Somehow computes t(keyword)
ChallengerSimulator
the minimum leakage {3, 6, 10}
45
Repeat
keyword
Distinguiher
Somehow computes t(keyword)
ChallengerSimulator
{3, 6, 10}
46
Finally
keyword
Distinguisher
t(keyword)
ChallengerSimulator
{3, 6, 10}
b=0 or 1
47
We say that
• Privacy is satisfied if• there exists a simulator such that
the real game ≈ the simulation game
48
This Def. of privacy
• Was given by Curtmola et al.
• But it looks artificial.• Who is the distinguisher ?
49
Server ? No. Client ? No.
D = {D1, …, DN}W={set of keywords}Index
Distinguisher
C= { E(D1), , E(D⋯ N) } I= E{ Index }
Challenger
50
This question will be resolved
• When we consider UC security.
• From a view point of UC security, this is a very natural Def. of privacy.• We will come back to this point later.
51
The client sends
t(keyword)
The honest server returns C(keyword)={E(D3), E(D6), E(D10)} Tag
Next Reliability
52
We say that
Reliability is satisfied if no server can forge (C(keyword)*, Tag*)such that C(keyword)* ≠ C(keyword)
53
By the way,
Even if a protocol Σ is secure in stand-alone,it may not be secure • if Σ is executed concurrently,
• Or if Σ is a part of a large protocol
Client 1
Client 2
Server
54
Σ
Σ
Universal Composability (UC)
Is a framework which guarantees that • A protocol Σ is secure• Even if it is executed concurrently, and• Even if it is a part of a large protocol
55
The notion of UC
• was introduced by Canetti.• He proved that UC-security is maintained under a general protocol composition.
56
We formulated the UC security
• of verifiable SSE scheme.• To do so, we defined the ideal functionality FvSSE
as follows.
57
In the ideal world,
dummyClient
Ideal Functionality
FvSSE
Environment
Z
D={D1, …, DN} W={set of keywords}Index
58
The dummy client relays them to FvSSE
dummyClient
Ideal Functionality
FvSSE
Environment
Z
D={D1, …, DN} W={set of keywords}Index
D={D1, …, DN} W={set of keywords}Index 59
FvSSE keeps them
dummyClient
Ideal Functionality
FvSSE
Environment
Z
D={D1, …, DN} W={set of keywords}Index
UC adversary
S
60
and sends the minimum leakage
dummyClient
Ideal Functionality
FvSSE
Environment
Z
D={D1, …, DN} W={set of keywords}Index
UC adversary
S
|D1|, …, |DN||{keywords}|
61
In the search phase
dummyClient
Ideal Functionality
FvSSE
Environment
Z
keyword
UC adversary
S
62
The dummy client relays it to FvSSE
dummyClient
Ideal Functionality
FvSSE
Environment
Z
keyword
keyword
UC adversary
S
63
FvSSE sends the minimum leakage
dummyClient
Ideal Functionality
FvSSE
Environment
Z
keyword
keyword
UC adversary
S
{3,6,10}
64
D={D1, …, DN} W={set of keywords}Index
The UC adversary S returns
dummyClient
Ideal Functionality
FvSSE
Environment
Z
keyword
keyword
UC adversary
S
{3,6,10} Accept or Reject
65
D={D1, …, DN} W={set of keywords}Index
If S returns Reject,
dummyClient
Ideal Functionality
FvSSE
Environment
Z
keyword
keyword
UC adversary
S
{3,6,10} Reject
66
FvSSE sends Reject to the dummy client
dummyClient
Ideal Functionality
FvSSE
Environment
Z
keyword
keyword
UC adversary
S
{3,6,10} Reject
Reject
67
The dummy client relays it to Z
dummyClient
Ideal Functionality
FvSSE
Environment
Z
keyword
keyword
UC adversary
S
{3,6,10} Reject
Reject
Reject
68
If S returns Accept,
dummyClient
Ideal Functionality
FvSSE
Environment
Z
keyword
keyword
UC adversary
S
{3,6,10} Accept
69
D={D1, …, DN} W={set of keywords}Index
FvSSE sends {D3,D6,D10}
dummyClient
Ideal Functionality
FvSSE
Environment
Z
keyword
keyword
UC adversary
S
{3,6,10} Accept
{D3,D6,D10}
70
D={D1, …, DN} W={set of keywords}Index
The dummy client relays them to Z
dummyClient
Ideal Functionality
FvSSE
Environment
Z
keyword
keyword
UC adversary
S
{3,6,10} Accept
{D3,D6,D10}
{D3,D6,D10}
71
This is an ideal world
Because(Correctness.) The dummy client receives {D3,D6,D10} correctly,
or outputs Reject.(Security.) The UC adversary S learns only the minimum leakage.
72
Further S can corrupt
dummyClient
Ideal Functionality
FvSSE
Environment
ZUC adversary
S
dummyServer
73
corruptcorrupt
Also Z can interact with S freely
dummyClient
Ideal Functionality
FvSSE
Environment
ZUC adversary
S
dummyServer
74
corruptcorrupt
Z finally outputs 0 or 1
dummyClient
Ideal Functionality
FvSSE
Environment
ZUC adversary
S
dummyServer
75
corruptcorrupt
In the real world
Client Server
Environment
Z
D={set of documents} W={set of keywords}Index
76
Client Server
Environment
Z
D, W, Index
Then the client and the server runthe store phase.
77
In the search phase
Client Server
Environment
Z
keyword
78
Client Server
Environment
Z
keyword The client and the server run the search phase
79
Then the client sends D3, D6, D10 to Z
Client Server
Environment
Z
keywordD3, D6, D10
80
An adversary A can corrupt
Client Server
Environment
ZAdversary
A
81
corruptcorrupt
Further Z can interact with A freely
Client Server
Environment
ZAdversary
A
82
corruptcorrupt
Z finally outputs 0 or 1
Client Server
Environment
ZAdversary
A
83
corruptcorrupt
We say that
• A verifiable SSE scheme is UC-secure if for any adversary A, there exists a UC-adversary S such that the real world ≈ the ideal world.
84
Equivalence
(Our Theorem) A verifiable SSE scheme is UC-secure if and only if it satisfies privacy and reliability
Herewe consider non-adaptive adversaries.
85
Proof
86
Client Server
Environment
ZAdversary
Akeyword Documents
Austin D3, D6, D10
Boston D8, D10
Washington D1, D4, D8
D, W,
In the real world,
The client sends
Client Server
Environment
ZAdversary
A
87
keyword Documents
Austin D3, D6, D10
Boston D8, D10
Washington D1, D4, D8
These ciphertexts E(D1), …, E(D10), E(Index)
D, W,
Suppose that the adversary A
Client Server
Environment
ZAdversary
A
88
keyword Documents
Austin D3, D6, D10
Boston D8, D10
Washington D1, D4, D8
E(D1), …, E(D10), E(Index)
corruptsD, W,
And sends these ciphertexts to Z
Client Server
Environment
ZAdversary
A
89
keyword Documents
Austin D3, D6, D10
Boston D8, D10
Washington D1, D4, D8
E(D1), …, E(D10), E(Index)
corrupts
E(D1), …, E(D10), E(Index)
D, W,
In the Real Game of Privacy
D, W, Index
Distinguisher
C= { E(D1), , E(D⋯ N) } I= E{ Index }
Challenger
90
In the UC framework, let
Client Server
Environment ZAdversary
A
91
E(D1), …, E(D10), E(Index)
corrupts
E(D1), …, E(D10), E(Index)
challenger
D, W, Index
distinguisher
Equivalent to the real game of privacy
Client Server
Environment ZAdversary
A
92
E(D1), …, E(D10), E(Index)
corrupts
E(D1), …, E(D10), E(Index)
challenger
D, W, Index
distinguisher
In the ideal world
dummyClient
Ideal Functionality
FvSSE
Environment
ZUC adversary
S
|D1|, …, |DN||{keywords}|
93
relay
D, W, Index
S must be able to send
dummyClient
Ideal Functionality
FvSSE
Environment
ZUC adversary
S
|D1|, …, |DN||{keywords}|
94
relay
E(D1), …, E(D10), E(Index)
D, W, Index
In the Simulation Game of Privacy
D = {D1, …, DN}W={set of keywords}Index
Distinguisher
Somehow computes C= { E(D1), , E(D⋯ N) } I= E{ Index }
ChallengerSimulator
the minimum leakage|D1|, …, |DN| and |{keywords}|
95
In the UC framework, let
dummyClient
Ideal Functionality
FvSSE
Environment Z UC adversary S
|D1|, …, |DN||{keywords}|
96
relay
E(D1), …, E(D10), E(Index)
challenger
D, W, Index
distinguisher simulator
Equivalent to the Sim. game of privacy
dummyClient
Ideal Functionality
FvSSE
Environment Z UC adversary S
|D1|, …, |DN||{keywords}|
97
relay
E(D1), …, E(D10), E(Index)
challenger
D, W, Index
distinguisher simulator
The proof of the equivalence
• proceeds in this way.
98
The proof of the equivalence
• proceeds in this way.
• At the first glance, the Def. of privacy looked artificial.• But as we have seen now, it is very natural from a view point of UC
99
Lesson
• SSE is a good example to understand the notion of UC security.
100
Theorem
• Our scheme satisfies privacy and reliability• if E is CPA secure and MAC is unforgeable
101
Corollary
• Our scheme is UC-secure.
102
Next
• How to Update Documents Verifiably in Searchable Symmetric Encryption,
Kaoru Kurosawa and Yasuhiro Ohtaki (CANS 2013)
103
Kamara, Papamanthou and Roeder (2012)
• showed a dynamic SSE scheme such that
the client can add, delete and modify the documents.
• However, their scheme is not verifiable.
Our contribution
Verifiabile DynamicCurtmola et al. X XOur FC 2012 scheme O XKamara et al. X OOur scheme of CANS 2013
O O
First we show
• A more efficient SSE cheme than Curtmola et al. and• A more efficient verifiable SSE scheme than our FC 2012 scheme
Consider this example
D1 D2 D3 D4 D5Austin 1 0 1 0 1Boston 0 1 0 1 0Washington
1 1 1 0 0
In our SSE scheme
E(D1) E(D2) E(D3) E(D4) E(D5)PRF(Austin) ( 1 0 1 0 1)PRF(Boston) ( 0 1 0 1 0)PRF(Washington)
( 1 1 1 0 0)
The client computes
where PRF means pseudorandom function.
and adds
E(D1) E(D2) E(D3) E(D4) E(D5)PRF(Austin) ( 1 0 1 0 1)PRF(Boston) ( 0 1 0 1 0)PRF(Washington)
( 1 1 1 0 0)
+PRF’(Austin)+PRF’(Boston)
+PRF’(Washington)
The client stores this table
E(D1) E(D2) E(D3) E(D4) E(D5)PRF(Austin) ( 1 0 1 0 1)PRF(Boston) ( 0 1 0 1 0)PRF(Washington)
( 1 1 1 0 0)
+PRF’(Austin)+PRF’(Boston)
+PRF’(Washington)
The server
In the search pahse,
E(D1) E(D2) E(D3) E(D4) E(D5)PRF(Austin) ( 1 0 1 0 1)PRF(Boston) ( 0 1 0 1 0)PRF(Washington)
( 1 1 1 0 0)
+PRF’(Austin)+PRF’(Boston)
+PRF’(Boston)
The client sends
The server decrypts (10101)
E(D1) E(D2) E(D3) E(D4) E(D5)PRF(Austin) ( 1 0 1 0 1)PRF(Boston) ( 0 1 0 1 0)PRF(Washington)
( 1 1 1 0 0)
+PRF’(Austin)+PRF’(Boston)
+PRF’(Boston)
and returns E(D1), E(D3) and E(D5)
E(D1) E(D2) E(D3) E(D4) E(D5)PRF(Austin) ( 1 0 1 0 1)PRF(Boston) ( 0 1 0 1 0)PRF(Washington)
( 1 1 1 0 0)
+PRF’(Austin)+PRF’(Boston)
+PRF’(Boston)
In our verifiable SSE scheme,
E(D1) E(D2) E(D3) E(D4) E(D5)PRF(Austin) ( 1 0 1 0 1)PRF(Boston) ( 0 1 0 1 0)PRF(Washington)
( 1 1 1 0 0)
+PRF’(Austin)+PRF’(Boston)
+PRF’(Boston)
the client stores this table
together with TagA=MAC( PRF(Austin), E(D1), E(D3), E(D5) ) TagB=MAC(PRF(Boston), E(D2), E(D4)) TagW=MAC(PRF(Washington), E(D1), E(D2), E(D3))
In our verifiable SSE scheme,
E(D1) E(D2) E(D3) E(D4) E(D5)PRF(Austin) ( 1 0 1 0 1)PRF(Boston) ( 0 1 0 1 0)PRF(Washington)
( 1 1 1 0 0)
+PRF’(Austin)+PRF’(Boston)
+PRF’(Boston)
the client stores this table
where TagA=MAC( PRF(Austin), E(D1), E(D3), E(D5) )
and so on
In the search phase,
E(D1), E(D3), E(D5 ), TagA
PRF(Austin) and PRF’(Austin)
The client accepts if
E(D1), E(D3), E(D5 ),
TagA=MAC(PRF(Austin), E(D1), E(D3), E(D5 ))
PRF(Austin) and PRF’(Austin)
Theorem
• The above verifiable SSE scheme satisfies privacy and reliability if E is CPA-secure, PRF and PRF’ are psuedorandom functions and MAC is unforgeable.
Now suppose that
E(D1) E(D2) E(D3) E(D4) E(D5)PRF(Austin) ( 1 0 1 0 1)PRF(Boston) ( 0 1 0 1 0)PRF(Washington)
( 1 1 1 0 0)
+PRF’(Austin)+PRF’(Boston)
+PRF’(Boston)
The client wants to modify D1 to D′1
D1 contains Austin and Washington
Therefore in the update phase
E(D1) E(D2) E(D3) E(D4) E(D5)PRF(Austin) ( 1 0 1 0 1)PRF(Boston) ( 0 1 0 1 0)PRF(Washington)
( 1 1 1 0 0)
+PRF’(Austin)+PRF’(Boston)
+PRF’(Boston)
the client must update E(D1) TagA
TagW
We want to do this more efficiently
• In the proposed scheme,• we break this part (PRF(Austin), E(D1), E(D3), E(D5))
down to (PRF(Austin), 1,3,5) (1, E(D1)) … (5, E(D5))
The client authenticates
• each piece separately
(PRF(Austin), 1,3,5) (1, E(D1)) … separately (5, E(D5))
The last problem is
• How to timestamp on these (1, E(D1))
… (5, E(D5))
Remember that the client wants to update files.
We can solve this problem by
• using any authentication scheme which has the timestamp functionality
such as– Merkle hash tree– Authenticated skip list– RSA accumulator (in this talk)
Letx1 = H(1, E(D1)) x2 = H(2, E(D2)) x3 = H(3, E(D3)) x4 = H(4, E(D4)) x5 = H(5, E(D5))
A = g mod N(=pq)x1 x2 x3 x4 x5
For simplicity, suppose that x1 ~ x5 are primes.Then the client computes
and keeps A.
In the search phase
Tag1 =MAC(PRF(Austin), 1,3,5)
y= gx2 ・ x4 mod N
(1,E(D1)), (3,E(D3)), (5,E(D5)),
PRF(Austin) and PRF’(Austin)
In the search phase
Tag1 =MAC(PRF(Austin), 1,3,5)
y= gx2 ・ x4 mod N
(1,E(D1)), (3,E(D3)), (5,E(D5)),
PRF(Austin) and PRF’(Austin)
The client verifies that Tag1 =MAC(PRF(Austin), 1,3,5) A= yx1 ・ x3 ・ x5 mod N
In the search phase
Tag1 =MAC(PRF(Austin), 1,3,5)
y= gx2 ・ x4 mod N
(1,E(D1)), (3,E(D3)), (5,E(D5)),
PRF(Austin) and PRF’(Austin)
The client verifies that Tag1 =MAC(PRF(Austin), 1,3,5) A= yx1 ・ x3 ・ x5 mod N ( = g x1 … x5 mod N )
In the update phase,
• To modify D1 to D1’
• the client sends only (1, E(D1’))
to the server.
He then updates A to
• where x1’= H(1, E(D1’))
A’= g mod N(=pq)x1’x2 x3 x4 x5
To delete D1
• Modify D1 to D1’=delete.
How to add files
• Please see the paper.
We defined the UC security
• of verifiable dynamic SSE schemes
We then proved that
• The proposed scheme is UC-secure against non-adaptive adversaries
• under the strong RSA assumption if
– E is CPA-secure– PRF and PRF’ are pseudorandom functions– H is a collision-resistant hash function
Finally
• Garbled Searchable Symmetric Encryption Kaoru Kurosawa (FC 2014)
135
So far,
• I have talked about single keyword search SSE schemes.
Next
• I will talk about multiple keyword search SSE schemes.
Golle, Staddon and Waters (2004)• showed a multiple keyword SSE scheme which has keyword fields.
From To SubjectD1 Keyword 1 Keyword 2 Keyword 4D2 Keyword 2 Keyword 1 Keyword 5D3 Keyword 3 Keyword 2 Keyword 6
Golle, Staddon and Waters (2004)• A client can specify at most one keyword in each keyword field.
From To SubjectD1 Keyword 1 Keyword 2 Keyword 4D2 Keyword 2 Keyword 1 Keyword 5D3 Keyword 3 Keyword 2 Keyword 6
In such a scheme, however,
• It’s hard to retrieve files which contain both Alice and Bob somewhere in the keyword fields
From To SubjectD1 Alice Bob Keyword 4D2 Bob Keyword 5 AliceD3 Keyword 3 Keyword 2 Keyword 6
Wang et al. (2008)
• Showed a keyword field free SSE scheme• But it works only for AND search.
Cash et al. (CRYPTO 2013)
• showed a keyword field free SSE scheme• which can support any search formula (in the random oracle model).
However,
• the search formula is revealed to the server and• the search phase requires 2 rounds.
Search formula
Search phase Search formula secrecy
Wang et al. only AND 1 round No
Cash et al. Any 2 rounds No
At FC 2014,
• I showed an SSE scheme such that even the search formula is kept secret.
Search formula
Search phase
Search formula secrecy
Wang et al. only AND 1 round NoCash et al. Any 2 rounds No Proposed Any 1 round Yes
Also,
• it can support any search formula and• the search phase requires only 1 round.
Search formula
Search phase
Search formula secrecy
Wang et al. only AND 1 round NoCash et al. Any 2 rounds No Proposed Any 1 round Yes
The proposed SSE scheme
• is based on Yao’s garbled circuit.
Yao (1982) constructed
• A secure two-party protocol by using• a garbled circuit and an oblivious transfer.
Alice Bob
GC + OT
x y
f(x,y)
Since then,
garbled circuits have found many applications: • multi-party secure protocols, • one-time programs,• KDM-security, • verifiable computation, • homomorphic computations• and others.
The proposed scheme
• is the first application of garbled circuits to SSE
A garbled circuit of f
• is an encoding garble(f) such that• one can compute f(X) • from garble(f) and label(X) without learning anything on f and X.
garble(f)label(X) f(X)
However, if
• garble(f) or label(X) is reused, then some information on (f, X) is leaked.
garble(f)label(X) f(X)
Recently
• Goldwasser et al. constructed a scheme such that garble(f) can be reused• I constructed a scheme such that label(X) can be reused and applied it to multiple keyword SSE
High level overview of the proposed scheme
w1 w2 w3
D1 1 1 1D2 1 0 0
keywords
files
Consider this example.
Let
w1 w2 w3
D1 (1 1 1)=X1
D2 (1 0 0)=X2
The client computes
w1 w2 w3
D1 label(X1)D2 label(X2)
The client also computes
PRF(w1) PRF(w2) PRF(w3)E(D1) label(X1)E(D2) label(X2)
and sends
PRF(w1) PRF(w2) PRF(w3)E(D1) label(X1)E(D2) label(X2)
Server
In the 1st search phase,
• Suppose that the client wants to search on f(w1,w2,w3)=w1 w⋀ 2 w⋀ 3
• He computes the garbled circuits of f: Γ1 for D1 and
Γ2 for D2.
PRF(w1), …, PRF(w3) Γ1
Γ2
counter=1
The client sends
PRF(w1), …, PRF(w3) Γ1
Γ2
counter=1
The server has this tablePRF(w1) PRF(w2) PRF(w3)
E(D1) label(X1)E(D2) label(X2)
PRF(w1), …, PRF(w3) Γ1
Γ2
counter=1
The server computes f(X1) fromPRF(w1) PRF(w2) PRF(w3)
E(D1) label(X1)E(D2) label(X2)
counter=1, label(X1) Γ1 f(X1)=1
garbled circuit
PRF(w1), …, PRF(w3) Γ1
Γ2
counter=1
Similarly she computes f(X2)PRF(w1) PRF(w2) PRF(w3)
E(D1) label(X1)E(D2) label(X2)
Γ2
counter=1 label(X2)
f(X2)=0
garbled circuit
The server returns E(D1)
Since f(X1)=1 and f(X2)=0,
In the 2nd search phase,
• Suppose that the client wants to search on g(w1,w2,w3)=w1 w⋁ 2 w⋁ 3
• He computes the garbled circuits of g: Δ1 for D1 and
Δ2 for D2.
PRF(w1), …, PRF(w3)Δ1
Δ2
counter=2
The client sends
and returns E(D1), E(D2)
The server computes g(X1)=g(X2)=1,
Note that
• label(X1) is reused for Γ1 and Δ1
label(X1)Γ1
Δ1
f(X1)=1
g(X1)=1
and
• label(X2) is reused for Γ2 and Δ2
label(X2)Γ2
Δ2
f(X2)=0
g(X2)=1
More details
Bellare et al. (2012)defined Kurosawa( 2014)
extended them togarbling schemes extended garbling
schemesInput-circuit privacy label reusable privacy
The difference is that
• counter is included • in the extended GC generation algorithm
(eGC.gen) and• in the extended GC evaluation algorithm
(eGC.eval)
XOR
AND
1
OR
4
2
3
This is a Boolean circuit f
1
4
2
3
This is the topological circuit f-
Label.gen algorithm chooses
• 2 random strings (vi0, vi
1) for each wire i• such that the lsbs are different: • lsb(vi
0) ≠ lsb(vi1)
XOR
AND v1
0, v11
OR
v20, v2
1
v30, v3
1
v40, v4
1
label(0000) is
XOR
AND v1
0, v11
OR
v20, v2
1
v30, v3
1
v40, v4
1
this vector.
label(1111) is
XOR
AND v1
0, v11
OR
v20, v2
1
v30, v3
1
v40, v4
1
this vector.
eGC.gen algorithm takes
XOR
AND v1
0, v11
OR
v20, v2
1
v30, v3
1
v40, v4
1
eGC.gen
counter
a boolean circuit fAll the strings
and outputs a garbled circuit Γ
XOR
AND v1
0, v11
OR
v20, v2
1
v30, v3
1
v40, v4
1
eGC.gen
counter
Γa boolean circuit fAll the strings
eGC.eval algorithm takes
v1
0
v20
v31
v41
eGC.eval
counter
the topological circuit f-label(0011),for example
GC Γ
and outputs f(0,0,1,1)
v1
0
v20
v31
v41
eGC.eval
counter
the topological circuit f-label(0011),for example
GC Γ
f(0,0,1,1)
Label reusable privacy (informal)
• Even if label(x1, …, xn) = (v1
x1, …, vnxn)
is reused for multiple garbled circuits Γ1, Γ2, …. ,
• no information on (x1, …, xn) and (f1,f2, … )
are leaked, where Γi is a garbled circuit of fi
Our construction
• of the extended garbling scheme which satisfies label reusable privacy is the same as the usual construction of the garbling scheme except for that counter is included in the hash function H.
For simplicity, consider f(x1,x2)
f(x1,x2)
v10, v1
1
v20, v2
1
Each input wire has two labels
eGC.gen algorithm
computes• y00=H(counter, v1
0, v20) f(⊕ 0,0)
• y01=H(counter, v10, v2
1) f(⊕ 0,1)
• y10=H(counter, v11, v2
0) f(⊕ 1,0)
• y11=H(counter, v11, v2
1) f(⊕ 1,1)
Note that
this part works as one-time pad
• y00=H(counter, v10, v2
0) f(⊕ 0,0)
• y01=H(counter, v10, v2
1) f(⊕ 0,1)
• y10=H(counter, v11, v2
0) f(⊕ 1,0)
• y11=H(counter, v11, v2
1) f(⊕ 1,1)
Roughly speaking,
• the garbled circuit Γ is a random permutation of (y00, …, y11).
y00=H(counter, v10, v2
0) f(0,0)⊕
y01=H(counter, v10, v2
1) f(0,1)⊕
y10=H(counter, v11, v2
0) f(1,0)⊕
y11=H(counter, v11, v2
1) f(1,1)⊕
More precisely
lsb(v10) lsb(v2
0) y00
lsb(v10) lsb(v2
1) y01
lsb(v11) lsb(v2
0) y10
lsb(v11) lsb(v2
1) y11
Construct this table
If lsb(v10)=0,
0 lsb(v20) y00
0 lsb(v21) y01
1 lsb(v20) y10
1 lsb(v21) y11
then the 1st column is
If lsb(v20)=1
0 1 y00
0 0 y01
1 1 y10
1 0 y11
then the 2nd column is
Then permute the rows in such a way that (00) ~ (11) appear here
0 0 y01
0 1 y00
1 0 y11
1 1 y10
The garbled circuit Γ is these 4 bits
eGC.eval algorithm takes
counter
eGC.eval
label(11)= (v11, v2
1)
y01 =H(counter,v10, v2
1) f(01)⊕
y00 =H(counter, v10, v2
0) f(00)⊕
y11 =H(counter, v11, v2
1) f(11)⊕
y10 =H(counter, v11, v2
0) f(10)⊕
T he garbled circuit Γ
Since lsb(v11)= 1 , lsb(v2
1)=0
counter
eGC.eval
y01 =H(counter,v10,v2
1) 0⊕
y00 =H(counter, v10, v2
0) 0⊕
y11 =H(counter, v11, v2
1) f(11)⊕
y10 =H(counter, v11, v2
0) 0⊕
00
01
10
11
look at the 3rd row of Γ(v1
1, v21)
Then we can compute f(1,1)from the given inputs
counter
eGC.eval
y01 =H(counter,v10,v2
1) 0⊕
y00 =H(counter, v10, v2
0) 0⊕
y11 =H(counter, v11, v2
1) f(⊕ 11)
y10 =H(counter, v11, v2
0) 0⊕
garbled circuit Γ
00
01
10
11f(1,1)
label(11)= (v11, v2
1)
Theorem
• The above construction satisfies label reusable privacy in the random oracle model
How to Apply Extended Garbling Scheme to Multiple Keyword SSE
w1 w2 w3
D1 e11=1 e12=1 e13=1D2 e21=1 e22=0 e23=0
Consider this example
The client computes
v110=AESk(1,1,0)
v111=AESk(1,1,1)
w1 w2 w3
D1 e11=1 e12=1 e13=1D2 e21=1 e22=0 e23=0
Since e11=1, let
v110=AESk(1,1,0)
v11= v111=AESk(1,1,1)
w1 w2 w3
D1 e11=1 e12=1 e13=1D2 e21=1 e22=0 e23=0
In this way,
w1 w2 w3
D1 v11=v111 v12=v12
1 v13=v131
D2 v21=v211 v22=v22
0 v23=v230
the client computes each entry of this table.
Let
w1 w2 w3
D1 (v11 v12 v13)=label(X1)D2 (v21 v22 v23)=label(X2)
Namely for D1,
The client generates these strings by using AES (v11
0, v111), (v12
0, v121), (v13
0, v131)
and
chooses each element of label(X1) from (111)
(v110, v11
1), (v120, v12
1), (v130, v13
1)
label(X1)=(v11, v22 , v33)
w1 w2 w3
D1 1 1 1D2 1 0 0
Similarly for D2,
The client generates these strings by using AES (v21
0, v211), (v22
0, v221), (v23
0, v231)
and
chooses each element of label(X2) from (100)
(v210, v21
1), (v220, v22
1), (v230, v23
1)
label(X2)=(v11, v22 , v33)
w1 w2 w3
D1 1 1 1D2 1 0 0
The client further computes
PRF(w1) PRF(w2) PRF(w3)E(D1) label(X1)E(D2) label(X2)
and sends
PRF(w1) PRF(w2) PRF(w3)E(D1) label(X1)E(D2) label(X2)
The server
After the store phase,
• The clients keeps only the secret keys of AES, E, PRF and PRF’.
• He remembers nothing other than these.
In the search phase,
• Suppose that the client searches on f(w1,w2,w3)=w1 w⋀ 2 w⋀ 3
For D1,
• the client re-generates these strings (v11
0, v111), …, (v13
0, v131)
by using AES in the same way as in the store phase.
f(w1,w2,w3)=w1 w⋀ 2 w⋀ 3
counter eGC.gen
and computes the garbled circuit Γ1
(v110, v11
1), …, (v130, v13
1)
Then the client runs eGC.gen on input
For D2,
• The client computes the garbled circuit Γ2
similarly
PRF(w1), …, PRF(w3) Γ1
Γ2
The topological circuit f- and counter
The client sends
The server has this table
PRF(w1) PRF(w2) PRF(w3)E(D1) label(X1)E(D2) label(X2)
The server runs eGC.eval on input
eGC.eval
and computes z1=f(X1)
label(X1)
the garbled circuit Γ1
the topological circuit f -counter
E(D1) if z1=1
The server returns
The same for D2
Theorem
In the proposed scheme,if the underlying extended garbling scheme satisfies label reusable privacy
Then only the following information is leaked to the server(other than the minimum leakage)
• The topological circuit f- • (π(j1), …, π(jc)),
where π is a random permutation and {wj1, …, wjc} are the queried keywords
In the scheme of Cash et al. (2013)
If 「 Japan AND Crypto 」 is searched,the following information is leaked to the server
the search formula = AND the search result of Japan or that of Crypto and some more information ( see Sec.5.3 of their paper )
Communication overheadof the proposed scheme
• Let m = # of files c = # of search keywords s = # of gates of f• In the search phase, the com. overhead is |counter|+(c+4m(s-1))×128+4m bits
If # of search keywords is 2
• The communication overhead is |counter|+256+ 4× ( # of files ) bits
Computer simulation
• We used a computer such as follows. 2.4GHz CPU and 32G byte RAM OS = CentOS 6.5 C++ and NTL library
• The total # of keywords is 20.• We generated Index randomly
The running time of the clientin the search phase
The running time of the serverin the search phase
In the proposed SSE scheme,Search
formulaSearch phase
Search formulasecrecy
Wang et al.(2008)
Only AND
1 round ---
Cash at al.(CRYPTO 2013)
Any 2 rounds leaked
Kurosawa(FC 2014)
Any 1 round secret
Summary
• UC-Secure Searchable Symmetric Encryption
• How to Update Documents Verifiably in Searchable Symmetric Encryption
• Garbled Searchable Symmetric Encryption
224
Thank you !