![Page 1: Lecturer: Moni Naor Joint work with Cynthia Dwork Foundations of Privacy Informal Lecture Impossibility of Disclosure Prevention or The Case for Differential](https://reader035.vdocument.in/reader035/viewer/2022062407/56649d155503460f949ea521/html5/thumbnails/1.jpg)
Lecturer: Moni NaorJoint work with Cynthia Dwork
Foundations of PrivacyInformal Lecture
Impossibility of Disclosure Preventionor
The Case for Differential Privacy
![Page 2: Lecturer: Moni Naor Joint work with Cynthia Dwork Foundations of Privacy Informal Lecture Impossibility of Disclosure Prevention or The Case for Differential](https://reader035.vdocument.in/reader035/viewer/2022062407/56649d155503460f949ea521/html5/thumbnails/2.jpg)
Let’s Talk About Sex
Better Privacy Means Better Data
![Page 3: Lecturer: Moni Naor Joint work with Cynthia Dwork Foundations of Privacy Informal Lecture Impossibility of Disclosure Prevention or The Case for Differential](https://reader035.vdocument.in/reader035/viewer/2022062407/56649d155503460f949ea521/html5/thumbnails/3.jpg)
Private Data Analysis• Simple Counts and Correlations
– Was there a significant rise in asthma emergency room cases this month?– What is the correlation between new HIV infections and crystal meth usage?
• Holistic Statistics– Are the data inherently low-dimensional?
• Collaborative filtering for movie recommendations
• Beyond Statistics: Private Data Analysis– How far is the (proprietary) network from bipartite?
…while preserving privacy of individuals
![Page 4: Lecturer: Moni Naor Joint work with Cynthia Dwork Foundations of Privacy Informal Lecture Impossibility of Disclosure Prevention or The Case for Differential](https://reader035.vdocument.in/reader035/viewer/2022062407/56649d155503460f949ea521/html5/thumbnails/4.jpg)
Different from SFESecure Function Evaluation • participants collaboratively compute a function f of
their private inputs • E.g., = sum(a,b,c, …)
– Each player learns only what can be deduced from and her own input to f
• Miracle of Modern Science!• SFE does not imply privacy!
– Privacy only ensured “modulo f”• if and a yield b, so be it.
![Page 5: Lecturer: Moni Naor Joint work with Cynthia Dwork Foundations of Privacy Informal Lecture Impossibility of Disclosure Prevention or The Case for Differential](https://reader035.vdocument.in/reader035/viewer/2022062407/56649d155503460f949ea521/html5/thumbnails/5.jpg)
Cryptographic Rigor Applied to Privacy
• Define a Break of the System– What is a “win” for the adversary?– May settle for partial information
• Specify the Power of the Adversary– Computational power? “Auxiliary” information?
• Conservative/Paranoid by Nature– All breaks are forever– Protect against all feasible attacks
![Page 6: Lecturer: Moni Naor Joint work with Cynthia Dwork Foundations of Privacy Informal Lecture Impossibility of Disclosure Prevention or The Case for Differential](https://reader035.vdocument.in/reader035/viewer/2022062407/56649d155503460f949ea521/html5/thumbnails/6.jpg)
Dalenius, 1977• Anything that can be learned about a respondent from the
statistical database can be learned without access to the database– Captures possibility that “I” may be an extrovert– The database doesn’t leak personal information– Adversary is a user
• Analagous to Semantic Security for Crypto– Anything that can be learned from the ciphertext can be learned
without the ciphertext– Adversary is an eavesdropper
Goldwasser-Micali 1982
![Page 7: Lecturer: Moni Naor Joint work with Cynthia Dwork Foundations of Privacy Informal Lecture Impossibility of Disclosure Prevention or The Case for Differential](https://reader035.vdocument.in/reader035/viewer/2022062407/56649d155503460f949ea521/html5/thumbnails/7.jpg)
Outline
• The Framework• A General Impossibility Result
– Dalenius’ goal cannot be achieved
• The Proof– Simplified– General case
![Page 8: Lecturer: Moni Naor Joint work with Cynthia Dwork Foundations of Privacy Informal Lecture Impossibility of Disclosure Prevention or The Case for Differential](https://reader035.vdocument.in/reader035/viewer/2022062407/56649d155503460f949ea521/html5/thumbnails/8.jpg)
Two Models
Database Sanitized Database
?San
Non-Interactive: Data are sanitized and released
![Page 9: Lecturer: Moni Naor Joint work with Cynthia Dwork Foundations of Privacy Informal Lecture Impossibility of Disclosure Prevention or The Case for Differential](https://reader035.vdocument.in/reader035/viewer/2022062407/56649d155503460f949ea521/html5/thumbnails/9.jpg)
Two Models
Database
Interactive: Multiple Queries, Adaptively Chosen
?San
![Page 10: Lecturer: Moni Naor Joint work with Cynthia Dwork Foundations of Privacy Informal Lecture Impossibility of Disclosure Prevention or The Case for Differential](https://reader035.vdocument.in/reader035/viewer/2022062407/56649d155503460f949ea521/html5/thumbnails/10.jpg)
Auxiliary Information
Common theme in many privacy horror stories: • Not taking into account side information
– Netflix challenge: not taking into account IMDB [Narayan-Shmatikov]
![Page 11: Lecturer: Moni Naor Joint work with Cynthia Dwork Foundations of Privacy Informal Lecture Impossibility of Disclosure Prevention or The Case for Differential](https://reader035.vdocument.in/reader035/viewer/2022062407/56649d155503460f949ea521/html5/thumbnails/11.jpg)
Not learning from DBWith access to the database Without access to the database
San A
Auxiliary Information
San A’
Auxiliary Information
DB DB
There is some utility of DB that legitimate user should be able to learn
• Possible breach of privacy• Goal: users learn the utility without the breach
![Page 12: Lecturer: Moni Naor Joint work with Cynthia Dwork Foundations of Privacy Informal Lecture Impossibility of Disclosure Prevention or The Case for Differential](https://reader035.vdocument.in/reader035/viewer/2022062407/56649d155503460f949ea521/html5/thumbnails/12.jpg)
Not learning from DBWith access to the database Without access to the database
San A
Auxiliary Information
San A’
Auxiliary Information
DB DB
Want: anything that can be learned about an individual from the statistical database can be learned without access to the database
• 8 DD 8 A 9 A’ whp DB 2R D 8 auxiliary information z |Prob [A(z) $ DB wins] – Prob[A’(z) wins]| is small
![Page 13: Lecturer: Moni Naor Joint work with Cynthia Dwork Foundations of Privacy Informal Lecture Impossibility of Disclosure Prevention or The Case for Differential](https://reader035.vdocument.in/reader035/viewer/2022062407/56649d155503460f949ea521/html5/thumbnails/13.jpg)
Illustrative Example for ImpossibilityWant: anything that can be learned about a respondent from the
statistical database can be learned without access to the database
• More Formally 8 D 8 A 9 A’ whp DB 2R D 8 auxiliary information z |Probability [A(z) $ DB wins] – Probability [A’(z) wins]| is small
Example:• Aux z = “Kobi Oz is 10 cm shorter than average in DB”
– A learns average height in DB, hence, also Kobi’s height– A’ does not
• Impossibility Requires Utility – Mechanism must convey info about DB
• Not predictable by someone w/o access• “Hint generator” and A share secret, unknown to A’
![Page 14: Lecturer: Moni Naor Joint work with Cynthia Dwork Foundations of Privacy Informal Lecture Impossibility of Disclosure Prevention or The Case for Differential](https://reader035.vdocument.in/reader035/viewer/2022062407/56649d155503460f949ea521/html5/thumbnails/14.jpg)
Defining “Win”: The Compromise Function
Notion of privacy compromise
Compromise?
y
0/1
Adv
DB DD
Privacy breach
Privacy compromise should be non trivial:
•Should not be possible to find privacy breach from auxiliary information alone
Privacy breach should exist:
•Given DB there should be y that is a privacy breach
•Should be possible to find y
![Page 15: Lecturer: Moni Naor Joint work with Cynthia Dwork Foundations of Privacy Informal Lecture Impossibility of Disclosure Prevention or The Case for Differential](https://reader035.vdocument.in/reader035/viewer/2022062407/56649d155503460f949ea521/html5/thumbnails/15.jpg)
Additional Basic Concepts• Distribution on (Finite) Databases DD
– Something about the database must be unknown– Captures knowledge about the domain
• E.g., rows of database correspond to owners of 2 pets• Privacy Mechanism San(DD, DB)
– Can be interactive or non-interactive– May have access to the distribution D
• Auxiliary Information Generator AuxGen(DD, DB)– Has access to the distribution and to DB– Formalizes partial knowledge about DB
• Utility Vector w– Answers to k questions about the DB– (Most of) utility vector can be learned by user– Utility: Must inherit sufficient min-entropy from source D
![Page 16: Lecturer: Moni Naor Joint work with Cynthia Dwork Foundations of Privacy Informal Lecture Impossibility of Disclosure Prevention or The Case for Differential](https://reader035.vdocument.in/reader035/viewer/2022062407/56649d155503460f949ea521/html5/thumbnails/16.jpg)
Impossibility Theorem
Fix any useful* privacy mechanism San and any reasonable privacy compromise decider C. Then There is an auxiliary info generator AuxGen and an adversary
A such that for “all” distributions DD and all adversary simulators A’
Pr[A(D, San(D,DB), AuxGen(D, DB)) wins] - Pr[A’(D, AuxGen(D, DB)) wins] ≥ for suitable, large,
The probability spaces are over choice of DB 2R D D and the coin flips of San, AuxGen, A, and A’
Tells us information we did not know
To completely specify need assumption on the entropy of utility vector
![Page 17: Lecturer: Moni Naor Joint work with Cynthia Dwork Foundations of Privacy Informal Lecture Impossibility of Disclosure Prevention or The Case for Differential](https://reader035.vdocument.in/reader035/viewer/2022062407/56649d155503460f949ea521/html5/thumbnails/17.jpg)
Strategy• The auxiliary info generator will provide a hint that
together with the Utility Vector w will yield the privacy breach.
• Want AuxGen to work without knowing D just DB– Find privacy breach y and encode in z– Make sure z alone does not give y. Only with w
• Complication: is the utility vector w– Completely learned by the user?– Or just an approximation?
![Page 18: Lecturer: Moni Naor Joint work with Cynthia Dwork Foundations of Privacy Informal Lecture Impossibility of Disclosure Prevention or The Case for Differential](https://reader035.vdocument.in/reader035/viewer/2022062407/56649d155503460f949ea521/html5/thumbnails/18.jpg)
Entropy of Random Sources• Source:
– Probability distribution X on {0,1}n.– Contains some “randomness”.
• Measure of “randomness”– Shannon entropy: H(X) = - ∑ x Γ Px (x) log Px (x)
Represents how much we can compress X on the average
But even a high entropy source may have a point with prob 0.9
– min-entropy: Hmin(X) = - log max x Γ Px (x)
Represents the most likely value of X
{0,1}n
![Page 19: Lecturer: Moni Naor Joint work with Cynthia Dwork Foundations of Privacy Informal Lecture Impossibility of Disclosure Prevention or The Case for Differential](https://reader035.vdocument.in/reader035/viewer/2022062407/56649d155503460f949ea521/html5/thumbnails/19.jpg)
Min-entropy
• Definition: X is a k-source if Hmin(X)¸ k.
i.e. Pr[X=x] · 2-k for all x• Examples:
– Bit-fixing: some k coordinates of X uniform, rest fixed• or even depend arbitrarily on others.
– Unpredictable Source: 8 i2[n], b1, ..., bi-12 {0,1},
k/n· Prob[Xi =1| X1, X2, … Xi-1= b1, ..., bi-1] · 1-k/n
– Flat k-source: Uniform over S µ {0,1}n, |S|=2k
• Fact every k-source is convex combination of flat ones.
![Page 20: Lecturer: Moni Naor Joint work with Cynthia Dwork Foundations of Privacy Informal Lecture Impossibility of Disclosure Prevention or The Case for Differential](https://reader035.vdocument.in/reader035/viewer/2022062407/56649d155503460f949ea521/html5/thumbnails/20.jpg)
Extractors Universal procedure for “purifying” an imperfect sourceDefinition: a function Ext: {0,1}n £ {0,1}d {0,1}m
is a (k,)-extractor if:
8 k-sources X, Ext(X, Ud) is -close to Um.
d random bits
“seed”
EXT
k-source of length n
m almost-uniform bits
{0,1}n
2k stringsx
s
![Page 21: Lecturer: Moni Naor Joint work with Cynthia Dwork Foundations of Privacy Informal Lecture Impossibility of Disclosure Prevention or The Case for Differential](https://reader035.vdocument.in/reader035/viewer/2022062407/56649d155503460f949ea521/html5/thumbnails/21.jpg)
Strong extractors
Output looks random even after seeing the seed.
Definition: Ext is a (k,) strong extractor if Ext’(x,s)= s ◦ Ext(x,s) is a
(k,)-extractor
• i.e. 8 k-sources X, for a 1- ’ frac. of s 2 {0,1}d
Ext(X,s) is -close to Um
![Page 22: Lecturer: Moni Naor Joint work with Cynthia Dwork Foundations of Privacy Informal Lecture Impossibility of Disclosure Prevention or The Case for Differential](https://reader035.vdocument.in/reader035/viewer/2022062407/56649d155503460f949ea521/html5/thumbnails/22.jpg)
Extractors from Hash Functions
• Leftover Hash Lemma [ILL89]: universal (pairwise independent) hash functions yield strong extractors– output length: m = k-O(1)– seed length: d = O(n)Example: Ext(x,(a,b))=first m bits of a¢x+b in
GF[2n]
• Almost pairwise independence [SZ94,GW94]:– seed length: d= O(log n+k)
![Page 23: Lecturer: Moni Naor Joint work with Cynthia Dwork Foundations of Privacy Informal Lecture Impossibility of Disclosure Prevention or The Case for Differential](https://reader035.vdocument.in/reader035/viewer/2022062407/56649d155503460f949ea521/html5/thumbnails/23.jpg)
Suppose w Learned Completely
AuxGen and A share a secret: w
AuxGen(DB)• Find privacy breach y of
DB• Find w from DB
– simulate A
• Choose s2R {0,1}d and compute Ext(w,s)
Set z = (s, Ext(w,s)©y)
San
DB AuxGen
A
C
0/1
w
z
![Page 24: Lecturer: Moni Naor Joint work with Cynthia Dwork Foundations of Privacy Informal Lecture Impossibility of Disclosure Prevention or The Case for Differential](https://reader035.vdocument.in/reader035/viewer/2022062407/56649d155503460f949ea521/html5/thumbnails/24.jpg)
Suppose w Learned Completely
AuxGen and A share a secret: w
DB AuxGen
A’
C
0/1
San
DB AuxGen
A
C
0/1
w
z
z = (s, Ext(w,s) © y)
z
Technical Conditions: H1(W|y) ≥ |y| and |y| “safe”
![Page 25: Lecturer: Moni Naor Joint work with Cynthia Dwork Foundations of Privacy Informal Lecture Impossibility of Disclosure Prevention or The Case for Differential](https://reader035.vdocument.in/reader035/viewer/2022062407/56649d155503460f949ea521/html5/thumbnails/25.jpg)
Why is it a compromise?
AuxGen and A share a secret: w
Why doesn’t A’ learn y:• For each possible value of y(s, Ext(w,s)) is -close to
uniform• Hence: (s, Ext(w,s) © y) is -
close to uniform
San
DB AuxGen
A
C
0/1
w
z
z = (s, Ext(s,w) © y)
Technical Conditions: H1(W|y) ≥ |y| and |y| “safe”
![Page 26: Lecturer: Moni Naor Joint work with Cynthia Dwork Foundations of Privacy Informal Lecture Impossibility of Disclosure Prevention or The Case for Differential](https://reader035.vdocument.in/reader035/viewer/2022062407/56649d155503460f949ea521/html5/thumbnails/26.jpg)
w Need Not be Learned Completely
Relaxed Utility: Something Close to w is LearnedAuxGen(DD, DB) does not know exactly what A will learnNeed that something close to w produces the same extracted randomness as wOrdinary extractors offer no such guarantee
Fuzzy Extractors (m,ℓ,t,): (Gen, Rec)
Gen(w) outputs extracted r 2 {0,1}ℓ and public string p. For any distribution W min-entropy at least m (R, P) Ã Gen(W) ) (R, P) and (Uℓ, P) are within stat distance
Rec(p,w*): reconstructs r given p and any w* sufficiently close to w (r, p) Ã Gen(w) and || w – w*||0 · t ) Rec(w*, p) = r.
Dodis, Reyzin and Smith
![Page 27: Lecturer: Moni Naor Joint work with Cynthia Dwork Foundations of Privacy Informal Lecture Impossibility of Disclosure Prevention or The Case for Differential](https://reader035.vdocument.in/reader035/viewer/2022062407/56649d155503460f949ea521/html5/thumbnails/27.jpg)
Construction Based on ECC
• Error-correcting code ECC:– Any two codewords differ by at least 2t bits
• Gen(w): p = w © ECC(r’)– where r’ is random
r is extracted from r’
• Given p and w’ close to w:– Compute w © p – Decode to get ECC(r’) – r is extracted from r’
2tw
ECC(r’)p
w’
![Page 28: Lecturer: Moni Naor Joint work with Cynthia Dwork Foundations of Privacy Informal Lecture Impossibility of Disclosure Prevention or The Case for Differential](https://reader035.vdocument.in/reader035/viewer/2022062407/56649d155503460f949ea521/html5/thumbnails/28.jpg)
w Need Not be Learned CompletelyFuzzy Extractors (m,ℓ,t,): (Gen, Rec)
Gen(w) outputs extracted r 2 {0,1}ℓ and public string p. For any distribution W of sufficient min-entropy (R, P) Ã Gen(W) ) (R, P) and (Uℓ, P) are within stat distance
Rec: reconstructs r given p and any w* sufficiently close to w (r, p) Ã Gen(w) and || w – w*||0 · t ) Rec(w*, p) = r.
Idea: (r, p) Ã Gen(w); Set z = (p, r © y) A reconstructs r from w* close to wr looks almost uniform to A’ even given p
Problem: p leaks information about w - might disclose privacy breach y’;
Solution: AuxGen interacts with DB to learn safe w’ (r, p) Ã Gen(w’); Set z = (p, r © y)w’’ (learned by A) and w’ both sufficiently close to w
) w’, w’’ close to each other ) A(w’’, p) can reconstruct r.
By assumption w’ should not yield a breach!
Let be bound on breach
![Page 29: Lecturer: Moni Naor Joint work with Cynthia Dwork Foundations of Privacy Informal Lecture Impossibility of Disclosure Prevention or The Case for Differential](https://reader035.vdocument.in/reader035/viewer/2022062407/56649d155503460f949ea521/html5/thumbnails/29.jpg)
w Need Not be Learned Completely
AuxGen and A share a secret: r
DB AuxGen
A’
C
0/1
San
DB AuxGen
A
C
0/1
w’’
z
(p, r) = Gen(w’)z = (p, r © y)A: r = Rec(p, w’’)
z
r almost unif, given pp should not be disclosive
w’ w’
![Page 30: Lecturer: Moni Naor Joint work with Cynthia Dwork Foundations of Privacy Informal Lecture Impossibility of Disclosure Prevention or The Case for Differential](https://reader035.vdocument.in/reader035/viewer/2022062407/56649d155503460f949ea521/html5/thumbnails/30.jpg)
w Need Not be Learned CompletelyPr[A’(z)] wins ≤ Pr[A $ San(DD, DB) wins] + ≤ +
DB AuxGen
A’
C
0/1
San
DB AuxGen
A
C
0/1
w’’
z
(p, r) = Gen(w’)z = (p, r © y)A: r = Rec(p, w’’)
z
r almost unif, given pp should not be disclosive
w’ w’
![Page 31: Lecturer: Moni Naor Joint work with Cynthia Dwork Foundations of Privacy Informal Lecture Impossibility of Disclosure Prevention or The Case for Differential](https://reader035.vdocument.in/reader035/viewer/2022062407/56649d155503460f949ea521/html5/thumbnails/31.jpg)
w Need Not be Learned CompletelyNeed extra min-entropy:
Hmin(W|y) ≥ L+|p|Pr[A’(z)] wins ≤ Pr[A $ San(D, DB) wins] +
≤ +
DB AuxGen
A’
C
0/1
San
DB AuxGen
A
C
0/1
w’’
z
(p, r) = Gen(w’)z = (p, r © y)A: r = Rec(p, w’’)
z
r almost unif, given pp should not be disclosive
w’ w’
![Page 32: Lecturer: Moni Naor Joint work with Cynthia Dwork Foundations of Privacy Informal Lecture Impossibility of Disclosure Prevention or The Case for Differential](https://reader035.vdocument.in/reader035/viewer/2022062407/56649d155503460f949ea521/html5/thumbnails/32.jpg)
Two Remarkable Aspects• Works Even if Kobi Oz not in Database!
– Motivates a definition based on increased risk incurred by joining the database, • Risk to Kobi if in database vs Risk to Kobi if not in DB• Cf: What can be learned about Kobi with vs w/o DB access
• Dalenius’ Goal Impossible but Semantic Sec Possible– Yet, definitions are similar.– Resolved by utility: the adversary is a user.
– Without auxiliary information:
• User must learn something from mechanism; – Simulator learns nothing
• Eavesdropper should learn nothing; – Simulator learns nothing
– What about SFE: the comparison is to an ideal party
Differential Privacy
![Page 33: Lecturer: Moni Naor Joint work with Cynthia Dwork Foundations of Privacy Informal Lecture Impossibility of Disclosure Prevention or The Case for Differential](https://reader035.vdocument.in/reader035/viewer/2022062407/56649d155503460f949ea521/html5/thumbnails/33.jpg)
Possible answer: Differential PrivacyNoticeable Relative Shift between
K (DB – Me) and K (DB + Me) ?If not, then no perceptible risk is incurred by joining DB.Anything adversary can do, it could do without Me.
Possible responses
Pr [response]
Bad Responses: X XX