an introduction to differential privacy and its applications 1 ali bagherzandi ph.d candidate...
DESCRIPTION
Two Models: Database K ? Sanitized Database Non-Interactive: Data are sanitized and releasedTRANSCRIPT
![Page 1: An Introduction to Differential Privacy and its Applications 1 Ali Bagherzandi Ph.D Candidate University of California at Irvine 1- Most slides in this](https://reader035.vdocument.in/reader035/viewer/2022062413/5a4d1b3f7f8b9ab0599a0378/html5/thumbnails/1.jpg)
An Introduction to Differential
Privacyand its Applications1
Ali Bagherzand
i Ph.D Candidate
University of California at
Irvine
1- Most slides in this presentation are from Cyntia Dwork’s talk.
![Page 2: An Introduction to Differential Privacy and its Applications 1 Ali Bagherzandi Ph.D Candidate University of California at Irvine 1- Most slides in this](https://reader035.vdocument.in/reader035/viewer/2022062413/5a4d1b3f7f8b9ab0599a0378/html5/thumbnails/2.jpg)
Goal of Differential Privacy:Privacy-Preserving Statistical Analysis of Confidential Data
Finding Statistical CorrelationsAnalyzing medical data to learn genotype/phenotype associations
Correlating cough outbreak with chemical plant malfunctionCan’t be done with HIPAA safe-harbor sanitized data
Noticing EventsDetecting spike in ER admissions for asthma
Official StatisticsCensus
Contingency Table Release (statistics for small sets of attributes)Model fitting, regression analyses, etc.
Standard Datamining TasksClustering; learning association rules, decision trees, separators;
principal component analysis
![Page 3: An Introduction to Differential Privacy and its Applications 1 Ali Bagherzandi Ph.D Candidate University of California at Irvine 1- Most slides in this](https://reader035.vdocument.in/reader035/viewer/2022062413/5a4d1b3f7f8b9ab0599a0378/html5/thumbnails/3.jpg)
Two Models:
Database
K?
Sanitized Database
Non-Interactive: Data are sanitized and released
![Page 4: An Introduction to Differential Privacy and its Applications 1 Ali Bagherzandi Ph.D Candidate University of California at Irvine 1- Most slides in this](https://reader035.vdocument.in/reader035/viewer/2022062413/5a4d1b3f7f8b9ab0599a0378/html5/thumbnails/4.jpg)
Two Models:
Database
K?
Sanitized Database
Interactive: Multiple Queries, Adaptively Chosen
![Page 5: An Introduction to Differential Privacy and its Applications 1 Ali Bagherzandi Ph.D Candidate University of California at Irvine 1- Most slides in this](https://reader035.vdocument.in/reader035/viewer/2022062413/5a4d1b3f7f8b9ab0599a0378/html5/thumbnails/5.jpg)
Achievements for Statistical Data-Bases
Defined Differential PrivacyProved classical goal (Dalenius, 1977) unachievable“Ad Omnia” definition; independent of linkage information
General Approach; Rigorous ProofRelates degree of distortion to the (mathematical) sensitivity of
the computation needed for the analysis“How much” can the data of one person affect the outcome?Cottage Industry: redesigning algorithms to be insensitive
Assorted ExtensionsWhen noise makes no sense; when actual sensitivity is much
less than worst-case; when the database is distributed; …Lower bounds on distortion
Initiated by Dinur and Nissim
![Page 6: An Introduction to Differential Privacy and its Applications 1 Ali Bagherzandi Ph.D Candidate University of California at Irvine 1- Most slides in this](https://reader035.vdocument.in/reader035/viewer/2022062413/5a4d1b3f7f8b9ab0599a0378/html5/thumbnails/6.jpg)
Semantic Security for Confidentiality [Goldwasser-Micali ’ 82]
VocabularyPlaintext: the message to be transmittedCiphertext: the encryption of the plaintextAuxiliary information: anything else known to attacker
The ciphertext leaks no information about the plaintext.
Formalization Compare the ability of someone seeing aux and ciphertext to guess
(anything about) the plaintext, to the ability of someone seeingonly aux to do the same thing. Difference should be “tiny”.
![Page 7: An Introduction to Differential Privacy and its Applications 1 Ali Bagherzandi Ph.D Candidate University of California at Irvine 1- Most slides in this](https://reader035.vdocument.in/reader035/viewer/2022062413/5a4d1b3f7f8b9ab0599a0378/html5/thumbnails/7.jpg)
Semantic Security for Privacy?
Dalenius, 1977Anything that can be learned about a respondent from thestatistical database can be learned without access to the
database.
Happily, Formalizes to Semantic Security
Unhappily, Unachievable [Dwork and Naor 2006]Both for not serious and serious reasons.
7
![Page 8: An Introduction to Differential Privacy and its Applications 1 Ali Bagherzandi Ph.D Candidate University of California at Irvine 1- Most slides in this](https://reader035.vdocument.in/reader035/viewer/2022062413/5a4d1b3f7f8b9ab0599a0378/html5/thumbnails/8.jpg)
Semantic Security for Privacy?
The Serious Reason (told as a parable)Database teaches average heights of population subgroups
“Terry Tao is four inches taller than avg Swedish ♀”Access to DB teaches Terry’s height.
Terry’s height learnable from the DB, not learnable w/o.Proof extends to “any” notion of privacy breach.
Attack Works Even if Terry Not in DB!Suggests new notion of privacy: risk incurred by joining DB
Before/After interacting vs Risk when in/notin DB
![Page 9: An Introduction to Differential Privacy and its Applications 1 Ali Bagherzandi Ph.D Candidate University of California at Irvine 1- Most slides in this](https://reader035.vdocument.in/reader035/viewer/2022062413/5a4d1b3f7f8b9ab0599a0378/html5/thumbnails/9.jpg)
Differential Privacy: Formal Definition
K gives differentialprivacy if for all values of DB, DB’differing in a single element, and all S in Range(K )
Pr[ K (DB) in S]Pr[ K (DB’) in S]
≤ eε ~ (1+ε)
ratio bounded
Pr [t]
![Page 10: An Introduction to Differential Privacy and its Applications 1 Ali Bagherzandi Ph.D Candidate University of California at Irvine 1- Most slides in this](https://reader035.vdocument.in/reader035/viewer/2022062413/5a4d1b3f7f8b9ab0599a0378/html5/thumbnails/10.jpg)
f+ noise ?
K
f: DB Rd
K (f, DB) = f(DB) + [Noise]d
Eg, Count(P, DB) = # rows in DB with Property P42
Achieving Differential Privacy:
![Page 11: An Introduction to Differential Privacy and its Applications 1 Ali Bagherzandi Ph.D Candidate University of California at Irvine 1- Most slides in this](https://reader035.vdocument.in/reader035/viewer/2022062413/5a4d1b3f7f8b9ab0599a0378/html5/thumbnails/11.jpg)
How Much Can f(DB + Me) Exceed f(DB - Me)?
Recall: K (f, DB) = f(DB) + noise
Question Asks: What difference must noise obscure?
Δf= maxH(DB, DB’)=1 ||f(DB) – f(DB’)||1
eg, ΔCount= 1
43
Sensitivity of Function
![Page 12: An Introduction to Differential Privacy and its Applications 1 Ali Bagherzandi Ph.D Candidate University of California at Irvine 1- Most slides in this](https://reader035.vdocument.in/reader035/viewer/2022062413/5a4d1b3f7f8b9ab0599a0378/html5/thumbnails/12.jpg)
Calibrate Noise to Sensitivity
Δf= maxH(DB, DB’)=1 ||f(DB) – f(DB’)||1
-4R -3R -2R -R 0 R 2R 3R 4R 5R
Lap(R) : f(x) = 1/2R exp(-|x|/R)
Theorem: To achieve ε-differentialprivacy, usescaled symmetric noise [Lap(R)]d with R =
Δfε.
Increasing R flattens curve; more privacyNoise depends on f and not on the
database
![Page 13: An Introduction to Differential Privacy and its Applications 1 Ali Bagherzandi Ph.D Candidate University of California at Irvine 1- Most slides in this](https://reader035.vdocument.in/reader035/viewer/2022062413/5a4d1b3f7f8b9ab0599a0378/html5/thumbnails/13.jpg)
Why does it work? (d=1)
Pr[ K (f, DB’) = t]
Pr[ K (f, DB) = t]= exp(-(|t- f(DB)|-|t- f(DB’)|)/R) ≤ exp(-Δf/R) ≤ e
Δf= maxDB, DB-Me |f(DB) – f(DB-Me)|
Theorem: To achieve ε-differentialprivacy, usescaled symmetric noise Lap(R) with R =
Δfε.
-4R -3R -2R -R 0 R 2R 3R 4R 5R
![Page 14: An Introduction to Differential Privacy and its Applications 1 Ali Bagherzandi Ph.D Candidate University of California at Irvine 1- Most slides in this](https://reader035.vdocument.in/reader035/viewer/2022062413/5a4d1b3f7f8b9ab0599a0378/html5/thumbnails/14.jpg)
Differential Privacy: ExampleID Claim
(‘000$)ID Claim
(‘000$)1 9.91 7 10.63
2 9.16 8 12.54
3 10.59 9 9.29
4 11.27 10 8.92
5 10.50 11 206 11.89 12 8.55
… … … …
Query : Average claim
Aux input : A fraction C of entries.
Without Noise : Can learn my claim w/ prob. C.
What privacy level is enough for me?
I feel safe if I’m hidden among n people Pr [identify] = 1/n
If n=100, c=0.01 in this example = 1.
In most applications, same gives me more feeling of privacy. e.g. adv. Identification itself is probabilistic. = 0.1, 1.0, ln(2)
![Page 15: An Introduction to Differential Privacy and its Applications 1 Ali Bagherzandi Ph.D Candidate University of California at Irvine 1- Most slides in this](https://reader035.vdocument.in/reader035/viewer/2022062413/5a4d1b3f7f8b9ab0599a0378/html5/thumbnails/15.jpg)
Differential Privacy: Example
Query : Average claim
Aux input : 12 entries
= 1 hide me among 1200 people
Δf = maxDB, DB-Me |f(DB) – f(DB-Me)| = maxDB, DB-Me |Avg(DB) – Avg(DB-Me)|
= 11.10-10.29 = 0.81
Add noise : Lap(f/) = Lap(0.81/1) = Lap(0.81)
ID Claim (‘000$)
ID Claim(‘000$)
1 9.91 7 10.63
2 9.16 8 12.54
3 10.59 9 9.29
4 11.27 10 8.92
5 10.50 11 206 11.89 12 8.55
… … … …
Var = 1.31 Roughly: respond w/ ± 1.31 true value
Avg. not a sensitive function!
![Page 16: An Introduction to Differential Privacy and its Applications 1 Ali Bagherzandi Ph.D Candidate University of California at Irvine 1- Most slides in this](https://reader035.vdocument.in/reader035/viewer/2022062413/5a4d1b3f7f8b9ab0599a0378/html5/thumbnails/16.jpg)
Differential Privacy: Example
Query : max claim
Aux input : 12 entries
= 1 hide me among 1200 people
Δf = maxDB, DB-Me |f(DB) – f(DB-Me)| = maxDB, DB-Me |Avg(DB) – Avg(DB-Me)|
= 20-12.54 = 7.46
Add noise : Lap(f/) = Lap(7.46/1) = Lap(7.46)
ID Claim (‘000$)
ID Claim(‘000$)
1 9.91 7 10.63
2 9.16 8 12.54
3 10.59 9 9.29
4 11.27 10 8.92
5 10.50 11 206 11.89 12 8.55
… … … …
Var = 1.31 Roughly: respond w/ ± 111.30 true value
Max. function is very sensitive!
Useless : error > sampling error
![Page 17: An Introduction to Differential Privacy and its Applications 1 Ali Bagherzandi Ph.D Candidate University of California at Irvine 1- Most slides in this](https://reader035.vdocument.in/reader035/viewer/2022062413/5a4d1b3f7f8b9ab0599a0378/html5/thumbnails/17.jpg)
For all DB, DB’ differing in at most one element for all subset S of Range(K ),
Pr[ K (DB) in S] ≤ ePr[ K (DB’) in S] + T
where (n) is negligible.
Cf : DifferentialPrivacy is unconditional, independent of n
A Natural Relaxation: (,T)-Differential Privacy
![Page 18: An Introduction to Differential Privacy and its Applications 1 Ali Bagherzandi Ph.D Candidate University of California at Irvine 1- Most slides in this](https://reader035.vdocument.in/reader035/viewer/2022062413/5a4d1b3f7f8b9ab0599a0378/html5/thumbnails/18.jpg)
Database D = {d1, … ,dn}rows in {0,1}k
T “counting queries” q = (S,g)S subset [1..n], g: rows{0,1}; “How many rows in S satisfy g?”
q(D) = ∑i in S g(di)
Privacy Mechanism: Add Binomial Noiser = q(D) + B(f(n), ½)
SuLQ Theorem: (δ, T)-Differential Privacy whenT(n)=O(nc), 0< c <1; δ=1/O(nc’), 0< c’ < c/2f(n) = (T(n) /δ2) log6 (n)
If (√T(n)/δ)<O(√n) then |noise| < |sampling error|!
Feasibility Results: [Dwork Nissim ’04] (Here: Revisionist Version)
e.g. : T =n3/4 and δ= 1/10 then √T/δ=10n3/8 << √n
![Page 19: An Introduction to Differential Privacy and its Applications 1 Ali Bagherzandi Ph.D Candidate University of California at Irvine 1- Most slides in this](https://reader035.vdocument.in/reader035/viewer/2022062413/5a4d1b3f7f8b9ab0599a0378/html5/thumbnails/19.jpg)
Blatant Non-Privacy: Adversary guesses correctly o(n) rows
Blatant non-privacy if:all / *cn / (1/2 + c’n
answers are within o(√n) of the true answer,
n / cn / c’n queries andpoly(n) / poly(n) / exp(n) computation.
[Y] / [DMT] / [DMT]
Results are independent of how noise is distributed.A variant model permits poly(n) computation in the final case [DY].
Impossibility Results:Line of Research Initiated by Dinur and Nissim ’03
2n queries: noise E permits correctly guessing n-4E rows [DiNi].Depressing implications for non-interactive case.
even against an adversary restricted to
![Page 20: An Introduction to Differential Privacy and its Applications 1 Ali Bagherzandi Ph.D Candidate University of California at Irvine 1- Most slides in this](https://reader035.vdocument.in/reader035/viewer/2022062413/5a4d1b3f7f8b9ab0599a0378/html5/thumbnails/20.jpg)
Thank You!