Download - The State of the Art
![Page 1: The State of the Art](https://reader036.vdocument.in/reader036/viewer/2022062411/568168a0550346895ddf3638/html5/thumbnails/1.jpg)
The State of the Art
Cynthia Dwork, Microsoft Research
![Page 2: The State of the Art](https://reader036.vdocument.in/reader036/viewer/2022062411/568168a0550346895ddf3638/html5/thumbnails/2.jpg)
Pre-Modern Cryptography
Propose
Break
![Page 3: The State of the Art](https://reader036.vdocument.in/reader036/viewer/2022062411/568168a0550346895ddf3638/html5/thumbnails/3.jpg)
Modern Cryptography
Propose Definition
Break Definition
Propose STRONGER
Definition
Break Definition
algorithms satisfying definition
Algs
Propose STRONGER
[GoldwasserMicali82,GoldwasserMicaliRivest85]
![Page 4: The State of the Art](https://reader036.vdocument.in/reader036/viewer/2022062411/568168a0550346895ddf3638/html5/thumbnails/4.jpg)
Modern Cryptography
Propose Definition
Break Definition
Propose STRONGER
Definition
Break Definition
algorithms satisfying definition
Algs
Propose STRONGER
[GoldwasserMicali82,GoldwasserMicaliRivest85]
![Page 5: The State of the Art](https://reader036.vdocument.in/reader036/viewer/2022062411/568168a0550346895ddf3638/html5/thumbnails/5.jpg)
No Algorithm?
Propose Definition
?Why?
![Page 6: The State of the Art](https://reader036.vdocument.in/reader036/viewer/2022062411/568168a0550346895ddf3638/html5/thumbnails/6.jpg)
Provably No Algorithm?
Bad Definition
Propose Definition
?
Propose WEAKER/DIFF
Definition
Alg / ?
![Page 7: The State of the Art](https://reader036.vdocument.in/reader036/viewer/2022062411/568168a0550346895ddf3638/html5/thumbnails/7.jpg)
The Privacy Dream
Original Database Sanitized data set
?C
Census, financial, medical data; OTC drug purchases; social networks; MOOCs data; call and text records; energy consumption; loan, advertising, and applicant data; ad clicks product correlations, query logs,…
![Page 8: The State of the Art](https://reader036.vdocument.in/reader036/viewer/2022062411/568168a0550346895ddf3638/html5/thumbnails/8.jpg)
Fundamental Law of Info Recovery
“Overly accurate” estimates of “too many” statistics is blatantly non-private
DinurNissim03; DworkMcSherryTalwar07; DworkYekhanin08, De12, MuthukrishnanNikolov12,…
![Page 9: The State of the Art](https://reader036.vdocument.in/reader036/viewer/2022062411/568168a0550346895ddf3638/html5/thumbnails/9.jpg)
Anonymization (aka De-Identification) Remove “personally identifying information”
Name sex DOB zip symptoms previous admissions medications family history …
![Page 10: The State of the Art](https://reader036.vdocument.in/reader036/viewer/2022062411/568168a0550346895ddf3638/html5/thumbnails/10.jpg)
Anonymization (aka De-Identification) Remove “personally identifying information”
Name sex DOB zip symptoms previous admissions medications family history …
![Page 11: The State of the Art](https://reader036.vdocument.in/reader036/viewer/2022062411/568168a0550346895ddf3638/html5/thumbnails/11.jpg)
Anonymization (aka De-Identification) Remove “personally identifying information”
Name sex DOB zip symptoms previous admissions medications family history …
![Page 12: The State of the Art](https://reader036.vdocument.in/reader036/viewer/2022062411/568168a0550346895ddf3638/html5/thumbnails/12.jpg)
Anonymization (aka De-Identification) Remove “personally identifying information”
Name sex DOB zip symptoms previous admissions medications family history …
?
![Page 13: The State of the Art](https://reader036.vdocument.in/reader036/viewer/2022062411/568168a0550346895ddf3638/html5/thumbnails/13.jpg)
William Weld’s Medical Records
ZIP
birthdate
sex
nameaddress
date reg.
partyaffiliation
last voted
ethnicity
visit datediagnosis
proceduremedication
total charge
voter registrationdata
HMO data
Sweeney97
![Page 14: The State of the Art](https://reader036.vdocument.in/reader036/viewer/2022062411/568168a0550346895ddf3638/html5/thumbnails/14.jpg)
William Weld’s Medical Records
ZIP
birthdate
sex
nameaddress
date reg.
partyaffiliation
last voted
ethnicity
visit datediagnosis
proceduremedication
total charge
voter registrationdata
HMO data
Can “name” by
(zip, birth, sex)
Sweeney97
![Page 15: The State of the Art](https://reader036.vdocument.in/reader036/viewer/2022062411/568168a0550346895ddf3638/html5/thumbnails/15.jpg)
Anonymization (aka De-Identification) Remove “personally identifying information”
Name sex DOB zip symptoms previous admissions medications family history …
??
![Page 16: The State of the Art](https://reader036.vdocument.in/reader036/viewer/2022062411/568168a0550346895ddf3638/html5/thumbnails/16.jpg)
Anonymization (aka De-Identification) Remove “personally identifying information”
Name sex DOB zip symptoms previous admissions medications family history …
??
![Page 17: The State of the Art](https://reader036.vdocument.in/reader036/viewer/2022062411/568168a0550346895ddf3638/html5/thumbnails/17.jpg)
NarayananShmatikov08
![Page 18: The State of the Art](https://reader036.vdocument.in/reader036/viewer/2022062411/568168a0550346895ddf3638/html5/thumbnails/18.jpg)
Can “name” by 3
(title, approx date)
pairs
NarayananShmatikov08
![Page 19: The State of the Art](https://reader036.vdocument.in/reader036/viewer/2022062411/568168a0550346895ddf3638/html5/thumbnails/19.jpg)
Re-ID Not the Only Worry
k-anonymity
Break Definition
l-diversity
Break Definition
algorithms satisfying definition
Algs
m,t
m-invariancet-closeness
SamaratiSweeney98, MakanvajjhalaGehrkeKiferVenkitasubramaniam06,XiaoTao07,LiLiVenkatasubmraminan07
![Page 20: The State of the Art](https://reader036.vdocument.in/reader036/viewer/2022062411/568168a0550346895ddf3638/html5/thumbnails/20.jpg)
Culprit: Diverse Background Info
2014
2013
2012
Voter Weld: M, DOB, zipIMDb
![Page 21: The State of the Art](https://reader036.vdocument.in/reader036/viewer/2022062411/568168a0550346895ddf3638/html5/thumbnails/21.jpg)
Billing for Targeted Advertisements
+
+
[Korolova12]
![Page 22: The State of the Art](https://reader036.vdocument.in/reader036/viewer/2022062411/568168a0550346895ddf3638/html5/thumbnails/22.jpg)
Product Recommendations
X’s preferences influence Y’s experience Combining evolving similar items lists with a little knowledge (from
your blog) of what you bought, an adversary can infer purchases you did not choose to publicize
CalandrinoKilzerNarayananFeltenShmatikov11
People who bought this also bought…Blog
![Page 23: The State of the Art](https://reader036.vdocument.in/reader036/viewer/2022062411/568168a0550346895ddf3638/html5/thumbnails/23.jpg)
SNP: Single Nucleotide (A,C,G,T) polymorphism
CT
TT
…
…
…
…
SNP statistics of Case Group
“Can” Test Case Group Membership using target’s DNA and HapMap
GWAS Statistics
Homer+08
![Page 24: The State of the Art](https://reader036.vdocument.in/reader036/viewer/2022062411/568168a0550346895ddf3638/html5/thumbnails/24.jpg)
HapMap
Culprit: Diverse Background Info
2014
2013
2012
Voter Weld: M, DOB, zipIMDb
People who bought this…Blog
Billing
![Page 25: The State of the Art](https://reader036.vdocument.in/reader036/viewer/2022062411/568168a0550346895ddf3638/html5/thumbnails/25.jpg)
How Should We Approach Privacy? “Computer science got us into this mess, can computer science
get us out of it?” (Sweeney, 2012)
![Page 26: The State of the Art](https://reader036.vdocument.in/reader036/viewer/2022062411/568168a0550346895ddf3638/html5/thumbnails/26.jpg)
How Should We Approach Privacy? “Computer science got us into this mess, can computer science
get us out of it?” (Sweeney, 2012)
Complexity of this type requires a mathematically rigorous theory of privacy and its loss.
![Page 27: The State of the Art](https://reader036.vdocument.in/reader036/viewer/2022062411/568168a0550346895ddf3638/html5/thumbnails/27.jpg)
How Should We Approach Privacy? “Computer science got us into this mess, can computer science
get us out of it?” (Sweeney, 2012)
Complexity of this type requires a mathematically rigorous theory of privacy and its loss. We cannot discuss tradeoffs between privacy and statistical utility
without a measure that captures cumulative harm over multiple uses. Other fields -- economics, ethics, policy -- cannot be brought to bear
without a “currency,” or measure of privacy, with which to work.
![Page 28: The State of the Art](https://reader036.vdocument.in/reader036/viewer/2022062411/568168a0550346895ddf3638/html5/thumbnails/28.jpg)
Useful Databases that Teach Database teaches that smoking causes cancer.
Smoker S’s insurance premiums rise. Premiums rise even if S not in database!
Learning that smoking causes cancer is the whole point. Smoker S enrolls in a smoking cessation program.
Differential privacy: limit harms to the teachings, not participation
The outcome of any analysis is essentially equally likely, independentof whether any individual joins, or refrains from joining, the database.
![Page 29: The State of the Art](https://reader036.vdocument.in/reader036/viewer/2022062411/568168a0550346895ddf3638/html5/thumbnails/29.jpg)
Useful Databases that Teach Database teaches that smoking causes cancer.
Smoker S’s insurance premiums rise. Premiums rise even if S not in database!
Learning that smoking causes cancer is the whole point. Smoker S enrolls in a smoking cessation program.
Differential privacy: limit harms to the teachings, not participation
The likelihood of any possible harm to ME is essentially independentof whether I join, or refrain from joining, the database.
![Page 30: The State of the Art](https://reader036.vdocument.in/reader036/viewer/2022062411/568168a0550346895ddf3638/html5/thumbnails/30.jpg)
Useful Databases that Teach Database teaches that smoking causes cancer.
Smoker S’s insurance premiums rise. Premiums rise even if S not in database!
Learning that smoking causes cancer is the whole point. Smoker S enrolls in a smoking cessation program.
Differential privacy: limit harms to the teachings, not participation
High premiums, busted, purchases revealed to co-worker…Essentially equally likely when I’m in as when I’m out
![Page 31: The State of the Art](https://reader036.vdocument.in/reader036/viewer/2022062411/568168a0550346895ddf3638/html5/thumbnails/31.jpg)
Differential Privacy [D.,McSherry,Nissim,Smith ‘06]
gives -differential privacy if for all pairs of data sets differing in one element, and all subsets of possible outputs
If a bad event is very unlikely when I’m not in dataset () then it is still very unlikely when I am ()
Randomness introduced by Randomness introduced by
Impossible to know the actual probabilities of bad events.Can still control change in risk due to joining the database.
![Page 32: The State of the Art](https://reader036.vdocument.in/reader036/viewer/2022062411/568168a0550346895ddf3638/html5/thumbnails/32.jpg)
Differential Privacy [D.,McSherry,Nissim,Smith ‘06]
gives -differential privacy if for all pairs of data sets differing in one element, and all subsets of possible outputs
If a bad event is very unlikely when I’m not in dataset () then it is still very unlikely when I am ()
“Privacy Loss”
Impossible to know the actual probabilities of bad events.Can still control change in risk due to joining the database.
![Page 33: The State of the Art](https://reader036.vdocument.in/reader036/viewer/2022062411/568168a0550346895ddf3638/html5/thumbnails/33.jpg)
Differential Privacy Nuanced measure of privacy loss
Captures cumulative harm over multiple uses, multiple databases
Adversary’s background knowledge is irrelevant Immune to re-identification attacks, etc.
“Programmable” Construct complicated private analyses from simple
private building blocks
![Page 34: The State of the Art](https://reader036.vdocument.in/reader036/viewer/2022062411/568168a0550346895ddf3638/html5/thumbnails/34.jpg)
Recall: Fundamental Law
“Overly accurate” estimates of “too many” statistics is blatantly non-private
DinurNissim03; DworkMcSherryTalwar07; DworkYekhanin08, De12, MuthukrishnanNikolov12,…
![Page 35: The State of the Art](https://reader036.vdocument.in/reader036/viewer/2022062411/568168a0550346895ddf3638/html5/thumbnails/35.jpg)
Answer Only Questions Asked
q1
a1
Database curator data analysts
Cq2
a2
q3
a3
![Page 36: The State of the Art](https://reader036.vdocument.in/reader036/viewer/2022062411/568168a0550346895ddf3638/html5/thumbnails/36.jpg)
Want to compute Adding pulls
Add random noise to obscure difference vs
Intuition
![Page 37: The State of the Art](https://reader036.vdocument.in/reader036/viewer/2022062411/568168a0550346895ddf3638/html5/thumbnails/37.jpg)
Want to compute Adding pulls
Add random noise to obscure difference
Intuition
Algorithms, geometry, learning theory, complexity theory,cryptography, statistics, machine learning, programming languages, verification, databases, economics,…
![Page 38: The State of the Art](https://reader036.vdocument.in/reader036/viewer/2022062411/568168a0550346895ddf3638/html5/thumbnails/38.jpg)
Not a Panacea Fundamental Law of Information Recovery still holds
![Page 39: The State of the Art](https://reader036.vdocument.in/reader036/viewer/2022062411/568168a0550346895ddf3638/html5/thumbnails/39.jpg)
Challenge: The Meaning of Loss Sometimes the theory gives exactly the right answer
Large loss in differential privacy translates to “obvious” real life privacy breach, under circumstances known to be plausible
Other times? Do all large losses translate to such realizable privacy breaches, or is
the theory too pessimistic?
![Page 40: The State of the Art](https://reader036.vdocument.in/reader036/viewer/2022062411/568168a0550346895ddf3638/html5/thumbnails/40.jpg)
Policy Recommendation Publish all Epsilons!
Penalize when
Combines motivation for data breach notification statutes and environmental laws requiring disclosures of toxic releases with an incentive to start using (minimal) differential privacy
DworkMulligan14