18734: foundations of privacy - carnegie mellon universityece734/lectures/17-fairness.pdf ·...
TRANSCRIPT
![Page 1: 18734: Foundations of Privacy - Carnegie Mellon Universityece734/lectures/17-fairness.pdf · 2019-10-23 · TV (statistical) distance over distributions •Need to show 2 things:](https://reader033.vdocument.in/reader033/viewer/2022060307/5f09e1ab7e708231d428f2b8/html5/thumbnails/1.jpg)
Fairness in Classification
Giulia Fanti
With many slides from Moritz Hardt,Anupam Datta
Fall 2019
18734: Foundations of Privacy
![Page 2: 18734: Foundations of Privacy - Carnegie Mellon Universityece734/lectures/17-fairness.pdf · 2019-10-23 · TV (statistical) distance over distributions •Need to show 2 things:](https://reader033.vdocument.in/reader033/viewer/2022060307/5f09e1ab7e708231d428f2b8/html5/thumbnails/2.jpg)
Administrative• Nice talk on Thursday: Farinaz Koushanfar (UCSD)
– Latest in privacy-preserving machine learning over encrypted data– Thursday 10/10 @ 3:30 pm EST/12:30 pm PT in HH1107 (Pittsburgh), Room 1065 (SV)– Free food in PIT!
• Mid-semester presentations– Wednesday, Oct. 30– Monday, Nov. 4– Guidelines on Canvas (rubric + points)– Sign up link here
• This Friday (Oct. 25): Day for Community Engagement– No recitation– Sruti will hold her OH on Friday from 3-4pm ET– My OH are by appointment this week
![Page 3: 18734: Foundations of Privacy - Carnegie Mellon Universityece734/lectures/17-fairness.pdf · 2019-10-23 · TV (statistical) distance over distributions •Need to show 2 things:](https://reader033.vdocument.in/reader033/viewer/2022060307/5f09e1ab7e708231d428f2b8/html5/thumbnails/3.jpg)
In-Class Quiz
• On Canvas
![Page 4: 18734: Foundations of Privacy - Carnegie Mellon Universityece734/lectures/17-fairness.pdf · 2019-10-23 · TV (statistical) distance over distributions •Need to show 2 things:](https://reader033.vdocument.in/reader033/viewer/2022060307/5f09e1ab7e708231d428f2b8/html5/thumbnails/4.jpg)
Big Data: Seizing Opportunities, Preserving Values ~ 2014
"big data technologies can cause societal harms beyond damages to privacy"
![Page 5: 18734: Foundations of Privacy - Carnegie Mellon Universityece734/lectures/17-fairness.pdf · 2019-10-23 · TV (statistical) distance over distributions •Need to show 2 things:](https://reader033.vdocument.in/reader033/viewer/2022060307/5f09e1ab7e708231d428f2b8/html5/thumbnails/5.jpg)
Many notions of “fairness” in CS
• Fair scheduling• Distributed computing• Envy-free division (cake cutting)• Stable matching
![Page 6: 18734: Foundations of Privacy - Carnegie Mellon Universityece734/lectures/17-fairness.pdf · 2019-10-23 · TV (statistical) distance over distributions •Need to show 2 things:](https://reader033.vdocument.in/reader033/viewer/2022060307/5f09e1ab7e708231d428f2b8/html5/thumbnails/6.jpg)
Fairness in Classification
¤Advertising
✚Health Care
Education
many more...
$Banking
InsuranceTaxation
Financial aid
![Page 7: 18734: Foundations of Privacy - Carnegie Mellon Universityece734/lectures/17-fairness.pdf · 2019-10-23 · TV (statistical) distance over distributions •Need to show 2 things:](https://reader033.vdocument.in/reader033/viewer/2022060307/5f09e1ab7e708231d428f2b8/html5/thumbnails/7.jpg)
Concern: Discrimination• Certain attributes should be irrelevant!
• Population includes minorities– Ethnic, religious, medical, geographic
• Protected by law, policy, ethics
![Page 8: 18734: Foundations of Privacy - Carnegie Mellon Universityece734/lectures/17-fairness.pdf · 2019-10-23 · TV (statistical) distance over distributions •Need to show 2 things:](https://reader033.vdocument.in/reader033/viewer/2022060307/5f09e1ab7e708231d428f2b8/html5/thumbnails/8.jpg)
Examples
• Word embeddings– Important trend in NLP– Map word -> vector– Related words have similar
vectors– E.g.:
v(king) – v(man) = v(queen) – v(woman)
Bolukbasi et al, “Man is to Computer Programmer as Woman is to Homemaker? Debiasing Word Embeddings”, NeurIPS 2019
![Page 9: 18734: Foundations of Privacy - Carnegie Mellon Universityece734/lectures/17-fairness.pdf · 2019-10-23 · TV (statistical) distance over distributions •Need to show 2 things:](https://reader033.vdocument.in/reader033/viewer/2022060307/5f09e1ab7e708231d428f2b8/html5/thumbnails/9.jpg)
Overview
• Fairness as a (group) statistical property• Individual fairness• Achieving fairness with utility considerations
![Page 10: 18734: Foundations of Privacy - Carnegie Mellon Universityece734/lectures/17-fairness.pdf · 2019-10-23 · TV (statistical) distance over distributions •Need to show 2 things:](https://reader033.vdocument.in/reader033/viewer/2022060307/5f09e1ab7e708231d428f2b8/html5/thumbnails/10.jpg)
Discrimination arises even when nobody’s evil
• Google+ tries to classify real vs fake names• Fairness problem:– Most training examples standard white American
names: John, Jennifer, Peter, Jacob, ...– Ethnic names often unique, much fewer training
examples
Likely outcome: Prediction accuracy worse on ethnic names
- Katya Casio. Google Product Forums.
“Due to Google's ethnocentricity I was prevented from using my real last name (my nationality is: Tungus and Sami)”
![Page 11: 18734: Foundations of Privacy - Carnegie Mellon Universityece734/lectures/17-fairness.pdf · 2019-10-23 · TV (statistical) distance over distributions •Need to show 2 things:](https://reader033.vdocument.in/reader033/viewer/2022060307/5f09e1ab7e708231d428f2b8/html5/thumbnails/11.jpg)
Error vs sample size
Sample Size Disparity: In a heterogeneous population, smaller groups face larger error
![Page 12: 18734: Foundations of Privacy - Carnegie Mellon Universityece734/lectures/17-fairness.pdf · 2019-10-23 · TV (statistical) distance over distributions •Need to show 2 things:](https://reader033.vdocument.in/reader033/viewer/2022060307/5f09e1ab7e708231d428f2b8/html5/thumbnails/12.jpg)
Credit Application
User visits capitalone.comCapital One uses tracking information provided by the tracking network [x+1] to personalize offersConcern: Steering minorities into higher rates (illegal)
WSJ 2010
![Page 13: 18734: Foundations of Privacy - Carnegie Mellon Universityece734/lectures/17-fairness.pdf · 2019-10-23 · TV (statistical) distance over distributions •Need to show 2 things:](https://reader033.vdocument.in/reader033/viewer/2022060307/5f09e1ab7e708231d428f2b8/html5/thumbnails/13.jpg)
V: Individuals O: outcomes
Classifier(e.g. tracking network)
x M(x)
Vendor(e.g. capital one)
A: actions
M : V ! O ƒ : O! A
![Page 14: 18734: Foundations of Privacy - Carnegie Mellon Universityece734/lectures/17-fairness.pdf · 2019-10-23 · TV (statistical) distance over distributions •Need to show 2 things:](https://reader033.vdocument.in/reader033/viewer/2022060307/5f09e1ab7e708231d428f2b8/html5/thumbnails/14.jpg)
V: Individuals O: outcomes
x M(x)
M : V ! O
Goal: Achieve Fairness in the classification step
Assumeunknown, untrusted, un-auditable vendor
![Page 15: 18734: Foundations of Privacy - Carnegie Mellon Universityece734/lectures/17-fairness.pdf · 2019-10-23 · TV (statistical) distance over distributions •Need to show 2 things:](https://reader033.vdocument.in/reader033/viewer/2022060307/5f09e1ab7e708231d428f2b8/html5/thumbnails/15.jpg)
What kinds of events do we want to prevent in our definition?
• Blatant discrimination• Discrimination based on redundant encoding• Discrimination against portion of population with higher
fraction of protected individuals• Self-fulfilling prophecy• Reverse tokenism
![Page 16: 18734: Foundations of Privacy - Carnegie Mellon Universityece734/lectures/17-fairness.pdf · 2019-10-23 · TV (statistical) distance over distributions •Need to show 2 things:](https://reader033.vdocument.in/reader033/viewer/2022060307/5f09e1ab7e708231d428f2b8/html5/thumbnails/16.jpg)
First attempt…
![Page 17: 18734: Foundations of Privacy - Carnegie Mellon Universityece734/lectures/17-fairness.pdf · 2019-10-23 · TV (statistical) distance over distributions •Need to show 2 things:](https://reader033.vdocument.in/reader033/viewer/2022060307/5f09e1ab7e708231d428f2b8/html5/thumbnails/17.jpg)
Fairness through Blindness
![Page 18: 18734: Foundations of Privacy - Carnegie Mellon Universityece734/lectures/17-fairness.pdf · 2019-10-23 · TV (statistical) distance over distributions •Need to show 2 things:](https://reader033.vdocument.in/reader033/viewer/2022060307/5f09e1ab7e708231d428f2b8/html5/thumbnails/18.jpg)
Fairness through Blindness
Ignore all irrelevant/protected attributes
“We don’t even look at ‘race’!”
![Page 19: 18734: Foundations of Privacy - Carnegie Mellon Universityece734/lectures/17-fairness.pdf · 2019-10-23 · TV (statistical) distance over distributions •Need to show 2 things:](https://reader033.vdocument.in/reader033/viewer/2022060307/5f09e1ab7e708231d428f2b8/html5/thumbnails/19.jpg)
Point of Failure
You don’t need to see an attribute to be able to predict it with high accuracy
E.g.: User visits artofmanliness.com... 90% chance of being male
![Page 20: 18734: Foundations of Privacy - Carnegie Mellon Universityece734/lectures/17-fairness.pdf · 2019-10-23 · TV (statistical) distance over distributions •Need to show 2 things:](https://reader033.vdocument.in/reader033/viewer/2022060307/5f09e1ab7e708231d428f2b8/html5/thumbnails/20.jpg)
Fairness through Privacy?
“It's Not Privacy, and It's Not Fair”
Cynthia Dwork & Deirdre K. Mulligan. Stanford Law Review.
Privacy is no Panacea: Can’t hope to have privacy solve our fairness problems.
“At worst, privacy solutions can hinder efforts to identify classifications that unintentionally produce objectionable outcomes—for example, differential treatment that tracks race or gender—by limiting the availability of data about such attributes.“
![Page 21: 18734: Foundations of Privacy - Carnegie Mellon Universityece734/lectures/17-fairness.pdf · 2019-10-23 · TV (statistical) distance over distributions •Need to show 2 things:](https://reader033.vdocument.in/reader033/viewer/2022060307/5f09e1ab7e708231d428f2b8/html5/thumbnails/21.jpg)
Group Exercise
• With your partner, come up with a mathematical definition of a fair classifier
• I.e., what properties should 𝑓 exhibit to be considered fair?Inputs
𝑋#𝑋$…𝑋&
=
AgeGender…Race
𝑓(𝑋)
Output
𝑌 ∈ Credit, No Credit
SProtected
Class
TUnprotected
Class
![Page 22: 18734: Foundations of Privacy - Carnegie Mellon Universityece734/lectures/17-fairness.pdf · 2019-10-23 · TV (statistical) distance over distributions •Need to show 2 things:](https://reader033.vdocument.in/reader033/viewer/2022060307/5f09e1ab7e708231d428f2b8/html5/thumbnails/22.jpg)
Second attempt…
![Page 23: 18734: Foundations of Privacy - Carnegie Mellon Universityece734/lectures/17-fairness.pdf · 2019-10-23 · TV (statistical) distance over distributions •Need to show 2 things:](https://reader033.vdocument.in/reader033/viewer/2022060307/5f09e1ab7e708231d428f2b8/html5/thumbnails/23.jpg)
Statistical Parity (Group Fairness)
Equalize two groups 𝑆, 𝑇 at the level of outcomes – E.g. 𝑆 = minority, 𝑇 = 𝑆𝑐
– Outcome o
P[𝑜 | 𝑆] = P[𝑜 | 𝑇]
“Fraction of people in 𝑆 getting credit same as in 𝑇.”
![Page 24: 18734: Foundations of Privacy - Carnegie Mellon Universityece734/lectures/17-fairness.pdf · 2019-10-23 · TV (statistical) distance over distributions •Need to show 2 things:](https://reader033.vdocument.in/reader033/viewer/2022060307/5f09e1ab7e708231d428f2b8/html5/thumbnails/24.jpg)
Not strong enough as a notion of fairness– Sometimes desirable, but can be abused
![Page 25: 18734: Foundations of Privacy - Carnegie Mellon Universityece734/lectures/17-fairness.pdf · 2019-10-23 · TV (statistical) distance over distributions •Need to show 2 things:](https://reader033.vdocument.in/reader033/viewer/2022060307/5f09e1ab7e708231d428f2b8/html5/thumbnails/25.jpg)
• Self-fulfilling prophecy– Give credit offers to S persons deemed least
credit-worthy.– Give credit offers to those in S who are not
interested in credit.
STLikelihood of
paying back $
![Page 26: 18734: Foundations of Privacy - Carnegie Mellon Universityece734/lectures/17-fairness.pdf · 2019-10-23 · TV (statistical) distance over distributions •Need to show 2 things:](https://reader033.vdocument.in/reader033/viewer/2022060307/5f09e1ab7e708231d428f2b8/html5/thumbnails/26.jpg)
Lesson: Fairness is task-specific
Fairness requires understanding of classification task and protected groups
“Awareness”
![Page 27: 18734: Foundations of Privacy - Carnegie Mellon Universityece734/lectures/17-fairness.pdf · 2019-10-23 · TV (statistical) distance over distributions •Need to show 2 things:](https://reader033.vdocument.in/reader033/viewer/2022060307/5f09e1ab7e708231d428f2b8/html5/thumbnails/27.jpg)
• Statistical property vs. individual guarantee– Statistical outcomes may be ”fair”, but individuals
might still be discriminated against
![Page 28: 18734: Foundations of Privacy - Carnegie Mellon Universityece734/lectures/17-fairness.pdf · 2019-10-23 · TV (statistical) distance over distributions •Need to show 2 things:](https://reader033.vdocument.in/reader033/viewer/2022060307/5f09e1ab7e708231d428f2b8/html5/thumbnails/28.jpg)
Individual FairnessApproach
Fairness Through Awareness. Cynthia Dwork, Moritz Hardt, Toniann Pitassi, Omer Reingold, Richard Zemel. 2011
![Page 29: 18734: Foundations of Privacy - Carnegie Mellon Universityece734/lectures/17-fairness.pdf · 2019-10-23 · TV (statistical) distance over distributions •Need to show 2 things:](https://reader033.vdocument.in/reader033/viewer/2022060307/5f09e1ab7e708231d428f2b8/html5/thumbnails/29.jpg)
Individual Fairness
Treat similar individuals similarly
Similar for the purpose ofthe classification task
Similar distributionover outcomes
![Page 30: 18734: Foundations of Privacy - Carnegie Mellon Universityece734/lectures/17-fairness.pdf · 2019-10-23 · TV (statistical) distance over distributions •Need to show 2 things:](https://reader033.vdocument.in/reader033/viewer/2022060307/5f09e1ab7e708231d428f2b8/html5/thumbnails/30.jpg)
• Assume task-specific similarity metric– Extent to which two individuals are similar w.r.t.
the classification task at hand
• Ideally captures ground truth– Or, society’s best approximation
• Open to public discussion, refinement– In the spirit of Rawls
• Typically unrelated to classification!
Metric
![Page 31: 18734: Foundations of Privacy - Carnegie Mellon Universityece734/lectures/17-fairness.pdf · 2019-10-23 · TV (statistical) distance over distributions •Need to show 2 things:](https://reader033.vdocument.in/reader033/viewer/2022060307/5f09e1ab7e708231d428f2b8/html5/thumbnails/31.jpg)
• Financial/insurance risk metrics– Already widely used (though secret)
• AALIM health care metric– health metric for treating similar patients similarly
• Roemer’s relative effort metric– Well-known approach in Economics/Political
theory
Examples
![Page 32: 18734: Foundations of Privacy - Carnegie Mellon Universityece734/lectures/17-fairness.pdf · 2019-10-23 · TV (statistical) distance over distributions •Need to show 2 things:](https://reader033.vdocument.in/reader033/viewer/2022060307/5f09e1ab7e708231d428f2b8/html5/thumbnails/32.jpg)
How to formalize this?
V: Individuals O: outcomes
x
d(�, y)y
M(x)
M(y)
How can we compare
M(x) with M(y)?
Think of V as spacewith metric d(x,y)similar = small d(x,y)
M : V ! O
![Page 33: 18734: Foundations of Privacy - Carnegie Mellon Universityece734/lectures/17-fairness.pdf · 2019-10-23 · TV (statistical) distance over distributions •Need to show 2 things:](https://reader033.vdocument.in/reader033/viewer/2022060307/5f09e1ab7e708231d428f2b8/html5/thumbnails/33.jpg)
V: Individuals O: outcomes
M(x)
d(�, y)y
M(y)
xM : V ! �(O)
Distributional outcomes
How can we compareM(x) with M(y)?
Distance over distributions!
![Page 34: 18734: Foundations of Privacy - Carnegie Mellon Universityece734/lectures/17-fairness.pdf · 2019-10-23 · TV (statistical) distance over distributions •Need to show 2 things:](https://reader033.vdocument.in/reader033/viewer/2022060307/5f09e1ab7e708231d428f2b8/html5/thumbnails/34.jpg)
V: Individuals O: outcomes
Metric d : V ⇥ V ! R
M(x)
Fairness DefinitionLipschitz condition: 𝑀(𝑥) −𝑀(𝑦) ≤ 𝑑(𝑥, 𝑦)
d(�, y)y
M(y)
xM : V ! �(O)
This talk: Statistical distance in [0,1]
![Page 35: 18734: Foundations of Privacy - Carnegie Mellon Universityece734/lectures/17-fairness.pdf · 2019-10-23 · TV (statistical) distance over distributions •Need to show 2 things:](https://reader033.vdocument.in/reader033/viewer/2022060307/5f09e1ab7e708231d428f2b8/html5/thumbnails/35.jpg)
Statistical Distance (Total Variation Distance)Let 𝑀J and 𝑀K denote probability measures on a finite domain 𝐴. The statistical distance (or total variation distance) between 𝑀J and 𝑀K is denoted by
𝐷NO 𝑀J,𝑀K =12RS∈T
𝑀J 𝑎 −𝑀K(𝑎)
![Page 36: 18734: Foundations of Privacy - Carnegie Mellon Universityece734/lectures/17-fairness.pdf · 2019-10-23 · TV (statistical) distance over distributions •Need to show 2 things:](https://reader033.vdocument.in/reader033/viewer/2022060307/5f09e1ab7e708231d428f2b8/html5/thumbnails/36.jpg)
Statistical Distance (Total Variation Distance)Let 𝑀J and 𝑀K denote probability measures on a finite domain 𝐴. The statistical distance (or total variation distance) between 𝑀J and 𝑀K is denoted by
𝐷NO 𝑀J,𝑀K =12RS∈T
𝑀J 𝑎 −𝑀K(𝑎)
Q: What is the minimum TV distance between two distributions?
A: 0, achieved when both are equal
![Page 37: 18734: Foundations of Privacy - Carnegie Mellon Universityece734/lectures/17-fairness.pdf · 2019-10-23 · TV (statistical) distance over distributions •Need to show 2 things:](https://reader033.vdocument.in/reader033/viewer/2022060307/5f09e1ab7e708231d428f2b8/html5/thumbnails/37.jpg)
Statistical Distance (Total Variation Distance)Let 𝑀J and 𝑀K denote probability measures on a finite domain 𝐴. The statistical distance (or total variation distance) between 𝑀J and 𝑀K is denoted by
𝐷NO 𝑀J,𝑀K =12RS∈T
𝑀J 𝑎 −𝑀K(𝑎)
Q: What is the maximum TV distance between two distributions?
A: 1, achieved when both are disjoint
𝑀K𝑀J
![Page 38: 18734: Foundations of Privacy - Carnegie Mellon Universityece734/lectures/17-fairness.pdf · 2019-10-23 · TV (statistical) distance over distributions •Need to show 2 things:](https://reader033.vdocument.in/reader033/viewer/2022060307/5f09e1ab7e708231d428f2b8/html5/thumbnails/38.jpg)
Statistical Distance (Total Variation Distance)Let 𝑀J and 𝑀K denote probability measures on a finite domain 𝐴. The statistical distance (or total variation distance) between 𝑀J and 𝑀K is denoted by
𝐷NO 𝑀J,𝑀K =12RS∈T
𝑀J 𝑎 −𝑀K(𝑎)
Example of intermediate TV distance between two distributions
𝑀K𝑀J
![Page 39: 18734: Foundations of Privacy - Carnegie Mellon Universityece734/lectures/17-fairness.pdf · 2019-10-23 · TV (statistical) distance over distributions •Need to show 2 things:](https://reader033.vdocument.in/reader033/viewer/2022060307/5f09e1ab7e708231d428f2b8/html5/thumbnails/39.jpg)
Existence Proof
There exists a classifier that satisfies the Lipschitz condition
• Construction: Map all individuals to the same distribution over outcomes
• Are we done?
Individual Fairness DefinitionLipschitz condition: 𝑀(𝑥) −𝑀(𝑦) ≤ 𝑑(𝑥, 𝑦)
![Page 40: 18734: Foundations of Privacy - Carnegie Mellon Universityece734/lectures/17-fairness.pdf · 2019-10-23 · TV (statistical) distance over distributions •Need to show 2 things:](https://reader033.vdocument.in/reader033/viewer/2022060307/5f09e1ab7e708231d428f2b8/html5/thumbnails/40.jpg)
But… how do we ensure utility?
![Page 41: 18734: Foundations of Privacy - Carnegie Mellon Universityece734/lectures/17-fairness.pdf · 2019-10-23 · TV (statistical) distance over distributions •Need to show 2 things:](https://reader033.vdocument.in/reader033/viewer/2022060307/5f09e1ab7e708231d428f2b8/html5/thumbnails/41.jpg)
Utility Maximization
U : V ⇥O! R
Vendor can specify arbitrary utility function
𝑈(𝑣, 𝑜) = Vendor’s utility from giving individual 𝑣the outcome 𝑜
E.g. Accuracy of classifier
![Page 42: 18734: Foundations of Privacy - Carnegie Mellon Universityece734/lectures/17-fairness.pdf · 2019-10-23 · TV (statistical) distance over distributions •Need to show 2 things:](https://reader033.vdocument.in/reader033/viewer/2022060307/5f09e1ab7e708231d428f2b8/html5/thumbnails/42.jpg)
Maximize vendor’s expected utility subject to Lipschitz condition
maxZ[
𝔼J~O𝔼^~Z[ 𝑈(𝑥, 𝑜)
s.t. 𝑀J −𝑀K ≤ 𝑑 𝑥, 𝑦 ∀ 𝑥, 𝑦 ∈ 𝑉
Maximize utility
Subject to fairness constraint
![Page 43: 18734: Foundations of Privacy - Carnegie Mellon Universityece734/lectures/17-fairness.pdf · 2019-10-23 · TV (statistical) distance over distributions •Need to show 2 things:](https://reader033.vdocument.in/reader033/viewer/2022060307/5f09e1ab7e708231d428f2b8/html5/thumbnails/43.jpg)
Claim: This optimization is a linear program under TV (statistical) distance over distributions
• Need to show 2 things: – Objective function is linear in probability mass vector 𝑀J– Constraints are all linear
• Try to show both– You can assume 𝑉 is a set with |𝑉| discrete items
• Why do we care? Linear programs can be solved efficiently!– I.e., in polynomial time in the problem dimension
maxZ[
𝔼J~O𝔼^~Z[ 𝑈(𝑥, 𝑜)
s.t. 𝑀J −𝑀K ≤ 𝑑 𝑥, 𝑦 ∀ 𝑥, 𝑦 ∈ 𝑉
![Page 44: 18734: Foundations of Privacy - Carnegie Mellon Universityece734/lectures/17-fairness.pdf · 2019-10-23 · TV (statistical) distance over distributions •Need to show 2 things:](https://reader033.vdocument.in/reader033/viewer/2022060307/5f09e1ab7e708231d428f2b8/html5/thumbnails/44.jpg)
What’s the takeaway?
• We can efficiently enforce individual fairness while maximizing overall utility!
• What about our initial notion of group fairness?