![Page 1: Machine Learning & Fairness...» Works with pre-trained word embeddings » Harder to integrate into systems that learn embeddings » Not all subpopulations have a definitional “direction”](https://reader033.vdocument.in/reader033/viewer/2022060309/5f0a63297e708231d42b6331/html5/thumbnails/1.jpg)
Machine Learning& Fairness
Jenn Wortman Vaughan & Hanna WallachMicrosoft Research New York City
![Page 2: Machine Learning & Fairness...» Works with pre-trained word embeddings » Harder to integrate into systems that learn embeddings » Not all subpopulations have a definitional “direction”](https://reader033.vdocument.in/reader033/viewer/2022060309/5f0a63297e708231d42b6331/html5/thumbnails/2.jpg)
Who?
http://www.microsoft.com/en-us/research/group/fate/
![Page 3: Machine Learning & Fairness...» Works with pre-trained word embeddings » Harder to integrate into systems that learn embeddings » Not all subpopulations have a definitional “direction”](https://reader033.vdocument.in/reader033/viewer/2022060309/5f0a63297e708231d42b6331/html5/thumbnails/3.jpg)
Aether Committee
Bias & FairnessWorking Group
IntelligibilityWorking Group
![Page 4: Machine Learning & Fairness...» Works with pre-trained word embeddings » Harder to integrate into systems that learn embeddings » Not all subpopulations have a definitional “direction”](https://reader033.vdocument.in/reader033/viewer/2022060309/5f0a63297e708231d42b6331/html5/thumbnails/4.jpg)
AI & Machine Learning
![Page 5: Machine Learning & Fairness...» Works with pre-trained word embeddings » Harder to integrate into systems that learn embeddings » Not all subpopulations have a definitional “direction”](https://reader033.vdocument.in/reader033/viewer/2022060309/5f0a63297e708231d42b6331/html5/thumbnails/5.jpg)
NeurIPS Registrations
0
1000
2000
3000
4000
5000
6000
7000
8000
9000
20
02
20
03
20
04
20
05
20
06
20
07
20
08
20
09
20
10
20
11
20
12
20
13
20
14
20
15
20
16
20
17
20
18
Nu
mb
er
of
Reg
istr
atio
ns
![Page 6: Machine Learning & Fairness...» Works with pre-trained word embeddings » Harder to integrate into systems that learn embeddings » Not all subpopulations have a definitional “direction”](https://reader033.vdocument.in/reader033/viewer/2022060309/5f0a63297e708231d42b6331/html5/thumbnails/6.jpg)
THE AGE OF AI
![Page 7: Machine Learning & Fairness...» Works with pre-trained word embeddings » Harder to integrate into systems that learn embeddings » Not all subpopulations have a definitional “direction”](https://reader033.vdocument.in/reader033/viewer/2022060309/5f0a63297e708231d42b6331/html5/thumbnails/7.jpg)
OPPORTUNITIES
![Page 8: Machine Learning & Fairness...» Works with pre-trained word embeddings » Harder to integrate into systems that learn embeddings » Not all subpopulations have a definitional “direction”](https://reader033.vdocument.in/reader033/viewer/2022060309/5f0a63297e708231d42b6331/html5/thumbnails/8.jpg)
Microsoft
![Page 9: Machine Learning & Fairness...» Works with pre-trained word embeddings » Harder to integrate into systems that learn embeddings » Not all subpopulations have a definitional “direction”](https://reader033.vdocument.in/reader033/viewer/2022060309/5f0a63297e708231d42b6331/html5/thumbnails/9.jpg)
CHALLENGES
![Page 10: Machine Learning & Fairness...» Works with pre-trained word embeddings » Harder to integrate into systems that learn embeddings » Not all subpopulations have a definitional “direction”](https://reader033.vdocument.in/reader033/viewer/2022060309/5f0a63297e708231d42b6331/html5/thumbnails/10.jpg)
The Media…
![Page 11: Machine Learning & Fairness...» Works with pre-trained word embeddings » Harder to integrate into systems that learn embeddings » Not all subpopulations have a definitional “direction”](https://reader033.vdocument.in/reader033/viewer/2022060309/5f0a63297e708231d42b6331/html5/thumbnails/11.jpg)
Employment
![Page 12: Machine Learning & Fairness...» Works with pre-trained word embeddings » Harder to integrate into systems that learn embeddings » Not all subpopulations have a definitional “direction”](https://reader033.vdocument.in/reader033/viewer/2022060309/5f0a63297e708231d42b6331/html5/thumbnails/12.jpg)
Criminal Justice
![Page 13: Machine Learning & Fairness...» Works with pre-trained word embeddings » Harder to integrate into systems that learn embeddings » Not all subpopulations have a definitional “direction”](https://reader033.vdocument.in/reader033/viewer/2022060309/5f0a63297e708231d42b6331/html5/thumbnails/13.jpg)
Advertising
[Sweeney, 2013]
![Page 14: Machine Learning & Fairness...» Works with pre-trained word embeddings » Harder to integrate into systems that learn embeddings » Not all subpopulations have a definitional “direction”](https://reader033.vdocument.in/reader033/viewer/2022060309/5f0a63297e708231d42b6331/html5/thumbnails/14.jpg)
FAIRNESS
![Page 15: Machine Learning & Fairness...» Works with pre-trained word embeddings » Harder to integrate into systems that learn embeddings » Not all subpopulations have a definitional “direction”](https://reader033.vdocument.in/reader033/viewer/2022060309/5f0a63297e708231d42b6331/html5/thumbnails/15.jpg)
LEARN FROMSECURITY & PRIVACY
![Page 16: Machine Learning & Fairness...» Works with pre-trained word embeddings » Harder to integrate into systems that learn embeddings » Not all subpopulations have a definitional “direction”](https://reader033.vdocument.in/reader033/viewer/2022060309/5f0a63297e708231d42b6331/html5/thumbnails/16.jpg)
SOME HISTORY…
![Page 17: Machine Learning & Fairness...» Works with pre-trained word embeddings » Harder to integrate into systems that learn embeddings » Not all subpopulations have a definitional “direction”](https://reader033.vdocument.in/reader033/viewer/2022060309/5f0a63297e708231d42b6331/html5/thumbnails/17.jpg)
GROWTH MINDSET
![Page 18: Machine Learning & Fairness...» Works with pre-trained word embeddings » Harder to integrate into systems that learn embeddings » Not all subpopulations have a definitional “direction”](https://reader033.vdocument.in/reader033/viewer/2022060309/5f0a63297e708231d42b6331/html5/thumbnails/18.jpg)
This Talk
» What are (some of) the different types of harm?
» Which subpopulations are likely to be affected?
» Where do these harms come from and what are some effective strategies to help mitigate them?
» Which software tools can help mitigate them?
![Page 19: Machine Learning & Fairness...» Works with pre-trained word embeddings » Harder to integrate into systems that learn embeddings » Not all subpopulations have a definitional “direction”](https://reader033.vdocument.in/reader033/viewer/2022060309/5f0a63297e708231d42b6331/html5/thumbnails/19.jpg)
TYPES OF HARM
[Shapiro et al., 2017]
![Page 20: Machine Learning & Fairness...» Works with pre-trained word embeddings » Harder to integrate into systems that learn embeddings » Not all subpopulations have a definitional “direction”](https://reader033.vdocument.in/reader033/viewer/2022060309/5f0a63297e708231d42b6331/html5/thumbnails/20.jpg)
Allocation
![Page 21: Machine Learning & Fairness...» Works with pre-trained word embeddings » Harder to integrate into systems that learn embeddings » Not all subpopulations have a definitional “direction”](https://reader033.vdocument.in/reader033/viewer/2022060309/5f0a63297e708231d42b6331/html5/thumbnails/21.jpg)
Allocation
![Page 22: Machine Learning & Fairness...» Works with pre-trained word embeddings » Harder to integrate into systems that learn embeddings » Not all subpopulations have a definitional “direction”](https://reader033.vdocument.in/reader033/viewer/2022060309/5f0a63297e708231d42b6331/html5/thumbnails/22.jpg)
Quality of Service
[Buolamwini & Gebru, 2018]
![Page 23: Machine Learning & Fairness...» Works with pre-trained word embeddings » Harder to integrate into systems that learn embeddings » Not all subpopulations have a definitional “direction”](https://reader033.vdocument.in/reader033/viewer/2022060309/5f0a63297e708231d42b6331/html5/thumbnails/23.jpg)
Quality of Service
![Page 24: Machine Learning & Fairness...» Works with pre-trained word embeddings » Harder to integrate into systems that learn embeddings » Not all subpopulations have a definitional “direction”](https://reader033.vdocument.in/reader033/viewer/2022060309/5f0a63297e708231d42b6331/html5/thumbnails/24.jpg)
Stereotyping
![Page 25: Machine Learning & Fairness...» Works with pre-trained word embeddings » Harder to integrate into systems that learn embeddings » Not all subpopulations have a definitional “direction”](https://reader033.vdocument.in/reader033/viewer/2022060309/5f0a63297e708231d42b6331/html5/thumbnails/25.jpg)
Stereotyping
[Caliksan et al., 2017]
![Page 26: Machine Learning & Fairness...» Works with pre-trained word embeddings » Harder to integrate into systems that learn embeddings » Not all subpopulations have a definitional “direction”](https://reader033.vdocument.in/reader033/viewer/2022060309/5f0a63297e708231d42b6331/html5/thumbnails/26.jpg)
Stereotyping
![Page 27: Machine Learning & Fairness...» Works with pre-trained word embeddings » Harder to integrate into systems that learn embeddings » Not all subpopulations have a definitional “direction”](https://reader033.vdocument.in/reader033/viewer/2022060309/5f0a63297e708231d42b6331/html5/thumbnails/27.jpg)
Denigration
![Page 28: Machine Learning & Fairness...» Works with pre-trained word embeddings » Harder to integrate into systems that learn embeddings » Not all subpopulations have a definitional “direction”](https://reader033.vdocument.in/reader033/viewer/2022060309/5f0a63297e708231d42b6331/html5/thumbnails/28.jpg)
Denigration
![Page 29: Machine Learning & Fairness...» Works with pre-trained word embeddings » Harder to integrate into systems that learn embeddings » Not all subpopulations have a definitional “direction”](https://reader033.vdocument.in/reader033/viewer/2022060309/5f0a63297e708231d42b6331/html5/thumbnails/29.jpg)
Over- and Under-Representation
[Kay et al., 2015]
![Page 30: Machine Learning & Fairness...» Works with pre-trained word embeddings » Harder to integrate into systems that learn embeddings » Not all subpopulations have a definitional “direction”](https://reader033.vdocument.in/reader033/viewer/2022060309/5f0a63297e708231d42b6331/html5/thumbnails/30.jpg)
Types of Harm
Allo
cati
on
Qu
alit
y o
f Se
rvic
e
Ste
reo
typ
ing
De
nig
rati
on
Ove
r-o
r U
nd
er-
Re
pr.
Hiring system does not rank women as highly as men for technical jobs
x x x
Gender classification software misclassifies darker-skin women
x
Machine translation system exhibits male/female gender stereotypes
x x
Photo management program labels image of black people as “gorillas”
x x
Image searches for “CEO” yield only photos of white men on first page
x x
![Page 31: Machine Learning & Fairness...» Works with pre-trained word embeddings » Harder to integrate into systems that learn embeddings » Not all subpopulations have a definitional “direction”](https://reader033.vdocument.in/reader033/viewer/2022060309/5f0a63297e708231d42b6331/html5/thumbnails/31.jpg)
This Talk
» What are (some of) the different types of harm?
» Which subpopulations are likely to be affected?
» Where do these harms come from and what are some effective strategies to help mitigate them?
» Which software tools can help mitigate them?
![Page 32: Machine Learning & Fairness...» Works with pre-trained word embeddings » Harder to integrate into systems that learn embeddings » Not all subpopulations have a definitional “direction”](https://reader033.vdocument.in/reader033/viewer/2022060309/5f0a63297e708231d42b6331/html5/thumbnails/32.jpg)
WHO?
![Page 33: Machine Learning & Fairness...» Works with pre-trained word embeddings » Harder to integrate into systems that learn embeddings » Not all subpopulations have a definitional “direction”](https://reader033.vdocument.in/reader033/viewer/2022060309/5f0a63297e708231d42b6331/html5/thumbnails/33.jpg)
Subpopulations
» Protected subpopulations, e.g., race, gender, age
» Historically marginalized subpopulations
» Not always easy to identify subpopulations
» 62% of industry practitioners reported it would be very or extremely useful to have support in this area
» Subpopulations may be application-specific
[Holstein et al., 2019]
![Page 34: Machine Learning & Fairness...» Works with pre-trained word embeddings » Harder to integrate into systems that learn embeddings » Not all subpopulations have a definitional “direction”](https://reader033.vdocument.in/reader033/viewer/2022060309/5f0a63297e708231d42b6331/html5/thumbnails/34.jpg)
Subpopulations
“ [P]eople start thinking about sensitive attributes like your ethnicity, your religion, your sexuality, your gender. But the biggest problem I found is that these cohorts should be defined based on the domain and problem. For example, for [automated writing evaluation] maybe it should be defined based on [... whether the writer is] a native speaker.
![Page 35: Machine Learning & Fairness...» Works with pre-trained word embeddings » Harder to integrate into systems that learn embeddings » Not all subpopulations have a definitional “direction”](https://reader033.vdocument.in/reader033/viewer/2022060309/5f0a63297e708231d42b6331/html5/thumbnails/35.jpg)
Intersectionality
![Page 36: Machine Learning & Fairness...» Works with pre-trained word embeddings » Harder to integrate into systems that learn embeddings » Not all subpopulations have a definitional “direction”](https://reader033.vdocument.in/reader033/viewer/2022060309/5f0a63297e708231d42b6331/html5/thumbnails/36.jpg)
Access to Attributes
» Many teams have no access to relevant attributes
» Makes it hard to audit systems for biases
» One option is to collect attributes purely for auditing
» Raises privacy concerns, users may object
» Another option is to use ML to infer relevant attributes
» Shifts the problem, can introduce new biases
![Page 37: Machine Learning & Fairness...» Works with pre-trained word embeddings » Harder to integrate into systems that learn embeddings » Not all subpopulations have a definitional “direction”](https://reader033.vdocument.in/reader033/viewer/2022060309/5f0a63297e708231d42b6331/html5/thumbnails/37.jpg)
Social Constructs
[Buolamwini & Gebru, 2018]
![Page 38: Machine Learning & Fairness...» Works with pre-trained word embeddings » Harder to integrate into systems that learn embeddings » Not all subpopulations have a definitional “direction”](https://reader033.vdocument.in/reader033/viewer/2022060309/5f0a63297e708231d42b6331/html5/thumbnails/38.jpg)
Individual Fairness
![Page 39: Machine Learning & Fairness...» Works with pre-trained word embeddings » Harder to integrate into systems that learn embeddings » Not all subpopulations have a definitional “direction”](https://reader033.vdocument.in/reader033/viewer/2022060309/5f0a63297e708231d42b6331/html5/thumbnails/39.jpg)
Counterfactual Fairness
![Page 40: Machine Learning & Fairness...» Works with pre-trained word embeddings » Harder to integrate into systems that learn embeddings » Not all subpopulations have a definitional “direction”](https://reader033.vdocument.in/reader033/viewer/2022060309/5f0a63297e708231d42b6331/html5/thumbnails/40.jpg)
This Talk
» What are (some of) the different types of harm?
» Which subpopulations are likely to be affected?
» Where do these harms come from and what are some effective strategies to help mitigate them?
» Which software tools can help mitigate them?
![Page 41: Machine Learning & Fairness...» Works with pre-trained word embeddings » Harder to integrate into systems that learn embeddings » Not all subpopulations have a definitional “direction”](https://reader033.vdocument.in/reader033/viewer/2022060309/5f0a63297e708231d42b6331/html5/thumbnails/41.jpg)
ML Pipeline
task definition
dataset construction
model definition
training process
testing process
deployment process
feedback loop
![Page 42: Machine Learning & Fairness...» Works with pre-trained word embeddings » Harder to integrate into systems that learn embeddings » Not all subpopulations have a definitional “direction”](https://reader033.vdocument.in/reader033/viewer/2022060309/5f0a63297e708231d42b6331/html5/thumbnails/42.jpg)
Task Definition
task definition
dataset construction
model definition
training process
testing process
deployment process
feedback loop
![Page 43: Machine Learning & Fairness...» Works with pre-trained word embeddings » Harder to integrate into systems that learn embeddings » Not all subpopulations have a definitional “direction”](https://reader033.vdocument.in/reader033/viewer/2022060309/5f0a63297e708231d42b6331/html5/thumbnails/43.jpg)
Task Definition
[Wu & Zhang, 2016]
![Page 44: Machine Learning & Fairness...» Works with pre-trained word embeddings » Harder to integrate into systems that learn embeddings » Not all subpopulations have a definitional “direction”](https://reader033.vdocument.in/reader033/viewer/2022060309/5f0a63297e708231d42b6331/html5/thumbnails/44.jpg)
Task Definition
![Page 45: Machine Learning & Fairness...» Works with pre-trained word embeddings » Harder to integrate into systems that learn embeddings » Not all subpopulations have a definitional “direction”](https://reader033.vdocument.in/reader033/viewer/2022060309/5f0a63297e708231d42b6331/html5/thumbnails/45.jpg)
Task Definition
» Clearly define the task & model’s intended effects
» Try to identify any unintended effects & biases
» Involve diverse stakeholders & multiple perspectives
» Try to refine task definition & be willing to abort
» Document any unintended effects & biases
![Page 46: Machine Learning & Fairness...» Works with pre-trained word embeddings » Harder to integrate into systems that learn embeddings » Not all subpopulations have a definitional “direction”](https://reader033.vdocument.in/reader033/viewer/2022060309/5f0a63297e708231d42b6331/html5/thumbnails/46.jpg)
Dataset Construction
task definition
dataset construction
model definition
training process
testing process
deployment process
feedback loop
![Page 47: Machine Learning & Fairness...» Works with pre-trained word embeddings » Harder to integrate into systems that learn embeddings » Not all subpopulations have a definitional “direction”](https://reader033.vdocument.in/reader033/viewer/2022060309/5f0a63297e708231d42b6331/html5/thumbnails/47.jpg)
Data: Societal Bias
![Page 48: Machine Learning & Fairness...» Works with pre-trained word embeddings » Harder to integrate into systems that learn embeddings » Not all subpopulations have a definitional “direction”](https://reader033.vdocument.in/reader033/viewer/2022060309/5f0a63297e708231d42b6331/html5/thumbnails/48.jpg)
Data: Societal Bias
![Page 49: Machine Learning & Fairness...» Works with pre-trained word embeddings » Harder to integrate into systems that learn embeddings » Not all subpopulations have a definitional “direction”](https://reader033.vdocument.in/reader033/viewer/2022060309/5f0a63297e708231d42b6331/html5/thumbnails/49.jpg)
Data: Skewed Sample
![Page 50: Machine Learning & Fairness...» Works with pre-trained word embeddings » Harder to integrate into systems that learn embeddings » Not all subpopulations have a definitional “direction”](https://reader033.vdocument.in/reader033/viewer/2022060309/5f0a63297e708231d42b6331/html5/thumbnails/50.jpg)
Data: Skewed Sample
![Page 51: Machine Learning & Fairness...» Works with pre-trained word embeddings » Harder to integrate into systems that learn embeddings » Not all subpopulations have a definitional “direction”](https://reader033.vdocument.in/reader033/viewer/2022060309/5f0a63297e708231d42b6331/html5/thumbnails/51.jpg)
Data: Skewed Sample
“ It sounds easy to just say like, “Oh, just add some more images in there,” but [...] there's no person on the team that actually knows what all of [these celebrities] look like [...] If I noticed that there's some celebrity from Taiwan that doesn't have enough images in there, I actually don't know what they look like to go and fix that. It's a non-trivial problem [...] But, Beyoncé, I know what she looks like.
![Page 52: Machine Learning & Fairness...» Works with pre-trained word embeddings » Harder to integrate into systems that learn embeddings » Not all subpopulations have a definitional “direction”](https://reader033.vdocument.in/reader033/viewer/2022060309/5f0a63297e708231d42b6331/html5/thumbnails/52.jpg)
Data: Source
» Think critically before collecting any data
» Check for biases in data source selection process
» Try to identify societal biases present in data source
» Check for biases in cultural context of data source
» Check that data source matches deployment context
![Page 53: Machine Learning & Fairness...» Works with pre-trained word embeddings » Harder to integrate into systems that learn embeddings » Not all subpopulations have a definitional “direction”](https://reader033.vdocument.in/reader033/viewer/2022060309/5f0a63297e708231d42b6331/html5/thumbnails/53.jpg)
Data: Collection Process
» Check for biases in technology used to collect data
» Check for biases in humans involved in collecting data
» Check for biases in strategy used for sampling
» Ensure sufficient representation of subpopulations
» Check that collection process itself is fair & ethical
![Page 54: Machine Learning & Fairness...» Works with pre-trained word embeddings » Harder to integrate into systems that learn embeddings » Not all subpopulations have a definitional “direction”](https://reader033.vdocument.in/reader033/viewer/2022060309/5f0a63297e708231d42b6331/html5/thumbnails/54.jpg)
Data: Labeler Bias
![Page 55: Machine Learning & Fairness...» Works with pre-trained word embeddings » Harder to integrate into systems that learn embeddings » Not all subpopulations have a definitional “direction”](https://reader033.vdocument.in/reader033/viewer/2022060309/5f0a63297e708231d42b6331/html5/thumbnails/55.jpg)
Data: Labeling & Preprocessing
» Check whether discarding data introduces biases
» Check whether bucketing introduces biases
» Check preprocessing software for biases
» Check labeling/annotation software for biases
» Check that human labelers do not introduce biases
![Page 56: Machine Learning & Fairness...» Works with pre-trained word embeddings » Harder to integrate into systems that learn embeddings » Not all subpopulations have a definitional “direction”](https://reader033.vdocument.in/reader033/viewer/2022060309/5f0a63297e708231d42b6331/html5/thumbnails/56.jpg)
Data: Documentation
![Page 57: Machine Learning & Fairness...» Works with pre-trained word embeddings » Harder to integrate into systems that learn embeddings » Not all subpopulations have a definitional “direction”](https://reader033.vdocument.in/reader033/viewer/2022060309/5f0a63297e708231d42b6331/html5/thumbnails/57.jpg)
DATASHEETS
[Gebru et al., 2018]
![Page 58: Machine Learning & Fairness...» Works with pre-trained word embeddings » Harder to integrate into systems that learn embeddings » Not all subpopulations have a definitional “direction”](https://reader033.vdocument.in/reader033/viewer/2022060309/5f0a63297e708231d42b6331/html5/thumbnails/58.jpg)
Datasheets for Datasets
![Page 59: Machine Learning & Fairness...» Works with pre-trained word embeddings » Harder to integrate into systems that learn embeddings » Not all subpopulations have a definitional “direction”](https://reader033.vdocument.in/reader033/viewer/2022060309/5f0a63297e708231d42b6331/html5/thumbnails/59.jpg)
Motivation
Composition
Collection Process
Preprocessing
Distribution
Maintenance
Legal & Ethical
Questions
![Page 60: Machine Learning & Fairness...» Works with pre-trained word embeddings » Harder to integrate into systems that learn embeddings » Not all subpopulations have a definitional “direction”](https://reader033.vdocument.in/reader033/viewer/2022060309/5f0a63297e708231d42b6331/html5/thumbnails/60.jpg)
Composition
![Page 61: Machine Learning & Fairness...» Works with pre-trained word embeddings » Harder to integrate into systems that learn embeddings » Not all subpopulations have a definitional “direction”](https://reader033.vdocument.in/reader033/viewer/2022060309/5f0a63297e708231d42b6331/html5/thumbnails/61.jpg)
Collection Process
![Page 62: Machine Learning & Fairness...» Works with pre-trained word embeddings » Harder to integrate into systems that learn embeddings » Not all subpopulations have a definitional “direction”](https://reader033.vdocument.in/reader033/viewer/2022060309/5f0a63297e708231d42b6331/html5/thumbnails/62.jpg)
Points to Consider
» What is the right set of questions?
» How best to handle continually evolving datastreams?
» Are there legal or PR risks to creating datasheets?
» What is the right process for making a datasheet?
» How best to incentivize developers & PMs?
» How much (if anything) should be automated?
![Page 63: Machine Learning & Fairness...» Works with pre-trained word embeddings » Harder to integrate into systems that learn embeddings » Not all subpopulations have a definitional “direction”](https://reader033.vdocument.in/reader033/viewer/2022060309/5f0a63297e708231d42b6331/html5/thumbnails/63.jpg)
Model Definition
task definition
dataset construction
model definition
training process
testing process
deployment process
feedback loop
![Page 64: Machine Learning & Fairness...» Works with pre-trained word embeddings » Harder to integrate into systems that learn embeddings » Not all subpopulations have a definitional “direction”](https://reader033.vdocument.in/reader033/viewer/2022060309/5f0a63297e708231d42b6331/html5/thumbnails/64.jpg)
What is a Model?
price of house = w1 * number of bedrooms +
w2 * number of bathrooms +
w3 * square feet +
a little bit of noise
![Page 65: Machine Learning & Fairness...» Works with pre-trained word embeddings » Harder to integrate into systems that learn embeddings » Not all subpopulations have a definitional “direction”](https://reader033.vdocument.in/reader033/viewer/2022060309/5f0a63297e708231d42b6331/html5/thumbnails/65.jpg)
Model: Assumptions
![Page 66: Machine Learning & Fairness...» Works with pre-trained word embeddings » Harder to integrate into systems that learn embeddings » Not all subpopulations have a definitional “direction”](https://reader033.vdocument.in/reader033/viewer/2022060309/5f0a63297e708231d42b6331/html5/thumbnails/66.jpg)
Model: Assumptions
![Page 67: Machine Learning & Fairness...» Works with pre-trained word embeddings » Harder to integrate into systems that learn embeddings » Not all subpopulations have a definitional “direction”](https://reader033.vdocument.in/reader033/viewer/2022060309/5f0a63297e708231d42b6331/html5/thumbnails/67.jpg)
Model: Structure
[image from Moritz Hardt]
majority minority population
![Page 68: Machine Learning & Fairness...» Works with pre-trained word embeddings » Harder to integrate into systems that learn embeddings » Not all subpopulations have a definitional “direction”](https://reader033.vdocument.in/reader033/viewer/2022060309/5f0a63297e708231d42b6331/html5/thumbnails/68.jpg)
Model: Objective Function
![Page 69: Machine Learning & Fairness...» Works with pre-trained word embeddings » Harder to integrate into systems that learn embeddings » Not all subpopulations have a definitional “direction”](https://reader033.vdocument.in/reader033/viewer/2022060309/5f0a63297e708231d42b6331/html5/thumbnails/69.jpg)
Model Definition
» Clearly define all assumptions about model
» Try to identify biases present in assumptions
» Check whether model structure introduces biases
» Check objective function for unintended effects
» Consider including “fairness” in objective function
![Page 70: Machine Learning & Fairness...» Works with pre-trained word embeddings » Harder to integrate into systems that learn embeddings » Not all subpopulations have a definitional “direction”](https://reader033.vdocument.in/reader033/viewer/2022060309/5f0a63297e708231d42b6331/html5/thumbnails/70.jpg)
Training Process
task definition
dataset construction
model definition
training process
testing process
deployment process
feedback loop
![Page 71: Machine Learning & Fairness...» Works with pre-trained word embeddings » Harder to integrate into systems that learn embeddings » Not all subpopulations have a definitional “direction”](https://reader033.vdocument.in/reader033/viewer/2022060309/5f0a63297e708231d42b6331/html5/thumbnails/71.jpg)
What is Training?
price of house = w1 * number of bedrooms +
w2 * number of bathrooms +
w3 * square feet +
a little bit of noise
![Page 72: Machine Learning & Fairness...» Works with pre-trained word embeddings » Harder to integrate into systems that learn embeddings » Not all subpopulations have a definitional “direction”](https://reader033.vdocument.in/reader033/viewer/2022060309/5f0a63297e708231d42b6331/html5/thumbnails/72.jpg)
Training Process
![Page 73: Machine Learning & Fairness...» Works with pre-trained word embeddings » Harder to integrate into systems that learn embeddings » Not all subpopulations have a definitional “direction”](https://reader033.vdocument.in/reader033/viewer/2022060309/5f0a63297e708231d42b6331/html5/thumbnails/73.jpg)
Testing Process
task definition
dataset construction
model definition
training process
testing process
deployment process
feedback loop
![Page 74: Machine Learning & Fairness...» Works with pre-trained word embeddings » Harder to integrate into systems that learn embeddings » Not all subpopulations have a definitional “direction”](https://reader033.vdocument.in/reader033/viewer/2022060309/5f0a63297e708231d42b6331/html5/thumbnails/74.jpg)
Testing: Data
![Page 75: Machine Learning & Fairness...» Works with pre-trained word embeddings » Harder to integrate into systems that learn embeddings » Not all subpopulations have a definitional “direction”](https://reader033.vdocument.in/reader033/viewer/2022060309/5f0a63297e708231d42b6331/html5/thumbnails/75.jpg)
Testing: Metrics
![Page 76: Machine Learning & Fairness...» Works with pre-trained word embeddings » Harder to integrate into systems that learn embeddings » Not all subpopulations have a definitional “direction”](https://reader033.vdocument.in/reader033/viewer/2022060309/5f0a63297e708231d42b6331/html5/thumbnails/76.jpg)
Testing: Metrics
![Page 77: Machine Learning & Fairness...» Works with pre-trained word embeddings » Harder to integrate into systems that learn embeddings » Not all subpopulations have a definitional “direction”](https://reader033.vdocument.in/reader033/viewer/2022060309/5f0a63297e708231d42b6331/html5/thumbnails/77.jpg)
Testing: Metrics
Un
qu
alif
ied
Qu
alif
ied
Reject TN FN
Hire FP TP}confusion matrix
![Page 78: Machine Learning & Fairness...» Works with pre-trained word embeddings » Harder to integrate into systems that learn embeddings » Not all subpopulations have a definitional “direction”](https://reader033.vdocument.in/reader033/viewer/2022060309/5f0a63297e708231d42b6331/html5/thumbnails/78.jpg)
Testing: Metrics
Un
qu
alif
ied
Qu
alif
ied
Reject 15 5
Hire 20 60
Men
Un
qu
alif
ied
Qu
alif
ied
Reject 60 20
Hire 5 15
Women
}confusion matrices
![Page 79: Machine Learning & Fairness...» Works with pre-trained word embeddings » Harder to integrate into systems that learn embeddings » Not all subpopulations have a definitional “direction”](https://reader033.vdocument.in/reader033/viewer/2022060309/5f0a63297e708231d42b6331/html5/thumbnails/79.jpg)
Demographic Parity
Un
qu
alif
ied
Qu
alif
ied
Reject 60 20
Hire 5 15
Un
qu
alif
ied
Qu
alif
ied
Reject 15 5
Hire 20 60
Men Women
![Page 80: Machine Learning & Fairness...» Works with pre-trained word embeddings » Harder to integrate into systems that learn embeddings » Not all subpopulations have a definitional “direction”](https://reader033.vdocument.in/reader033/viewer/2022060309/5f0a63297e708231d42b6331/html5/thumbnails/80.jpg)
Testing: Metrics
![Page 81: Machine Learning & Fairness...» Works with pre-trained word embeddings » Harder to integrate into systems that learn embeddings » Not all subpopulations have a definitional “direction”](https://reader033.vdocument.in/reader033/viewer/2022060309/5f0a63297e708231d42b6331/html5/thumbnails/81.jpg)
Predictive Parity
Un
qu
alif
ied
Qu
alif
ied
Reject 60 20
Hire 5 15
Un
qu
alif
ied
Qu
alif
ied
Reject 15 5
Hire 20 60
Men Women
![Page 82: Machine Learning & Fairness...» Works with pre-trained word embeddings » Harder to integrate into systems that learn embeddings » Not all subpopulations have a definitional “direction”](https://reader033.vdocument.in/reader033/viewer/2022060309/5f0a63297e708231d42b6331/html5/thumbnails/82.jpg)
False Positive Rate Balance
Un
qu
alif
ied
Qu
alif
ied
Reject 60 20
Hire 5 15
Un
qu
alif
ied
Qu
alif
ied
Reject 15 5
Hire 20 60
Men Women
![Page 83: Machine Learning & Fairness...» Works with pre-trained word embeddings » Harder to integrate into systems that learn embeddings » Not all subpopulations have a definitional “direction”](https://reader033.vdocument.in/reader033/viewer/2022060309/5f0a63297e708231d42b6331/html5/thumbnails/83.jpg)
False Negative Rate Balance
Un
qu
alif
ied
Qu
alif
ied
Reject 60 20
Hire 5 15
Un
qu
alif
ied
Qu
alif
ied
Reject 15 5
Hire 20 60
Men Women
![Page 84: Machine Learning & Fairness...» Works with pre-trained word embeddings » Harder to integrate into systems that learn embeddings » Not all subpopulations have a definitional “direction”](https://reader033.vdocument.in/reader033/viewer/2022060309/5f0a63297e708231d42b6331/html5/thumbnails/84.jpg)
Testing: Metrics
![Page 85: Machine Learning & Fairness...» Works with pre-trained word embeddings » Harder to integrate into systems that learn embeddings » Not all subpopulations have a definitional “direction”](https://reader033.vdocument.in/reader033/viewer/2022060309/5f0a63297e708231d42b6331/html5/thumbnails/85.jpg)
Testing: Metrics
![Page 86: Machine Learning & Fairness...» Works with pre-trained word embeddings » Harder to integrate into systems that learn embeddings » Not all subpopulations have a definitional “direction”](https://reader033.vdocument.in/reader033/viewer/2022060309/5f0a63297e708231d42b6331/html5/thumbnails/86.jpg)
Impossibility Theorem
![Page 87: Machine Learning & Fairness...» Works with pre-trained word embeddings » Harder to integrate into systems that learn embeddings » Not all subpopulations have a definitional “direction”](https://reader033.vdocument.in/reader033/viewer/2022060309/5f0a63297e708231d42b6331/html5/thumbnails/87.jpg)
Testing: Metrics
![Page 88: Machine Learning & Fairness...» Works with pre-trained word embeddings » Harder to integrate into systems that learn embeddings » Not all subpopulations have a definitional “direction”](https://reader033.vdocument.in/reader033/viewer/2022060309/5f0a63297e708231d42b6331/html5/thumbnails/88.jpg)
Testing Process
» Check that test data matches deployment context
» Ensure test data has sufficient representation
» Involve diverse stakeholders & multiple perspectives
» Clearly state all fairness requirements for model
» Use metrics to check that requirements are met
![Page 89: Machine Learning & Fairness...» Works with pre-trained word embeddings » Harder to integrate into systems that learn embeddings » Not all subpopulations have a definitional “direction”](https://reader033.vdocument.in/reader033/viewer/2022060309/5f0a63297e708231d42b6331/html5/thumbnails/89.jpg)
Deployment Process
task definition
dataset construction
model definition
training process
testing process
deployment process
feedback loop
![Page 90: Machine Learning & Fairness...» Works with pre-trained word embeddings » Harder to integrate into systems that learn embeddings » Not all subpopulations have a definitional “direction”](https://reader033.vdocument.in/reader033/viewer/2022060309/5f0a63297e708231d42b6331/html5/thumbnails/90.jpg)
Deployment: Context
[Phillips et al., 2011]
![Page 91: Machine Learning & Fairness...» Works with pre-trained word embeddings » Harder to integrate into systems that learn embeddings » Not all subpopulations have a definitional “direction”](https://reader033.vdocument.in/reader033/viewer/2022060309/5f0a63297e708231d42b6331/html5/thumbnails/91.jpg)
Deployment Process
» Check that data source matches deployment context
» Monitor match between training data & deployment data
» Monitor fairness metrics for unexpected changes
» Invite diverse stakeholders to audit system for biases
» Monitor user reports & user complaints
![Page 92: Machine Learning & Fairness...» Works with pre-trained word embeddings » Harder to integrate into systems that learn embeddings » Not all subpopulations have a definitional “direction”](https://reader033.vdocument.in/reader033/viewer/2022060309/5f0a63297e708231d42b6331/html5/thumbnails/92.jpg)
Feedback Loop
task definition
dataset construction
model definition
training process
testing process
deployment process
feedback loop
![Page 93: Machine Learning & Fairness...» Works with pre-trained word embeddings » Harder to integrate into systems that learn embeddings » Not all subpopulations have a definitional “direction”](https://reader033.vdocument.in/reader033/viewer/2022060309/5f0a63297e708231d42b6331/html5/thumbnails/93.jpg)
Feedback: Non-Adversarial
![Page 94: Machine Learning & Fairness...» Works with pre-trained word embeddings » Harder to integrate into systems that learn embeddings » Not all subpopulations have a definitional “direction”](https://reader033.vdocument.in/reader033/viewer/2022060309/5f0a63297e708231d42b6331/html5/thumbnails/94.jpg)
Feedback: Adversarial
![Page 95: Machine Learning & Fairness...» Works with pre-trained word embeddings » Harder to integrate into systems that learn embeddings » Not all subpopulations have a definitional “direction”](https://reader033.vdocument.in/reader033/viewer/2022060309/5f0a63297e708231d42b6331/html5/thumbnails/95.jpg)
Feedback Loop
» Monitor match between training & deployment data
» Monitor fairness metrics for unexpected changes
» Monitor user reports & user complaints
» Monitor users’ interactions with system
» Consider prohibiting some types of interactions
![Page 96: Machine Learning & Fairness...» Works with pre-trained word embeddings » Harder to integrate into systems that learn embeddings » Not all subpopulations have a definitional “direction”](https://reader033.vdocument.in/reader033/viewer/2022060309/5f0a63297e708231d42b6331/html5/thumbnails/96.jpg)
This Talk
» What are (some of) the different types of harm?
» Which subpopulations are likely to be affected?
» Where do these harms come from and what are some effective strategies to help mitigate them?
» Which software tools can help mitigate them?
![Page 97: Machine Learning & Fairness...» Works with pre-trained word embeddings » Harder to integrate into systems that learn embeddings » Not all subpopulations have a definitional “direction”](https://reader033.vdocument.in/reader033/viewer/2022060309/5f0a63297e708231d42b6331/html5/thumbnails/97.jpg)
SOFTWARE TOOLS
![Page 98: Machine Learning & Fairness...» Works with pre-trained word embeddings » Harder to integrate into systems that learn embeddings » Not all subpopulations have a definitional “direction”](https://reader033.vdocument.in/reader033/viewer/2022060309/5f0a63297e708231d42b6331/html5/thumbnails/98.jpg)
Academic Response
[image from Moritz Hardt]
![Page 99: Machine Learning & Fairness...» Works with pre-trained word embeddings » Harder to integrate into systems that learn embeddings » Not all subpopulations have a definitional “direction”](https://reader033.vdocument.in/reader033/viewer/2022060309/5f0a63297e708231d42b6331/html5/thumbnails/99.jpg)
AUDITING
![Page 100: Machine Learning & Fairness...» Works with pre-trained word embeddings » Harder to integrate into systems that learn embeddings » Not all subpopulations have a definitional “direction”](https://reader033.vdocument.in/reader033/viewer/2022060309/5f0a63297e708231d42b6331/html5/thumbnails/100.jpg)
Aequitas
![Page 101: Machine Learning & Fairness...» Works with pre-trained word embeddings » Harder to integrate into systems that learn embeddings » Not all subpopulations have a definitional “direction”](https://reader033.vdocument.in/reader033/viewer/2022060309/5f0a63297e708231d42b6331/html5/thumbnails/101.jpg)
IBM Fairness 360
![Page 102: Machine Learning & Fairness...» Works with pre-trained word embeddings » Harder to integrate into systems that learn embeddings » Not all subpopulations have a definitional “direction”](https://reader033.vdocument.in/reader033/viewer/2022060309/5f0a63297e708231d42b6331/html5/thumbnails/102.jpg)
Points to Consider
» Fairness is a non-trivial sociotechnical challenge
» Many types of harm relate to a broader cultural context than a single decision-making system
» Many aspects of fairness not captured by metrics
» No free lunch! Can’t simultaneously satisfy all metrics
» Need to make different tradeoffs in different contexts
![Page 103: Machine Learning & Fairness...» Works with pre-trained word embeddings » Harder to integrate into systems that learn embeddings » Not all subpopulations have a definitional “direction”](https://reader033.vdocument.in/reader033/viewer/2022060309/5f0a63297e708231d42b6331/html5/thumbnails/103.jpg)
CLASSIFICATION
[Agarwal et al., 2018]
![Page 104: Machine Learning & Fairness...» Works with pre-trained word embeddings » Harder to integrate into systems that learn embeddings » Not all subpopulations have a definitional “direction”](https://reader033.vdocument.in/reader033/viewer/2022060309/5f0a63297e708231d42b6331/html5/thumbnails/104.jpg)
“Fair” Classification
» Choose fairness metric w/r/t relevant attributes
» ML goal becomes maximizing classifier accuracy while minimizing unfairness according to the metric
» Two technical challenges:
» Choose an appropriate fairness metric
» Learning an accurate model subject to the metric
![Page 105: Machine Learning & Fairness...» Works with pre-trained word embeddings » Harder to integrate into systems that learn embeddings » Not all subpopulations have a definitional “direction”](https://reader033.vdocument.in/reader033/viewer/2022060309/5f0a63297e708231d42b6331/html5/thumbnails/105.jpg)
fairness-constrained classification
cost-sensitive classification
➧
A Reductions Approach
![Page 106: Machine Learning & Fairness...» Works with pre-trained word embeddings » Harder to integrate into systems that learn embeddings » Not all subpopulations have a definitional “direction”](https://reader033.vdocument.in/reader033/viewer/2022060309/5f0a63297e708231d42b6331/html5/thumbnails/106.jpg)
Many Benefits
» Works with many different fairness metrics
» Agnostic to form of classifier & training algorithm
» Doesn’t need deployment access to relevant attributes
» Important for teams that have no such access
» Important for “disparate treatment” concerns
» Guaranteed to find most accurate classifier
![Page 107: Machine Learning & Fairness...» Works with pre-trained word embeddings » Harder to integrate into systems that learn embeddings » Not all subpopulations have a definitional “direction”](https://reader033.vdocument.in/reader033/viewer/2022060309/5f0a63297e708231d42b6331/html5/thumbnails/107.jpg)
In Practice…
![Page 108: Machine Learning & Fairness...» Works with pre-trained word embeddings » Harder to integrate into systems that learn embeddings » Not all subpopulations have a definitional “direction”](https://reader033.vdocument.in/reader033/viewer/2022060309/5f0a63297e708231d42b6331/html5/thumbnails/108.jpg)
Python Library
![Page 109: Machine Learning & Fairness...» Works with pre-trained word embeddings » Harder to integrate into systems that learn embeddings » Not all subpopulations have a definitional “direction”](https://reader033.vdocument.in/reader033/viewer/2022060309/5f0a63297e708231d42b6331/html5/thumbnails/109.jpg)
IBM Fairness 360
![Page 110: Machine Learning & Fairness...» Works with pre-trained word embeddings » Harder to integrate into systems that learn embeddings » Not all subpopulations have a definitional “direction”](https://reader033.vdocument.in/reader033/viewer/2022060309/5f0a63297e708231d42b6331/html5/thumbnails/110.jpg)
Points to Consider
» Need to choose an appropriate fairness metric
» Still need to assess other fairness metrics
» Accuracy–fairness tradeoff may be illusory
» Test data may not match deployment context
» Fairness is a non-trivial sociotechnical challenge
» Many aspects of fairness not captured by metrics
![Page 111: Machine Learning & Fairness...» Works with pre-trained word embeddings » Harder to integrate into systems that learn embeddings » Not all subpopulations have a definitional “direction”](https://reader033.vdocument.in/reader033/viewer/2022060309/5f0a63297e708231d42b6331/html5/thumbnails/111.jpg)
WORD EMBEDDINGS
![Page 112: Machine Learning & Fairness...» Works with pre-trained word embeddings » Harder to integrate into systems that learn embeddings » Not all subpopulations have a definitional “direction”](https://reader033.vdocument.in/reader033/viewer/2022060309/5f0a63297e708231d42b6331/html5/thumbnails/112.jpg)
Data: Societal Bias
[Caliksan et al., 2017]
![Page 113: Machine Learning & Fairness...» Works with pre-trained word embeddings » Harder to integrate into systems that learn embeddings » Not all subpopulations have a definitional “direction”](https://reader033.vdocument.in/reader033/viewer/2022060309/5f0a63297e708231d42b6331/html5/thumbnails/113.jpg)
Bias in Word Embeddings
[Bolukbasi et al., 2016]
![Page 114: Machine Learning & Fairness...» Works with pre-trained word embeddings » Harder to integrate into systems that learn embeddings » Not all subpopulations have a definitional “direction”](https://reader033.vdocument.in/reader033/viewer/2022060309/5f0a63297e708231d42b6331/html5/thumbnails/114.jpg)
Python Library
![Page 115: Machine Learning & Fairness...» Works with pre-trained word embeddings » Harder to integrate into systems that learn embeddings » Not all subpopulations have a definitional “direction”](https://reader033.vdocument.in/reader033/viewer/2022060309/5f0a63297e708231d42b6331/html5/thumbnails/115.jpg)
Points to Consider
» Works with pre-trained word embeddings
» Harder to integrate into systems that learn embeddings
» Not all subpopulations have a definitional “direction”
» Can’t guarantee that you have eliminated biases
» Need to assess downstream effects on performance
![Page 116: Machine Learning & Fairness...» Works with pre-trained word embeddings » Harder to integrate into systems that learn embeddings » Not all subpopulations have a definitional “direction”](https://reader033.vdocument.in/reader033/viewer/2022060309/5f0a63297e708231d42b6331/html5/thumbnails/116.jpg)
This Class
» What are (some of) the different types of harm?
» Which subpopulations are likely to be affected?
» Where do these harms come from and what are some effective strategies to help mitigate them?
» Which software tools can help mitigate them?
![Page 117: Machine Learning & Fairness...» Works with pre-trained word embeddings » Harder to integrate into systems that learn embeddings » Not all subpopulations have a definitional “direction”](https://reader033.vdocument.in/reader033/viewer/2022060309/5f0a63297e708231d42b6331/html5/thumbnails/117.jpg)
OPEN QUESTIONS
![Page 118: Machine Learning & Fairness...» Works with pre-trained word embeddings » Harder to integrate into systems that learn embeddings » Not all subpopulations have a definitional “direction”](https://reader033.vdocument.in/reader033/viewer/2022060309/5f0a63297e708231d42b6331/html5/thumbnails/118.jpg)
Semi-Structured Interviews
[Holstein et al., 2019]
![Page 119: Machine Learning & Fairness...» Works with pre-trained word embeddings » Harder to integrate into systems that learn embeddings » Not all subpopulations have a definitional “direction”](https://reader033.vdocument.in/reader033/viewer/2022060309/5f0a63297e708231d42b6331/html5/thumbnails/119.jpg)
Anonymous Survey
0 10 20 30 40 50 60
Natural Language…
Predictive Analytics
Computer Vision
Decision Support
Search / Info. Retrieval
Recommender Systems
Chatbots / Conversational…
Speech and Voice
User Modeling / Adaptive…
Robotics / Cyberphysical…
0 10 20 30 40 50 60
Data Scientist
Researcher
Software Engineer
Technical Lead / Manager
Project / Program Manager
Domain / Content Expert
Executive / General…
Data Labeler
Social Scientist
Product Manager
![Page 120: Machine Learning & Fairness...» Works with pre-trained word embeddings » Harder to integrate into systems that learn embeddings » Not all subpopulations have a definitional “direction”](https://reader033.vdocument.in/reader033/viewer/2022060309/5f0a63297e708231d42b6331/html5/thumbnails/120.jpg)
High-Level Themes
» Needs for support in auditing systems for biases in a diverse range of applications beyond allocation
» Needs for support in creating “fairer” datasets
» Needs for support in identifying subpopulations
» Needs for support in detecting biases with access only to coarse-grained, partial, or indirect information
![Page 121: Machine Learning & Fairness...» Works with pre-trained word embeddings » Harder to integrate into systems that learn embeddings » Not all subpopulations have a definitional “direction”](https://reader033.vdocument.in/reader033/viewer/2022060309/5f0a63297e708231d42b6331/html5/thumbnails/121.jpg)
TAKEAWAYS
![Page 122: Machine Learning & Fairness...» Works with pre-trained word embeddings » Harder to integrate into systems that learn embeddings » Not all subpopulations have a definitional “direction”](https://reader033.vdocument.in/reader033/viewer/2022060309/5f0a63297e708231d42b6331/html5/thumbnails/122.jpg)
3 Calls to Action
» Prioritize fairness at every stage of ML pipeline
» Fairness should be a first-order priority
» Involve diverse stakeholders & multiple perspectives
» Fairness is a non-trivial sociotechnical challenge
» Adopt a growth mindset & learn from failures
» Can’t solve fairness, can’t debias, can’t neutralize
![Page 123: Machine Learning & Fairness...» Works with pre-trained word embeddings » Harder to integrate into systems that learn embeddings » Not all subpopulations have a definitional “direction”](https://reader033.vdocument.in/reader033/viewer/2022060309/5f0a63297e708231d42b6331/html5/thumbnails/123.jpg)
http://www.microsoft.com/en-us/research/group/fate/
THANKS