fairtext: an algorithm in the loop writing platform for

31
FairText: An Algorithm in the loop writing platform for Inclusive writing BHAVYA GHAI PhD Candidate, Computer Science Department Adviser: Klaus Mueller

Upload: others

Post on 02-Jun-2022

2 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: FairText: An Algorithm in the loop writing platform for

FairText: An Algorithm in the loop writing platform for Inclusive writing

BHAVYA GHAI

PhD Candidate, Computer Science Department

Adviser: Klaus Mueller

Page 2: FairText: An Algorithm in the loop writing platform for

Where I left last time …

Big Thanks to IACS !!!

Page 3: FairText: An Algorithm in the loop writing platform for

Outline▪ Unconscious Bias

▪ Social Impacts

▪ Word Embeddings

▪ Motivation

▪ Our Approach

▪ Problem Statement

▪ Bias Identification

▪ Ranking Synonym

▪ Current Status

▪ Usage Scenarios

Page 4: FairText: An Algorithm in the loop writing platform for

Picture a School Teacher …

Everyone holds unconscious beliefs based on past experiences

Page 5: FairText: An Algorithm in the loop writing platform for

Unconscious bias in language isn’t always intuitive, but its impact is real

Unconscious bias in Text

Image: https://textio.ai/watch-your-gender-tone-2728016066ec

Lead Analyze Competitive Active Confident Fearless exhaustive

Support Responsible Understanding Dependable Committed

Male Gendered Words

Female Gendered Words

Page 6: FairText: An Algorithm in the loop writing platform for

Impact of bias can be felt in all areas including Education, Career, Healthcare, etc.

Social Impact of Unconscious bias (in text)

Image: https://textio.ai/watch-your-gender-tone-2728016066ec

Page 7: FairText: An Algorithm in the loop writing platform for

Word Embeddings

Word Embeddings are reflection of society

SOME

ML

MODEL

UNSUPERVISED

LEARNING

WORD => [0.23, 0.86, 0.19, 0.49,.., .., .., ..]

Queen – King + Man = Woman

SentimentAnalysis

MachineTranslation

Popular eg: word2vec, fastText, Glove

QuestionAnswering

Named Entity

Recognition

.

.

.

Page 8: FairText: An Algorithm in the loop writing platform for

dancer teacher

acid

insecurity

joy

love

humiliation

forgiveness

lecturer

sadness

Female

aggression insecurity

hungeralienation prisoner

hatred

retaliationgangster

Black

terror terrorism

missile

aggression casualties

bomb

terrorist

slaughter

violence

Islam

* Christianity was used as reference religion for Islam and White was used as reference race for blacks.

engineermanager

military

aggression

fury

soldier

Professordoubt

terrorist

Male

capable

Page 9: FairText: An Algorithm in the loop writing platform for

Motivation

▪ One of the prime ways to tackle Unconscious bias is to make “the unconscious, conscious”

▪ Multiple research papers have established that ML algorithms have captured human like biases against a specific race, gender, etc.

▪ Specifically, Can we leverage bias encoded in word embeddings for more inclusive writing?

Page 10: FairText: An Algorithm in the loop writing platform for

Our Approach

SentimentAnalysis

MachineTranslation

QuestionAnswering

.

.

.

Humans Text Editor Text Corpus Word Embedding

We need AI powered Text editors to tackle human bias at its core

State of the art research is focused on these stages!

Help humanstackle their Implicit bias

Generate Unbiased

text

We focus on building Intelligent Text editors

Page 11: FairText: An Algorithm in the loop writing platform for

Problem Statement

Given some text as input:-

→ Can we identify which words are more likely to incite unconscious bias in the minds of the readers?

→ Can we suggest/recommend alternate words which have similar meaning but doesn’t incite bias?

Return unbiased version of original text with the meaning preserved

Page 12: FairText: An Algorithm in the loop writing platform for

Proposed Architecture

Human abilities are augmented with knowledge from NLP & Psychology research

Raw Text

Part-of-speechtagging

BiasIdentification

Engine

NamedEntity

Recognition

NeutralWord

Recommender System

RuleBasedEngine

Human

Neutral Text

Page 13: FairText: An Algorithm in the loop writing platform for

Bias Identification

Ideally, neutral words should be equidistant from either cluster

Bias_score(word) = distance(word, g1) - distance(word, g2)

"Word embeddings quantify 100 years of gender and ethnic stereotypes." Proceedings of the National Academy of Sciences (2018)

she

her

herself

g1

lady

mother

hehimself

father

man

him

g2

nurse

teacher

actress

waitress

actor

doctor

warrior

boss

career

family

MALE

FEMALE

Page 14: FairText: An Algorithm in the loop writing platform for

Synonym Retrieval

Word EmbeddingThesaurusContextualized

Word Embedding

Contextualized word embeddings is the way to go!

Lacks similarity score

Limited vocabulary

Good quality synonyms Fair quality synonyms

Similarity score

Big Vocabulary

Lacks context

Fair quality synonyms

Similarity score

Big Vocabulary

Context Aware

Slower inference

Page 15: FairText: An Algorithm in the loop writing platform for

Ranking of Synonyms

Finding the right word is a bi-objective optimization problem

assistant

coach

educator

instructor

lecturer

professor

scholar

schoolteacher

supervisor tutor

adviser

guide

mentor

pundit

teach

trainer

teacher

0.15

0.2

0.25

0.3

0.35

0.4

0.45

0 0.2 0.4 0.6 0.8 1 1.2

Ne

utr

alit

y

Similarity

Page 16: FairText: An Algorithm in the loop writing platform for

Current State

Basic framework with text highlighting is implemented

Page 17: FairText: An Algorithm in the loop writing platform for

We haven’t made any assumptions on the domain, so possibilities are endless!

Usage Scenarios

Job Postings

Letter of Recommendation Political Speeches

News Articles

Natural LanguageGeneration

Integrate into Mainstream tools

Page 18: FairText: An Algorithm in the loop writing platform for

▪ Improve Bias Identification using POS tagging & Named Entity Recognition

▪ Improve word recommendations using Contextualized word embedding

▪ Add overall gender tone meter

▪ Evaluate using User Study▪ How often do user concur with word highlighting? ▪ Do users adopt the suggestion or figure out a new way?▪ Does user demographics play a role?

▪ Extend to other kinds of bias like political, racial, etc.

▪ Detect if phrases/sentences are biased. E.g.- “Don’t be such a drama queen.”

Plan of Research

It’s an uncharted territory & there’s a lot to explore!

Page 19: FairText: An Algorithm in the loop writing platform for

Image: https://www.dreamstime.com/royalty-free-stock-images-finish-line-image29185929

Conclusion

▪ Unconscious Bias is ubiquitous & has serious real-world consequences

▪ Identifying & mitigating bias in language is a tricky Interdisciplinary problem

▪ We propose a novel approach by leveraging the bias encoded in word embeddings

▪ Our approach works in real time & can have direct real-world impact as a product

▪ In future, we will plan to extend to different kinds of biases and detect biases in sentences

Every word matters so choose your words carefully!

Page 20: FairText: An Algorithm in the loop writing platform for

Image: https://depositphotos.com/99431064/stock-photo-man-hand-writing-any-questions.html

Thank You …

Page 21: FairText: An Algorithm in the loop writing platform for

Why our approach?

Finding the right word is a bi-objective optimization problem

ScalableWorks in real timeMakes unconscious, conscious

Page 22: FairText: An Algorithm in the loop writing platform for

Writing Platforms

Mechanical Era

Source: https://textio.ai/the-dawn-of-the-augmented-writing-era-a0bb4f4376ff

Text mode eraGraphical Era

Collaborative Era

Artificial Intelligence Era

AI powered text editors is the way forward!

Page 23: FairText: An Algorithm in the loop writing platform for

How Algorithmic Bias is impacting Society?

Recidivism

Allocative Harms

Algorithms are trying to replicate the bias encoded in data

Representation Harms

Page 24: FairText: An Algorithm in the loop writing platform for

Sources of Bias

If training data or code developer is biased, Algorithms will be biased

Learn Learn

TeacherTraining data Developers

Books

Human Algorithm

Page 25: FairText: An Algorithm in the loop writing platform for

In the media …

Page 26: FairText: An Algorithm in the loop writing platform for

* Algorithms are often implemented without any appeals method in place (due to the misconception that algorithms are objective, accurate, and won’t make mistakes)

* Algorithms are often used at a much larger scale than human decision makers, in many cases, replicating an identical bias at scale (part of the appeal of algorithms is how cheap they are to use)

* Instead of just focusing on the least-terrible existing option, it is more valuable to ask how we can create better, less biased decision-making tools by leveraging the strengths of humans and machines working together

http://www.fast.ai/2018/08/07/hbr-bias-algorithms/

Algorithms vs Humans

Biased

Domain Expertise

Slow

Interpretable

Expensive

Storytelling

Fast

Algorithm

Non-culpable

Economical

Opaque

Unbiased

No domain KnowledgeHuman

Humans and machines have their own pros & cons

Page 27: FairText: An Algorithm in the loop writing platform for

Our approach – Human Centered AI

Our approach brings the bests of both worlds!

▪ Propose an interactive visual interface to identify and tackle bias

▪ Understand underlying structures in data using interpretable model like causal inference

▪ Infuse domain knowledge into the system by modifying causal network

▪ Evaluate debiased data using Utility, Distortion, Individual fairness & group fairness

Human Algorithm

Interaction

▪ AI Systems should understand humans

▪ AI help humans understand itself

▪ Computational creativity

https://www.nytimes.com/2018/03/07/opinion/artificial-intelligence-human.html

Page 28: FairText: An Algorithm in the loop writing platform for

Causal Networks & DebiasingCausal Network

Debiasing

CGPA

GREVerbal

TOEFLInternational Admitted

zyx

Causal Networks help identify bias

W1W2 ynew = y – w1x

znew = z – w1w2x

Partial Debiasing

ynew = y – αw1x

Page 29: FairText: An Algorithm in the loop writing platform for

Evaluation Objectives

Utility

Accuracy

AUC

F1 score

DistortionSSE

MAPE

MSE

SMAPE

Fairness

Preserve utility, maximize fairness & minimize distortion

IFM(k-NN)

IndividualFairness

TPR

GDM

FPR

GroupFairness

Page 30: FairText: An Algorithm in the loop writing platform for

Proposed Architecture

Humans can infuse domain knowledge by interacting with the causal network

Raw data

Debiased data

Causal Network

Semantic Suggestions

Visualization

Evaluation metrics

Debias

Human Supervision

Page 31: FairText: An Algorithm in the loop writing platform for

Accountability

Trust

TransparencyFairness

Why our Approach?

Introducing Human in the loop is the way forward!

Using multiple fairness definitionsHuman in-charge can be held accountable

Human expert infuses domain knowledge into systemHuman brings more trust into the system

Interactive visual interface boosts transparency

MultidisciplinaryInvestigate policies by traversing causal network

Data-driven Storytelling