+ All Categories
Home > Documents > Advisors: Artur Dubrawski & Alexandra Chouldechova Joint PhD …demo.clab.cs.cmu.edu ›...

Advisors: Artur Dubrawski & Alexandra Chouldechova Joint PhD …demo.clab.cs.cmu.edu ›...

Date post: 04-Jul-2020
Category:
Upload: others
View: 2 times
Download: 0 times
Share this document with a friend
121
Bias in bios: fairness in a high-stakes machine-learning setting Maria De-Arteaga Joint PhD Student, Machine Learning & Public Policy Advisors: Artur Dubrawski & Alexandra Chouldechova
Transcript

Bias in bios: fairness in a high-stakes machine-learning settingMaria De-Arteaga

Joint PhD Student, Machine Learning & Public Policy

Advisors: Artur Dubrawski & Alexandra Chouldechova

Adam KalaiAlexey Romanov Hanna Wallach Jennifer Chayes

Christian borgs Alexandra Chouldechova

Krishnaram Kenthapadi

Sahin Geyik Max Leiserson

Nathaniel Swinger, Neil Thomas Heffernan IV

What are the biases in our data?

Why do they matter?

What can we do about them?

Copyright © 2019 Maria De-Arteaga

What are the biases in my data?

Why do they matter?

What can we do about them?

Bias in Bios: A Case Study of Semantic Representation Bias in a High-Stakes Setting (FAT* 2019)Maria De-Arteaga (CMU), Alexey Romanov (UMASS), Hanna Wallach (MSR), Jennifer Chayes (MSR), Christian Borgs (MSR), Alexandra Chouldechova (CMU), Sahin Geyik (LinkedIn), Krishnaram Kenthapadi (LinkedIn), Adam Kalai (MSR)

What are the biases in my word embedding? (AIES 2019)Nathaniel Swinger= (Lexington HS), Maria De-Arteaga= (CMU), Neil Thomas Heffernan IV (Shrewsbury HS), Mark Leiserson (UMD), Adam Kalai (MSR)

What's in a Name? Reducing Bias in Bios without Access to Protected Attributes (NAACL 2019)Alexey Romanov (UMASS), Maria De-Arteaga (CMU), Hanna Wallach (MSR), Jennifer Chayes (MSR), Christian Borgs (MSR), Alexandra Chouldechova (CMU), Sahin Geyik (LinkedIn), Krishnaram Kenthapadi (LinkedIn), Anna Rumshisky (UMASS), Adam Kalai (MSR) Best Thematic Paper :)

Copyright © 2019 Maria De-Arteaga

Humans and high-stakes predictions

Data

Decision-maker

Prediction-informed decision

Copyright © 2019 Maria De-Arteaga

Humans and high-stakes predictions

Defendant’s record

Judge

Bail?

Copyright © 2019 Maria De-Arteaga

Humans and high-stakes predictions

Defendant’s record

Judge

Bail?

Candidate’s CV

Recruiter

Interview?Hire?

Copyright © 2019 Maria De-Arteaga

Humans and high-stakes predictions

Defendant’s record

Judge

Bail?

Candidate’s CV

Recruiter

Interview?Hire?

Patient’s monitoring

Physician

Life-sustaining therapies?

Copyright © 2019 Maria De-Arteaga

Humans, machines and high-stakes predictions

Data

Machine prediction

Human decision

Copyright © 2019 Maria De-Arteaga

Humans, machines and high-stakes predictions

Data

Machine prediction

Human decision

Machines are better than humans at making predictions![Meehl’54, Dawes’89, Grove’00]

Copyright © 2019 Maria De-Arteaga

Humans, machines and high-stakes predictions

Data

Machine prediction

Human decision

But what happens when available data embeds societal biases?

Copyright © 2019 Maria De-Arteaga

What are the risks of semantic representation bias?

Input data Semantic representation Machine learning algorithm

In this talk...

Copyright © 2019 Maria De-Arteaga

What are the risks of semantic representation bias?

In this talk...

Part 1: Representational harms

What are the biases in my word embedding? (AIES 2019)

Nathaniel Swinger= (Lexington HS), Maria De-Arteaga= (CMU), Neil Thomas Heffernan IV (Shrewsbury HS), Mark Leiserson (UMD), Adam Kalai (MSR)

Copyright © 2019 Maria De-Arteaga

What are the risks of semantic representation bias?

In this talk...

Part 2: Allocative harms

Bias in Bios: A Case Study of Semantic Representation Bias in a High-Stakes Setting (FAT* 2019)Maria De-Arteaga (CMU), Alexey Romanov (UMASS), Hanna Wallach (MSR), Jennifer Chayes (MSR), Christian Borgs (MSR), Alexandra Chouldechova (CMU), Sahin Geyik (LinkedIn), Krishnaram Kenthapadi (LinkedIn), Adam Kalai (MSR)

Copyright © 2019 Maria De-Arteaga

What are the risks of semantic representation bias?

In this talk...

Part 3: Mitigating allocative harms

What's in a Name? Reducing Bias in Bios without Access to Protected Attributes (NAACL 2019)Alexey Romanov (UMASS), Maria De-Arteaga (CMU), Hanna Wallach (MSR), Jennifer Chayes (MSR), Christian Borgs (MSR), Alexandra Chouldechova (CMU), Sahin Geyik (LinkedIn), Krishnaram Kenthapadi (LinkedIn), Anna Rumshisky (UMASS), Adam Kalai (MSR) Best Thematic Paper :)

Copyright © 2019 Maria De-Arteaga

Word embeddings

Slide created by Adam Kalai

Word embeddings

Man :: computer programmer

Woman ::

Slide created by Adam Kalai

Word embeddings

Slide created by Adam Kalai

Embedding geometry: proximity and parallelism

Slide created by Adam Kalai

Slide created by Adam Kalai

Word embeddings

0.000

0.000

0.000

0.000

0.000

0.000

0.000

0.000

Slide created by Adam Kalai

Word embeddings

What are the biases in my word embedding?

(beyond gender bias)

Credit: Adam Kalai

Implicit Association Test [Greenwald’98]

Implicit association between categories?

X1

X2Amanda

ZoeEmily

John

Adam

Paul

A1

Kids

Family

Wedding A2

Salary

Office

Copyright © 2019 Maria De-Arteaga

Implicit Association Test [Greenwald’98]

Implicit association between categories?

X1

X2Amanda

ZoeEmily

John

Adam

Paul

A1

Kids

Family

Wedding A2

Salary

Office

Copyright © 2019 Maria De-Arteaga

Implicit Association Test [Greenwald’98]

Female

Career

Male

FamilySetting 1

Copyright © 2019 Maria De-Arteaga

Implicit Association Test [Greenwald’98]

Female

Career

Male

Family

Salary

Copyright © 2019 Maria De-Arteaga

Implicit Association Test [Greenwald’98]

Female

Career

Male

Family

Paul

Copyright © 2019 Maria De-Arteaga

Implicit Association Test [Greenwald’98]

Female

Career

Male

Family

Emily

Copyright © 2019 Maria De-Arteaga

Implicit Association Test [Greenwald’98]

Female

Career

Male

Family

Wedding

Copyright © 2019 Maria De-Arteaga

Implicit Association Test [Greenwald’98]

Female

Family

Male

CareerSetting 2

Copyright © 2019 Maria De-Arteaga

Implicit Association Test [Greenwald’98]

Female

Family

Male

Career

Salary

Copyright © 2019 Maria De-Arteaga

Implicit Association Test [Greenwald’98]

Female

Family

Male

Career

Emily

Copyright © 2019 Maria De-Arteaga

Implicit Association Test [Greenwald’98]

Female

Family

Male

Career

Wedding

Copyright © 2019 Maria De-Arteaga

Implicit Association Test [Greenwald’98]

Female

Family

Male

Career

John

Copyright © 2019 Maria De-Arteaga

Implicit Association Test [Greenwald’98]

Differences in average response time between setting 1 and setting 2?

Copyright © 2019 Maria De-Arteaga

X1

X2Amanda

ZoeEmily

John

Adam

Paul

A1

Kids

Family

Wedding

A2

Salary

Office

Word embedding Association Test [Caliskan et al, 2017]

Copyright © 2019 Maria De-Arteaga

Word embedding Association Test [Caliskan et al, 2017]

X1

X2Amanda

ZoeEmily

John

Adam

Paul

A1

Kids

Family

Wedding

Salary

Office

Differences in average distances between groups of words?

Salary

Copyright © 2019 Maria De-Arteaga

X1

X2

A1 A2

1. Which sets X1, X2, A1, A2 should we consider?

2. How to deal with the combinatorial explosion that arises when considering intersectional groups?

Word embedding Association Test [Caliskan et al, 2017]

Copyright © 2019 Maria De-Arteaga

X1

X2

A1 A2

Is bias X in my word embedding?[Caliskan’17]

What are the biases in my word embedding?[Swinger* and De-Arteaga* et al, AIES, 2019]

Unsupervised bias enumeration

Word embedding Association Test [Caliskan et al, 2017]

Copyright © 2019 Maria De-Arteaga

Generalized Word embedding Association Test [Swinger* and De-Arteaga* et al, 2018]

Copyright © 2019 Maria De-Arteaga

n=2

Generalized Word embedding Association Test [Swinger* and De-Arteaga* et al 2018]

Copyright © 2019 Maria De-Arteaga

n=1

Generalized Word embedding Association Test [Swinger* and De-Arteaga* et al 2018]

Copyright © 2019 Maria De-Arteaga

n>1(decomposition)

Generalized Word embedding Association Test [Swinger* and De-Arteaga* et al 2018]

Copyright © 2019 Maria De-Arteaga

Unsupervised Bias Enumeration (UBE) algorithm

Copyright © 2019 Maria De-Arteaga

Input

Attributes

Copyright © 2019 Maria De-Arteaga

Step 1: Discover groupsCopyright © 2019 Maria De-Arteaga

X1

X2

X3

Amanda

ZoeErika

Markisha

Latisha

Tyrique

Yael

Moses

Michal

Step 1: Discover groupsCopyright © 2019 Maria De-Arteaga

Step 1: Discover groupsCopyright © 2019 Maria De-Arteaga

Step 1: Discover groupsCopyright © 2019 Maria De-Arteaga

Step 2: Discover word categoriesCopyright © 2019 Maria De-Arteaga

X

1

X

2

X

3

A1

A4

A2

A

3

Amanda

ZoeErika

Markisha

Latisha

Tyrique

Yael

MosesMichal

Nurse

Translator

Lawyer

potatoes

tortillas

husband

aunt

pesos

rubles

Step 2: Discover word categoriesCopyright © 2019 Maria De-Arteaga

Step 3: Partition Aj

X1

X2

X3

Aj

Copyright © 2019 Maria De-Arteaga

Step 3: Partition Aj

X1

X2

X3

Aj

Tortillas

Tequila

Kosher

Hummus

Caviar

Copyright © 2019 Maria De-Arteaga

Step 3: Partition Aj

X1

X2

X3

A2,j

A3,j

A1,j

Ai,j contains top t words s.t.

Copyright © 2019 Maria De-Arteaga

X1

X2

X3

A2,j

A3,j

A1,j

Is Ai,j significantly closer to Xi than it could be expected through sheer randomness?

Step 4: Establish statistical significanceCopyright © 2019 Maria De-Arteaga

X1

X2

X3

A2,j

A3,j

A1,j

Step 4: Establish statistical significanceCopyright © 2019 Maria De-Arteaga

Xi

Step 4: Establish statistical significance

Ai.j

A

Copyright © 2019 Maria De-Arteaga

Xi

Step 4: Establish statistical significance

Ai.j

A

Copyright © 2019 Maria De-Arteaga

Xi

Step 4: Establish statistical significance

Ai.j

A

Copyright © 2019 Maria De-Arteaga

Xi

Step 4: Establish statistical significance

Ai.j

A

Is 𝝈i,j significantly large?

Copyright © 2019 Maria De-Arteaga

Step 4: Establish statistical significance

A1

A4

A2

A

3

X 1U r

X 3U r

X 2U r

Rotational null hypothesis

1. Rotate X: X → XUr

Copyright © 2019 Maria De-Arteaga

X1 X2

X3

Rotational null hypothesis

2. Find Ai,j,r

A1,j,r

A3,j,r

A2,j,r

Step 4: Establish statistical significanceCopyright © 2019 Maria De-Arteaga

Step 4: Establish statistical significance

Ai.j

A

XiU

Rotational null hypothesis

3. Calculate 𝞼i,j,r

Copyright © 2019 Maria De-Arteaga

Step 4: Establish statistical significance

Ai.j

A

XiU

Rotational null hypothesis

3. Calculate 𝞼i,j,r

Xi

Copyright © 2019 Maria De-Arteaga

Step 4: Establish statistical significance

Ai.j

A

XiU

Rotational null hypothesis

3. Calculate p-value:

pi,j = [ 𝛅(𝞼i,j > 𝞼i,j,r) + 1] / [ R + 1]

r = 1,2,...,10k

Copyright © 2019 Maria De-Arteaga

Step 4: Establish statistical significance

Ai.j

A

XiU

Rotational null hypothesis

4. Determine critical p-value, 𝞪-bound guarantee on false discovery rate (Benjamini-Hochbergh)

Copyright © 2019 Maria De-Arteaga

Disclaimer

The biases in the following slides contain offensive stereotypes.

These do not reflect our views.

Copyright © 2019 Maria De-Arteaga

Copyright © 2019 Maria De-Arteaga

Crowdsourcing evaluationQualification: 

36 names, 3 per group +1 per name labeled in correct group

Copyright © 2019 Maria De-ArteagaCopyright © 2019 Maria De-Arteaga

Crowdsourcing evaluationQualification: 

36 names, 3 per group +1 per name labeled in correct group

If accuracy > 50%

Is the UBE output consistent with society's stereotypes?For each WEAT:

• Groups in output {X1, X2, … , Xk} and {A1, A2, …, Ak} shown• For each name group Xi, which group Ai contains words most stereotypically

associated with these names?

Copyright © 2019 Maria De-Arteaga

Crowdsourcing evaluationQualification: 

36 names, 3 per group +1 per name labeled in correct group

If accuracy > 50%

Is the UBE output consistent with society's stereotypes?For each WEAT:

• Groups in output {X1, X2, … , Xk} and {A1, A2, …, Ak} shown• For each name group Xi, which group Ai contains words most stereotypically

associated with these names?

Is it offensive? Rate:

If most commonly chosen group matches UBE pairing

Politically incorrect, possibly very offensive

Politically correct, inoffensive, or just random

1 2 3 4 5 6 7

Copyright © 2019 Maria De-Arteaga

Crowdsourcing evaluation

Copyright © 2019 Maria De-Arteaga

Disclaimer

The biases in the following slides contain offensive stereotypes.

These do not reflect our views or the views of crowd workers.

Copyright © 2019 Maria De-Arteaga

Crowdsourcing evaluation

*These associations do not reflect our views or those of the crowd workers.Copyright © 2019 Maria De-Arteaga

Why does this matter?

● Representational harms

● Harmful bias encoded in semantic representation used for learning

● Removing names is not enough to get rid of bias!

○ Words in category clusters may be used as proxy for gender/race/etc

Hostess Cab driver

volleyball cornerback

Copyright © 2019 Maria De-Arteaga

What are the risks of semantic representation bias?

In this talk...

Part 2: Allocative harms

Bias in Bios: A Case Study of Semantic Representation Bias in a High-Stakes Setting (FAT* 2019)Maria De-Arteaga (CMU), Alexey Romanov (UMASS), Hanna Wallach (MSR), Jennifer Chayes (MSR), Christian Borgs (MSR), Alexandra Chouldechova (CMU), Sahin Geyik (LinkedIn), Krishnaram Kenthapadi (LinkedIn), Adam Kalai (MSR)

Copyright © 2019 Maria De-Arteaga

An artificially intelligent headhunter?

77

Copyright © 2019 Maria De-Arteaga

An artificially intelligent headhunter?

78

Copyright © 2019 Maria De-Arteaga

An artificially intelligent headhunter?

79

Copyright © 2019 Maria De-Arteaga

Can we quantify the risks of incorporating ML in hiring and recruiting pipelines?

80

Copyright © 2019 Maria De-Arteaga

81

Can we characterize the effects?

Can we quantify the risks of incorporating ML in hiring and recruiting pipelines?

Copyright © 2019 Maria De-Arteaga

82

Can we characterize the effects?

Can we quantify the risks of incorporating ML in hiring and recruiting pipelines?

Our findings:● Gender accuracy gap in large-scale study

● “Scrubbing” gender indicators ≠ gender blindness ● Compounding imbalances

Copyright © 2019 Maria De-Arteaga

Computer Programmer

83Slide created by Adam Kalai

Computer Programmer

84Slide created by Adam Kalai

Computer Programmer

85Slide created by Adam Kalai

Computer Programmer

86

BLACK FEMALE

Slide created by Adam Kalai

Computer Programmer

87Slide created by Adam Kalai

88

Copyright © 2019 Maria De-Arteaga

Bias in bios: Biographies dataset

● 400,000 third-person web bios from Common Crawl.

“Xxx Xxx is a(n) (xxx) [title]...he/she…” title ∈ {common BLS SOC titles}

Alexandra Chouldechova is an Assistant Professor of Statistics and Public Policy at Carnegie Mellon University's Heinz College of Informations Systems and Public Policy. She received her B.Sc. from the University of Toronto in 2009, and in 2014 she completed her Ph.D. in Statistics at Stanford University. While at Stanford, she also worked at Google and Symantec on developing statistical assessment methods for information retrieval systems.

● Classification problem: 28 title-from-bio-text 89

Copyright © 2019 Maria De-Arteaga

400k total bios

90

Log

frequ

ency

Slide created by Adam Kalai

Learning pipeline

Input data:Biographies

Semantic representations:

1. Bag-of-words2. Word embedding

3. Deep neural network (GRU) with attention

Objective:Predict Y = Occupation

91

Copyright © 2019 Maria De-Arteaga

Gender sensitivity: How do predictions change if explicit gender indicators are swapped?

[Bertrand, Mulliainathan’04]

92

Copyright © 2019 Maria De-Arteaga

93

Copyright © 2019 Maria De-Arteaga

94

Copyright © 2019 Maria De-Arteaga

95

Copyright © 2019 Maria De-Arteaga

Beyond explicit gender indicators: the gender accuracy gap

96

Copyright © 2019 Maria De-Arteaga

AC

CU

RA

CY

ON

FEM

ALE

S –

AC

CU

RA

CY

ON

MA

LES:

More accurate on F

More accurate on M

97

Copyright © 2019 Maria De-Arteaga

AC

CU

RA

CY

ON

FEM

ALE

S –

AC

CU

RA

CY

ON

MA

LES:

More accurate on F

More accurate on M

Compounding gender imbalance

98

Copyright © 2019 Maria De-Arteaga

AC

CU

RA

CY

ON

FEM

ALE

S –

AC

CU

RA

CY

ON

MA

LES:

More accurate on F

More accurate on M

Compounding gender imbalance

Compounding imbalance

If female fraction p < 0.5 and gender gap < 0 for title, then female fraction in true positives < p(similarly for males)

99

Copyright © 2019 Maria De-Arteaga

AC

CU

RA

CY

ON

FEM

ALE

S –

AC

CU

RA

CY

ON

MA

LES:

More accurate on F

More accurate on M

Compounding gender imbalance

Compounding injustice

[Hellman’18]

If initial imbalance constitutes injustice: Model’s prediction is informed by, and compounds, previous injustice

100

Copyright © 2019 Maria De-Arteaga

Compounding imbalances

Surgeons

females in data:

14.6%

101

Copyright © 2019 Maria De-Arteaga

Compounding imbalances

Males:71% recall

Females:54% recall

females in data:

14.6%

Surgeons

102

Copyright © 2019 Maria De-Arteaga

Compounding imbalances

Males:71% recall

Females:54% recall

females in data:

14.6%

Surgeons

females in true positives:

11.6%

103

Copyright © 2019 Maria De-Arteaga

≈ same accuracywith/without explicit gender indicators

“Scrub” explicit gender indicators?

104

Copyright © 2019 Maria De-Arteaga

Compounding imbalances

105

Copyright © 2019 Maria De-Arteaga

Slide created by Adam Kalai

Can we mitigate this problem?

● Additional challenges:

○ Sensitive attributes may be unavailable, or it may be illegal to use them

○ Need to consider several attributes and their intersection

➢ Race, gender, ethnicity, . . .

Copyright © 2019 Maria De-Arteaga

What are the risks of semantic representation bias?

In this talk...

Part 3: Mitigating allocative harms

What's in a Name? Reducing Bias in Bios without Access to Protected Attributes (NAACL 2019)Alexey Romanov (UMASS), Maria De-Arteaga (CMU), Hanna Wallach (MSR), Jennifer Chayes (MSR), Christian Borgs (MSR), Alexandra Chouldechova (CMU), Sahin Geyik (LinkedIn), Krishnaram Kenthapadi (LinkedIn), Anna Rumshisky (UMASS), Adam Kalai (MSR) Best Thematic Paper :)

Copyright © 2019 Maria De-Arteaga

Names encode societal biases, and…

"What's in a name? That which we call a rose

By any other name would smell as sweet."

William Shakespeare, Romeo and Juliet

Copyright © 2019 Maria De-Arteaga

Main idea

● Leverage biases presented in word embeddings○ Use embeddings of names as “universal proxies”

○ No need to define protected groups

● Embeddings are used only in the loss calculation ○ No need for names or protected attributes during

deployment

○ Gains extend to individuals who are poorly proxied

Credit: Alexey Romanov

Names are indeed “universal proxies”

Algorithms: regularize accuracy gaps

Slide created by Adam Kalai

Algorithms: regularize accuracy gaps

Slide created by Adam Kalai

Slide created by Adam Kalai

UCI Adult dataset

Slide created by Adam Kalai

Slide created by Alexey Romanov

● Unsupervised bias enumeration algorithm for word embeddings

○ Problematic societal biases encoded in widely used embeddings

● Link between accuracy gap and compounding injustices

● Large-scale dataset of online bios for occupation classification*

○ Gender imbalance compounded, even if explicit indicators “scrubbed”

● Bias in word embeddings can be leveraged to mitigate bias without access to protected attributes

Summary

*Code to reproduce dataset publicly available: aka.ms/biasbios 119

Copyright © 2019 Maria De-Arteaga

Slide created by Adam Kalai

we now have some results for Spanish!

Thanks!

[email protected]


Recommended