perceptron and multilayer perceptron · applications show bias in favor of male candidate risk...

20
CS109A Introduction to Data Science Pavlos Protopapas, Kevin Rader and Chris Tanner Protopapas Perceptron and Multilayer Perceptron

Upload: others

Post on 19-Mar-2021

4 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Perceptron and Multilayer Perceptron · applications show bias in favor of male candidate Risk models used in US courts have shown to be biased against non-white defendants. CS109A,

CS109A Introduction to Data SciencePavlos Protopapas, Kevin Rader and Chris Tanner

Protopapas

Perceptron and Multilayer Perceptron

Page 2: Perceptron and Multilayer Perceptron · applications show bias in favor of male candidate Risk models used in US courts have shown to be biased against non-white defendants. CS109A,

CS109A, PROTOPAPAS, RADER, TANNER 2

Outline

1. Introduction to Artificial Neural Networks

2. Review of basic concepts

3. Single Neuron Network (‘Perceptron’)

4. Multi-Layer Perceptron (MLP)

Page 3: Perceptron and Multilayer Perceptron · applications show bias in favor of male candidate Risk models used in US courts have shown to be biased against non-white defendants. CS109A,

CS109A, PROTOPAPAS, RADER, TANNER

Page 4: Perceptron and Multilayer Perceptron · applications show bias in favor of male candidate Risk models used in US courts have shown to be biased against non-white defendants. CS109A,

CS109A, PROTOPAPAS, RADER, TANNER 5

Today’s news

Stopping Cyberattacks

Detecting tampering with the diagnostic images, or quietly upped the radiation levels.

Skin Conditions

Using Deep Learning in diagnosing skin conditions

Page 5: Perceptron and Multilayer Perceptron · applications show bias in favor of male candidate Risk models used in US courts have shown to be biased against non-white defendants. CS109A,

CS109A, PROTOPAPAS, RADER, TANNER 6

Today’s news

Image generation

Katie Bouman’s CHIRP produces the first-ever image of a black hole.

Computer Code Generation

Page 6: Perceptron and Multilayer Perceptron · applications show bias in favor of male candidate Risk models used in US courts have shown to be biased against non-white defendants. CS109A,

CS109A, PROTOPAPAS, RADER, TANNER 7

The Potential of Data Science

Gender Bias Racial Bias

Some DS models for evaluate job applications show bias in favor of male candidate

Risk models used in US courts have shown to be biased against non-white defendants

Page 7: Perceptron and Multilayer Perceptron · applications show bias in favor of male candidate Risk models used in US courts have shown to be biased against non-white defendants. CS109A,

CS109A, PROTOPAPAS, RADER, TANNER 8

Historical Trends

Page 8: Perceptron and Multilayer Perceptron · applications show bias in favor of male candidate Risk models used in US courts have shown to be biased against non-white defendants. CS109A,

CS109A, PROTOPAPAS, RADER, TANNER 9

Historical Trends

Disease prediction Game strategy

Natural Language Processing

2016

2018

Page 9: Perceptron and Multilayer Perceptron · applications show bias in favor of male candidate Risk models used in US courts have shown to be biased against non-white defendants. CS109A,

CS109A, PROTOPAPAS, RADER, TANNER 10

Historical Trends

ArXiv papers on Machine Learning and Artificial Intelligence: 2007-2019

Page 10: Perceptron and Multilayer Perceptron · applications show bias in favor of male candidate Risk models used in US courts have shown to be biased against non-white defendants. CS109A,

CS109A, PROTOPAPAS, RADER, TANNER 11

Outline

1. Introduction to Artificial Neural Networks

2. Review of basic concepts 3. Single Neuron Network (‘Perceptron’)

4. Multi-Layer Perceptron (MLP)

Page 11: Perceptron and Multilayer Perceptron · applications show bias in favor of male candidate Risk models used in US courts have shown to be biased against non-white defendants. CS109A,

CS109A, PROTOPAPAS, RADER, TANNER 12

Classification example: Heart Data

Age Sex ChestPain RestBP Chol Fbs RestECG MaxHR ExAng Oldpeak Slope Ca Thal AHD

63 1 typical 145 233 1 2 150 0 2.3 3 0.0 fixed No

67 1 asymptomatic 160 286 0 2 108 1 1.5 2 3.0 normal Yes

67 1 asymptomatic 120 229 0 2 129 1 2.6 2 2.0 reversable Yes

37 1 nonanginal 130 250 0 0 187 0 3.5 3 0.0 normal No

41 0 nontypical 130 204 0 2 172 0 1.4 1 0.0 normal No

response variable Yis Yes/No

Page 12: Perceptron and Multilayer Perceptron · applications show bias in favor of male candidate Risk models used in US courts have shown to be biased against non-white defendants. CS109A,

CS109A, PROTOPAPAS, RADER, TANNER 13

Heart Data: logistic estimation

We'd like to predict whether or not a person has a heart disease. And we'd like to make this prediction, for now, just based on the MaxHR.

𝑃 𝑌 = 1 =1

1 + 𝑒!(#!$#"%)

The logistic regression model uses a function, called the logistic function, to model 𝑃 𝑦 = 1 :

−𝛽!𝛽"

Page 13: Perceptron and Multilayer Perceptron · applications show bias in favor of male candidate Risk models used in US courts have shown to be biased against non-white defendants. CS109A,

CS109A, PROTOPAPAS, RADER, TANNER 14

Logistic Regression

Find the coefficients that minimize the loss function

ℒ 𝛽#, 𝛽$ = −)%

[𝑦% log 𝑝% + 1 − 𝑦% log(1 − 𝑝%)]

Page 14: Perceptron and Multilayer Perceptron · applications show bias in favor of male candidate Risk models used in US courts have shown to be biased against non-white defendants. CS109A,

CS109A, PROTOPAPAS, RADER, TANNER 15

Outline

1. Introduction to Artificial Neural Networks

2. Review of basic concepts

3. Single Neuron Network (‘Perceptron’)

4. Multi-Layer Perceptron (MLP)

Page 15: Perceptron and Multilayer Perceptron · applications show bias in favor of male candidate Risk models used in US courts have shown to be biased against non-white defendants. CS109A,

CS109A, PROTOPAPAS, RADER, TANNER 16

Logistic Regression Revisited

… ……

ℒ(𝛽) =(!

"

ℒ! 𝛽

True y

𝑥" Affine ℎ" = 𝛽! + 𝛽"𝑥" Activation 𝑝" =1

1 + 𝑒#$!ℒ# 𝛽 = −𝑦# ln 𝑝# − 1 − 𝑦# ln(1 − 𝑝#)Loss Fun

𝑥% Affine ℎ% = 𝛽! + 𝛽"𝑥% Activation 𝑝% =1

1 + 𝑒#$"ℒ$ 𝛽 = −𝑦$ ln 𝑝# − 1 − 𝑦$ ln(1 − 𝑝$)Loss Fun

𝑥& Affine ℎ& = 𝛽! + 𝛽"𝑥& Activation 𝑝& =1

1 + 𝑒#$#ℒ" 𝛽 = −𝑦" ln 𝑝" − 1 − 𝑦" ln(1 − 𝑝#")Loss Fun

Page 16: Perceptron and Multilayer Perceptron · applications show bias in favor of male candidate Risk models used in US courts have shown to be biased against non-white defendants. CS109A,

CS109A, PROTOPAPAS, RADER, TANNER 17

Build our first ANN

ℒ(𝛽) =0'

&

ℒ' 𝛽Affine𝑋 ℎ = 𝛽! + 𝑋𝛽" Activation 𝑝 =1

1 + 𝑒#$ Loss Fun

ℒ(𝛽) =0'

&

ℒ' 𝛽Affine𝑋 ℎ = 𝑋𝑊 + 𝑏 Activation 𝑦^=

11 + 𝑒#$ Loss Fun

matrix: 𝑛$×𝑛%

vector: 𝑛%×1

𝑛#: number of predictors𝑛$: number of observations

vector: 𝑛&×1

ℒ(𝛽) =0'

&

ℒ' 𝛽Affine𝑋 ℎ = 𝑋𝑊 Activation 𝑦^=

11 + 𝑒#$ Loss Fun

𝑋 =1 𝑋"" …1 ⋮ …1 𝑋)" …

𝑋"*⋮𝑋)*

W =

𝑏𝑊!⋮𝑊"

scalar

Page 17: Perceptron and Multilayer Perceptron · applications show bias in favor of male candidate Risk models used in US courts have shown to be biased against non-white defendants. CS109A,

CS109A, PROTOPAPAS, RADER, TANNER 18

Build our first ANN

ℒ(𝑊) =0'

&

ℒ' 𝑊Affine𝑋 ℎ = 𝑋𝑊 Activation 𝑦^=

11 + 𝑒#$ Loss Fun

“Sigmoid activation” 𝝈

𝑋 𝑌σ𝑋𝑊

Page 18: Perceptron and Multilayer Perceptron · applications show bias in favor of male candidate Risk models used in US courts have shown to be biased against non-white defendants. CS109A,

CS109A, PROTOPAPAS, RADER, TANNER 19

Build our first ANN

ℒ(𝑊) =0'

&

ℒ' 𝑊Affine𝑋 ℎ = 𝑋𝑊 Activation 𝑦^=

11 + 𝑒#$ Loss Fun

“Sigmoid activation” 𝝈

𝑋 𝑌σ𝑋𝑊

Page 19: Perceptron and Multilayer Perceptron · applications show bias in favor of male candidate Risk models used in US courts have shown to be biased against non-white defendants. CS109A,

CS109A, PROTOPAPAS, RADER, TANNER

Page 20: Perceptron and Multilayer Perceptron · applications show bias in favor of male candidate Risk models used in US courts have shown to be biased against non-white defendants. CS109A,

Today’s lucky student: NOT PAVLOS Build a Single Neuron by Hand

21