8 logistic regressioncourses.cs.vt.edu/cs4824/spring19/slide_pdfs/8 logistic...logistic regression...

21
Logistic Regression Machine Learning CS4824/ECE4424 Bert Huang Virginia Tech

Upload: others

Post on 15-Oct-2020

9 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: 8 logistic regressioncourses.cs.vt.edu/cs4824/Spring19/slide_pdfs/8 logistic...Logistic Regression Machine Learning CS4824/ECE4424 Bert Huang Virginia Tech Outline • Review conditional

Logistic Regression

Machine Learning CS4824/ECE4424

Bert Huang Virginia Tech

Page 2: 8 logistic regressioncourses.cs.vt.edu/cs4824/Spring19/slide_pdfs/8 logistic...Logistic Regression Machine Learning CS4824/ECE4424 Bert Huang Virginia Tech Outline • Review conditional

Outline• Review conditional probability and classification

• Linear parameterization and logistic function

• Gradient descent

• Other optimization methods

• Regularization

Page 3: 8 logistic regressioncourses.cs.vt.edu/cs4824/Spring19/slide_pdfs/8 logistic...Logistic Regression Machine Learning CS4824/ECE4424 Bert Huang Virginia Tech Outline • Review conditional

Classification and Conditional Probability

• Discriminative learning: maximize p(y|x)

• Naive Bayes: learn p(x|y) and p(y)

• Today: parameterize p(y|x) and directly learn p(y|x)

Page 4: 8 logistic regressioncourses.cs.vt.edu/cs4824/Spring19/slide_pdfs/8 logistic...Logistic Regression Machine Learning CS4824/ECE4424 Bert Huang Virginia Tech Outline • Review conditional

Parameterizing p(y|x)

p(y |x) := f

f : Rd ! [0, 1]

f (x) :=1

1 + exp(�w>x)

Page 5: 8 logistic regressioncourses.cs.vt.edu/cs4824/Spring19/slide_pdfs/8 logistic...Logistic Regression Machine Learning CS4824/ECE4424 Bert Huang Virginia Tech Outline • Review conditional

Parameterizing p(y|x)

p(y |x) := f

f : Rd ! [0, 1]

f (x) :=1

1 + exp(�w>x)

Page 6: 8 logistic regressioncourses.cs.vt.edu/cs4824/Spring19/slide_pdfs/8 logistic...Logistic Regression Machine Learning CS4824/ECE4424 Bert Huang Virginia Tech Outline • Review conditional

Logistic Function�(x) =

1

1 + exp(�x)

Page 7: 8 logistic regressioncourses.cs.vt.edu/cs4824/Spring19/slide_pdfs/8 logistic...Logistic Regression Machine Learning CS4824/ECE4424 Bert Huang Virginia Tech Outline • Review conditional

limx!�1

�(x) = limx!�1

1

1 + exp(�x)= 0.0

Logistic Function�(x) =

1

1 + exp(�x)

�(0) =1

1 + exp(�0)=

1

1 + 1= 0.5

limx!1

�(x) = limx!1

1

1 + exp(�x)=

1

1= 1.0

Page 8: 8 logistic regressioncourses.cs.vt.edu/cs4824/Spring19/slide_pdfs/8 logistic...Logistic Regression Machine Learning CS4824/ECE4424 Bert Huang Virginia Tech Outline • Review conditional

From Features to Probability

f (x) :=1

1 + exp(�w>x)

Page 9: 8 logistic regressioncourses.cs.vt.edu/cs4824/Spring19/slide_pdfs/8 logistic...Logistic Regression Machine Learning CS4824/ECE4424 Bert Huang Virginia Tech Outline • Review conditional

Likelihood Functiony 2 {�1,+1} L(w) :=

((1 + exp(�w>x))�1 if y = +1

1� (1 + exp(�w>x))�1 if y = �1

1� (1 + exp(�w>x))�1

= 1� 1

1 + exp(�w>x)=

1 + exp(�w>x)

1 + exp(�w>x)� 1

1 + exp(�w>x)

=exp(�w>x)

1 + exp(�w>x) =

✓1 + exp(�w>x)

exp(�w>x)

◆�1

=

✓1

exp(�w>x)+

exp(�w>x)

exp(�w>x)

◆�1

= (1 + exp(w>x))�1

Page 10: 8 logistic regressioncourses.cs.vt.edu/cs4824/Spring19/slide_pdfs/8 logistic...Logistic Regression Machine Learning CS4824/ECE4424 Bert Huang Virginia Tech Outline • Review conditional

Likelihood Functiony 2 {�1,+1} L(w) :=

((1 + exp(�w>x))�1 if y = +1

(1 + exp(+w>x))�1 if y = �1

= (1 + exp(�yw>x))�1

L(w) =nY

i=1

(1 + exp(�yiw>xi ))

�1

log L(w) = �nX

i=1

log(1 + exp(�yiw>xi ))

Page 11: 8 logistic regressioncourses.cs.vt.edu/cs4824/Spring19/slide_pdfs/8 logistic...Logistic Regression Machine Learning CS4824/ECE4424 Bert Huang Virginia Tech Outline • Review conditional

Likelihood Functionlog L(w) = �

nX

i=1

log(1 + exp(�yiw>xi ))

w argminw

nX

i=1

log(1 + exp(�yiw>xi ))

negative log likelihood

:= nll(w)

rwnll = �nX

i=1

✓1� 1

1 + exp(�yiw>xi )

◆yixi

Page 12: 8 logistic regressioncourses.cs.vt.edu/cs4824/Spring19/slide_pdfs/8 logistic...Logistic Regression Machine Learning CS4824/ECE4424 Bert Huang Virginia Tech Outline • Review conditional

w argminw

nX

i=1

log(1 + exp(�yiw>xi )) := nll(w)

rwnll = �nX

i=1

✓1� 1

1 + exp(�yiw>xi )

◆yixi

rwnll =nX

i=1

✓1

1 + exp(�yiw>xi )⇥rw

�1 + exp(�yiw

>xi )�◆

0� exp(�yiw>xi )yixi

rwnll =nX

i=1

✓� exp(�yiw>xi )yixi1 + exp(�yiw>xi )

=nX

i=1

�✓

exp(�yiw>xi )

1 + exp(�yiw>xi )

◆yixi

:= g

Page 13: 8 logistic regressioncourses.cs.vt.edu/cs4824/Spring19/slide_pdfs/8 logistic...Logistic Regression Machine Learning CS4824/ECE4424 Bert Huang Virginia Tech Outline • Review conditional

Gradient Descent

0 0.5 1 1.5 2−0.5

0

0.5

1

1.5

2

2.5

3

Image from Murphy Ch. 8

wt+1 ! wt � ⌘tgt

⌘t =1

t

⌘t =1pt

e.g.,

⌘t = 1

Page 14: 8 logistic regressioncourses.cs.vt.edu/cs4824/Spring19/slide_pdfs/8 logistic...Logistic Regression Machine Learning CS4824/ECE4424 Bert Huang Virginia Tech Outline • Review conditional

0 0.5 1 1.5 2−0.5

0

0.5

1

1.5

2

2.5

3

Gradient Descent

Image from Murphy Ch. 8

wt+1 ! wt � ⌘tgt

⌘t =1

t

⌘t =1pt

e.g.,

⌘t = 1

Page 15: 8 logistic regressioncourses.cs.vt.edu/cs4824/Spring19/slide_pdfs/8 logistic...Logistic Regression Machine Learning CS4824/ECE4424 Bert Huang Virginia Tech Outline • Review conditional

Line Search⌘t = argmin

⌘nll(w � ⌘g)

nll(w

�⌘g)

Page 16: 8 logistic regressioncourses.cs.vt.edu/cs4824/Spring19/slide_pdfs/8 logistic...Logistic Regression Machine Learning CS4824/ECE4424 Bert Huang Virginia Tech Outline • Review conditional

Line Searchexact line searching 1

0 0.5 1 1.5 2−0.5

0

0.5

1

1.5

2

2.5

3

Image from Murphy Ch. 8

⌘t = argmin⌘

nll(w � ⌘g)

nll(w

�⌘g)

Page 17: 8 logistic regressioncourses.cs.vt.edu/cs4824/Spring19/slide_pdfs/8 logistic...Logistic Regression Machine Learning CS4824/ECE4424 Bert Huang Virginia Tech Outline • Review conditional

exact line searching 1

0 0.5 1 1.5 2−0.5

0

0.5

1

1.5

2

2.5

3

Page 18: 8 logistic regressioncourses.cs.vt.edu/cs4824/Spring19/slide_pdfs/8 logistic...Logistic Regression Machine Learning CS4824/ECE4424 Bert Huang Virginia Tech Outline • Review conditional

Momentum

• Heavy ball method

• Momentum term retains “velocity” from previous steps

• Avoids sharp turns

st �gt + �tst�1

wt wt�1 + ⌘tst

Page 19: 8 logistic regressioncourses.cs.vt.edu/cs4824/Spring19/slide_pdfs/8 logistic...Logistic Regression Machine Learning CS4824/ECE4424 Bert Huang Virginia Tech Outline • Review conditional

Second-Order Optimization

• Newton’s method

• Approximate function with quadratic and minimize quadratic exactly

• Requires computing Hessian (matrix of second derivatives)

• Various approximation methods (e.g., L-BFGS)

Page 20: 8 logistic regressioncourses.cs.vt.edu/cs4824/Spring19/slide_pdfs/8 logistic...Logistic Regression Machine Learning CS4824/ECE4424 Bert Huang Virginia Tech Outline • Review conditional

RegularizationL(w) = p(w)

nY

i=1

(1 + exp(�yiw>xi ))

�1

p(w) = N✓0,

1

�I

◆posterior

prior

neg log posterior (regularized nll)

gradient

� log L(w) = ��

2w>w +

nX

i=1

log(1 + exp(�yiw>xi ))

rwnll = ��w �nX

i=1

✓1� 1

1 + exp(�yiw>xi )

◆yixi

Page 21: 8 logistic regressioncourses.cs.vt.edu/cs4824/Spring19/slide_pdfs/8 logistic...Logistic Regression Machine Learning CS4824/ECE4424 Bert Huang Virginia Tech Outline • Review conditional

Summary• Review conditional probability and classification

• Linear parameterization and logistic function

• Gradient descent

• Other optimization methods

• Regularization