regression and gradient descent - github pages · 2020. 12. 20. · linear regression with batch...

Gradient Descent Source: Intro. to Machine Learning By Andrew Ng, Stanford, Coursera Regression and

Upload: others

Post on 28-Jan-2021

4 views

Category:

Documents

0 download

Report

Download

Embed Size (px):

TRANSCRIPT

GradientDescent

Source:Intro.toMachineLearningByAndrewNg,Stanford,Coursera

Regressionand
Notation:m=Numberoftrainingexamplesx’s=“input”variable/featuresy’s=“output”variable/“target”variable

Sizeinfeet2(x) Price($)in1000's(y)2104 4601416 2321534 315852 178… …

Trainingsetof

housingprices

(Portland,OR)
0

100

200

300

400

500

0 500 1000 1500 2000 2500 30000

100

200

300

400

500

0 500 1000 1500 2000 2500 3000

HousingPrices

(Portland,OR)

Price(in1000sofdollars)

Size(feet2)RegressionProblem

Predictreal-valuedoutput
Howtochoose‘s?

TrainingSet

Hypothesis:‘s:Parameters

Sizeinfeet2(x) Price($)in1000's(y)2104 4601416 2321534 315852 178… …
Hypothesis:

Parameters:

CostFunction:

Goal:
(forfixed,thisisafunctionofx) (functionoftheparameters)

0

100

200

300

400

500

0 1000 2000 3000

Price($)in1000’s

Sizeinfeet2(x)
Gradientdescent
Havesomefunction

Want

Outline:

•  Startwithsome

•  Keepchangingtoreduce

untilwehopefullyendupataminimum
!1!0

J(!0,!1)
!0

!1

J(!0,!1)
Ifαistoosmall,gradientdescentcanbeslow.

Ifαistoolarge,gradientdescentcanovershoottheminimum.Itmayfailtoconverge,orevendiverge.
atlocaloptima

Currentvalueof
Gradientdescentalgorithm
Gradientdescentalgorithm

Correct:Simultaneousupdate Incorrect:
Gradientdescentalgorithm LinearRegressionModel
Gradientdescentalgorithm

updateand

simultaneously
!1!0

J(!0,!1)
!0

!1

J(!0,!1)
(forfixed,thisisafunctionofx) (functionoftheparameters)
(forfixed,thisisafunctionofx) (functionoftheparameters)
(forfixed,thisisafunctionofx) (functionoftheparameters)
(forfixed,thisisafunctionofx) (functionoftheparameters)
(forfixed,thisisafunctionofx) (functionoftheparameters)
(forfixed,thisisafunctionofx) (functionoftheparameters)
(forfixed,thisisafunctionofx) (functionoftheparameters)
(forfixed,thisisafunctionofx) (functionoftheparameters)
(forfixed,thisisafunctionofx) (functionoftheparameters)
Linearregressionwithgradientdescent

Repeat{(forevery)}
LinearregressionwithBatchGradientDescent

Repeat{(forevery)}
Learningrateistypicallyheldconstant.Canslowlydecreaseovertimeifwewanttoconverge.(E.g. )const1 iterationNumber + const2

Stochasticgradientdescent

1.  Randomlyshuffledataset.2.  Repeat{

for {

(for)}}
Adviceforapplyingmachinelearning

Diagnosingbiasvs.variance

MachineLearning
Bias/variance

Highbias(underfit)

“Justright” Highvariance(overfit)

Price

Size

Price

Size

Price

Size
Bias/variance

Highbias(underfit)

“Justright” Highvariance(overfit)

Price

Size

Price

Size

Price

Size
Bias/variance

degreeofpolynomiald

error

Trainingerror:

Crossvalidationerror:

size

price
Bias/variance

degreeofpolynomiald

error

Trainingerror:

Crossvalidationerror:

size

price
Diagnosingbiasvs.variance

degreeofpolynomiald

error

Supposeyourlearningalgorithmisperforminglesswellthanyouwerehoping.(orishigh.)Isitabiasproblemoravarianceproblem?

(crossvalidationerror)

(trainingerror)

Bias(underfit):

Variance(overfit):
Diagnosingbiasvs.variance

degreeofpolynomiald

error

Supposeyourlearningalgorithmisperforminglesswellthanyouwerehoping.(orishigh.)Isitabiasproblemoravarianceproblem?

(crossvalidationerror)

(trainingerror)

Bias(underfit):

Variance(overfit):
Adviceforapplyingmachinelearning

Regularizationandbias/variance

MachineLearning
Linearregressionwithregularization

LargexxHighbias(underfit)

Intermediatexx“Justright”

SmallxxHighvariance(overfit)

Model:Price

Size

Price

Size

Price

Size
Linearregressionwithregularization

LargexxHighbias(underfit)

Intermediatexx“Justright”

SmallxxHighvariance(overfit)

Model:Price

Size

Price

Size

Price

Size
Choosingtheregularizationparameter
Choosingtheregularizationparameter
Model:

Choosingtheregularizationparameter

Pick(say).Testerror:
1.  Try2.  Try3.  Try4.  Try5.  Try12. Try

Model:

Choosingtheregularizationparameter

Pick(say).Testerror:
Bias/varianceasafunctionoftheregularizationparameter
Bias/varianceasafunctionoftheregularizationparameter