regression and gradient descent - github pages · 2020. 12. 20. · linear regression with batch...
TRANSCRIPT
-
GradientDescent
Source:Intro.toMachineLearningByAndrewNg,Stanford,Coursera
Regressionand
-
Notation:m=Numberoftrainingexamplesx’s=“input”variable/featuresy’s=“output”variable/“target”variable
Sizeinfeet2(x) Price($)in1000's(y)2104 4601416 2321534 315852 178… …
Trainingsetof
housingprices
(Portland,OR)
-
0
100
200
300
400
500
0 500 1000 1500 2000 2500 30000
100
200
300
400
500
0 500 1000 1500 2000 2500 3000
HousingPrices
(Portland,OR)
Price(in1000sofdollars)
Size(feet2)RegressionProblem
Predictreal-valuedoutput
-
Howtochoose‘s?
TrainingSet
Hypothesis:‘s:Parameters
Sizeinfeet2(x) Price($)in1000's(y)2104 4601416 2321534 315852 178… …
-
Hypothesis:
Parameters:
CostFunction:
Goal:
-
(forfixed,thisisafunctionofx) (functionoftheparameters)
0
100
200
300
400
500
0 1000 2000 3000
Price($)in1000’s
Sizeinfeet2(x)
-
Gradientdescent
-
Havesomefunction
Want
Outline:
• Startwithsome
• Keepchangingtoreduce
untilwehopefullyendupataminimum
-
!1!0
J(!0,!1)
-
!0
!1
J(!0,!1)
-
Ifαistoosmall,gradientdescentcanbeslow.
Ifαistoolarge,gradientdescentcanovershoottheminimum.Itmayfailtoconverge,orevendiverge.
-
atlocaloptima
Currentvalueof
-
Gradientdescentalgorithm
-
Gradientdescentalgorithm
Correct:Simultaneousupdate Incorrect:
-
Gradientdescentalgorithm LinearRegressionModel
-
Gradientdescentalgorithm
updateand
simultaneously
-
!1!0
J(!0,!1)
-
!0
!1
J(!0,!1)
-
(forfixed,thisisafunctionofx) (functionoftheparameters)
-
(forfixed,thisisafunctionofx) (functionoftheparameters)
-
(forfixed,thisisafunctionofx) (functionoftheparameters)
-
(forfixed,thisisafunctionofx) (functionoftheparameters)
-
(forfixed,thisisafunctionofx) (functionoftheparameters)
-
(forfixed,thisisafunctionofx) (functionoftheparameters)
-
(forfixed,thisisafunctionofx) (functionoftheparameters)
-
(forfixed,thisisafunctionofx) (functionoftheparameters)
-
(forfixed,thisisafunctionofx) (functionoftheparameters)
-
Linearregressionwithgradientdescent
Repeat{(forevery)}
-
LinearregressionwithBatchGradientDescent
Repeat{(forevery)}
-
Learningrateistypicallyheldconstant.Canslowlydecreaseovertimeifwewanttoconverge.(E.g. )const1 iterationNumber + const2
Stochasticgradientdescent
1. Randomlyshuffledataset.2. Repeat{
for {
(for)}}
-
Adviceforapplyingmachinelearning
Diagnosingbiasvs.variance
MachineLearning
-
Bias/variance
Highbias(underfit)
“Justright” Highvariance(overfit)
Price
Size
Price
Size
Price
Size
-
Bias/variance
Highbias(underfit)
“Justright” Highvariance(overfit)
Price
Size
Price
Size
Price
Size
-
Bias/variance
degreeofpolynomiald
error
Trainingerror:
Crossvalidationerror:
size
price
-
Bias/variance
degreeofpolynomiald
error
Trainingerror:
Crossvalidationerror:
size
price
-
Diagnosingbiasvs.variance
degreeofpolynomiald
error
Supposeyourlearningalgorithmisperforminglesswellthanyouwerehoping.(orishigh.)Isitabiasproblemoravarianceproblem?
(crossvalidationerror)
(trainingerror)
Bias(underfit):
Variance(overfit):
-
Diagnosingbiasvs.variance
degreeofpolynomiald
error
Supposeyourlearningalgorithmisperforminglesswellthanyouwerehoping.(orishigh.)Isitabiasproblemoravarianceproblem?
(crossvalidationerror)
(trainingerror)
Bias(underfit):
Variance(overfit):
-
Adviceforapplyingmachinelearning
Regularizationandbias/variance
MachineLearning
-
Linearregressionwithregularization
LargexxHighbias(underfit)
Intermediatexx“Justright”
SmallxxHighvariance(overfit)
Model:Price
Size
Price
Size
Price
Size
-
Linearregressionwithregularization
LargexxHighbias(underfit)
Intermediatexx“Justright”
SmallxxHighvariance(overfit)
Model:Price
Size
Price
Size
Price
Size
-
Choosingtheregularizationparameter
-
Choosingtheregularizationparameter
-
Model:
Choosingtheregularizationparameter
Pick(say).Testerror:
-
1. Try2. Try3. Try4. Try5. Try12. Try
Model:
Choosingtheregularizationparameter
Pick(say).Testerror:
-
Bias/varianceasafunctionoftheregularizationparameter
-
Bias/varianceasafunctionoftheregularizationparameter