regression and gradient descent - github pages · 2020. 12. 20. · linear regression with batch...

49
Gradient Descent Source: Intro. to Machine Learning By Andrew Ng, Stanford, Coursera Regression and

Upload: others

Post on 28-Jan-2021

4 views

Category:

Documents


0 download

TRANSCRIPT

  • GradientDescent

    Source:Intro.toMachineLearningByAndrewNg,Stanford,Coursera

    Regressionand

  • Notation:m=Numberoftrainingexamplesx’s=“input”variable/featuresy’s=“output”variable/“target”variable

    Sizeinfeet2(x) Price($)in1000's(y)2104 4601416 2321534 315852 178… …

    Trainingsetof

    housingprices

    (Portland,OR)

  • 0

    100

    200

    300

    400

    500

    0 500 1000 1500 2000 2500 30000

    100

    200

    300

    400

    500

    0 500 1000 1500 2000 2500 3000

    HousingPrices

    (Portland,OR)

    Price(in1000sofdollars)

    Size(feet2)RegressionProblem

    Predictreal-valuedoutput

  • Howtochoose‘s?

    TrainingSet

    Hypothesis:‘s:Parameters

    Sizeinfeet2(x) Price($)in1000's(y)2104 4601416 2321534 315852 178… …

  • Hypothesis:

    Parameters:

    CostFunction:

    Goal:

  • (forfixed,thisisafunctionofx) (functionoftheparameters)

    0

    100

    200

    300

    400

    500

    0 1000 2000 3000

    Price($)in1000’s

    Sizeinfeet2(x)

  • Gradientdescent

  • Havesomefunction

    Want

    Outline:

    •  Startwithsome

    •  Keepchangingtoreduce

    untilwehopefullyendupataminimum

  • !1!0

    J(!0,!1)

  • !0

    !1

    J(!0,!1)

  • Ifαistoosmall,gradientdescentcanbeslow.

    Ifαistoolarge,gradientdescentcanovershoottheminimum.Itmayfailtoconverge,orevendiverge.

  • atlocaloptima

    Currentvalueof

  • Gradientdescentalgorithm

  • Gradientdescentalgorithm

    Correct:Simultaneousupdate Incorrect:

  • Gradientdescentalgorithm LinearRegressionModel

  • Gradientdescentalgorithm

    updateand

    simultaneously

  • !1!0

    J(!0,!1)

  • !0

    !1

    J(!0,!1)

  • (forfixed,thisisafunctionofx) (functionoftheparameters)

  • (forfixed,thisisafunctionofx) (functionoftheparameters)

  • (forfixed,thisisafunctionofx) (functionoftheparameters)

  • (forfixed,thisisafunctionofx) (functionoftheparameters)

  • (forfixed,thisisafunctionofx) (functionoftheparameters)

  • (forfixed,thisisafunctionofx) (functionoftheparameters)

  • (forfixed,thisisafunctionofx) (functionoftheparameters)

  • (forfixed,thisisafunctionofx) (functionoftheparameters)

  • (forfixed,thisisafunctionofx) (functionoftheparameters)

  • Linearregressionwithgradientdescent

    Repeat{(forevery)}

  • LinearregressionwithBatchGradientDescent

    Repeat{(forevery)}

  • Learningrateistypicallyheldconstant.Canslowlydecreaseovertimeifwewanttoconverge.(E.g. )const1 iterationNumber + const2

    Stochasticgradientdescent

    1.  Randomlyshuffledataset.2.  Repeat{

    for {

    (for)}}

  • Adviceforapplyingmachinelearning

    Diagnosingbiasvs.variance

    MachineLearning

  • Bias/variance

    Highbias(underfit)

    “Justright” Highvariance(overfit)

    Price

    Size

    Price

    Size

    Price

    Size

  • Bias/variance

    Highbias(underfit)

    “Justright” Highvariance(overfit)

    Price

    Size

    Price

    Size

    Price

    Size

  • Bias/variance

    degreeofpolynomiald

    error

    Trainingerror:

    Crossvalidationerror:

    size

    price

  • Bias/variance

    degreeofpolynomiald

    error

    Trainingerror:

    Crossvalidationerror:

    size

    price

  • Diagnosingbiasvs.variance

    degreeofpolynomiald

    error

    Supposeyourlearningalgorithmisperforminglesswellthanyouwerehoping.(orishigh.)Isitabiasproblemoravarianceproblem?

    (crossvalidationerror)

    (trainingerror)

    Bias(underfit):

    Variance(overfit):

  • Diagnosingbiasvs.variance

    degreeofpolynomiald

    error

    Supposeyourlearningalgorithmisperforminglesswellthanyouwerehoping.(orishigh.)Isitabiasproblemoravarianceproblem?

    (crossvalidationerror)

    (trainingerror)

    Bias(underfit):

    Variance(overfit):

  • Adviceforapplyingmachinelearning

    Regularizationandbias/variance

    MachineLearning

  • Linearregressionwithregularization

    LargexxHighbias(underfit)

    Intermediatexx“Justright”

    SmallxxHighvariance(overfit)

    Model:Price

    Size

    Price

    Size

    Price

    Size

  • Linearregressionwithregularization

    LargexxHighbias(underfit)

    Intermediatexx“Justright”

    SmallxxHighvariance(overfit)

    Model:Price

    Size

    Price

    Size

    Price

    Size

  • Choosingtheregularizationparameter

  • Choosingtheregularizationparameter

  • Model:

    Choosingtheregularizationparameter

    Pick(say).Testerror:

  • 1.  Try2.  Try3.  Try4.  Try5.  Try12. Try

    Model:

    Choosingtheregularizationparameter

    Pick(say).Testerror:

  • Bias/varianceasafunctionoftheregularizationparameter

  • Bias/varianceasafunctionoftheregularizationparameter