uva cs 4501: machine learning lecture 5: non …...rbf= radial-basis function:a function which...
TRANSCRIPT
![Page 1: UVA CS 4501: Machine Learning Lecture 5: Non …...RBF= radial-basis function:a function which depends only on the radial distance from a centrepoint Gaussian RBFè as distancefrom](https://reader034.vdocument.in/reader034/viewer/2022042219/5ec4ce9cc620f72ddb6d389c/html5/thumbnails/1.jpg)
UVACS4501:MachineLearning
Lecture5:Non-LinearRegressionModels
Dr.Yanjun Qi
UniversityofVirginiaDepartmentofComputerScience
10/6/18
Dr.YanjunQi/UVACS
1
![Page 2: UVA CS 4501: Machine Learning Lecture 5: Non …...RBF= radial-basis function:a function which depends only on the radial distance from a centrepoint Gaussian RBFè as distancefrom](https://reader034.vdocument.in/reader034/viewer/2022042219/5ec4ce9cc620f72ddb6d389c/html5/thumbnails/2.jpg)
Wherearewe?èFivemajorsectionsofthiscourse
q Regression(supervised)q Classification(supervised)q Unsupervisedmodelsq Learningtheoryq Graphicalmodels
10/6/18 2
Dr.YanjunQi/UVACS
![Page 3: UVA CS 4501: Machine Learning Lecture 5: Non …...RBF= radial-basis function:a function which depends only on the radial distance from a centrepoint Gaussian RBFè as distancefrom](https://reader034.vdocument.in/reader034/viewer/2022042219/5ec4ce9cc620f72ddb6d389c/html5/thumbnails/3.jpg)
Regression(supervised)q Fourwaystotrain/performoptimizationforlinearregressionmodelsq NormalEquationq GradientDescent(GD)q StochasticGDq Newton’smethod
qSupervisedregressionmodelsqLinearregression(LR)qLRwithnon-linearbasisfunctionsqLocallyweightedLRqLRwithRegularizations
10/6/18 3
Dr.YanjunQi/UVACS
![Page 4: UVA CS 4501: Machine Learning Lecture 5: Non …...RBF= radial-basis function:a function which depends only on the radial distance from a centrepoint Gaussian RBFè as distancefrom](https://reader034.vdocument.in/reader034/viewer/2022042219/5ec4ce9cc620f72ddb6d389c/html5/thumbnails/4.jpg)
TodayèRegression(supervised)
q Fourwaystotrain/performoptimizationforlinearregressionmodelsq NormalEquationq GradientDescent(GD)q StochasticGDq Newton’smethod
qSupervisedregressionmodelsqLinearregression(LR)qLRwithnon-linearbasisfunctionsqLocallyweightedLRqLRwithRegularizations
10/6/18 4
Dr.YanjunQi/UVACS
![Page 5: UVA CS 4501: Machine Learning Lecture 5: Non …...RBF= radial-basis function:a function which depends only on the radial distance from a centrepoint Gaussian RBFè as distancefrom](https://reader034.vdocument.in/reader034/viewer/2022042219/5ec4ce9cc620f72ddb6d389c/html5/thumbnails/5.jpg)
Today
q RegressionModelsBeyondLinear– LRwithnon-linearbasisfunctions– Instance-basedRegression:K-NearestNeighbors(later)
– Locallyweightedlinearregression(extra)–RegressiontreesandMultilinear Interpolation(later)
10/6/18 5
Dr.YanjunQi/UVACS
![Page 6: UVA CS 4501: Machine Learning Lecture 5: Non …...RBF= radial-basis function:a function which depends only on the radial distance from a centrepoint Gaussian RBFè as distancefrom](https://reader034.vdocument.in/reader034/viewer/2022042219/5ec4ce9cc620f72ddb6d389c/html5/thumbnails/6.jpg)
Dr.YanjunQi/UVACS
6
LRwithnon-linearbasisfunctions
• LRdoesnotmeanwecanonlydealwithlinearrelationships
y =θ0 + θ jϕ j(x)j=1m∑ =θTϕ(x)
10/6/18
y =θ Tx
![Page 7: UVA CS 4501: Machine Learning Lecture 5: Non …...RBF= radial-basis function:a function which depends only on the radial distance from a centrepoint Gaussian RBFè as distancefrom](https://reader034.vdocument.in/reader034/viewer/2022042219/5ec4ce9cc620f72ddb6d389c/html5/thumbnails/7.jpg)
Dr.YanjunQi/UVACS
7
LRwithnon-linearbasisfunctions
• Wearefreetodesignbasisfunctions(e.g.,non-linearfeatures:
Herearefixedbasisfunctions(alsodefine)
• E.g.:polynomialregression:
ϕ(x):= 1,x ,x2⎡⎣ ⎤⎦T
10/6/18
!!ϕ j(x) !!ϕ0(x)=1
![Page 8: UVA CS 4501: Machine Learning Lecture 5: Non …...RBF= radial-basis function:a function which depends only on the radial distance from a centrepoint Gaussian RBFè as distancefrom](https://reader034.vdocument.in/reader034/viewer/2022042219/5ec4ce9cc620f72ddb6d389c/html5/thumbnails/8.jpg)
e.g.(1)polynomialregression
10/6/18
Dr.YanjunQi/UVACS
8
θ * = ϕTϕ( )−1ϕT !y( ) yXXX TT !1−=*θ
y =θTϕ(x)y =θ Tx
ϕ(x):= 1,x ,x2⎡⎣ ⎤⎦T
![Page 9: UVA CS 4501: Machine Learning Lecture 5: Non …...RBF= radial-basis function:a function which depends only on the radial distance from a centrepoint Gaussian RBFè as distancefrom](https://reader034.vdocument.in/reader034/viewer/2022042219/5ec4ce9cc620f72ddb6d389c/html5/thumbnails/9.jpg)
e.g.(1)polynomialregression
10/6/18
Dr.YanjunQi/UVACS
9
KEY:ifthebasesaregiven,theproblemof
learningtheparameters isstilllinear.
y =θTϕ(x)y =θ Tx
![Page 10: UVA CS 4501: Machine Learning Lecture 5: Non …...RBF= radial-basis function:a function which depends only on the radial distance from a centrepoint Gaussian RBFè as distancefrom](https://reader034.vdocument.in/reader034/viewer/2022042219/5ec4ce9cc620f72ddb6d389c/html5/thumbnails/10.jpg)
Dr.YanjunQi/UVACS
10
ManyPossibleBasisfunctions• Therearemanybasisfunctions,e.g.:
– Polynomial
– Radialbasisfunctions
– Sigmoidal
– Splines,– Fourier,– Wavelets,etc
ϕ j (x) = xj−1
( )⎟⎟⎠
⎞⎜⎜⎝
⎛ −−= 2
2
2sx
x jj
µφ exp)(
⎟⎟⎠
⎞⎜⎜⎝
⎛ −=
sx
x jj
µσφ )(
10/6/18
![Page 11: UVA CS 4501: Machine Learning Lecture 5: Non …...RBF= radial-basis function:a function which depends only on the radial distance from a centrepoint Gaussian RBFè as distancefrom](https://reader034.vdocument.in/reader034/viewer/2022042219/5ec4ce9cc620f72ddb6d389c/html5/thumbnails/11.jpg)
ManyPossibleBasisfunctions
10/6/18
Dr.YanjunQi/UVACS
11
![Page 12: UVA CS 4501: Machine Learning Lecture 5: Non …...RBF= radial-basis function:a function which depends only on the radial distance from a centrepoint Gaussian RBFè as distancefrom](https://reader034.vdocument.in/reader034/viewer/2022042219/5ec4ce9cc620f72ddb6d389c/html5/thumbnails/12.jpg)
Dr.YanjunQi/UVACS
12
e.g.(2)LRwithradial-basisfunctions
• E.g.:LRwithRBFregression:
!! y =θ0 + θ jϕ j(x)j=1m∑ =ϕ(x)Tθ
10/6/18
!!ϕ(x):= 1,Kλ1(x ,r1),Kλ2(x ,r2),Kλ3
(x ,r3),Kλ4(x ,r4 )⎡⎣
⎤⎦T
θ * = ϕTϕ( )−1ϕT !y
![Page 13: UVA CS 4501: Machine Learning Lecture 5: Non …...RBF= radial-basis function:a function which depends only on the radial distance from a centrepoint Gaussian RBFè as distancefrom](https://reader034.vdocument.in/reader034/viewer/2022042219/5ec4ce9cc620f72ddb6d389c/html5/thumbnails/13.jpg)
Kλ(x ,r)= exp − (x − r)
2
2λ2⎛
⎝⎜⎞
⎠⎟
RBF = radial-basisfunction: afunctionwhichdependsonlyontheradial distancefromacentre point
GaussianRBFè
asdistance fromthecenterr increases, theoutputoftheRBFdecreases
1Dcase 2Dcase
10/6/18
Dr.YanjunQi/UVACS
13
![Page 14: UVA CS 4501: Machine Learning Lecture 5: Non …...RBF= radial-basis function:a function which depends only on the radial distance from a centrepoint Gaussian RBFè as distancefrom](https://reader034.vdocument.in/reader034/viewer/2022042219/5ec4ce9cc620f72ddb6d389c/html5/thumbnails/14.jpg)
10/6/18
Dr.YanjunQi/UVACS
14
Kλ(x ,r)= exp − (x − r)
2
2λ2⎛
⎝⎜⎞
⎠⎟
X=
10.6065307
0.1353353
0.0001234098
Kλ(x ,r)=r
r +λ
r +2λr +3λ
![Page 15: UVA CS 4501: Machine Learning Lecture 5: Non …...RBF= radial-basis function:a function which depends only on the radial distance from a centrepoint Gaussian RBFè as distancefrom](https://reader034.vdocument.in/reader034/viewer/2022042219/5ec4ce9cc620f72ddb6d389c/html5/thumbnails/15.jpg)
e.g.anotherLinearregressionwith1DRBFbasisfunctions
(assuming3predefinedcentres andwidth)
10/6/18
Dr.YanjunQi/UVACS
15
ϕ(x):= 1,Kλ1(x ,r1),Kλ2(x ,r2),Kλ3
(x ,r3)⎡⎣
⎤⎦T
θ * = ϕTϕ( )−1ϕT !y
![Page 16: UVA CS 4501: Machine Learning Lecture 5: Non …...RBF= radial-basis function:a function which depends only on the radial distance from a centrepoint Gaussian RBFè as distancefrom](https://reader034.vdocument.in/reader034/viewer/2022042219/5ec4ce9cc620f72ddb6d389c/html5/thumbnails/16.jpg)
Dr.YanjunQi/UVACS
16
e.g.aLRwith1DRBFs(3predefinedcentres andwidth)
• 1DRBF
• Afterfit:
10/6/18
![Page 17: UVA CS 4501: Machine Learning Lecture 5: Non …...RBF= radial-basis function:a function which depends only on the radial distance from a centrepoint Gaussian RBFè as distancefrom](https://reader034.vdocument.in/reader034/viewer/2022042219/5ec4ce9cc620f72ddb6d389c/html5/thumbnails/17.jpg)
e.g.EvenmorepossibleBasisFunc?
10/6/18
Dr.YanjunQi/UVACS
17
![Page 18: UVA CS 4501: Machine Learning Lecture 5: Non …...RBF= radial-basis function:a function which depends only on the radial distance from a centrepoint Gaussian RBFè as distancefrom](https://reader034.vdocument.in/reader034/viewer/2022042219/5ec4ce9cc620f72ddb6d389c/html5/thumbnails/18.jpg)
(2) Multivariate Linear Regression with basis Expansion
Regression
Y = Weighted linear sum of (X basis expansion)
SSE
Linear algebra
Regression coefficients
Task
Representation
Score Function
Search/Optimization
Models, Parameters
!! y =θ0 + θ jϕ j(x)j=1m∑ =ϕ(x)Tθ
10/6/18 18
Dr.YanjunQi/UVACS
![Page 19: UVA CS 4501: Machine Learning Lecture 5: Non …...RBF= radial-basis function:a function which depends only on the radial distance from a centrepoint Gaussian RBFè as distancefrom](https://reader034.vdocument.in/reader034/viewer/2022042219/5ec4ce9cc620f72ddb6d389c/html5/thumbnails/19.jpg)
Twomainissues:
• ToLearntheparameter– AlmostthesameasLR,justè Xto– Linearcombinationofbasisfunctions(thatcanbenon-linear)
• Howtochoosethemodelorder,– E.g.whatpolynomialdegreeforpolynomialregression
– E.g.,wheretoputthecentersfortheRBFkernels?Howwide?
10/6/18
Dr.YanjunQi/UVACS
19
ϕ(x)θ *
![Page 20: UVA CS 4501: Machine Learning Lecture 5: Non …...RBF= radial-basis function:a function which depends only on the radial distance from a centrepoint Gaussian RBFè as distancefrom](https://reader034.vdocument.in/reader034/viewer/2022042219/5ec4ce9cc620f72ddb6d389c/html5/thumbnails/20.jpg)
Dr.YanjunQi/UVACS
20
e.g.2DGoodandBadRBFBasis
• Agood2DRBF
• Twobad2DRBFs
10/6/18
![Page 21: UVA CS 4501: Machine Learning Lecture 5: Non …...RBF= radial-basis function:a function which depends only on the radial distance from a centrepoint Gaussian RBFè as distancefrom](https://reader034.vdocument.in/reader034/viewer/2022042219/5ec4ce9cc620f72ddb6d389c/html5/thumbnails/21.jpg)
Issue:OverfittingandUnderfitting
x
y
y=f(x)+noiseCanwelearnaregression ffromthedata?
Let’sconsiderthreemethods…
![Page 22: UVA CS 4501: Machine Learning Lecture 5: Non …...RBF= radial-basis function:a function which depends only on the radial distance from a centrepoint Gaussian RBFè as distancefrom](https://reader034.vdocument.in/reader034/viewer/2022042219/5ec4ce9cc620f72ddb6d389c/html5/thumbnails/22.jpg)
LinearRegression
x
y
![Page 23: UVA CS 4501: Machine Learning Lecture 5: Non …...RBF= radial-basis function:a function which depends only on the radial distance from a centrepoint Gaussian RBFè as distancefrom](https://reader034.vdocument.in/reader034/viewer/2022042219/5ec4ce9cc620f72ddb6d389c/html5/thumbnails/23.jpg)
QuadraticRegression
x
y
![Page 24: UVA CS 4501: Machine Learning Lecture 5: Non …...RBF= radial-basis function:a function which depends only on the radial distance from a centrepoint Gaussian RBFè as distancefrom](https://reader034.vdocument.in/reader034/viewer/2022042219/5ec4ce9cc620f72ddb6d389c/html5/thumbnails/24.jpg)
Join-the-dots
x
y
Alsoknownaspiecewiselinearnonparametric regression ifthatmakesyoufeelbetter
![Page 25: UVA CS 4501: Machine Learning Lecture 5: Non …...RBF= radial-basis function:a function which depends only on the radial distance from a centrepoint Gaussian RBFè as distancefrom](https://reader034.vdocument.in/reader034/viewer/2022042219/5ec4ce9cc620f72ddb6d389c/html5/thumbnails/25.jpg)
Whichisbest?
x
y
x
y
Whynotchoosethemethodwiththebestfittothedata?
![Page 26: UVA CS 4501: Machine Learning Lecture 5: Non …...RBF= radial-basis function:a function which depends only on the radial distance from a centrepoint Gaussian RBFè as distancefrom](https://reader034.vdocument.in/reader034/viewer/2022042219/5ec4ce9cc620f72ddb6d389c/html5/thumbnails/26.jpg)
Whatdowereallywant?
x
y
x
y
Whynotchoosethemethodwiththebestfittothedata?
“Howwellareyougoing topredictfuturedatadrawnfromthesamedistribution?”
![Page 27: UVA CS 4501: Machine Learning Lecture 5: Non …...RBF= radial-basis function:a function which depends only on the radial distance from a centrepoint Gaussian RBFè as distancefrom](https://reader034.vdocument.in/reader034/viewer/2022042219/5ec4ce9cc620f72ddb6d389c/html5/thumbnails/27.jpg)
Whatdowereallywant?
x
y
x
y
Whynotchoosethemethodwiththebestfittothedata?
“Howwellareyougoing topredictfuturedatadrawnfromthesamedistribution?”
Underfit Good? Overfit
![Page 28: UVA CS 4501: Machine Learning Lecture 5: Non …...RBF= radial-basis function:a function which depends only on the radial distance from a centrepoint Gaussian RBFè as distancefrom](https://reader034.vdocument.in/reader034/viewer/2022042219/5ec4ce9cc620f72ddb6d389c/html5/thumbnails/28.jpg)
Dr.YanjunQi/UVACS
28
Issue:Overfitting andunderfitting
xy 10 θθ += 2210 xxy θθθ ++= ∑ =
= 5
0jj
j xy θ
10/6/18
K-foldCrossValidation/Train-Test/
Generalisation:learnfunction/hypothesisfrompastdatainorderto“explain”,“predict”,“model”or“control”new dataexamples
Underfit Looksgood Overfit
![Page 29: UVA CS 4501: Machine Learning Lecture 5: Non …...RBF= radial-basis function:a function which depends only on the radial distance from a centrepoint Gaussian RBFè as distancefrom](https://reader034.vdocument.in/reader034/viewer/2022042219/5ec4ce9cc620f72ddb6d389c/html5/thumbnails/29.jpg)
Thetestsetmethod
x
y
1.Randomlychoosesomepercentagelike30% ofthelabeleddatatobeinatestset2.Theremainder isatrainingset
Credit:Prof.AndrewMoore
![Page 30: UVA CS 4501: Machine Learning Lecture 5: Non …...RBF= radial-basis function:a function which depends only on the radial distance from a centrepoint Gaussian RBFè as distancefrom](https://reader034.vdocument.in/reader034/viewer/2022042219/5ec4ce9cc620f72ddb6d389c/html5/thumbnails/30.jpg)
Thetestsetmethod
x
y
1.Randomlychoosesomepercentagelike30% ofthelabeleddatatobeinatestset2.Theremainder isatrainingset3.Performyour regressiononthetrainingset
(Linearregressionexample)
Credit:Prof.AndrewMoore
![Page 31: UVA CS 4501: Machine Learning Lecture 5: Non …...RBF= radial-basis function:a function which depends only on the radial distance from a centrepoint Gaussian RBFè as distancefrom](https://reader034.vdocument.in/reader034/viewer/2022042219/5ec4ce9cc620f72ddb6d389c/html5/thumbnails/31.jpg)
Thetestsetmethod
x
y
1.Randomlychoose30%ofthedatatobeinatestset2.Theremainder isatrainingset3.Performyour regressiononthetrainingset4.Estimateyour futureperformancewiththetestset
(Linearregressionexample)MeanSquaredError=2.4
Credit:Prof.AndrewMoore
![Page 32: UVA CS 4501: Machine Learning Lecture 5: Non …...RBF= radial-basis function:a function which depends only on the radial distance from a centrepoint Gaussian RBFè as distancefrom](https://reader034.vdocument.in/reader034/viewer/2022042219/5ec4ce9cc620f72ddb6d389c/html5/thumbnails/32.jpg)
Evaluation: e.g. train / test split as follows
10/6/18 32
Xtrain =
−− x1T −−
−− x2T −−
! ! !−− xn
T −−
"
#
$$$$$
%
&
'''''
!ytrain =
y1y2"yn
!
"
#####
$
%
&&&&&
Xtest =
−− xn+1T −−
−− xn+2T −−
! ! !−− xn+m
T −−
"
#
$$$$$
%
&
'''''
!ytest =
yn+1yn+2"yn+m
!
"
#####
$
%
&&&&&
Dr.YanjunQi/UVACS
![Page 33: UVA CS 4501: Machine Learning Lecture 5: Non …...RBF= radial-basis function:a function which depends only on the radial distance from a centrepoint Gaussian RBFè as distancefrom](https://reader034.vdocument.in/reader034/viewer/2022042219/5ec4ce9cc620f72ddb6d389c/html5/thumbnails/33.jpg)
10/6/18 33
TestingMSEErrortoreport:
Jtest =1m
(x iTθ * − yi )2i=n+1
n+m
∑ = 1m
ε i2
i=n+1
n+m
∑
Dr.YanjunQi/UVACS
InHomework,whenweaskforplotsof trainingerror,weaskfortheMSEper-sampletrainerrors;BecauseitiscomparabletotestMSEerror.
![Page 34: UVA CS 4501: Machine Learning Lecture 5: Non …...RBF= radial-basis function:a function which depends only on the radial distance from a centrepoint Gaussian RBFè as distancefrom](https://reader034.vdocument.in/reader034/viewer/2022042219/5ec4ce9cc620f72ddb6d389c/html5/thumbnails/34.jpg)
10/6/18
Dr.YanjunQi/
34
Jtrain−MSE =
1n
(x iTθ * − yi )2i=1
n
∑
• TrainMSEErrortoobserve:
InHomework,whenweaskforplotsof trainingerror,weaskfortheMSEper-sampletrainerrors;BecauseitiscomparabletotestMSEerror.
Inmanysituations,visualizingTrain-MSEcanbehelpful tounderstand thebehaviorofyourmethod, e.g.,theinfluenceofthehyperparameteryouchose……
![Page 35: UVA CS 4501: Machine Learning Lecture 5: Non …...RBF= radial-basis function:a function which depends only on the radial distance from a centrepoint Gaussian RBFè as distancefrom](https://reader034.vdocument.in/reader034/viewer/2022042219/5ec4ce9cc620f72ddb6d389c/html5/thumbnails/35.jpg)
Thetestsetmethod
x
y
1.Randomlychoose30%ofthedatatobeinatestset2.Theremainder isatrainingset3.Performyour regressiononthetrainingset4.Estimateyour futureperformancewiththetestset
(Quadraticregressionexample)MeanSquaredError=0.9
Credit:Prof.AndrewMoore
![Page 36: UVA CS 4501: Machine Learning Lecture 5: Non …...RBF= radial-basis function:a function which depends only on the radial distance from a centrepoint Gaussian RBFè as distancefrom](https://reader034.vdocument.in/reader034/viewer/2022042219/5ec4ce9cc620f72ddb6d389c/html5/thumbnails/36.jpg)
Thetestsetmethod
x
y
1.Randomlychoose30%ofthedatatobeinatestset2.Theremainder isatrainingset3.Performyour regressiononthetrainingset4.Estimateyour futureperformancewiththetestset
(Join thedotsexample)MeanSquaredError=2.2
Credit:Prof.AndrewMoore
![Page 37: UVA CS 4501: Machine Learning Lecture 5: Non …...RBF= radial-basis function:a function which depends only on the radial distance from a centrepoint Gaussian RBFè as distancefrom](https://reader034.vdocument.in/reader034/viewer/2022042219/5ec4ce9cc620f72ddb6d389c/html5/thumbnails/37.jpg)
Thetestsetmethod
Goodnews:•Veryverysimple•Canthensimplychoosethemethodwiththebesttest-setscoreBadnews:•What’sthedownside?
Credit:Prof.AndrewMoore
![Page 38: UVA CS 4501: Machine Learning Lecture 5: Non …...RBF= radial-basis function:a function which depends only on the radial distance from a centrepoint Gaussian RBFè as distancefrom](https://reader034.vdocument.in/reader034/viewer/2022042219/5ec4ce9cc620f72ddb6d389c/html5/thumbnails/38.jpg)
Thetestsetmethod
Goodnews:•Veryverysimple•Canthensimplychoosethemethodwiththebesttest-setscoreBadnews:•Wastesdata:wegetanestimateofthebestmethodtoapplyto30%lessdata•Ifwedon’thavemuchdata,ourtest-setmightjustbeluckyorunlucky
We say the “test-set estimator of performance has high variance”
Credit:Prof.AndrewMoore
![Page 39: UVA CS 4501: Machine Learning Lecture 5: Non …...RBF= radial-basis function:a function which depends only on the radial distance from a centrepoint Gaussian RBFè as distancefrom](https://reader034.vdocument.in/reader034/viewer/2022042219/5ec4ce9cc620f72ddb6d389c/html5/thumbnails/39.jpg)
Regression: Complexity versus Goodness of Fit
x
y
x
y
x
y
x
y
Too simple?
Too complex ? About right ?
Training data
What ultimately matters: GENERALIZATION
LowVariance/HighBias
LowBias/HighVariance
10/6/18 39
Dr.YanjunQi/UVACS
![Page 40: UVA CS 4501: Machine Learning Lecture 5: Non …...RBF= radial-basis function:a function which depends only on the radial distance from a centrepoint Gaussian RBFè as distancefrom](https://reader034.vdocument.in/reader034/viewer/2022042219/5ec4ce9cc620f72ddb6d389c/html5/thumbnails/40.jpg)
LOOCV(Leave-one-outCrossValidation)
x
y
For k=1 to n
1. Let (xk,yk) be the kth record
Credit:Prof.AndrewMoore
![Page 41: UVA CS 4501: Machine Learning Lecture 5: Non …...RBF= radial-basis function:a function which depends only on the radial distance from a centrepoint Gaussian RBFè as distancefrom](https://reader034.vdocument.in/reader034/viewer/2022042219/5ec4ce9cc620f72ddb6d389c/html5/thumbnails/41.jpg)
LOOCV(Leave-one-outCrossValidation)
x
y
For k=1 to n
1. Let (xk,yk) be the kth record
2. Temporarily remove (xk,yk)from the dataset
![Page 42: UVA CS 4501: Machine Learning Lecture 5: Non …...RBF= radial-basis function:a function which depends only on the radial distance from a centrepoint Gaussian RBFè as distancefrom](https://reader034.vdocument.in/reader034/viewer/2022042219/5ec4ce9cc620f72ddb6d389c/html5/thumbnails/42.jpg)
LOOCV(Leave-one-outCrossValidation)
x
y
For k=1 to n
1. Let (xk,yk) be the kth record
2. Temporarily remove (xk,yk)from the dataset
3. Train on the remaining R-1 datapoints
Credit:Prof.AndrewMoore
![Page 43: UVA CS 4501: Machine Learning Lecture 5: Non …...RBF= radial-basis function:a function which depends only on the radial distance from a centrepoint Gaussian RBFè as distancefrom](https://reader034.vdocument.in/reader034/viewer/2022042219/5ec4ce9cc620f72ddb6d389c/html5/thumbnails/43.jpg)
LOOCV(Leave-one-outCrossValidation)For k=1 to n
1. Let (xk,yk) be the kth record
2. Temporarily remove (xk,yk)from the dataset
3. Train on the remaining R-1 datapoints
4. Note your error (xk,yk)
x
y
![Page 44: UVA CS 4501: Machine Learning Lecture 5: Non …...RBF= radial-basis function:a function which depends only on the radial distance from a centrepoint Gaussian RBFè as distancefrom](https://reader034.vdocument.in/reader034/viewer/2022042219/5ec4ce9cc620f72ddb6d389c/html5/thumbnails/44.jpg)
LOOCV(Leave-one-outCrossValidation)For k=1 to R
1. Let (xk,yk) be the kth record
2. Temporarily remove (xk,yk)from the dataset
3. Train on the remaining R-1 datapoints
4. Note your error (xk,yk)
When you’ve done all points, report the mean error.
x
y
![Page 45: UVA CS 4501: Machine Learning Lecture 5: Non …...RBF= radial-basis function:a function which depends only on the radial distance from a centrepoint Gaussian RBFè as distancefrom](https://reader034.vdocument.in/reader034/viewer/2022042219/5ec4ce9cc620f72ddb6d389c/html5/thumbnails/45.jpg)
LOOCV(Leave-one-outCrossValidation)For k=1 to n
1. Let (xk,yk) be the kth
record
2. Temporarily remove (xk,yk) from the dataset
3. Train on the remaining R-1 datapoints
4. Note your error (xk,yk)
When you’ve done all points, report the mean error.
x
y
x
y
x
y
x
y
x
y
x
y
x
y
x
y
x
y
MSELOOCV=2.12
Credit:Prof.AndrewMoore
![Page 46: UVA CS 4501: Machine Learning Lecture 5: Non …...RBF= radial-basis function:a function which depends only on the radial distance from a centrepoint Gaussian RBFè as distancefrom](https://reader034.vdocument.in/reader034/viewer/2022042219/5ec4ce9cc620f72ddb6d389c/html5/thumbnails/46.jpg)
LOOCVforQuadraticRegressionFor k=1 to n
1. Let (xk,yk) be the kth record
2. Temporarily remove (xk,yk)from the dataset
3. Train on the remaining R-1 datapoints
4. Note your error (xk,yk)
When you’ve done all points, report the mean error.
x
y
x
y
x
y
x
y
x
y
x
y
x
y
x
y
x
y
MSELOOCV=0.962
Credit:Prof.AndrewMoore
![Page 47: UVA CS 4501: Machine Learning Lecture 5: Non …...RBF= radial-basis function:a function which depends only on the radial distance from a centrepoint Gaussian RBFè as distancefrom](https://reader034.vdocument.in/reader034/viewer/2022042219/5ec4ce9cc620f72ddb6d389c/html5/thumbnails/47.jpg)
LOOCVforJoinTheDotsFor k=1 to n
1. Let (xk,yk) be the kth
record
2. Temporarily remove (xk,yk) from the dataset
3. Train on the remaining R-1 datapoints
4. Note your error (xk,yk)
When you’ve done all points, report the mean error.
x
y
x
y
x
y
x
y
x
y
x
y
x
y
x
y
x
y
MSELOOCV=3.33
Credit:Prof.AndrewMoore
![Page 48: UVA CS 4501: Machine Learning Lecture 5: Non …...RBF= radial-basis function:a function which depends only on the radial distance from a centrepoint Gaussian RBFè as distancefrom](https://reader034.vdocument.in/reader034/viewer/2022042219/5ec4ce9cc620f72ddb6d389c/html5/thumbnails/48.jpg)
WhichkindofCrossValidation?Downside Upside
Test-set Variance: unreliable estimate of future performance
Cheap
Leave-one-out
Expensive. Has some weird behavior
Doesn’t waste data
..canwegetthebestofbothworlds?
Credit:Prof.AndrewMoore
![Page 49: UVA CS 4501: Machine Learning Lecture 5: Non …...RBF= radial-basis function:a function which depends only on the radial distance from a centrepoint Gaussian RBFè as distancefrom](https://reader034.vdocument.in/reader034/viewer/2022042219/5ec4ce9cc620f72ddb6d389c/html5/thumbnails/49.jpg)
e.g.Byk=10foldCrossValidationmodel P1 P2 P3 P4 P5 P6 P7 P8 P9 P10
1 train train train train train train train train train test
2 train train train train train train train train test train
3 train train train train train train train test train train
4 train train train train train train test train train train
5 train train train train train test train train train train
6 train train train train test train train train train train
7 train train train test train train train train train train
8 train train test train train train train train train train
9 train test train train train train train train train train
10 test train train train train train train train train train
• Dividedatainto10equalpieces
• 9piecesastrainingset,therest1astestset
• Collectthescoresfromthediagonal
• Wenormallyusethemeanofthescores 4910/6/18
Dr.YanjunQi/UVACSMakesurethatthetrain/test/validationfoldsareindeed independent samples.
![Page 50: UVA CS 4501: Machine Learning Lecture 5: Non …...RBF= radial-basis function:a function which depends only on the radial distance from a centrepoint Gaussian RBFè as distancefrom](https://reader034.vdocument.in/reader034/viewer/2022042219/5ec4ce9cc620f72ddb6d389c/html5/thumbnails/50.jpg)
k-foldCrossValidation
x
y
Randomly break the dataset into k partitions (in our example we’ll have k=3 partitions colored Purple Green and Blue)
Credit:Prof.AndrewMoore
![Page 51: UVA CS 4501: Machine Learning Lecture 5: Non …...RBF= radial-basis function:a function which depends only on the radial distance from a centrepoint Gaussian RBFè as distancefrom](https://reader034.vdocument.in/reader034/viewer/2022042219/5ec4ce9cc620f72ddb6d389c/html5/thumbnails/51.jpg)
k-foldCrossValidation
x
y
Randomly break the dataset into k partitions (in our example we’ll have k=3 partitions colored Purple Green and Blue)
For the blue partition: Train on all the points not in the blue partition. Find the test-set sum of errors on the blue points.
Credit:Prof.AndrewMoore
![Page 52: UVA CS 4501: Machine Learning Lecture 5: Non …...RBF= radial-basis function:a function which depends only on the radial distance from a centrepoint Gaussian RBFè as distancefrom](https://reader034.vdocument.in/reader034/viewer/2022042219/5ec4ce9cc620f72ddb6d389c/html5/thumbnails/52.jpg)
k-foldCrossValidation
x
y
Randomly break the dataset into k partitions (in our example we’ll have k=3 partitions colored Purple Green and Blue)
For the red partition: Train on all the points not in the red partition. Find the test-set sum of errors on the red points.
For the green partition: Train on all the points not in the green partition. Find the test-set sum of errors on the green points.
Credit:Prof.AndrewMoore
![Page 53: UVA CS 4501: Machine Learning Lecture 5: Non …...RBF= radial-basis function:a function which depends only on the radial distance from a centrepoint Gaussian RBFè as distancefrom](https://reader034.vdocument.in/reader034/viewer/2022042219/5ec4ce9cc620f72ddb6d389c/html5/thumbnails/53.jpg)
k-foldCrossValidation
x
y
Randomly break the dataset into k partitions (in our example we’ll have k=3 partitions colored Purple Green and Blue)
For the red partition: Train on all the points not in the red partition. Find the test-set sum of errors on the red points.
For the green partition: Train on all the points not in the green partition. Find the test-set sum of errors on the green points.
For the purple partition: Train on all the points not in the purple partition. Find the test-set sum of errors on the purple points.
Credit:Prof.AndrewMoore
![Page 54: UVA CS 4501: Machine Learning Lecture 5: Non …...RBF= radial-basis function:a function which depends only on the radial distance from a centrepoint Gaussian RBFè as distancefrom](https://reader034.vdocument.in/reader034/viewer/2022042219/5ec4ce9cc620f72ddb6d389c/html5/thumbnails/54.jpg)
k-foldCrossValidation
x
y
Randomly break the dataset into k partitions (in our example we’ll have k=3 partitions colored Purple Green and Blue)
For the red partition: Train on all the points not in the red partition. Find the test-set sum of errors on the red points.
For the green partition: Train on all the points not in the green partition. Find the test-set sum of errors on the green points.
For the purple partition: Train on all the points not in the purple partition. Find the test-set sum of errors on the purple points.
Then report the mean error
LinearRegressionMSE3FOLD=2.05
Credit:Prof.AndrewMoore
![Page 55: UVA CS 4501: Machine Learning Lecture 5: Non …...RBF= radial-basis function:a function which depends only on the radial distance from a centrepoint Gaussian RBFè as distancefrom](https://reader034.vdocument.in/reader034/viewer/2022042219/5ec4ce9cc620f72ddb6d389c/html5/thumbnails/55.jpg)
k-foldCrossValidation
x
y
Randomly break the dataset into k partitions (in our example we’ll have k=3 partitions colored Purple Green and Blue)
For the red partition: Train on all the points not in the red partition. Find the test-set sum of errors on the red points.
For the green partition: Train on all the points not in the green partition. Find the test-set sum of errors on the green points.
For the purple partition: Train on all the points not in the purple partition. Find the test-set sum of errors on the purple points.
Then report the mean error
QuadraticRegressionMSE3FOLD=1.11
Credit:Prof.AndrewMoore
![Page 56: UVA CS 4501: Machine Learning Lecture 5: Non …...RBF= radial-basis function:a function which depends only on the radial distance from a centrepoint Gaussian RBFè as distancefrom](https://reader034.vdocument.in/reader034/viewer/2022042219/5ec4ce9cc620f72ddb6d389c/html5/thumbnails/56.jpg)
k-foldCrossValidation
x
y
Randomly break the dataset into k partitions (in our example we’ll have k=3 partitions colored Purple Green and Blue)
For the red partition: Train on all the points not in the red partition. Find the test-set sum of errors on the red points.
For the green partition: Train on all the points not in the green partition. Find the test-set sum of errors on the green points.
For the blue partition: Train on all the points not in the blue partition. Find the test-set sum of errors on the blue points.
Then report the mean error
Joint-the-dotsMSE3FOLD=2.93
Credit:Prof.AndrewMoore
![Page 57: UVA CS 4501: Machine Learning Lecture 5: Non …...RBF= radial-basis function:a function which depends only on the radial distance from a centrepoint Gaussian RBFè as distancefrom](https://reader034.vdocument.in/reader034/viewer/2022042219/5ec4ce9cc620f72ddb6d389c/html5/thumbnails/57.jpg)
WhichkindofCrossValidation?Downside Upside
Test-set Variance: unreliable estimate of future performance
Cheap
Leave-one-out
Expensive. Has some weird behavior
Doesn’t waste data
10-fold Wastes 10% of the data. 10 times more expensive than test set
Only wastes 10%. Only 10 times more expensive instead of R times.
3-fold Wastier than 10-fold. Expensivier than test set
better than test-set
n-fold Identical to Leave-one-out
![Page 58: UVA CS 4501: Machine Learning Lecture 5: Non …...RBF= radial-basis function:a function which depends only on the radial distance from a centrepoint Gaussian RBFè as distancefrom](https://reader034.vdocument.in/reader034/viewer/2022042219/5ec4ce9cc620f72ddb6d389c/html5/thumbnails/58.jpg)
CV-basedModelSelection
• We’retryingtodecidewhichalgorithmtouse.• Wetraineachmachineandmakeatable…
i fi TRAINERR k-FOLD-CV-ERR Choice1 f12 f23 f3 Ö
4 f45 f56 f6
Credit:Prof.AndrewMoore
![Page 59: UVA CS 4501: Machine Learning Lecture 5: Non …...RBF= radial-basis function:a function which depends only on the radial distance from a centrepoint Gaussian RBFè as distancefrom](https://reader034.vdocument.in/reader034/viewer/2022042219/5ec4ce9cc620f72ddb6d389c/html5/thumbnails/59.jpg)
10/6/18
Dr.YanjunQi/UVACS
59Credit:StanfordMachineLearningcourse
![Page 60: UVA CS 4501: Machine Learning Lecture 5: Non …...RBF= radial-basis function:a function which depends only on the radial distance from a centrepoint Gaussian RBFè as distancefrom](https://reader034.vdocument.in/reader034/viewer/2022042219/5ec4ce9cc620f72ddb6d389c/html5/thumbnails/60.jpg)
References
• BigthankstoProf.EricXing@CMUforallowingmetoreusesomeofhisslides
q Prof.Nando deFreitas’stutorialslideq Prof.AndrewMoore’sslides@CMU
10/6/18 60
Dr.YanjunQi/UVACS
![Page 61: UVA CS 4501: Machine Learning Lecture 5: Non …...RBF= radial-basis function:a function which depends only on the radial distance from a centrepoint Gaussian RBFè as distancefrom](https://reader034.vdocument.in/reader034/viewer/2022042219/5ec4ce9cc620f72ddb6d389c/html5/thumbnails/61.jpg)
61
Extra: Performancevs.TrainingSizeforanexample
l TheresultsfromBatchGDandOnline(SGD)updatearealmostidentical.Sotheplotscoincide.
l ThetestMSEfromthenormalequationismorethanthatofBGDandO(SGD)during smalltraining.Thisisprobablyduetooverfitting.
l InBandO,sinceonly2000(forexample)iterationsareallowedatmost.Thisroughlyactsasamechanismthatavoidsoverfitting.
10/6/18
Dr.YanjunQi/UVACS
Dr.EricXing’s tutorialslide
![Page 62: UVA CS 4501: Machine Learning Lecture 5: Non …...RBF= radial-basis function:a function which depends only on the radial distance from a centrepoint Gaussian RBFè as distancefrom](https://reader034.vdocument.in/reader034/viewer/2022042219/5ec4ce9cc620f72ddb6d389c/html5/thumbnails/62.jpg)
EXTRA:MOREREGRESSIONMODELS
10/6/18
Dr.YanjunQi/UVACS
62
![Page 63: UVA CS 4501: Machine Learning Lecture 5: Non …...RBF= radial-basis function:a function which depends only on the radial distance from a centrepoint Gaussian RBFè as distancefrom](https://reader034.vdocument.in/reader034/viewer/2022042219/5ec4ce9cc620f72ddb6d389c/html5/thumbnails/63.jpg)
ExtraToday
q RegressionModelsBeyondLinear– LRwithnon-linearbasisfunctions– Instance-basedRegression:K-NearestNeighbors(later)
– Locallyweightedlinearregression(extra)–RegressiontreesandMultilinear Interpolation(later)
10/6/18 63
Dr.YanjunQi/UVACS
![Page 64: UVA CS 4501: Machine Learning Lecture 5: Non …...RBF= radial-basis function:a function which depends only on the radial distance from a centrepoint Gaussian RBFè as distancefrom](https://reader034.vdocument.in/reader034/viewer/2022042219/5ec4ce9cc620f72ddb6d389c/html5/thumbnails/64.jpg)
K-NearestNeighbor
• Features– Allinstancescorrespondtopointsinanp-dimensionalEuclideanspace
– Regressionisdelayedtillanewinstancearrives– Regressionisdonebycomparingfeaturevectorsofthedifferentpoints
– Targetfunctionmaybediscreteorreal-valued• Whentargetiscontinuous,thepredictionisthemeanvalueoftheknearesttrainingexamples
![Page 65: UVA CS 4501: Machine Learning Lecture 5: Non …...RBF= radial-basis function:a function which depends only on the radial distance from a centrepoint Gaussian RBFè as distancefrom](https://reader034.vdocument.in/reader034/viewer/2022042219/5ec4ce9cc620f72ddb6d389c/html5/thumbnails/65.jpg)
K=5-NearestNeighbor(1Dinput)
10/6/18
Dr.YanjunQi/UVACS
65
![Page 66: UVA CS 4501: Machine Learning Lecture 5: Non …...RBF= radial-basis function:a function which depends only on the radial distance from a centrepoint Gaussian RBFè as distancefrom](https://reader034.vdocument.in/reader034/viewer/2022042219/5ec4ce9cc620f72ddb6d389c/html5/thumbnails/66.jpg)
K=1-NearestNeighbor(1Dinput)
10/6/18
Dr.YanjunQi/UVACS
66
![Page 67: UVA CS 4501: Machine Learning Lecture 5: Non …...RBF= radial-basis function:a function which depends only on the radial distance from a centrepoint Gaussian RBFè as distancefrom](https://reader034.vdocument.in/reader034/viewer/2022042219/5ec4ce9cc620f72ddb6d389c/html5/thumbnails/67.jpg)
K-Nearest Neighbor
Regression/ classification
Local Smoothness
NA
NA
Training Samples
Task
Representation
Score Function
Search/Optimization
Models, Parameters
10/6/18 67
Dr.YanjunQi/UVACS6316/f16
![Page 68: UVA CS 4501: Machine Learning Lecture 5: Non …...RBF= radial-basis function:a function which depends only on the radial distance from a centrepoint Gaussian RBFè as distancefrom](https://reader034.vdocument.in/reader034/viewer/2022042219/5ec4ce9cc620f72ddb6d389c/html5/thumbnails/68.jpg)
Variants:Distance-Weightedk-NearestNeighborAlgorithm
• Assignweightstotheneighborsbasedontheir“distance” fromthequerypoint– Weight“may” beinversesquareofthedistances
• Alltrainingpointsmayinfluenceaparticularinstance– E.g.,Shepard’smethod/ModifiedShepard,… byGeospatialAnalysis
![Page 69: UVA CS 4501: Machine Learning Lecture 5: Non …...RBF= radial-basis function:a function which depends only on the radial distance from a centrepoint Gaussian RBFè as distancefrom](https://reader034.vdocument.in/reader034/viewer/2022042219/5ec4ce9cc620f72ddb6d389c/html5/thumbnails/69.jpg)
Instance-basedRegressionvs.LinearRegression
• LinearRegressionLearning– Explicitdescriptionoftargetfunctiononthewholetrainingset
• Instance-basedLearning– Learning=storingalltraininginstances– Referredtoas“Lazy” learning
![Page 70: UVA CS 4501: Machine Learning Lecture 5: Non …...RBF= radial-basis function:a function which depends only on the radial distance from a centrepoint Gaussian RBFè as distancefrom](https://reader034.vdocument.in/reader034/viewer/2022042219/5ec4ce9cc620f72ddb6d389c/html5/thumbnails/70.jpg)
ExtraToday
q RegressionModelsBeyondLinear– LRwithnon-linearbasisfunctions– Instance-basedRegression:K-NearestNeighbors(later)
– Locallyweightedlinearregression(extra)–RegressiontreesandMultilinear Interpolation(later)
10/6/18 70
Dr.YanjunQi/UVACS
![Page 71: UVA CS 4501: Machine Learning Lecture 5: Non …...RBF= radial-basis function:a function which depends only on the radial distance from a centrepoint Gaussian RBFè as distancefrom](https://reader034.vdocument.in/reader034/viewer/2022042219/5ec4ce9cc620f72ddb6d389c/html5/thumbnails/71.jpg)
71
Locally weighted regression• aka locally weighted regression, local
linear regression, LOESS, …– A combination of kNN and Linear regression
10/6/18
Dr.YanjunQi/UVACS
Kλ (xi, x0 )
![Page 72: UVA CS 4501: Machine Learning Lecture 5: Non …...RBF= radial-basis function:a function which depends only on the radial distance from a centrepoint Gaussian RBFè as distancefrom](https://reader034.vdocument.in/reader034/viewer/2022042219/5ec4ce9cc620f72ddb6d389c/html5/thumbnails/72.jpg)
72
Locally weighted regression
10/6/18
Dr.YanjunQi/UVACS
UseRBFfunctiontopickout/emphasizetheneighborregionofx_0è
Kλ (xi, x0 )
Kλ (xi, x0 )
![Page 73: UVA CS 4501: Machine Learning Lecture 5: Non …...RBF= radial-basis function:a function which depends only on the radial distance from a centrepoint Gaussian RBFè as distancefrom](https://reader034.vdocument.in/reader034/viewer/2022042219/5ec4ce9cc620f72ddb6d389c/html5/thumbnails/73.jpg)
73
Locally weighted regression
10/6/18
Dr.YanjunQi/UVACS
Alinear_func(x)->yèOnlytorepresent
theneighborregionofx_0
Kλ (xi, x0 )
f (x0)=θ0!(x0)+θ1
!(x0)x0
![Page 74: UVA CS 4501: Machine Learning Lecture 5: Non …...RBF= radial-basis function:a function which depends only on the radial distance from a centrepoint Gaussian RBFè as distancefrom](https://reader034.vdocument.in/reader034/viewer/2022042219/5ec4ce9cc620f72ddb6d389c/html5/thumbnails/74.jpg)
Dr.YanjunQi/UVACS
74
Locallyweightedlinearregression
Insteadofminimizing
nowwefittominimize
∑=
−=n
ii
Ti yJ
1
2
21 )()( θθ x
J(θ ) = 12
wi (xiTθ − yi )
2
i=1
n
∑
wi = Kλ(x i ,x0)= exp −
(x i − x0)22λ2
⎛
⎝⎜
⎞
⎠⎟
10/6/18
wherex_0 isthequerypointforwhichwe'dliketoknowitscorrespondingy
![Page 75: UVA CS 4501: Machine Learning Lecture 5: Non …...RBF= radial-basis function:a function which depends only on the radial distance from a centrepoint Gaussian RBFè as distancefrom](https://reader034.vdocument.in/reader034/viewer/2022042219/5ec4ce9cc620f72ddb6d389c/html5/thumbnails/75.jpg)
Dr.YanjunQi/UVACS
75
Locallyweightedlinearregression
Wefit \thetatominimize
wi comesfrom:
• x_0 isthequerypointforwhichwe'dliketoknowitscorrespondingy
J(θ ) = 12
wi (xiTθ − yi )
2
i=1
n
∑
wi = Kλ(x i ,x0)= exp −
(x i − x0)22λ2
⎛
⎝⎜
⎞
⎠⎟
10/6/18
Essentiallyweputhigherweightsontrainingexamplesthatareclosetothequerypointx_0(thanthosethatarefurtherawayfromthequerypointx_0)
![Page 76: UVA CS 4501: Machine Learning Lecture 5: Non …...RBF= radial-basis function:a function which depends only on the radial distance from a centrepoint Gaussian RBFè as distancefrom](https://reader034.vdocument.in/reader034/viewer/2022042219/5ec4ce9cc620f72ddb6d389c/html5/thumbnails/76.jpg)
76
Locally weighted linear regression
• ThewidthofRBFmatters!
10/6/18
Dr.YanjunQi/UVACS
![Page 77: UVA CS 4501: Machine Learning Lecture 5: Non …...RBF= radial-basis function:a function which depends only on the radial distance from a centrepoint Gaussian RBFè as distancefrom](https://reader034.vdocument.in/reader034/viewer/2022042219/5ec4ce9cc620f72ddb6d389c/html5/thumbnails/77.jpg)
LEARNING of Locally weighted linear regression
10/6/18
Dr.YanjunQi/UVACS
x0
77
• Separate weighted least squares training and inference at each target point x0
f (x0)=θ0!(x0)+θ1
!(x0)x0
![Page 78: UVA CS 4501: Machine Learning Lecture 5: Non …...RBF= radial-basis function:a function which depends only on the radial distance from a centrepoint Gaussian RBFè as distancefrom](https://reader034.vdocument.in/reader034/viewer/2022042219/5ec4ce9cc620f72ddb6d389c/html5/thumbnails/78.jpg)
10/6/18
Dr. Yanjun Qi / UVA CS
78
Locally weighted linear regression
• èSeparate weighted least square error minimization at each target point x0:
f (x0)= x0Tθ *(x0)
θ *(x0)= argmin12 wi(x iTθ(x0)− yi )2
i=1
n
∑
= argmin12 Kλ(xi ,x0)(x iTθ(x0)− yi )2i=1
n
∑
0000 )(ˆ)(ˆ)(ˆ xxxxf βα +=
![Page 79: UVA CS 4501: Machine Learning Lecture 5: Non …...RBF= radial-basis function:a function which depends only on the radial distance from a centrepoint Gaussian RBFè as distancefrom](https://reader034.vdocument.in/reader034/viewer/2022042219/5ec4ce9cc620f72ddb6d389c/html5/thumbnails/79.jpg)
10/6/18
Dr. Yanjun Qi / UVA CS
79
Extra: Solution of Locally weighted linear/NonLinearBasis regression
( ) NixxKdiagxW iNN ,,1,),()( 00 !==× λ
θ*(x0)= (BTW(x0)B)−1BTW(x0)y
versus LR θ * = XTX( )−1 XT !y
LWR
![Page 80: UVA CS 4501: Machine Learning Lecture 5: Non …...RBF= radial-basis function:a function which depends only on the radial distance from a centrepoint Gaussian RBFè as distancefrom](https://reader034.vdocument.in/reader034/viewer/2022042219/5ec4ce9cc620f72ddb6d389c/html5/thumbnails/80.jpg)
10/6/18 80
More è Local Weighted Polynomial Regression
• Local polynomial fits of any degree d
∑∑ ∑
=
= ==
+=
⎥⎦
⎤⎢⎣
⎡−−
d
jj
j
N
i
d
j
jijiidjxx
xxxxf
xxxyxxKj
1 0000
1
2
1000,,1),(),(
)(ˆ)(ˆ)(ˆ
)()(),(min00
βα
βαλβα !
Dr.YanjunQi/UVACS
Blue:trueGreen:estimated
![Page 81: UVA CS 4501: Machine Learning Lecture 5: Non …...RBF= radial-basis function:a function which depends only on the radial distance from a centrepoint Gaussian RBFè as distancefrom](https://reader034.vdocument.in/reader034/viewer/2022042219/5ec4ce9cc620f72ddb6d389c/html5/thumbnails/81.jpg)
Dr.YanjunQi/UVACS
81
Extra:Parametricvs.non-parametric
• Locallyweightedlinearregressionisanon-parametricalgorithm.
• The(unweighted)linearregressionalgorithmthatwesawearlierisknownasaparametric learningalgorithm– becauseithasafixed,finitenumberofparameters(the),which
arefittothedata;– Oncewe'vefitthe\theta andstoredthemaway,wenolongerneed
tokeepthetrainingdataaroundtomakefuturepredictions.– Incontrast,tomakepredictionsusinglocallyweightedlinear
regression,weneedtokeeptheentiretrainingsetaround.
• Theterm"non-parametric"(roughly)referstothefactthattheamountofknowledgeweneedtokeep,inordertorepresentthehypothesisgrowswithlinearlythesizeofthetrainingset.
10/6/18
θ
![Page 82: UVA CS 4501: Machine Learning Lecture 5: Non …...RBF= radial-basis function:a function which depends only on the radial distance from a centrepoint Gaussian RBFè as distancefrom](https://reader034.vdocument.in/reader034/viewer/2022042219/5ec4ce9cc620f72ddb6d389c/html5/thumbnails/82.jpg)
(3) Locally Weighted / Kernel Linear Regression
Regression
Y = Weighted linear sum of X’s
Weighted SSE
Linear algebra
Local Regression coefficients
(conditioned on each test point)
Task
Representation
Score Function
Search/Optimization
Models, Parameters
0000 )(ˆ)(ˆ)(ˆ xxxxf βα +=min
α (x0 ),β(x0 )Kλ(x0 ,xi )[ yi −α(x0)−β(x0)xi ]2
i=1
N
∑10/6/18 82
Dr.YanjunQi/UVACS
θ*(x0)= (BTW(x0)B)−1BTW(x0)y