advanced topics i lecture 11 - github pages · 2021. 5. 21. · regression ridge regression 11.1...
TRANSCRIPT
![Page 1: Advanced Topics I Lecture 11 - GitHub Pages · 2021. 5. 21. · Regression Ridge Regression 11.1 Lecture 11 Advanced Topics I STAT 8020 Statistical Methods II September 24, 2020 Whitney](https://reader035.vdocument.in/reader035/viewer/2022071603/613e5ef659df642846167cda/html5/thumbnails/1.jpg)
Advanced Topics I
Nonlinear Regression
Non-parametricRegression
Ridge Regression
11.1
Lecture 11Advanced Topics ISTAT 8020 Statistical Methods IISeptember 24, 2020
Whitney HuangClemson University
![Page 2: Advanced Topics I Lecture 11 - GitHub Pages · 2021. 5. 21. · Regression Ridge Regression 11.1 Lecture 11 Advanced Topics I STAT 8020 Statistical Methods II September 24, 2020 Whitney](https://reader035.vdocument.in/reader035/viewer/2022071603/613e5ef659df642846167cda/html5/thumbnails/2.jpg)
Advanced Topics I
Nonlinear Regression
Non-parametricRegression
Ridge Regression
11.2
Agenda
1 Nonlinear Regression
2 Non-parametric Regression
3 Ridge Regression
![Page 3: Advanced Topics I Lecture 11 - GitHub Pages · 2021. 5. 21. · Regression Ridge Regression 11.1 Lecture 11 Advanced Topics I STAT 8020 Statistical Methods II September 24, 2020 Whitney](https://reader035.vdocument.in/reader035/viewer/2022071603/613e5ef659df642846167cda/html5/thumbnails/3.jpg)
Advanced Topics I
Nonlinear Regression
Non-parametricRegression
Ridge Regression
11.3
Moving Away From Linear Regression
We have mainly focused on linear regression so far
The class of polynomial regression can be thought as astarting point for relaxing the linear assumption
In this lecture we are going to discuss non-linear andnon-parametric regression modeling
![Page 4: Advanced Topics I Lecture 11 - GitHub Pages · 2021. 5. 21. · Regression Ridge Regression 11.1 Lecture 11 Advanced Topics I STAT 8020 Statistical Methods II September 24, 2020 Whitney](https://reader035.vdocument.in/reader035/viewer/2022071603/613e5ef659df642846167cda/html5/thumbnails/4.jpg)
Advanced Topics I
Nonlinear Regression
Non-parametricRegression
Ridge Regression
11.4
Population of the United States
Let’s look at the USPop data set, a bulit-in data set in R. This isa decennial time-series from 1790 to 2000.
* * * * * * * * **
**
**
* **
**
**
*
1800 1850 1900 1950 2000
0
50
100
150
200
250
300
U.S. population
Census year
Pop
ulat
ion
in m
illio
ns
![Page 5: Advanced Topics I Lecture 11 - GitHub Pages · 2021. 5. 21. · Regression Ridge Regression 11.1 Lecture 11 Advanced Topics I STAT 8020 Statistical Methods II September 24, 2020 Whitney](https://reader035.vdocument.in/reader035/viewer/2022071603/613e5ef659df642846167cda/html5/thumbnails/5.jpg)
Advanced Topics I
Nonlinear Regression
Non-parametricRegression
Ridge Regression
11.5
Logistic Growth CurveA simple model for population growth is the logistic growthmodel,
Y = m(X,φ) + ε
=φ1
1 + exp [−(x− φ2)/φ3]+ ε
−5 0 5 10
0
2
4
6
8
10
Logistic growth curve
We are going to fit a logistic growth curve to the U.S.population data set
![Page 6: Advanced Topics I Lecture 11 - GitHub Pages · 2021. 5. 21. · Regression Ridge Regression 11.1 Lecture 11 Advanced Topics I STAT 8020 Statistical Methods II September 24, 2020 Whitney](https://reader035.vdocument.in/reader035/viewer/2022071603/613e5ef659df642846167cda/html5/thumbnails/6.jpg)
Advanced Topics I
Nonlinear Regression
Non-parametricRegression
Ridge Regression
11.6
Fitting logistic growth curve to the U.S. population
φ1 = 440.83, φ2 = 1976.63, φ3 = 46.29
* * * * * * * * * * * * * * * * **
**
**
1800 1850 1900 1950 2000 2050 2100
0
100
200
300
400
500
Census year
Pop
ulat
ion
(mill
ions
)
*
NLRPolyR
![Page 7: Advanced Topics I Lecture 11 - GitHub Pages · 2021. 5. 21. · Regression Ridge Regression 11.1 Lecture 11 Advanced Topics I STAT 8020 Statistical Methods II September 24, 2020 Whitney](https://reader035.vdocument.in/reader035/viewer/2022071603/613e5ef659df642846167cda/html5/thumbnails/7.jpg)
Advanced Topics I
Nonlinear Regression
Non-parametricRegression
Ridge Regression
11.7
Non-parametric Regression
Let’s use the motor-cycle impact data as an illustrativeexample. This data set is taken from a simulated motor-cyclecrash experiment in order to study the efficacy of crashhelmets.
***** *** **** ****** *********
*
**
***
**
****
**
*
*
**
**
*
*
*
****
*
*
**
****
*** **
*
**
****
*
*
*
*
*
*
*
*
*
***
*
**
*
*
*
*
** *
*
*
*
*
*
*
*
**
*
*
*
* **
**
*
*
*
**
***
* *
** *
**
****
*
10 20 30 40 50
−100
−50
0
50
Time (ms)
Acc
eler
atio
n (g
)
![Page 8: Advanced Topics I Lecture 11 - GitHub Pages · 2021. 5. 21. · Regression Ridge Regression 11.1 Lecture 11 Advanced Topics I STAT 8020 Statistical Methods II September 24, 2020 Whitney](https://reader035.vdocument.in/reader035/viewer/2022071603/613e5ef659df642846167cda/html5/thumbnails/8.jpg)
Advanced Topics I
Nonlinear Regression
Non-parametricRegression
Ridge Regression
11.8
Non-parametric Regression Fits
The main idea “non-parametric” regression modeling is to fitthe data “locally”. Therefore, no global structure assumptionmade when fitting the data.
***** *** **** ****** *********
*
**
***
**
****
**
*
*
**
**
*
*
*
****
*
*
**
****
*** **
*
**
****
*
*
*
*
*
*
*
*
*
***
*
**
*
*
*
*
** *
*
*
*
*
*
*
*
**
*
*
*
* **
**
*
*
*
**
***
* *
** *
**
****
*
10 20 30 40 50
−100
−50
0
50
Time (ms)
Acc
eler
atio
n (g
)
Regression SplineGeneralized Additive ModelSmoothing Spline
![Page 9: Advanced Topics I Lecture 11 - GitHub Pages · 2021. 5. 21. · Regression Ridge Regression 11.1 Lecture 11 Advanced Topics I STAT 8020 Statistical Methods II September 24, 2020 Whitney](https://reader035.vdocument.in/reader035/viewer/2022071603/613e5ef659df642846167cda/html5/thumbnails/9.jpg)
Advanced Topics I
Nonlinear Regression
Non-parametricRegression
Ridge Regression
11.9
Regression Tree
Partitioning X-space into sub-regions and fit simple modelto each sub-regionThe partitioning pattern is encoded in a tree structure
We will use Major League Baseball Hitters Data from the1986–1987 season to give you a quick idea of what aregression tree might look like
![Page 10: Advanced Topics I Lecture 11 - GitHub Pages · 2021. 5. 21. · Regression Ridge Regression 11.1 Lecture 11 Advanced Topics I STAT 8020 Statistical Methods II September 24, 2020 Whitney](https://reader035.vdocument.in/reader035/viewer/2022071603/613e5ef659df642846167cda/html5/thumbnails/10.jpg)
Advanced Topics I
Nonlinear Regression
Non-parametricRegression
Ridge Regression
11.10
Regression Tree
Years < 5
Hits >= 42
Years < 4
Hits < 118
Years < 7 Hits < 185
Years < 6
Hits < 142
Hits >= 152
Hits < 160
>= 5
< 42
>= 4
>= 118
>= 7 >= 185
>= 6
>= 142
< 152
>= 160
536100%
22634%
20431%
14221%
32910%
4543%
69766%
46534%
33510%
51824%
94932%
91429%
6223%
94926%
85111%
102614%
95110%
6883%
10756%
11705%
13283%
![Page 11: Advanced Topics I Lecture 11 - GitHub Pages · 2021. 5. 21. · Regression Ridge Regression 11.1 Lecture 11 Advanced Topics I STAT 8020 Statistical Methods II September 24, 2020 Whitney](https://reader035.vdocument.in/reader035/viewer/2022071603/613e5ef659df642846167cda/html5/thumbnails/11.jpg)
Advanced Topics I
Nonlinear Regression
Non-parametricRegression
Ridge Regression
11.11
Longley’s Economic Regression Data
We are going to use Longley’s data set, which provides awell-known example of multicollinearity, to illustrate Ridgeregression.
![Page 12: Advanced Topics I Lecture 11 - GitHub Pages · 2021. 5. 21. · Regression Ridge Regression 11.1 Lecture 11 Advanced Topics I STAT 8020 Statistical Methods II September 24, 2020 Whitney](https://reader035.vdocument.in/reader035/viewer/2022071603/613e5ef659df642846167cda/html5/thumbnails/12.jpg)
Advanced Topics I
Nonlinear Regression
Non-parametricRegression
Ridge Regression
11.12
Linear Regression Fit
![Page 13: Advanced Topics I Lecture 11 - GitHub Pages · 2021. 5. 21. · Regression Ridge Regression 11.1 Lecture 11 Advanced Topics I STAT 8020 Statistical Methods II September 24, 2020 Whitney](https://reader035.vdocument.in/reader035/viewer/2022071603/613e5ef659df642846167cda/html5/thumbnails/13.jpg)
Advanced Topics I
Nonlinear Regression
Non-parametricRegression
Ridge Regression
11.13
The Predictor Variables are Highly Correlated
![Page 14: Advanced Topics I Lecture 11 - GitHub Pages · 2021. 5. 21. · Regression Ridge Regression 11.1 Lecture 11 Advanced Topics I STAT 8020 Statistical Methods II September 24, 2020 Whitney](https://reader035.vdocument.in/reader035/viewer/2022071603/613e5ef659df642846167cda/html5/thumbnails/14.jpg)
Advanced Topics I
Nonlinear Regression
Non-parametricRegression
Ridge Regression
11.14
Ridge Regression as Multicollinearity Remedy
Recall least squares suffers because (XTX) is almostsingular thereby resulting in highly unstable parameterestimates
Modification of least squares that overcomesmulticollinearity problem
βridge = argminβ
(Y −Zβ
)T (Y −Zβ
)s.t.
p−1∑j=1
β2j ≤ t,
where Z is assumed to be standardized and Y isassumed to be centered
Ridge regression results in (slightly) biased but morestable estimates and better prediction performance
![Page 15: Advanced Topics I Lecture 11 - GitHub Pages · 2021. 5. 21. · Regression Ridge Regression 11.1 Lecture 11 Advanced Topics I STAT 8020 Statistical Methods II September 24, 2020 Whitney](https://reader035.vdocument.in/reader035/viewer/2022071603/613e5ef659df642846167cda/html5/thumbnails/15.jpg)
Advanced Topics I
Nonlinear Regression
Non-parametricRegression
Ridge Regression
11.15
Ridge Regression Fit