unit 3 multiple linear regression
TRANSCRIPT
![Page 1: Unit 3 Multiple Linear Regression](https://reader031.vdocument.in/reader031/viewer/2022021021/62047da1901936518a08f564/html5/thumbnails/1.jpg)
NYUWIRELESS
Unit 3 Multiple Linear RegressionEL-GY 6143/CS-GY 6923: INTRODUCTION TO MACHINE LEARNING
PROF. PEI LIU
1
![Page 2: Unit 3 Multiple Linear Regression](https://reader031.vdocument.in/reader031/viewer/2022021021/62047da1901936518a08f564/html5/thumbnails/2.jpg)
NYUWIRELESS
Learning ObjectivesqFormulate a machine learning model as a multiple linear regression model.
◦ Identify prediction vector and target for the problem.
qWrite the regression model in matrix form. Write the feature matrix
qCompute the least-squares solution for the regression coefficients on training data.
qDerive the least-squares formula from minimization of the RSS
qManipulate 2D arrays in python (indexing, stacking, computing shapes, …)
qCompute the LS solution using python linear algebra and machine learning packages
2
![Page 3: Unit 3 Multiple Linear Regression](https://reader031.vdocument.in/reader031/viewer/2022021021/62047da1901936518a08f564/html5/thumbnails/3.jpg)
NYUWIRELESS
OutlineqMotivating Example: Understanding glucose levels in diabetes patients
qMultiple variable linear models
qLeast squares solutions
qComputing the solutions in python
qSpecial case: Simple linear regression
qExtensions
3
![Page 4: Unit 3 Multiple Linear Regression](https://reader031.vdocument.in/reader031/viewer/2022021021/62047da1901936518a08f564/html5/thumbnails/4.jpg)
NYUWIRELESS
Example: Blood Glucose LevelqDiabetes patients must monitor glucose level
qWhat causes blood glucose levels to rise and fall?
qMany factors
qWe know mechanisms qualitatively
qBut, quantitative models are difficult to obtain◦ Hard to derive from first principles◦ Difficult to model physiological process precisely
qCan machine learning help?
4
![Page 5: Unit 3 Multiple Linear Regression](https://reader031.vdocument.in/reader031/viewer/2022021021/62047da1901936518a08f564/html5/thumbnails/5.jpg)
NYUWIRELESS
Data from AIM 94 Experiment
qData collected as series of events◦ Eating◦ Exercise◦ Insulin dosage
qTarget variable glucose level monitored
5
![Page 6: Unit 3 Multiple Linear Regression](https://reader031.vdocument.in/reader031/viewer/2022021021/62047da1901936518a08f564/html5/thumbnails/6.jpg)
NYUWIRELESS
Demo on GitHubqAll code is available in github:https://github.com/pliugithub/MachineLearning/blob/master/unit03_mult_lin_reg/demo_glucose.ipynb
6
![Page 7: Unit 3 Multiple Linear Regression](https://reader031.vdocument.in/reader031/viewer/2022021021/62047da1901936518a08f564/html5/thumbnails/7.jpg)
NYUWIRELESS
Using Google ColaboratoryqTwo options for running the demos:
qOption 1: ◦ Clone the github repository to your local machine and run it using jupyter notebook◦ Need to install all the software correctly
qOption 2: Run on the cloud in Google Colaboratory
qFor Option 2:◦ Go to https://colab.research.google.com/◦ File->Open Notebook◦ Select GitHub tab◦ Enter github URL:
https://github.com/sdrangan/introml◦ Select the unit03_mult_lin_reg/demo1_glucose.ipynb
7
![Page 8: Unit 3 Multiple Linear Regression](https://reader031.vdocument.in/reader031/viewer/2022021021/62047da1901936518a08f564/html5/thumbnails/8.jpg)
NYUWIRELESS
Demo on Google Colab
8
![Page 9: Unit 3 Multiple Linear Regression](https://reader031.vdocument.in/reader031/viewer/2022021021/62047da1901936518a08f564/html5/thumbnails/9.jpg)
NYUWIRELESS
OutlineqMotivating Example: Understanding glucose levels in diabetes patients
qMultiple variable linear models
qLeast squares solutions
qComputing the solutions in python
qSpecial case: Simple linear regression
qExtensions
9
![Page 10: Unit 3 Multiple Linear Regression](https://reader031.vdocument.in/reader031/viewer/2022021021/62047da1901936518a08f564/html5/thumbnails/10.jpg)
NYUWIRELESS
Simple vs. Multiple RegressionqSimple linear regression: One predictor (feature)
◦ Scalar predictor 𝑥◦ Linear model: "𝑦 = 𝛽! + 𝛽"𝑥◦ Can only account for one variable
qMultiple linear regression: Multiple predictors (features)◦ Vector predictor 𝒙 = (𝑥" , … , 𝑥#)◦ Linear model: "𝑦 = 𝛽! + 𝛽"𝑥" + ⋯+ 𝛽#𝑥#◦ Can account for multiple predictors◦ Turns into simple linear regression when 𝑘 = 1
10
![Page 11: Unit 3 Multiple Linear Regression](https://reader031.vdocument.in/reader031/viewer/2022021021/62047da1901936518a08f564/html5/thumbnails/11.jpg)
NYUWIRELESS
Comparison to Single Variable ModelsqWe could compute models for each variable separately:
𝑦 = 𝑎" + 𝑏"𝑥"𝑦 = 𝑎$ + 𝑏$𝑥$⋮
qBut, doesn’t provide a way to account for joint effects
qExample: Consider three linear models to predicting longevity:◦ A: Longevity vs. some factor in diet (e.g. amount of fiber consumed)◦ B: Longevity vs. exercise◦ C: Longevity vs. diet AND exercise◦ What does C tell you that A and B do not?
11
![Page 12: Unit 3 Multiple Linear Regression](https://reader031.vdocument.in/reader031/viewer/2022021021/62047da1901936518a08f564/html5/thumbnails/12.jpg)
NYUWIRELESS
Special Case: Single VariableqSuppose 𝑘 = 1 predictor.
qFeature matrix and coefficient vector:
𝐴 =1 𝑥!⋮ ⋮1 𝑥"
, 𝛽 = 𝛽#𝛽!
qLS soln: 𝛽 = !$𝐴%𝐴
&! !$𝐴%𝑦 = 𝑃&!𝑟
𝑃 = 1 �̅��̅� 𝑥' , 𝑟 = -𝑦
𝑥𝑦qObtain single variable solutions for coefficients (after some algebra):
𝛽! =𝑠()𝑠('
, 𝛽# = -𝑦 − 𝛽!�̅�, 𝑅' =𝑠()'
𝑠('𝑠)'
12
![Page 13: Unit 3 Multiple Linear Regression](https://reader031.vdocument.in/reader031/viewer/2022021021/62047da1901936518a08f564/html5/thumbnails/13.jpg)
NYUWIRELESS
Loading the Data
qscikit-learn package:◦ Many methods for machine learning◦ Datasets◦ Will use throughout this class
qDiabetes dataset is one example
13
![Page 14: Unit 3 Multiple Linear Regression](https://reader031.vdocument.in/reader031/viewer/2022021021/62047da1901936518a08f564/html5/thumbnails/14.jpg)
NYUWIRELESS
Simple Linear Regression for Diabetes DataqTry a fit of each variable individually
qCompute 𝑅*' coefficient for each variable
qUse formula on previous slide
q“Best” individual variable is a poor fit◦ 𝑅#$ ≈ 0.34
14
Best individual variable
![Page 15: Unit 3 Multiple Linear Regression](https://reader031.vdocument.in/reader031/viewer/2022021021/62047da1901936518a08f564/html5/thumbnails/15.jpg)
NYUWIRELESS
Scatter PlotqNo one variable explains glucose well
qMultiple linear regression could be much better
15
![Page 16: Unit 3 Multiple Linear Regression](https://reader031.vdocument.in/reader031/viewer/2022021021/62047da1901936518a08f564/html5/thumbnails/16.jpg)
NYUWIRELESS
OutlineqMotivating Example: Understanding glucose levels in diabetes patients
qMultiple variable linear models
qLeast squares solutions
qComputing the solutions in python
qSpecial case: Simple linear regression
qExtensions
16
![Page 17: Unit 3 Multiple Linear Regression](https://reader031.vdocument.in/reader031/viewer/2022021021/62047da1901936518a08f564/html5/thumbnails/17.jpg)
NYUWIRELESS
Finding a Mathematical Model
17
qGoal: Find a function to predict glucose level from the 10 attributes
qProblem: Several attributes ◦ Need a multi-variable function
Attributes Target𝑦 =Glucose level
𝑦 ≈ "𝑦 = 𝑓(𝑥" , … , 𝑥"!)
𝑥": Age𝑥$: Sex𝑥%: BMI𝑥&: BP𝑥': S1
⋮𝑥"!: S6
![Page 18: Unit 3 Multiple Linear Regression](https://reader031.vdocument.in/reader031/viewer/2022021021/62047da1901936518a08f564/html5/thumbnails/18.jpg)
NYUWIRELESS
Matrix Representation of DataqData is a matrix and a target vector
q𝑛 samples: ◦ One sample per row
q𝑘 features / attributes /predictors: ◦ One feature per column
qThis example:◦ 𝑦( = blood glucose measurement of i-th sample◦ 𝑥(,* : j-th feature of i-th sample◦ 𝒙(
+ = [𝑥(,", 𝑥(,$,…, 𝑥(,#]: feature or predictor vector◦ i-th sample contains 𝒙( ,𝑦(
18
𝑋 =𝑥22 ⋯ 𝑥23⋮ ⋱ ⋮𝑥42 ⋯ 𝑥43
Attributes
Samples𝑦 =𝑦2⋮𝑦4
Target vector
![Page 19: Unit 3 Multiple Linear Regression](https://reader031.vdocument.in/reader031/viewer/2022021021/62047da1901936518a08f564/html5/thumbnails/19.jpg)
NYUWIRELESS
In class exercise
19
![Page 20: Unit 3 Multiple Linear Regression](https://reader031.vdocument.in/reader031/viewer/2022021021/62047da1901936518a08f564/html5/thumbnails/20.jpg)
NYUWIRELESS
OutlineqMotivating Example: Understanding glucose levels in diabetes patients
qMultiple variable linear models
qLeast squares solutions
qComputing the solutions in python
qSpecial case: Simple linear regression
qExtensions
20
![Page 21: Unit 3 Multiple Linear Regression](https://reader031.vdocument.in/reader031/viewer/2022021021/62047da1901936518a08f564/html5/thumbnails/21.jpg)
NYUWIRELESS
Multivariable Linear Model for Glucose
21
qGoal: Find a function to predict glucose level from the 10 attributes
qLinear Model: Assume glucose is a linear function of the predictors:
glucose ≈ prediction = 𝛽# + 𝛽! Age +⋯+ 𝛽+ 𝐵𝑃 + 𝛽, S1 +⋯+ 𝛽!# S6
qGeneral form:
AttributesTarget𝑦 =Glucose level
𝑦 ≈ "𝑦 = 𝑓(𝑥" , … , 𝑥"!)
Age, Sex, BMI,BP,S1, …, S6𝒙 = [𝑥" , … , 𝑥"!]
10 Features
𝑦 ≈ "𝑦 = 𝛽! + 𝛽"𝑥" + ⋯+ 𝛽&𝑥& + 𝛽'𝑥' + ⋯+ 𝛽"!𝑥"!Target
Intercept
![Page 22: Unit 3 Multiple Linear Regression](https://reader031.vdocument.in/reader031/viewer/2022021021/62047da1901936518a08f564/html5/thumbnails/22.jpg)
NYUWIRELESS
Multiple Variable Linear ModelqVector of features: 𝒙 = [𝑥!, … , 𝑥*]
◦ 𝑘 features (also known as predictors, independent variable, attributes, covariates, …)
qSingle target variable 𝑦◦ What we want to predict
qLinear model: Make a prediction M𝑦
𝑦 ≈ M𝑦 = 𝛽# + 𝛽!𝑥! +⋯+ 𝛽*𝑥*qData for training
◦ Samples are (𝒙( , 𝑦(), i=1,2,…,n. ◦ Each sample has a vector of features: 𝒙( = [𝑥(" , … , 𝑥(#] and scalar target 𝑦(
qProblem: Learn the best coefficients 𝜷 = [𝛽# , 𝛽!, … , 𝛽*] from the training data
22
![Page 23: Unit 3 Multiple Linear Regression](https://reader031.vdocument.in/reader031/viewer/2022021021/62047da1901936518a08f564/html5/thumbnails/23.jpg)
NYUWIRELESS
Example: Heart Rate IncreaseqLinear Model: HR increase ≈ 𝛽# + 𝛽! mins exercise + 𝛽' exercise intensity
qData:
23
Subjectnumber
HR before HR after Mins ontreadmill
Speed(min/km)
Days exercise/ week
123 60 90 1 5.2 3
456 80 110 2 4.1 1
789 70 130 5 3.5 2
⋮ ⋮ ⋮ ⋮ ⋮ ⋮
Measuring fitness of athletes
https://www.mercurynews.com/2017/10/29/4851089/
![Page 24: Unit 3 Multiple Linear Regression](https://reader031.vdocument.in/reader031/viewer/2022021021/62047da1901936518a08f564/html5/thumbnails/24.jpg)
NYUWIRELESS
Why Use a Linear Model?qMany natural phenomena have linear relationship
qPredictor has small variation◦ Suppose 𝑦 = 𝑓 𝑥◦ If variation of 𝑥 is small around some value 𝑥!, then
𝑦 ≈ 𝑓 𝑥! + 𝑓, 𝑥! 𝑥 − 𝑥! = 𝛽! + 𝛽"𝑥,
𝛽! = 𝑓 𝑥! − 𝑓, 𝑥! 𝑥! , 𝛽" = 𝑓′(𝑥!)
qSimple to compute
qEasy to interpret relation◦ Coefficient 𝛽* indicates the importance of feature j for the target.
qAdvanced: Gaussian random variables: ◦ If two variables are jointly Gaussian, the optimal predictor of one from the other is linear predictor
24
![Page 25: Unit 3 Multiple Linear Regression](https://reader031.vdocument.in/reader031/viewer/2022021021/62047da1901936518a08f564/html5/thumbnails/25.jpg)
NYUWIRELESS
Matrix ReviewqConsider
𝐴 =1 23 45 6
, 𝐵 = 2 03 2 , 𝑥 = 2
3 ,
qCompute (computations on the board):◦ Matrix vector multiply: 𝐴𝑥◦ Transpose: 𝐴+
◦ Matrix multiply: 𝐴𝐵◦ Solution to linear equations: Solve for 𝑢: 𝑥 = 𝐵𝑢◦ Matrix inverse: 𝐵-"
25
![Page 26: Unit 3 Multiple Linear Regression](https://reader031.vdocument.in/reader031/viewer/2022021021/62047da1901936518a08f564/html5/thumbnails/26.jpg)
NYUWIRELESS
Slopes, Intercept and Inner ProductsqModel with coefficients 𝜷: M𝑦 = 𝛽# + 𝛽!𝑥! +⋯+ 𝛽*𝑥*qSometimes use weight bias version:
M𝑦 = 𝑏 + 𝑤!𝑥! +⋯+𝑤*𝑥*
◦ 𝑏 = 𝛽! : Bias or intercept◦ 𝒘 = 𝜷":# = [𝛽" , … , 𝛽#]: Weights or slope vector
qCan write either with inner product:
M𝑦 = 𝛽# + 𝜷!:* ⋅ 𝒙 or M𝑦 = 𝑏 + 𝒘 ⋅ 𝒙qInner product:
◦ 𝒘 ⋅ 𝒙 = ∑*/"# 𝑤*𝑥*
◦ Will use alternate notation: 𝐰+𝒙 = 𝒘, 𝒙
26
![Page 27: Unit 3 Multiple Linear Regression](https://reader031.vdocument.in/reader031/viewer/2022021021/62047da1901936518a08f564/html5/thumbnails/27.jpg)
NYUWIRELESS
Matrix Form of Linear RegressionqData: 𝒙., 𝑦. , 𝑖 = 1,… , 𝑛
qPredicted value for 𝑖-th sample: M𝑦. = 𝛽# + 𝛽!𝑥.! +⋯+ 𝛽*𝑥.*qMatrix form
M𝑦!M𝑦'⋮M𝑦"
=11
𝑥!!𝑥'!
⋯⋯
𝑥!*𝑥'*
⋮ ⋮ ⋯ ⋮1 𝑥"! ⋯ 𝑥"*
𝛽#𝛽!⋮𝛽*
qMatrix equation: �̀� = 𝑨 𝜷
27
𝜷 with 𝑝 = 𝑘 + 1 coefficient vector
𝑨 a 𝑛×(𝑘 + 1) feature matrix
K𝒚 a 𝑛 predicted values
![Page 28: Unit 3 Multiple Linear Regression](https://reader031.vdocument.in/reader031/viewer/2022021021/62047da1901936518a08f564/html5/thumbnails/28.jpg)
NYUWIRELESS
In-Class Exercise
28
![Page 29: Unit 3 Multiple Linear Regression](https://reader031.vdocument.in/reader031/viewer/2022021021/62047da1901936518a08f564/html5/thumbnails/29.jpg)
NYUWIRELESS
OutlineqMotivating Example: Understanding glucose levels in diabetes patients
qMultiple variable linear models
qLeast squares solutions
qComputing the solutions in python
qSpecial case: Simple linear regression
qExtensions
29
![Page 30: Unit 3 Multiple Linear Regression](https://reader031.vdocument.in/reader031/viewer/2022021021/62047da1901936518a08f564/html5/thumbnails/30.jpg)
NYUWIRELESS
Least Squares Model FittingqHow do we select parameters 𝜷 = (𝛽#, … , 𝛽*)?
qDefine M𝑦. = 𝛽# + 𝛽!𝑥.! +⋯+ 𝛽*𝑥.*◦ Predicted value on sample 𝑖 for parameters 𝜷 = (𝛽! , … , 𝛽#)
qDefine average residual sum of squares:
RSS 𝜷 :=e/0!
"
𝑦. − M𝑦. '
◦ Note that "𝑦( is implicitly a function of 𝜷 = (𝛽! , … , 𝛽#)◦ Also called the sum of squared residuals (SSR) and sum of squared errors (SSE)
qLeast squares solution: Find 𝜷 to minimize RSS.
30
![Page 31: Unit 3 Multiple Linear Regression](https://reader031.vdocument.in/reader031/viewer/2022021021/62047da1901936518a08f564/html5/thumbnails/31.jpg)
NYUWIRELESS
Variants of RSSqOften use some variant of RSS
◦ Note: these are not standard
qResidual sum of squares: RSS = ∑.0!" 𝑦. − M𝑦. '
qRSS per sample or Mean Squared Error:
MSE =RSS𝑛
=1𝑛e.0!
"
𝑦. − M𝑦. '
qNormalized RSS or Normalized MSE: ⁄𝑅𝑆𝑆 𝑛
𝑠)'=𝑀𝑆𝐸𝑠)'
=∑.0!" 𝑦. − M𝑦. '
∑.0!" 𝑦. − -𝑦 '
31
![Page 32: Unit 3 Multiple Linear Regression](https://reader031.vdocument.in/reader031/viewer/2022021021/62047da1901936518a08f564/html5/thumbnails/32.jpg)
NYUWIRELESS
Finding Parameters via OptimizationA general ML recipe
qPick a model with parameters
qGet data
qPick a loss function◦ Measures goodness of fit model to data◦ Function of the parameters
qFind parameters that minimizes loss
32
Linear model: M𝑦 = 𝛽# + 𝛽!𝑥! +⋯+ 𝛽*𝑥*Data: 𝒙., 𝑦. , 𝑖 = 1,2, … , 𝑛
Loss function: 𝑅𝑆𝑆 𝛽#, … , 𝛽* ≔ ∑ 𝑦. − M𝑦. '
Select 𝜷 = (𝛽#, … , 𝛽*) to minimize 𝑅𝑆𝑆 𝜷
General ML problem Multiple linear regression
![Page 33: Unit 3 Multiple Linear Regression](https://reader031.vdocument.in/reader031/viewer/2022021021/62047da1901936518a08f564/html5/thumbnails/33.jpg)
NYUWIRELESS
RSS as a Vector NormqRSS is given by sum:
RSS =e.0!
"𝑦. − M𝑦. '
qDefine norm of a vector: ◦ 𝒙 = 𝑥"$ + ⋯+ 𝑥0$ "/$
◦ Standard Euclidean norm.◦ Sometimes called ℓ-2 norm. ℓ is for Lebesque
qWrite RSS in vector form:RSS = 𝒚 − �̀� '
33
![Page 34: Unit 3 Multiple Linear Regression](https://reader031.vdocument.in/reader031/viewer/2022021021/62047da1901936518a08f564/html5/thumbnails/34.jpg)
NYUWIRELESS
Least Squares SolutionqConsider cost function of the RSS:
RSS 𝜷 = Q(/"
2𝑦( − "𝑦( $ , "𝑦( =Q
*/!
3𝐴(*𝛽*
◦ Vector 𝜷 that minimizes RSS called the least-squares solution
qLeast squares solution: The vector 𝜷 that minimizes the RSS is:
n𝜷 = 𝑨%𝑨 &!𝑨%𝒚
◦ Can compute the best coefficient vector analytically◦ Just solve a linear set of equations ◦ Will show the proof below
34
![Page 35: Unit 3 Multiple Linear Regression](https://reader031.vdocument.in/reader031/viewer/2022021021/62047da1901936518a08f564/html5/thumbnails/35.jpg)
NYUWIRELESS
Proving the LS FormulaqLeast squares formula: The vector 𝜷 that minimizes the RSS is:
n𝜷 = 𝑨%𝑨 &!𝑨%𝒚
qTo prove this formula, we will:
◦ Review gradients of multi-variable functions
◦ Compute gradient 𝛻𝑅𝑆𝑆 𝜷
◦ Solve 𝛻𝑅𝑆𝑆 𝜷 = 0
35
![Page 36: Unit 3 Multiple Linear Regression](https://reader031.vdocument.in/reader031/viewer/2022021021/62047da1901936518a08f564/html5/thumbnails/36.jpg)
NYUWIRELESS
Gradients of Multi-Variable FunctionsqConsider scalar valued function of a vector: 𝑓 𝜷 = 𝑓(𝛽!, … , 𝛽")
qGradient is the column vector:
𝛻𝑓 𝜷 =⁄𝜕𝑓(𝜷) 𝜕𝛽!⋮⁄𝜕𝑓(𝜷) 𝜕𝛽"
qRepresents direction of maximum increase
qAt a local minima or maxima: 𝛻𝑓 𝜷 = 0◦ Solve 𝑛 equations and 𝑛 unknowns
qEx: 𝑓 𝛽!, 𝛽' = 𝛽! sin 𝛽' + 𝛽!' 𝛽'. ◦ Compute 𝛻𝑓 𝜷 . Solution on board
36
![Page 37: Unit 3 Multiple Linear Regression](https://reader031.vdocument.in/reader031/viewer/2022021021/62047da1901936518a08f564/html5/thumbnails/37.jpg)
NYUWIRELESS
Proof of the LS FormulaqConsider cost function of the RSS:
RSS = Q(/"
2𝑦( − "𝑦( $ , "𝑦( =Q
*/!
3𝐴(*𝛽*
◦ Vector 𝜷 that minimizes RSS called the least-squares solution
qCompute partial derivatives via chain rule: 1233144
= −2∑.0!" 𝑦. − M𝑦. 𝐴.5, 𝑗 = 1,2, … , 𝑘
qMatrix form: RSS = 𝐴𝜷 − 𝒚 ', 𝛻𝑅𝑆𝑆 = −2𝐴%(𝒚 − 𝑨𝜷)
qSolution: 𝐴% 𝒚 − 𝐴𝜷 = 0 → 𝜷 = 𝐴%𝐴 &!𝐴%𝒚 (least squares solution of equation 𝐴𝜷 = 𝐲)
qMinimum RSS: 𝑅𝑆𝑆 = 𝒚% 𝐼 − 𝐴 𝐴%𝐴 &!𝐴% 𝒚◦ Proof on the board
37
![Page 38: Unit 3 Multiple Linear Regression](https://reader031.vdocument.in/reader031/viewer/2022021021/62047da1901936518a08f564/html5/thumbnails/38.jpg)
NYUWIRELESS
LS Solution via Auto-Correlation FunctionsqEach data sample has a linear feature vector:
𝐴. = 𝐴.#, ⋯ , 𝐴.* = 1, 𝑥.!, ⋯ , 𝑥.*
qDefine sample auto-correlation matrix and cross-correlation vector:◦ 𝑅55 =
"2𝐴+𝐴, 𝑅55 ℓ,𝑚 = "
2∑(/"2 𝐴(ℓ𝐴(7 (correlation of feature ℓ and feature m)
◦ 𝑅58 ="2𝐴+𝑦, 𝑅85 ℓ = "
2∑(/"2 𝐴(ℓ𝑦( (correlation of feature ℓ and target)
qLeast squares solution is: 𝜷 = 𝑅66&!𝑅6)
38
![Page 39: Unit 3 Multiple Linear Regression](https://reader031.vdocument.in/reader031/viewer/2022021021/62047da1901936518a08f564/html5/thumbnails/39.jpg)
NYUWIRELESS
Mean Removed Form of the LS SolutionqOften useful to remove mean from data before fitting
qSample mean: -𝑦 = !$∑.0!$ 𝑦., �̅�5 =
!$∑.0!$ 𝑥.5, v𝒙 = �̅�!, ⋯ , �̅�*
qDefined mean removed data: w𝑋.5 = 𝑥.5 − �̅�5, y𝑦. = 𝑦. − -𝑦
qSample covariance matrix and cross-covariance vector:◦ 𝑆99 ℓ,𝑚 = "
:∑(/": (𝑥(ℓ−�̅�ℓ)(𝑥(7−�̅�7), 𝑆99 =
":V𝑿𝑻V𝑿
◦ 𝑆98 ℓ = ":∑(/": (𝑥(ℓ−�̅�ℓ)(𝑦(−X𝑦), 𝑆98 =
":V𝑿+Y𝒚
qMean-Removed form of the least squares solution:
M𝑦 = 𝜷!:* ⋅ 𝒙 + 𝛽#, 𝜷!:* = 𝑆((&!𝑆(), 𝛽# = -𝑦 − 𝜷!:* ⋅ v𝒙
qProof: On board
39
![Page 40: Unit 3 Multiple Linear Regression](https://reader031.vdocument.in/reader031/viewer/2022021021/62047da1901936518a08f564/html5/thumbnails/40.jpg)
NYUWIRELESS
𝑅!: Goodness of FitqMultiple variable coefficient of determination:
𝑅" =𝑠#" −𝑀𝑆𝐸
𝑠#"= 1 −
𝑀𝑆𝐸𝑠#"
◦ MSE = <==2= "
2∑(/"2 𝑦( − "𝑦( $
◦ Sample variance is: 𝑠8$ ="2∑(/"2 𝑦( − X𝑦 $, X𝑦 = "
2∑(/"2 𝑦( ,
qInterpretation:
◦ >=?@!"
= Error with linear predictorError predicting bymean
◦ 𝑅$ = fraction of variance reduced or “explained” by the model.
qOn the training data (not necessarily on the test data):◦ 𝑅$ ∈ [0,1] always◦ 𝑅$ ≈ 1 ⇒ linear model provides a good fit ◦ 𝑅$ ≈ 0 ⇒ linear model provides a poor fit
40
![Page 41: Unit 3 Multiple Linear Regression](https://reader031.vdocument.in/reader031/viewer/2022021021/62047da1901936518a08f564/html5/thumbnails/41.jpg)
NYUWIRELESS
In-Class Exercise
41
![Page 42: Unit 3 Multiple Linear Regression](https://reader031.vdocument.in/reader031/viewer/2022021021/62047da1901936518a08f564/html5/thumbnails/42.jpg)
NYUWIRELESS
OutlineqMotivating Example: Understanding glucose levels in diabetes patients
qMultiple variable linear models
qLeast squares solutions
qComputing the solutions in python
qSpecial case: Simple linear regression
qExtensions
42
![Page 43: Unit 3 Multiple Linear Regression](https://reader031.vdocument.in/reader031/viewer/2022021021/62047da1901936518a08f564/html5/thumbnails/43.jpg)
NYUWIRELESS
Arrays and Vector in Python and MATLAB
43
qThere are some key differences between MATLAB and Python that you need to get used to
qMATLAB◦ All arrays are at least 2 dimensions◦ Vectors are 1×𝑁 (row vectors) or 𝑁×1 (column) vectors◦ Matrix vector multiplication syntax depends if vector is on left or right: x’*A or A*x
qPython:◦ Arrays can have 1, 2, 3, … dimension◦ Vectors can be 1D arrays; matrices are generally 2D arrays◦ Vectors that are 1D arrays are neither row not column vectors◦ If x is 1D and A is 2D, then left and right multiplication are the same: x.dot(A) and A.dot(x)
qLecture notes: We will generally treat 𝑥 and 𝑥% the same. ◦ Can write 𝑥 = (𝑥" , … , 𝑥:) and still multiply by a matrix on left or right
![Page 44: Unit 3 Multiple Linear Regression](https://reader031.vdocument.in/reader031/viewer/2022021021/62047da1901936518a08f564/html5/thumbnails/44.jpg)
NYUWIRELESS
Fitting Using sklearnqReturn to diabetes data example
qAll code in demo
qDivide data into two portions:◦ Training data: First 300 samples◦ Test data: Remaining 142 samples
qTrain model on training data.
qTest model (i.e. measure RSS) on test data
qReason for splitting data discussed next lecture.
44
![Page 45: Unit 3 Multiple Linear Regression](https://reader031.vdocument.in/reader031/viewer/2022021021/62047da1901936518a08f564/html5/thumbnails/45.jpg)
NYUWIRELESS
Manually Computing the SolutionqUse numpy linear algebra routine to solve
𝛽 = 𝐴%𝐴 &!𝐴%𝑦
qCommon mistake:◦ Compute matrix inverse 𝑃 = 𝐴+𝐴 -", ◦ Then compute 𝛽 = 𝑃𝐴+𝑦◦ Full matrix inverse is VERY slow. Not needed.◦ Can directly solve linear system: 𝐴 𝛽 = 𝑦◦ Numpy has routines to solve this directly
45
![Page 46: Unit 3 Multiple Linear Regression](https://reader031.vdocument.in/reader031/viewer/2022021021/62047da1901936518a08f564/html5/thumbnails/46.jpg)
NYUWIRELESS
Calling the sklearn Linear Regression methodqConstruct a linear regression object
qRun it on the training data
qPredict values on the test data
46
![Page 47: Unit 3 Multiple Linear Regression](https://reader031.vdocument.in/reader031/viewer/2022021021/62047da1901936518a08f564/html5/thumbnails/47.jpg)
NYUWIRELESS
In-Class Exercise
47
![Page 48: Unit 3 Multiple Linear Regression](https://reader031.vdocument.in/reader031/viewer/2022021021/62047da1901936518a08f564/html5/thumbnails/48.jpg)
NYUWIRELESS
OutlineqMotivating Example: Understanding glucose levels in diabetes patients
qMultiple variable linear models
qLeast squares solutions
qComputing in python
qExtensions
48
![Page 49: Unit 3 Multiple Linear Regression](https://reader031.vdocument.in/reader031/viewer/2022021021/62047da1901936518a08f564/html5/thumbnails/49.jpg)
NYUWIRELESS
Transformed Linear ModelsqStandard linear model: M𝑦 = 𝛽# + 𝛽!𝑥! +⋯+ 𝛽7𝑥7qLinear model may be too restrictive
◦ Relation between 𝑥 and 𝑦 can be nonlinear
qUseful to look at models in transformed form:
M𝑦 = 𝛽!𝜙! 𝒙 +⋯+ 𝛽8𝜙8 𝒙◦ Each function 𝜙* 𝒙 = 𝜙*(𝑥" , … , 𝑥A) is called a basis function◦ Each basis function may be nonlinear and a function of multiple variables
qCan write in vector form: M𝑦 = 𝝓 𝒙 ⋅ 𝜷◦ 𝝓 𝒙 = 𝜙" 𝒙 , … , 𝜙3 𝒙 , 𝜷 = [𝛽" , … , 𝛽3]
49
𝑥
𝑦
![Page 50: Unit 3 Multiple Linear Regression](https://reader031.vdocument.in/reader031/viewer/2022021021/62047da1901936518a08f564/html5/thumbnails/50.jpg)
NYUWIRELESS
Fitting Transformed Linear ModelsqConsider transformed linear model
M𝑦 = 𝛽!𝜙! 𝒙 +⋯+ 𝛽8𝜙8 𝒙
qWe can fit this model exactly as before◦ Given data 𝒙( , 𝑦( , 𝑖 = 1, … , 𝑁◦ Want to fit the model from the transformed variables 𝜙* 𝒙 to target 𝑦◦ Define the transformed matrix:
𝐴 =𝜙"(𝒙") ⋯ 𝜙3(𝒙")
⋮ ⋮ ⋮𝜙"(𝒙:) ⋯ 𝜙3(𝒙:)
◦ Predictions: "𝑦 = 𝐴𝛽◦ Least squares fit t𝛽 = 𝐴+𝐴 -"𝐴+𝑦
50
![Page 51: Unit 3 Multiple Linear Regression](https://reader031.vdocument.in/reader031/viewer/2022021021/62047da1901936518a08f564/html5/thumbnails/51.jpg)
NYUWIRELESS
Example: Polynomial FittingqSuppose 𝑦 only depends on a single variable 𝑥,
qWant to fit a polynomial model◦ 𝑦 ≈ 𝛽! + 𝛽"𝑥 +⋯𝛽#𝑥#
qGiven data 𝑥( , 𝑦( , 𝑖 = 1,… , 𝑛
qTake basis functions 𝜙* 𝑥 = 𝑥* , 𝑗 = 0,… , 𝑑qTransformed model: "𝑦 = 𝛽!𝜙! 𝑥 +⋯+𝛽A𝜙A 𝑥qTransformed matrix is:
𝑨 =1 𝑥" ⋯ 𝑥"A⋮ ⋮ ⋯ ⋮1 𝑥2 ⋯ 𝑥2A
, 𝛽 =𝛽!⋮𝛽A
◦ 𝑝 = 𝑑 + 1 transformed features from 1 original feature
qWill discuss how to select 𝑑 in the next lecture
51
![Page 52: Unit 3 Multiple Linear Regression](https://reader031.vdocument.in/reader031/viewer/2022021021/62047da1901936518a08f564/html5/thumbnails/52.jpg)
NYUWIRELESS
Other Nonlinear ExamplesqMultinomial model: M𝑦 = 𝑎 + 𝑏!𝑥! + 𝑏'𝑥' + 𝑐!𝑥!' + 𝑐'𝑥!𝑥' + 𝑐C𝑥''
◦ Contains all second order terms◦ Define parameter vector 𝛽 = 𝑎, 𝑏" , 𝑏$ , 𝑐" , 𝑐$ , 𝑐%◦ Transformed vector 𝜙 𝑥" , 𝑥$ = [1, 𝑥" , 𝑥$ , 𝑥"$ , 𝑥"𝑥$ , 𝑥$$]◦ Note that the features are nonlinear functions of 𝒙 = [𝑥" , 𝑥$]
qExponential model: M𝑦 = 𝑎!e&DB( +⋯+ 𝑎7𝑒&DC(◦ If the parameters 𝑏" , … , 𝑏A are fixed, then the model is linear in the parameters 𝑎" , … , 𝑎A◦ Parameter vector 𝛽 = 𝑎" , … , 𝑎A◦ Transformed vector 𝜙 𝑥 = [𝑒-D#9 , … , 𝑒-D$9]◦ But, if the parameters 𝑏" , … , 𝑏A are not fixed, the model is nonlinear in 𝑏" , … , 𝑏A
52
![Page 53: Unit 3 Multiple Linear Regression](https://reader031.vdocument.in/reader031/viewer/2022021021/62047da1901936518a08f564/html5/thumbnails/53.jpg)
NYUWIRELESS
Linear Models via Re-ParametrizationqSometimes models can be made into a linear model via re-parametrization
qExample: Consider the model: M𝑦 = 𝐴𝑥!(1 + 𝐵𝑒&(E)◦ Parameters (𝐴, 𝐵)
qThis is nonlinear in (𝐴, 𝐵) due to the product 𝐴𝐵: M𝑦 = 𝐴𝑥! + 𝐴𝐵𝑥!𝑒&(E
qBut, we can define a new set of parameters:◦ 𝛽" = 𝐴 and 𝛽$ = 𝐴𝐵
qThen, M𝑦 = 𝛽!𝑥! + 𝛽'𝑥!𝑒&(E
qBasis functions: 𝜙 𝑥!, 𝑥' = [𝑥!, 𝑥!𝑒&(E]
qAfter we solve for 𝛽!, 𝛽' we can recover 𝐴, 𝐵 via inverting the equations:
𝐴 = 𝛽!, 𝐵 =𝛽'𝐴
53
![Page 54: Unit 3 Multiple Linear Regression](https://reader031.vdocument.in/reader031/viewer/2022021021/62047da1901936518a08f564/html5/thumbnails/54.jpg)
NYUWIRELESS
Example: Learning Linear SystemsqLinear system: 𝑦$ = 𝑎%𝑦$&% +⋯+ 𝑎'𝑦$&' + 𝑏(𝑥$ +⋯+ 𝑏)𝑥$&) +𝑤$
qTransfer function: 𝐻 𝑧 = *$+⋯+*%-&%
%&.'-&'&⋯&.(-&(
qGiven input sequence and output sequence for T samples,
How do we determine 𝛽 = 𝑎%, ⋯ , 𝑎', 𝑏(, ⋯ , 𝑏) /?
qCan be solved using linear regression!
qWrite 𝑦 = 𝐴𝛽 + 𝑤 and define 𝐴, y◦ See homework problem
qMany applications◦ Learning dynamics in robots / mechanical systems◦ Modeling responses in neural systems◦ Stock market time series◦ Speech modeling. Fit a model each 25 ms.
54
![Page 55: Unit 3 Multiple Linear Regression](https://reader031.vdocument.in/reader031/viewer/2022021021/62047da1901936518a08f564/html5/thumbnails/55.jpg)
NYUWIRELESS
One Hot EncodingqSuppose that one feature 𝑥5 is a categorical variable
qEx: Predict the price of a car, 𝑦, given model 𝑥! and interior space 𝑥'◦ Suppose there are 3 different models of a car (Ford, BMW, GM)◦ Bad idea: Arbitrarily assign an index to each possible car model◦ Can give unreasonable relations
qOne-hot encoding example: ◦ With 3 possible categories, represent 𝑥" using 3 binary features (𝜙", 𝜙$ , 𝜙%)◦ Model: 𝑦 = 𝛽! + 𝛽"𝜙" + 𝛽$𝜙$ + 𝛽%𝜙% + 𝛽&𝑥$◦ Essentially obtain 3 different models:
◦ Ford: 𝑦 = 𝛽% + 𝛽& + 𝛽'𝑥(◦ BMW: 𝑦 = 𝛽% + 𝛽( + 𝛽'𝑥(◦ GM: 𝑦 = 𝛽% + 𝛽) + 𝛽'𝑥(
◦ Allows different intercepts (or mean values) for different categories!
55
Model 𝝓𝟏 𝝓𝟐 𝝓𝟑
Ford 1 0 0
BMW 0 1 0
GM 0 0 1
![Page 56: Unit 3 Multiple Linear Regression](https://reader031.vdocument.in/reader031/viewer/2022021021/62047da1901936518a08f564/html5/thumbnails/56.jpg)
NYUWIRELESS
Lab: Robot CalibrationqPredict the current draw
◦ Needed to predict power consumption
qPredictors:◦ Joint angles, velocity and acceleration◦ Strain gauge readings (measure of load)
qFull website at TU Dortmund, Germany◦ http://www.rst.e-technik.tu-
dortmund.de/cms/en/research/robotics/TUDOR_engl/index.html
◦ http://www.rst.e-technik.tu-dortmund.de/forschung/robot-toolbox/MERIt/MERIt_Documentation.pdf
56