multiple features
DESCRIPTION
Multiple features. Linear Regression with multiple variables. Machine Learning. Multiple features (variables). Multiple features (variables). Notation: = number of features = input (features) of training example. = value of feature in training example. Hypothesis:. - PowerPoint PPT PresentationTRANSCRIPT
Linear Regression with multiple variables
Multiple features
Machine Learning
Andrew Ng
Size (feet2) Price ($1000)
2104 4601416 2321534 315852 178… …
Multiple features (variables).
Andrew Ng
Size (feet2) Number of bedrooms
Number of floors
Age of home (years)
Price ($1000)
2104 5 1 45 4601416 3 2 40 2321534 3 2 30 315852 2 1 36 178… … … … …
Multiple features (variables).
Notation:= number of features= input (features) of training example.
= value of feature in training example.
Andrew Ng
Hypothesis:Previously:
Andrew Ng
For convenience of notation, define .
Multivariate linear regression.
Linear Regression with multiple variables
Gradient descent for multiple variables
Machine Learning
Andrew Ng
Hypothesis:
Cost function:
Parameters:
(simultaneously update for every )
Repeat
Gradient descent:
Andrew Ng
(simultaneously update )
Gradient Descent
Repeat
Previously (n=1):
New algorithm :Repeat
(simultaneously update for )
Linear Regression with multiple variables
Gradient descent in practice I: Feature Scaling
Machine Learning
Andrew Ng
E.g. = size (0-2000 feet2)
= number of bedrooms (1-5)
Feature ScalingIdea: Make sure features are on a similar scale.
size (feet2)
number of bedrooms
Andrew Ng
Feature Scaling
Get every feature into approximately a range.
Andrew Ng
Replace with to make features have approximately zero mean (Do not apply to ).
Mean normalization
E.g.
Linear Regression with multiple variables
Gradient descent in practice II: Learning rate
Machine Learning
Andrew Ng
Gradient descent
- “Debugging”: How to make sure gradient descent is working correctly.
- How to choose learning rate .
Andrew Ng
Example automatic convergence test:
Declare convergence if decreases by less than in one iteration.
0 100 200 300 400
No. of iterations
Making sure gradient descent is working correctly.
Andrew Ng
Making sure gradient descent is working correctly.
Gradient descent not working.
Use smaller .
No. of iterations
No. of iterations No. of iterations
- For sufficiently small , should decrease on every iteration.- But if is too small, gradient descent can be slow to converge.
Andrew Ng
Summary:- If is too small: slow convergence.- If is too large: may not decrease on
every iteration; may not converge.
To choose , try
Linear Regression with multiple variables
Features and polynomial regression
Machine Learning
Andrew Ng
Housing prices prediction
Andrew Ng
Polynomial regression
Price(y)
Size (x)
Andrew Ng
Choice of features
Price(y)
Size (x)
Linear Regression with multiple variables
Normal equation
Machine Learning
Andrew Ng
Gradient Descent
Normal equation: Method to solve for analytically.
Andrew Ng
Intuition: If 1D
Solve for
(for every )
Andrew Ng
Size (feet2) Number of bedrooms
Number of floors
Age of home (years)
Price ($1000)
1 2104 5 1 45 4601 1416 3 2 40 2321 1534 3 2 30 3151 852 2 1 36 178
Size (feet2) Number of bedrooms
Number of floors
Age of home (years)
Price ($1000)
2104 5 1 45 4601416 3 2 40 2321534 3 2 30 315852 2 1 36 178
Examples:
Andrew Ng
examples ; features.
E.g. If
Andrew Ng
is inverse of matrix .
Octave: pinv(X’*X)*X’*y
Andrew Ng
training examples, features.Gradient Descent Normal Equation
• No need to choose .• Don’t need to iterate.
• Need to choose . • Needs many iterations.• Works well even
when is large.• Need to compute
• Slow if is very large.
Linear Regression with multiple variables
Normal equation and non-invertibility (optional)
Machine Learning
Andrew Ng
Normal equation
- What if is non-invertible? (singular/ degenerate)
- Octave: pinv(X’*X)*X’*y
Andrew Ng
What if is non-invertible?
• Redundant features (linearly dependent).E.g. size in feet2
size in m2
• Too many features (e.g. ).- Delete some features, or use regularization.