07 the problem of over fitting-introduction to machine learning

Regularization

The problem of overfitting• So far we've seen a few algorithms • Work well for many applications, but can suffer from

the problem of overfitting

Overfitting with linear regressionExample: Linear regression (housing prices)

Overfitting: If we have too many features, the learned hypothesis may fit the training set very well ( ), but fail to generalize to new examples (predict prices on new examples).The hypothesis is just too large, too variable and we don't have enough data to constrain it to give us a good hypothesis

Example: Logistic regression

( = sigmoid function)

Addressing overfitting:

size of houseno. of bedroomsno. of floorsage of houseaverage income in neighborhoodkitchen size

• Plotting hypothesis is one way to decide whether overfitting occurs or not

• But with lots of features and little data we cannot visualize, and therefore:

• Hard to select the degree of polynomial• What features to keep and which to drop

Addressing overfitting:

Options:1. Reduce number of features. (but this means loosing

information)― Manually select which features to keep.― Model selection algorithm (later in course).

2. Regularization.― Keep all the features, but reduce magnitude/values of

parameters .― Works well when we have a lot of features, each of

which contributes a bit to predicting .

Cost function

Intuition

Suppose we penalize and make , really small.

Size of house

Small values for parameters ― “Simpler” hypothesis― Less prone to overfitting

Regularization.

Housing:― Features: ― Parameters:

Unlike the polynomial example, we don't know what are the high order terms

How do we pick the ones that need to be shrunk?

With regularization, take cost function and modify it to shrink all the parameters

By convention you don't penalize θ0 - minimization is from θ1 onwards

Regularization.

Size of house

• Using the regularized objective (i.e. cost function with regularization term)

• We get a much smoother curve which fits the data and gives a much better hypothesis

λ is the regularization parameter

Controls a trade off between our two goals

1) Want to fit the training set well

2) Want to keep parameters small

In regularized linear regression, we choose to minimize

What if is set to an extremely large value (perhaps too large for our problem, say )?- Algorithm works fine; setting to be very large can’t hurt it- Algorithm fails to eliminate overfitting.- Algorithm results in underfitting. (Fails to fit even training data

well).- Gradient descent will fail to converge.

In regularized linear regression, we choose to minimize

What if is set to an extremely large value (perhaps for too large for our problem, say )?

Size of house

Regularized linear regression

Gradient descentRepeat

[+𝝀𝒎 𝜽 𝒋 ] (regularized)

𝜕𝜕𝜃0

𝐽 (𝜃)

Interesting term: Usually learning rate is small and m is large Ex.

Same as before

Normal equation

𝜃= (𝑋𝑇 𝑋 )−1 𝑋𝑇 𝑦

Suppose ,Non-invertibility (optional/advanced).

(#examples) (#features)

Regularized logistic regression

Regularized logistic regression.

Cost function:x1

Gradient descentRepeat

[+𝝀𝒎 𝜽 𝒋 ] (regularized)

gradient(1) = [ ];

function [jVal, gradient] = costFunction(theta)

jVal = [ ];

gradient(2) = [ ];

gradient(n+1) = [ ];

code to compute

Advanced optimization

gradient(3) = [ ];code to compute

07 the problem of over fitting-introduction to machine learning

price size size of house

overfitting regularization

regularized linear regression

problem of overfitting

linear regression example

number of features

lot of features

cost function

Education

achim zielesny from curve fitting to machine learning ·...

machine foundation problem machine foundation problem

machine learning damon waring 22 april 2003. 2 of 15 agenda...

the curve-fitting problem

17. line fitting least-squares straight-line...

lecture 20: intro to fitting, least...

optimization methods for the single-machine problem

fitting shop · fitting shop fitting is the process of...

machine interference problem: introduction

machine learning overview machine learning is function...

machine foundation problem

regularization the problem of overfitting machine learning

beaumanor machine capacitybeaumanor fluid power products...

workshop manual for fitting,welding and machine shop

fitting high-tech, capacitive human machine interfaces

house price estimation as a function fitting problem with...

machine design problem class 2

recommender systems problem formulation machine learning

tube fitting squaring machine - e. h. wachsjul 20, 1997 ·...

linear curve fitting least squares problem curve...