csse463: image recognition day 14 lab due weds. lab due weds. these solutions assume that you...

11
CSSE463: Image Recognition CSSE463: Image Recognition Day Day 14 14 Lab due Weds. Lab due Weds. These solutions assume that you don't These solutions assume that you don't threshold the shapes.ppt image: Shape1: threshold the shapes.ppt image: Shape1: elongation = 1.632636, C1 = 19.2531, C2 elongation = 1.632636, C1 = 19.2531, C2 = 5.0393 = 5.0393 This week: This week: Tuesday: Support Vector Machine (SVM) Tuesday: Support Vector Machine (SVM) Introduction and derivation Introduction and derivation Thursday: Project info, SVM demo Thursday: Project info, SVM demo Friday: SVM lab Friday: SVM lab

Upload: byron-allen

Post on 19-Jan-2018

212 views

Category:

Documents


0 download

DESCRIPTION

SVMs: “Best” decision boundary Consider a 2- class problem Consider a 2- class problem Start by assuming each class is linearly separable Start by assuming each class is linearly separable There are many separating hyperplanes… There are many separating hyperplanes… Which would you choose? Which would you choose?

TRANSCRIPT

Page 1: CSSE463: Image Recognition Day 14 Lab due Weds. Lab due Weds. These solutions assume that you don't…

CSSE463: Image Recognition CSSE463: Image Recognition Day 14Day 14

Lab due Weds.Lab due Weds. These solutions assume that you don't threshold the These solutions assume that you don't threshold the

shapes.ppt image: Shape1: elongation = 1.632636, shapes.ppt image: Shape1: elongation = 1.632636, C1 = 19.2531, C2 = 5.0393 C1 = 19.2531, C2 = 5.0393

This week:This week: Tuesday: Support Vector Machine (SVM) Introduction Tuesday: Support Vector Machine (SVM) Introduction

and derivationand derivation Thursday: Project info, SVM demoThursday: Project info, SVM demo Friday: SVM labFriday: SVM lab

Page 2: CSSE463: Image Recognition Day 14 Lab due Weds. Lab due Weds. These solutions assume that you don't…

Feedback on feedbackFeedback on feedbackDeltaDeltaWant to see more codeWant to see more codeMath examples caught off guard, Math examples caught off guard, but OK now.but OK now.Tough if labs build on each other b/c Tough if labs build on each other b/c no feedback until lab returned.no feedback until lab returned.Project + lab in same week is Project + lab in same week is slightly toughslightly toughInclude more examplesInclude more examplesApplication in MATLAB takes time.Application in MATLAB takes time.

PlusPlusReally like the material (lots)Really like the material (lots)Covering lots of groundCovering lots of groundLabs! Labs! Quizzes 2Quizzes 2Challenging and interestingChallenging and interestingEnthusiasmEnthusiasmSlidesSlidesGroupworkGroupworkWant to learn moreWant to learn more

Pace:Pace:Lectures and assignments: OK – slightly fastLectures and assignments: OK – slightly fast

Page 3: CSSE463: Image Recognition Day 14 Lab due Weds. Lab due Weds. These solutions assume that you don't…

SVMs: “Best” decision boundarySVMs: “Best” decision boundary Consider a 2-Consider a 2-

class problemclass problem Start by assuming Start by assuming

each class is each class is linearly separablelinearly separable

There are many There are many separating separating hyperplanes… hyperplanes…

Which would you Which would you choose?choose?

Page 4: CSSE463: Image Recognition Day 14 Lab due Weds. Lab due Weds. These solutions assume that you don't…

SVMs: “Best” decision boundarySVMs: “Best” decision boundary The “best” The “best”

hyperplane is the hyperplane is the one that one that maximizes maximizes the margin, the margin, , , between the between the classes.classes.

Some training Some training points will always points will always lie on the marginlie on the margin These are called These are called

“support vectors”“support vectors” #2,4,9 to the left#2,4,9 to the left

Why does this Why does this name make sense name make sense intuitively?intuitively?

margin

Q1

Page 5: CSSE463: Image Recognition Day 14 Lab due Weds. Lab due Weds. These solutions assume that you don't…

Support vectorsSupport vectors The support The support

vectors are the vectors are the toughest to toughest to classifyclassify

What would What would happen to the happen to the decision decision boundary if we boundary if we moved one of moved one of them, say #4?them, say #4?

A different margin A different margin would have would have maximal width!maximal width!

Q2

Page 6: CSSE463: Image Recognition Day 14 Lab due Weds. Lab due Weds. These solutions assume that you don't…

ProblemProblemMaximize the margin width Maximize the margin width while classifying all the data points while classifying all the data points

correctly…correctly…

Page 7: CSSE463: Image Recognition Day 14 Lab due Weds. Lab due Weds. These solutions assume that you don't…

Mathematical formulation of the Mathematical formulation of the hyperplanehyperplane

On paperOn paper Key ideas:Key ideas:

Optimum separating Optimum separating hyperplane: hyperplane:

Distance to margin: Distance to margin:

Can show the margin Can show the margin width = width =

Want to maximize marginWant to maximize margin

opT

op bxw

opw2

opT

op bxwxg )(

Q3-4

Page 8: CSSE463: Image Recognition Day 14 Lab due Weds. Lab due Weds. These solutions assume that you don't…

Finding the optimal hyperplaneFinding the optimal hyperplane We need to find w and b We need to find w and b

that satisfy the system of that satisfy the system of inequalities:inequalities:

where w minimizes the where w minimizes the cost function:cost function:

(Recall that we want to (Recall that we want to minimize ||wminimize ||w00||, which is ||, which is equivalent to minimizing ||equivalent to minimizing ||wwopop||||22=w=wTTw)w)

Quadratic programming Quadratic programming problemproblem Use Lagrange multipliersUse Lagrange multipliers Switch to the dual of the Switch to the dual of the

problemproblem

Niforbxwd iT

i ,....2,11)(

www T

21)(

Page 9: CSSE463: Image Recognition Day 14 Lab due Weds. Lab due Weds. These solutions assume that you don't…

Non-separable dataNon-separable data Allow data points to Allow data points to

be misclassifedbe misclassifed But assign a cost to But assign a cost to

each misclassified each misclassified point.point.

The cost is bounded The cost is bounded by the parameter C by the parameter C (which you can set)(which you can set)

You can set You can set different bounds for different bounds for each class. Why?each class. Why? Can weigh false Can weigh false

positives and false positives and false negatives differentlynegatives differently

Page 10: CSSE463: Image Recognition Day 14 Lab due Weds. Lab due Weds. These solutions assume that you don't…

Can we do better?Can we do better?Cover’s Theorem from information theory Cover’s Theorem from information theory

says that we can map nonseparable data says that we can map nonseparable data in the input space to a feature space in the input space to a feature space where the data is separable, with high where the data is separable, with high probability, if:probability, if:The mapping is nonlinearThe mapping is nonlinearThe feature space has a higher dimensionThe feature space has a higher dimension

The mapping is called a The mapping is called a kernel functionkernel function..Lots of math would follow hereLots of math would follow here

Page 11: CSSE463: Image Recognition Day 14 Lab due Weds. Lab due Weds. These solutions assume that you don't…

Most common kernel functions Most common kernel functions PolynomialPolynomial Gaussian Radial-basis Gaussian Radial-basis

function (RBF)function (RBF) Two-layer perceptronTwo-layer perceptron

You choose p, You choose p, , or , or ii

My experience with real My experience with real data: data: use Gaussian RBF!use Gaussian RBF!Easy Difficulty of problem Hard

p=1, p=2, higher p RBFQ5

10

2

2

tanh),(21exp),(

)1(),(

iT

i

ii

pi

Ti

xxxxK

xxxxK

xxxxK