computational learning theory pac iid vc dimension svm
DESCRIPTION
Computational Learning Theory PAC IID VC Dimension SVM. Marius Bulacu. Kunstmatige Intelligentie / RuG. The Problem. Why does learning work? How do we know that the learned hypothesis h is close to the target function f if we do not know what f is?. answer provided by - PowerPoint PPT PresentationTRANSCRIPT
Computational Learning Theory• PAC• IID• VC Dimension• SVM
Kunstmatige Intelligentie / RuG
Marius Bulacu
2
The Problem
• Why does learning work?
• How do we know that the learned hypothesis h is close to the target function f if we do not know what f is?
answer provided by
computational learning theory
3
The Answer
• Any hypothesis h that is consistent with a sufficiently large number of training examples is unlikely to be seriously wrong.
Therefore it must be:
Probably Approximately Correct
PAC
4
The Stationarity Assumption
• The training and test sets are drawn randomly from the same population of examples using the same probability distribution.
Therefore training and test data are
Independently and Identically Distributed
IID
“the future is like the past”
5
How many examples are needed?
Number of examples Probability that h and f disagree on an example
Probability of existence of a wrong hypothesis
consistent with all examples
)Hln(lnm 11
Size of hypothesis space
Sample complexity
6
Formal Derivation
H (the set of all possible hypothese)
f
HBAD (the set of “wrong” hypotheses)
1))x(f)x(h,x(P
))x(f)x(h,x(P
)Hln(lnm)(H
)(H)Hh(P
m
mBADBAD
11
1
1
7
What if hypothesis space is infinite?
Can’t use our result for finite H Need some other measure of complexity for H
– Vapnik-Chervonenkis dimension
8
9
10
11
12
SVM (1): Kernels
Complicated separation boundary
Simple separation boundary: Hyperplane
f1
f2
f1
f2
f3
Kernels Polynomial Radial basis Sigmoid
Implicit mapping to a higher dimensional space where linear separation is possible.
13
SVM (2): Max Margin
Support vectors
Max Margin
“Best” Separating Hyperplane
From all the possible separating hyperplanes, select the one that gives Max Margin.
Solution found by Quadratic Optimization – “Learning”.
f1
f2Good generalization