machine learning using support vector machines (paper review) presented to: prof. dr. mohamed...

44
Machine Learning Using Support Vector Machines (Paper Review) Presented to: Prof. Dr. Mohamed Batouche Prepared By: Asma B. Al-Saleh Amani A. Al-Ajlan 427220094 427220111 King Saud University The College of Computer & Information Science Computer Science Department (Master) Neural Networks and Machine Learning Applications (CSC 563 ) Spring 2008

Upload: clement-ryan

Post on 02-Jan-2016

217 views

Category:

Documents


1 download

TRANSCRIPT

Page 1: Machine Learning Using Support Vector Machines (Paper Review) Presented to: Prof. Dr. Mohamed Batouche Prepared By: Asma B. Al-Saleh Amani A. Al-Ajlan

Machine Learning Using Support Vector Machines

(Paper Review)

Presented to: Prof. Dr. Mohamed Batouche

Prepared By: Asma B. Al-Saleh Amani A. Al-Ajlan

427220094 427220111

King Saud UniversityThe College of Computer & Information Science

Computer Science Department (Master)Neural Networks and Machine Learning Applications (CSC 563 )

Spring 2008

Page 2: Machine Learning Using Support Vector Machines (Paper Review) Presented to: Prof. Dr. Mohamed Batouche Prepared By: Asma B. Al-Saleh Amani A. Al-Ajlan

Paper Information Title: Machine Learning Using Support Vector

Machines.

Authors: Abdul Rahim Ahmad. Marzuki Khalid. Rubiyah Yusof.

Publisher: MSTC 2002, Johor Bahru.

Date: September 2002.

Page 3: Machine Learning Using Support Vector Machines (Paper Review) Presented to: Prof. Dr. Mohamed Batouche Prepared By: Asma B. Al-Saleh Amani A. Al-Ajlan

Review Outlines

Introduction Artificial Neural Network (ANN) Support Vector Machine (SVM)

Support Vectors Theory of SVM Quadratic Programming Non-linear SVM SVM Implementations SVM for Multi-class Classification

Handwriting Recognition Experimental Results ANN vs. SVM Conclusion

Page 4: Machine Learning Using Support Vector Machines (Paper Review) Presented to: Prof. Dr. Mohamed Batouche Prepared By: Asma B. Al-Saleh Amani A. Al-Ajlan

Introduction

Page 5: Machine Learning Using Support Vector Machines (Paper Review) Presented to: Prof. Dr. Mohamed Batouche Prepared By: Asma B. Al-Saleh Amani A. Al-Ajlan

Introduction

The aim of this paper is to Present SVM as a comparison to ANN. Show the concept of SVM by providing

them some details about SVM.

Page 6: Machine Learning Using Support Vector Machines (Paper Review) Presented to: Prof. Dr. Mohamed Batouche Prepared By: Asma B. Al-Saleh Amani A. Al-Ajlan

Machine Learning

Page 7: Machine Learning Using Support Vector Machines (Paper Review) Presented to: Prof. Dr. Mohamed Batouche Prepared By: Asma B. Al-Saleh Amani A. Al-Ajlan

Machine Learning (ML) Constructing computer program that

automatically improve its performance with experience.

???Learned

Data

Page 8: Machine Learning Using Support Vector Machines (Paper Review) Presented to: Prof. Dr. Mohamed Batouche Prepared By: Asma B. Al-Saleh Amani A. Al-Ajlan

Machine Learning (ML) Applications

1. Data mining programs.2. Information filtering systems.3. Autonomous vehicles.4. Pattern recognition system:

Speech recognition. Handwriting recognition. Face recognition. Text categorization.

Page 9: Machine Learning Using Support Vector Machines (Paper Review) Presented to: Prof. Dr. Mohamed Batouche Prepared By: Asma B. Al-Saleh Amani A. Al-Ajlan

Artificial NeuralNetwork

Page 10: Machine Learning Using Support Vector Machines (Paper Review) Presented to: Prof. Dr. Mohamed Batouche Prepared By: Asma B. Al-Saleh Amani A. Al-Ajlan

Artificial Neural Network (ANN)

Massively parallel computing systems consisting of an extremely large number of simple processors with many interconnections.

Page 11: Machine Learning Using Support Vector Machines (Paper Review) Presented to: Prof. Dr. Mohamed Batouche Prepared By: Asma B. Al-Saleh Amani A. Al-Ajlan

Artificial Neural Network (ANN)

The main characteristics of ANN are:1. Ability to learn complex nonlinear input-

output relationships2. They use sequential training procedures to

updating (adapt) network architecture and connection weights so that a network can work efficiently.

Page 12: Machine Learning Using Support Vector Machines (Paper Review) Presented to: Prof. Dr. Mohamed Batouche Prepared By: Asma B. Al-Saleh Amani A. Al-Ajlan

Artificial Neural Network (ANN) In the area of pattern classification,

the feed-forward network is most popularly used.

Pattern Classification

Data Clustering

Multilayer perceptron (MLP) Radial-BasisFunction (RBF) networks

Kohonen Self-Organizing Map (SOM)

Page 13: Machine Learning Using Support Vector Machines (Paper Review) Presented to: Prof. Dr. Mohamed Batouche Prepared By: Asma B. Al-Saleh Amani A. Al-Ajlan

ANN and Pattern Recognition

ANN is low dependence on domain-specific knowledge compared to rule-based approaches.

Availability of efficient learning algorithms to use.

Page 14: Machine Learning Using Support Vector Machines (Paper Review) Presented to: Prof. Dr. Mohamed Batouche Prepared By: Asma B. Al-Saleh Amani A. Al-Ajlan

Support VectorMachine

Page 15: Machine Learning Using Support Vector Machines (Paper Review) Presented to: Prof. Dr. Mohamed Batouche Prepared By: Asma B. Al-Saleh Amani A. Al-Ajlan

Support Vector Machine – (SVM) SVM was introduced in 1992 by

Vapnik and his coworkers. SVM original form:

Binary classifier; separates between two classes.

Design for linear and separable data set. SVM used for classification and

regression.

Page 16: Machine Learning Using Support Vector Machines (Paper Review) Presented to: Prof. Dr. Mohamed Batouche Prepared By: Asma B. Al-Saleh Amani A. Al-Ajlan

Support Vector Machine – (SVM)

Page 17: Machine Learning Using Support Vector Machines (Paper Review) Presented to: Prof. Dr. Mohamed Batouche Prepared By: Asma B. Al-Saleh Amani A. Al-Ajlan

Theory of SVM

H1

H

H2W

Constraints

1. No data points between H1 and H2.

2. Margin between H1 and H2 is maximized.

-

Page 18: Machine Learning Using Support Vector Machines (Paper Review) Presented to: Prof. Dr. Mohamed Batouche Prepared By: Asma B. Al-Saleh Amani A. Al-Ajlan

H1

H

H2W

Support VectorsSolution: expressed as

a linear combination of support vectors:

• Subset of training patterns• Close to the decision boundary

Page 19: Machine Learning Using Support Vector Machines (Paper Review) Presented to: Prof. Dr. Mohamed Batouche Prepared By: Asma B. Al-Saleh Amani A. Al-Ajlan

Theory of SVM training data:{ , , ……, } Where:

),( 11 yx ),( 22 yx ),( nn yx

SVM

Class or label

Input features

di Rx }1,1{ iy

iy

Page 20: Machine Learning Using Support Vector Machines (Paper Review) Presented to: Prof. Dr. Mohamed Batouche Prepared By: Asma B. Al-Saleh Amani A. Al-Ajlan

Theory of SVM

H1

H

H2W

Class 1:

1. bxw

Class 2:

1. bxw

1).( bxwy ii

Page 21: Machine Learning Using Support Vector Machines (Paper Review) Presented to: Prof. Dr. Mohamed Batouche Prepared By: Asma B. Al-Saleh Amani A. Al-Ajlan

Theory of SVM

).sgn()( bxwxf

Learn a linear separating hyper plane classifier:

Page 22: Machine Learning Using Support Vector Machines (Paper Review) Presented to: Prof. Dr. Mohamed Batouche Prepared By: Asma B. Al-Saleh Amani A. Al-Ajlan

Quadratic Programming to maximize the margin , we need

to minimize:

Quadratic Programming solved by introducing Lagrange multipliers

||||

2

w2||||

2

1w

),,( bwL 2||||2

1w

N

i iii bwxy1

).(

N

i i1

Page 23: Machine Learning Using Support Vector Machines (Paper Review) Presented to: Prof. Dr. Mohamed Batouche Prepared By: Asma B. Al-Saleh Amani A. Al-Ajlan

Lagrange Multipliers

Maximize L where w and b are eliminated:

and

ji jijiji xxyy

,2

1

N

i i1DL

),,( bwL 2||||2

1w

N

i iii bwxy1

).(

N

i i1

Page 24: Machine Learning Using Support Vector Machines (Paper Review) Presented to: Prof. Dr. Mohamed Batouche Prepared By: Asma B. Al-Saleh Amani A. Al-Ajlan

Theory of SVM

Discriminate function:

)(xf ).sgn((1

bxxyN

i iii

Page 25: Machine Learning Using Support Vector Machines (Paper Review) Presented to: Prof. Dr. Mohamed Batouche Prepared By: Asma B. Al-Saleh Amani A. Al-Ajlan

Non-linear SVM

1. SVM mapped the data sets of input space into a higher dimensional feature space

2. Linear and the large-margin learning algorithm is then applied.

Page 26: Machine Learning Using Support Vector Machines (Paper Review) Presented to: Prof. Dr. Mohamed Batouche Prepared By: Asma B. Al-Saleh Amani A. Al-Ajlan

Non-linear SVM

Input (data) space(Non- linear)

)(xx

Feature space(Linear)

Page 27: Machine Learning Using Support Vector Machines (Paper Review) Presented to: Prof. Dr. Mohamed Batouche Prepared By: Asma B. Al-Saleh Amani A. Al-Ajlan

Non-linear SVM If the mapping function is ,we just

solve:

However, the mapping can be implicitly done by kernel functions:

)(.).(2

1, jji ijiji xxyy

N

i i1DL

ji jijiji xxkyy

,),(

2

1

N

i i1DL

Page 28: Machine Learning Using Support Vector Machines (Paper Review) Presented to: Prof. Dr. Mohamed Batouche Prepared By: Asma B. Al-Saleh Amani A. Al-Ajlan

Non-linear SVM

Discriminate function:

)(xf )),(sgn((1

bxxkyN

i iii

Page 29: Machine Learning Using Support Vector Machines (Paper Review) Presented to: Prof. Dr. Mohamed Batouche Prepared By: Asma B. Al-Saleh Amani A. Al-Ajlan

Kernel

There are many kernels that can be used that way.

Any kernel that satisfies Mercer’s condition can be used.

Page 30: Machine Learning Using Support Vector Machines (Paper Review) Presented to: Prof. Dr. Mohamed Batouche Prepared By: Asma B. Al-Saleh Amani A. Al-Ajlan

Kernel - Examples Polynomial kernels

Hyperbolic tangent

Radial basis function (Gaussian kernel)

Page 31: Machine Learning Using Support Vector Machines (Paper Review) Presented to: Prof. Dr. Mohamed Batouche Prepared By: Asma B. Al-Saleh Amani A. Al-Ajlan

Non-separable Input Space In real world problem, there is always

noise. Noise Non-separable data. Slack variables are introduced to each

input:

Penalty Parameter C: control overfitting.

iii bxwy 1).(

Page 32: Machine Learning Using Support Vector Machines (Paper Review) Presented to: Prof. Dr. Mohamed Batouche Prepared By: Asma B. Al-Saleh Amani A. Al-Ajlan

Non-separable Input Space

H1

H

H2

1

.

bxw

0

.

bxw

1

.

bxw

j

j

i

i

Page 33: Machine Learning Using Support Vector Machines (Paper Review) Presented to: Prof. Dr. Mohamed Batouche Prepared By: Asma B. Al-Saleh Amani A. Al-Ajlan

SVM for Multi-class Classification

Basic SVM is binary classifier; separates between two classes.

In real world, more than two classes is usually needed.

Ex: handwriting recognition

Page 34: Machine Learning Using Support Vector Machines (Paper Review) Presented to: Prof. Dr. Mohamed Batouche Prepared By: Asma B. Al-Saleh Amani A. Al-Ajlan

SVM for Multi-class Classification

Methods

Modifying binary to incorporate multi-class learning.

Combining binary

classifiers

One vs. One

K (K-1) /2

One vs. All

K

Page 35: Machine Learning Using Support Vector Machines (Paper Review) Presented to: Prof. Dr. Mohamed Batouche Prepared By: Asma B. Al-Saleh Amani A. Al-Ajlan

SVM for Multi-class Classification One vs. One and DAGSVM (Directed

Acyclic Graph) are the best choices for practical use.

they are: Less complex Easy to construct Faster to train.

Tapia et al, 2005

Page 36: Machine Learning Using Support Vector Machines (Paper Review) Presented to: Prof. Dr. Mohamed Batouche Prepared By: Asma B. Al-Saleh Amani A. Al-Ajlan

SVM Implementation Quadratic programming (QP) which is

computationally intensive.

However, many decomposition methods have been proposed that avoids the QP and makes SVM learning practical for many current problems.

Ex: Sequential Minimal Optimization (SMO)

Page 37: Machine Learning Using Support Vector Machines (Paper Review) Presented to: Prof. Dr. Mohamed Batouche Prepared By: Asma B. Al-Saleh Amani A. Al-Ajlan

Results of Experimental Studies

Page 38: Machine Learning Using Support Vector Machines (Paper Review) Presented to: Prof. Dr. Mohamed Batouche Prepared By: Asma B. Al-Saleh Amani A. Al-Ajlan

Data Handwritten digit database:

MNIST dataset.

USPS dataset. more difficult; human recognition error rate

is as high as 2.5%.

Page 39: Machine Learning Using Support Vector Machines (Paper Review) Presented to: Prof. Dr. Mohamed Batouche Prepared By: Asma B. Al-Saleh Amani A. Al-Ajlan

Error rate comparison of ANN, SVM and other algorithms for MNIST and USPS database.

Page 40: Machine Learning Using Support Vector Machines (Paper Review) Presented to: Prof. Dr. Mohamed Batouche Prepared By: Asma B. Al-Saleh Amani A. Al-Ajlan

1. SVM error rate is significantly lower than most other algorithms except for LeNet 5 NN.

2. Training time for SVM was significantly slower the higher recognition rate (low error rate) justify for the usage.

3. SVM usage should be increasing and replacing ANN in the area of handwriting recognition where faster method of implementing SVM have been introduced recently.

Results of Experimental Studies

Page 41: Machine Learning Using Support Vector Machines (Paper Review) Presented to: Prof. Dr. Mohamed Batouche Prepared By: Asma B. Al-Saleh Amani A. Al-Ajlan

SVM vs. ANNSVM ANN

Naturally handles multi-class classification.

Multi-class implementation needs to be performed.

ANN is known to overfit data unless cross-validation is applied.

SVM does not overfit data (Structural Risk Minimization).

Local minimum Global minimum.

Page 42: Machine Learning Using Support Vector Machines (Paper Review) Presented to: Prof. Dr. Mohamed Batouche Prepared By: Asma B. Al-Saleh Amani A. Al-Ajlan

Conclusion SVM is Powerful and is a useful alternative to

neural networks.

SVM find Global, unique solution.

Two key concepts of SVM: maximize the margin and choice of kernel.

Performance depends on choice of kernel and

parameters Still a research topic.

Training is memory-intensive due to QP.

Page 43: Machine Learning Using Support Vector Machines (Paper Review) Presented to: Prof. Dr. Mohamed Batouche Prepared By: Asma B. Al-Saleh Amani A. Al-Ajlan

Conclusion

Many active research is taking place on areas related to SVM.

Many SVM implementations are available on the Web:

SVMLight LIBSVM

Page 44: Machine Learning Using Support Vector Machines (Paper Review) Presented to: Prof. Dr. Mohamed Batouche Prepared By: Asma B. Al-Saleh Amani A. Al-Ajlan

Thank you…..

Questions?