topic 10 (bayesian classifiers)

CSE 473: Digital Image Processing CSE 473: Digital Image Processing and Pattern Recognitionand Pattern Recognition

Spring 2015Spring 2015

Course Teacher:Course Teacher:Md. Tarek HabibMd. Tarek HabibAssistant ProfessorAssistant Professor

Department of Computer Science and Department of Computer Science and EngineeringEngineering

Green University of BangladeshGreen University of Bangladesh

Topic – 10: Topic – 10: Bayesian ClassifiersBayesian Classifiers

Classification TechniquesClassification Techniques Bayesian ClassifiersBayesian Classifiers

Bayes TheoremBayes Theorem Using the Bayes Theorem for Using the Bayes Theorem for ClassificationClassification Naïve Bayes ClassifierNaïve Bayes Classifier

2

Lecture OutlineLecture Outline

Md. Tarek HabibMd. Tarek Habib

Classification Techniques

A classification technique (or classifier) is

a systematic approach to building classification

models from an input data set. Each technique employs a learning algorithm

to identify a model that best fits the relationship

between the attribute set and class label of the

input data. The model generated by a learning algorithm

should both fit the input data well and correctly

predict the class labels of records it has never

seen before.

3



Figure 4.3 shows a general approach for

solving classification problems.

4



5



First, a training set consisting of records

whose class labels are known must be provided. The training set is used to build a classification

model, which is subsequently applied to the test

set, which consists of records with unknown

class labels. Evaluation of the performance of a

classification model is based on the counts of

test records correctly and incorrectly predicted

by the model. A single number would make it more

convenient to compare the performance of

different models.

6



This can be done using a performance

metric such as accuracy, which is defined as

follows:

Equivalently, the performance of a model can

be expressed in terms o f its error rate, which is

given by the following equation:

7



Most classification algorithms seek models

that attain the highest accuracy, or equivalently,

the lowest error rate when applied to the test

set.

8



Typical examples of classification

techniques or classifiers are: Bayesian classifiers Nearest-neighbor classifiers Decision tree classifiers Rule-based classifiers Neural networks Support vector machines

9


Bayesian Classifiers

Consider the task of predicting whether a

person is at risk for heart disease based on the

person’s diet and workout frequency. Although most people who eat healthily and

exercise regularly have less chance of

developing heart disease, they may still do so be

cause of other factors such as heredity,

excessive smoking, and alcohol abuse. This section presents an approach for

modeling probabilistic relationships between the

attribute set and the class variable.10


Bayes Theorem

Consider a football game between two rival

teams: Team 0 and Team 1. Suppose Team 0 wins

65% o f the time and Team 1 wins the remaining

matches. Among the games won by Team 0, only

30% of them come from playing on Team 1 ’s

football field. On the other hand, 75% of the

victories for Team 1 are obtained while playing at

home. If Team 1 is to host the next match between

the two teams, which team will most likely emerge

as the winner? This question can be answered by using the

well-known Bayes theorem.

11


Bayes Theorem

12Md. Tarek HabibMd. Tarek Habib

Bayes Theorem

13


Bayes Theorem

14


Using the Bayes Theorem for Classification

15


Let X denote the attribute set and Y denote

the class variable. We can treat X and Y as random variables. P(Y|X) is also known as the posterior

probability for Y, as opposed to its prior

probability, P(Y).


16



17



18



19



20Md. Tarek HabibMd. Tarek Habib

Naïve Bayes ClassifierNaïve Bayes Classifier

21



22



23



24



25



26



27



28


topic 10 (bayesian classifiers)

Education

tarek habibmd

classification algorithms

classification problems

building classification

test set

input data set

training set

class labels of records