machine learning - 1 tutorial 1: optimal bayes classification
TRANSCRIPT
![Page 1: Machine Learning - 1 Tutorial 1: Optimal Bayes Classification](https://reader036.vdocument.in/reader036/viewer/2022082819/56649f295503460f94c42144/html5/thumbnails/1.jpg)
Machine Learning - 1
Tutorial 1:
Optimal Bayes Classification
![Page 2: Machine Learning - 1 Tutorial 1: Optimal Bayes Classification](https://reader036.vdocument.in/reader036/viewer/2022082819/56649f295503460f94c42144/html5/thumbnails/2.jpg)
Machine Learning - 2
Theory Review
We assume all variables are random variables, with known distributions
Notation:
- A finite set of classes (categories),
- Input space (patterns),
- A classifier, maps to
W i i Nw Î W =, 1,...,
X x XÎ
:f X ®W x XÎ w Î W
![Page 3: Machine Learning - 1 Tutorial 1: Optimal Bayes Classification](https://reader036.vdocument.in/reader036/viewer/2022082819/56649f295503460f94c42144/html5/thumbnails/3.jpg)
Machine Learning - 3
Basic Assumption
The following distributions are known:
• - Prior probabilities for each class.
• - Conditional probability of the input, given that the class is
• If is a continuous space, denotes the probability density.
( ) ( ) ( ){ }Np p pw w w1 2, ,...,
( )ip x i Nw =1| , ,...,
iw
X ( )ip x w|
![Page 4: Machine Learning - 1 Tutorial 1: Optimal Bayes Classification](https://reader036.vdocument.in/reader036/viewer/2022082819/56649f295503460f94c42144/html5/thumbnails/4.jpg)
Machine Learning - 4
Reminder: Fish Classification
• Two classes
• Prior probability can be estimated from relative frequency
• Class conditional probability can be estimated by a frequency histogram
![Page 5: Machine Learning - 1 Tutorial 1: Optimal Bayes Classification](https://reader036.vdocument.in/reader036/viewer/2022082819/56649f295503460f94c42144/html5/thumbnails/5.jpg)
Machine Learning - 5
Optimal Bayes Classifier
Bayes Rule:
Optimal Bayes Classifier:
This classifier minimizes the conditional error probability and the average error probability.
( )( ) ( )
( )
( ) ( )( ) ( )
i i i ii
i ii
p x p p x pp x
p x p x p
w w w ww
w w= =
å| |
||
( ) ( ) ( ),..., ,...,
( ) argmax | argmax |i i i
i N i Nf x p x p x pw w w
=1 =1= =
![Page 6: Machine Learning - 1 Tutorial 1: Optimal Bayes Classification](https://reader036.vdocument.in/reader036/viewer/2022082819/56649f295503460f94c42144/html5/thumbnails/6.jpg)
Machine Learning - 6
Exercise 1It is given that and
The prior probability is uniform:
And the class conditional probability is Gaussian:
Where
a. What is the Bayes optimal decision rule? What are the decision boundaries in the plane?
b. Does the decision boundary for the Gaussian case always have the same form? What it depends on?
2X = ¡ { }w w w1 2 3W= , ,
( ) ( ) ( )p p pw w w1 2 3
1= = =3
( | ) ( , )i ip x Nw m S:
, ,m m m1 2 3
æö æö æ ö0 3 - 3÷ ÷ ÷ç ç ç÷ ÷ ÷ç ç ç= = =÷ ÷ ÷ç ç ç÷ ÷ ÷3 0 0ç ç ç÷ ÷ ÷ç ç çè ø è ø è ø
æ ö1 ÷0ç ÷ç 2 ÷çS = ÷ç ÷1ç ÷0ç ÷çè ø2
![Page 7: Machine Learning - 1 Tutorial 1: Optimal Bayes Classification](https://reader036.vdocument.in/reader036/viewer/2022082819/56649f295503460f94c42144/html5/thumbnails/7.jpg)
Machine Learning - 7
Exercise 2It is given that the input space is binary vectors of length d,
meaning that . The output space is
with general prior .
We define the per-coordinate class conditional probabilities:
In addition, each coordinate is statistically independent of all other coordinates
a. What is the optimal decision rule?
b. What Happens if for some i, ?
{ }dx Î 01,
( ) { }d ix x x x1= Î 01,..., , ,
( )|i ip p x w
0= =1( )|
i iq p x w
1= =1
i ip q=
{ },w w0 1
W=
0 1( ), ( )p pw w
![Page 8: Machine Learning - 1 Tutorial 1: Optimal Bayes Classification](https://reader036.vdocument.in/reader036/viewer/2022082819/56649f295503460f94c42144/html5/thumbnails/8.jpg)
Machine Learning - 8
Exercise 3
Two classes are given, with uniform prior. In addition it is given that:
What is the Bayes optimal decision rule?
( ) ( ){ }( ) ( ){ }| exp /
| exp /
p x x
p x x
wp
wp
2
0
2
1
1= - - 2 2
21
= - - 4 82 2
w w0 1,