pr_unit-iii

Upload: joyce-george

Post on 04-Apr-2018

215 views

Category:

Documents


0 download

TRANSCRIPT

  • 7/30/2019 PR_UNIT-III

    1/29

    12/31/2010 SAM_PR_UNIT-II 1

    UNIT-III The Normal Density

    Univariate density

    Density which is analytically tractable

    Continuous density

    A lot of processes are asymptotically Gaussian

    Handwritten characters, speech sounds are ideal orprototype corrupted by random process (central limittheorem)

    Where:

    = mean (or expected value) ofx

    2= expected squared deviation or variance

    ,x

    2

    1exp

    2

    1)x(P

    2

  • 7/30/2019 PR_UNIT-III

    2/29

    12/31/2010 SAM_PR_UNIT-II 2

  • 7/30/2019 PR_UNIT-III

    3/29

    12/31/2010 SAM_PR_UNIT-II 3

    Multivariate density

    Multivariate normal density in d dimensions is:

    where x = (x1, x2, , xd)t (t stands for the transpose vector

    form)

    = (1, 2, , d)tmean vector

    = d*dcovariance matrix

    ||and -1 are determinant and inverse respectively

    )x()x(21exp

    )2(1)x(P 1t2/12/d

  • 7/30/2019 PR_UNIT-III

    4/29

    12/31/2010 SAM_PR_UNIT-II 4

    Discriminant Functions for the Normal Density

    Bayes Decision Theory Discrete Features

    Bayesian Decision Theory

  • 7/30/2019 PR_UNIT-III

    5/29

    12/31/2010 SAM_PR_UNIT-II 5

    Discriminant Functions for the Normal Density

    The minimum error-rate classification can be achieved

    by the discriminant function

    gi(x) = ln p(x |i) + ln P(i)

    Case of multivariate normal

    )(lnln2

    12ln

    2)()(

    2

    1)(

    1

    ii

    i

    i

    t

    ii Pd

    xxxg

  • 7/30/2019 PR_UNIT-III

    6/29

    12/31/2010 SAM_PR_UNIT-II 6

    All features have equal variance

    )(ln)()(2

    1)(

    2 ii

    t

    ii Pxxxg

    )category!thfor thethresholdthecalledis(

    )(ln2

    1;

    :where

    function)ntdiscrimina(linear)(

    0

    202

    0

    -ii

    ii

    t

    iii

    i

    i

    t

    ii

    Pww

    wxwxg

    Case 1: i=2.I

    )()(:x tofromdistanceEuclidean xx t

  • 7/30/2019 PR_UNIT-III

    7/2912/31/2010 SAM_PR_UNIT-II 7

    A classifier that uses linear discriminant functions

    is called a linear machine

    The decision surfaces for a linear machine are

    pieces of hyperplanes defined by:

    gi(x) = gj(x)

  • 7/30/2019 PR_UNIT-III

    8/2912/31/2010 SAM_PR_UNIT-II 8

  • 7/30/2019 PR_UNIT-III

    9/2912/31/2010 SAM_PR_UNIT-II 9

    The hyperplane separating Riand Rj

    always orthogonal to the line linking the means!

    )()(P

    )(Pln)(

    2

    1x ji

    j

    i

    2

    ji

    2

    ji0

    )(21xthen)(P)(Pif ji0ji

  • 7/30/2019 PR_UNIT-III

    10/2912/31/2010 SAM_PR_UNIT-II 10

  • 7/30/2019 PR_UNIT-III

    11/2912/31/2010 SAM_PR_UNIT-II 11

  • 7/30/2019 PR_UNIT-III

    12/2912/31/2010 SAM_PR_UNIT-II 12

    Covariance of all classes are identical but arbitrary

    i.e., covariance matrices are arbitrary, but equal toeach other for all classes. Features then form hyper-

    ellipsoidal clusters of equal size and shape. This alsoresults in linear discriminant functions whose decisionboundaries are again hyperplanes

    Hyperplane separating Ri and Rj

    (the hyperplane separating Ri and Rj is generally notorthogonal to the line between the means!)

    ).()()(

    )(P/)(Pln)(

    2

    1x ji

    ji

    1t

    ji

    ji

    ji0

    Case 2: i =

  • 7/30/2019 PR_UNIT-III

    13/2912/31/2010 SAM_PR_UNIT-II 13

  • 7/30/2019 PR_UNIT-III

    14/29

    12/31/2010 SAM_PR_UNIT-II 14

  • 7/30/2019 PR_UNIT-III

    15/29

    12/31/2010 SAM_PR_UNIT-II 15

    The covariance matrices are different for eachcategory

    In two class case, the decision boundaries formhyperquadratics.

    The discriminant functions are now, in general,quadratic (nor linear)

    (Hyperquadrics which are: hyperplanes, pairs ofhyperplanes, hyperspheres, hyperellipsoids,hyperparaboloids, hyperhyperboloids)

    )(lnln2

    1

    2

    1w

    w

    2

    1W

    :

    )(

    10

    1i

    1i

    0

    iiii

    t

    ii

    ii

    i

    i

    t

    ii

    t

    i

    P

    where

    wxwxWxxg

    Case 3: i = arbitrary

  • 7/30/2019 PR_UNIT-III

    16/29

    12/31/2010 SAM_PR_UNIT-II 16

  • 7/30/2019 PR_UNIT-III

    17/29

    12/31/2010 SAM_PR_UNIT-II 17

  • 7/30/2019 PR_UNIT-III

    18/29

    12/31/2010 SAM_PR_UNIT-II 18

    For the multi class case, the boundaries will look even

    more complicated.

    Decision

    Boundaries

  • 7/30/2019 PR_UNIT-III

    19/29

    12/31/2010 SAM_PR_UNIT-II 19

    Bayes Decision Theory Discrete Features

    Components of x are binary or integer valued, x can

    take only one of m discrete values

    v1, v2, , vm

    Case of independent binary features in 2 category

    problem

    Let x =[x1, x2, , xd]twhere each xiis either 0 or 1, with

    probabilities:

    pi= P(xi= 1 |1)

    qi= P(xi= 1 |2)

  • 7/30/2019 PR_UNIT-III

    20/29

    12/31/2010 SAM_PR_UNIT-II 20

    The discriminant function in this case is:

    0g(x)ifand0g(x)if

    )()(ln

    11ln

    :

    ,...,1)1()1(ln

    :

    )(

    21

    2

    1

    1

    0

    i

    0

    1

    decide

    PP

    qpw

    and

    dipqqpw

    where

    wxwxg

    d

    i i

    i

    ii

    ii

    i

    d

    i

    i

  • 7/30/2019 PR_UNIT-III

    21/29

    12/31/2010 SAM_PR_UNIT-II 21

    Compound Bayesian Decision Theoryand Context

    dependent,..,...,,, 321 vrc

    Exploit statistical dependence to gain improvedperformance by using context

    Compound decision problem Sequential compound decision problem

  • 7/30/2019 PR_UNIT-III

    22/29

    12/31/2010 SAM_PR_UNIT-II 22

    Compound Bayesian Decision Theoryand Context

    Pp

    Pp

    p

    PpP

    |X

    |X

    X

    |XX|

    The posterior probability of

    The optimal procedure is to minimize the compound conditional risk.

    If no loss for being correct & all errors are equally costly,

    Procedure computing P(|X) for all ,

    selecting ( posterior probability is maximum )

    ct in ,...,)(,)(),...,1( 1

    )x,...,x(X n1

    In practice, enormous task (cn) &

    P() dependent

    )(onlydependsx,)(||X1

    iixppi

    n

    i

    i

  • 7/30/2019 PR_UNIT-III

    23/29

    12/31/2010 SAM_PR_UNIT-II 23

  • 7/30/2019 PR_UNIT-III

    24/29

    12/31/2010 SAM_PR_UNIT-II 24

  • 7/30/2019 PR_UNIT-III

    25/29

    12/31/2010 SAM_PR_UNIT-II 25

  • 7/30/2019 PR_UNIT-III

    26/29

    12/31/2010 SAM_PR_UNIT-II 26

  • 7/30/2019 PR_UNIT-III

    27/29

    12/31/2010 SAM_PR_UNIT-II 27

  • 7/30/2019 PR_UNIT-III

    28/29

    12/31/2010 SAM_PR_UNIT-II 28

  • 7/30/2019 PR_UNIT-III

    29/29

    12/31/2010 SAM PR UNIT II 29