introduction to machine learning 3rd edition ethem alpaydin © the mit press, 2014...

17
INTRODUCTION TO MACHİNE LEARNİNG 3RD EDİTİON ETHEM ALPAYDIN © The MIT Press, 2014 [email protected] http://www.cmpe.boun.edu.tr/~ethem/i2ml3e Lecture Slides for

Upload: melvyn-sparks

Post on 13-Dec-2015

279 views

Category:

Documents


1 download

TRANSCRIPT

Page 1: INTRODUCTION TO MACHINE LEARNING 3RD EDITION ETHEM ALPAYDIN © The MIT Press, 2014 alpaydin@boun.edu.tr ethem/i2ml3e Lecture

INTRODUCTION TO MACHİNE LEARNİNG3RD EDİTİON

ETHEM ALPAYDIN© The MIT Press, 2014

[email protected]://www.cmpe.boun.edu.tr/~ethem/i2ml3e

Lecture Slides for

Page 2: INTRODUCTION TO MACHINE LEARNING 3RD EDITION ETHEM ALPAYDIN © The MIT Press, 2014 alpaydin@boun.edu.tr ethem/i2ml3e Lecture

CHAPTER 3:

BAYESİAN DECİSİON THEORY

Page 3: INTRODUCTION TO MACHINE LEARNING 3RD EDITION ETHEM ALPAYDIN © The MIT Press, 2014 alpaydin@boun.edu.tr ethem/i2ml3e Lecture

3

Probability and Inference

Result of tossing a coin is Î {Heads,Tails} Random var X Î{1,0}

Bernoulli: P {X=1} = poX (1 ‒ po)(1 ‒ X)

Sample: X = {xt }Nt =1

Estimation: po = # {Heads}/#{Tosses} = ∑

t xt / N

Prediction of next toss:Heads if po > ½, Tails otherwise

Page 4: INTRODUCTION TO MACHINE LEARNING 3RD EDITION ETHEM ALPAYDIN © The MIT Press, 2014 alpaydin@boun.edu.tr ethem/i2ml3e Lecture

Classification

Credit scoring: Inputs are income and savings.

Output is low-risk vs high-risk Input: x = [x1,x2]T ,Output: C Î {0,1} Prediction:

otherwise 0

)|()|( if 1 choose

or

otherwise 0

)|( if 1 choose

C

C

C

C

,xxCP ,xxCP

. ,xxCP

2121

21

01

501

4

Page 5: INTRODUCTION TO MACHINE LEARNING 3RD EDITION ETHEM ALPAYDIN © The MIT Press, 2014 alpaydin@boun.edu.tr ethem/i2ml3e Lecture

Bayes’ Rule

xx

xppP

PCC

C|

|

110

0011

110

xx

xxx

||||

CC

CCCC

CC

PpPpPpp

PP

5

posterior

likelihoodprior

evidence

Page 6: INTRODUCTION TO MACHINE LEARNING 3RD EDITION ETHEM ALPAYDIN © The MIT Press, 2014 alpaydin@boun.edu.tr ethem/i2ml3e Lecture

6

Bayes’ Rule: K>2 Classes

K

kkk

ii

iii

CPCp

CPCpp

CPCpCP

1

|

|

||

x

x

x

xx

xx | max | if choose

and 1

kkii

K

iii

CPCPC

CPCP

10

Page 7: INTRODUCTION TO MACHINE LEARNING 3RD EDITION ETHEM ALPAYDIN © The MIT Press, 2014 alpaydin@boun.edu.tr ethem/i2ml3e Lecture

Losses and Risks

Actions: αi Loss of αi when the state is Ck : λik Expected risk (Duda and Hart, 1973)

xx

xx

|min| if choose

||

kkii

k

K

kiki

RR

CPR

1

7

Page 8: INTRODUCTION TO MACHINE LEARNING 3RD EDITION ETHEM ALPAYDIN © The MIT Press, 2014 alpaydin@boun.edu.tr ethem/i2ml3e Lecture

8

Losses and Risks: 0/1 Loss

kiki

ik if

if

1

0

x

x

xx

|

|

||

i

ikk

K

kkiki

CP

CP

CPR

1

1

For minimum risk, choose the most probable class

Page 9: INTRODUCTION TO MACHINE LEARNING 3RD EDITION ETHEM ALPAYDIN © The MIT Press, 2014 alpaydin@boun.edu.tr ethem/i2ml3e Lecture

9

Losses and Risks: Reject

10

1

1

0

otherwise

if

if

,Kiki

ik

xxx

xx

|||

||

iik

ki

K

kkK

CPCPR

CPR

11

1

otherwise reject

| and || if choose 1xxx ikii CPikCPCPC

Page 10: INTRODUCTION TO MACHINE LEARNING 3RD EDITION ETHEM ALPAYDIN © The MIT Press, 2014 alpaydin@boun.edu.tr ethem/i2ml3e Lecture

10

Different Losses and Reject

Equal losses

Unequal losses

With reject

Page 11: INTRODUCTION TO MACHINE LEARNING 3RD EDITION ETHEM ALPAYDIN © The MIT Press, 2014 alpaydin@boun.edu.tr ethem/i2ml3e Lecture

11

Discriminant Functions

Kigi ,, , 1x xx kkii ggC max if choose

xxx kkii gg max| R

ii

i

i

i

CPCpCPR

g|

|

|

x

x

x

x

K decision regions R1,...,RK

Page 12: INTRODUCTION TO MACHINE LEARNING 3RD EDITION ETHEM ALPAYDIN © The MIT Press, 2014 alpaydin@boun.edu.tr ethem/i2ml3e Lecture

12

K=2 Classes

Dichotomizer (K=2) vs Polychotomizer (K>2) g(x) = g1(x) – g2(x)

Log odds:

otherwise

if choose

2

1 0

CgC x

x

x

||

log2

1

CPCP

Page 13: INTRODUCTION TO MACHINE LEARNING 3RD EDITION ETHEM ALPAYDIN © The MIT Press, 2014 alpaydin@boun.edu.tr ethem/i2ml3e Lecture

13

Utility Theory

Prob of state k given exidence x: P (Sk|x)

Utility of αi when state is k: Uik

Expected utility:

xx

xx

| max| if Choose

||

jjii

kkiki

EUEUα

SPUEU

Page 14: INTRODUCTION TO MACHINE LEARNING 3RD EDITION ETHEM ALPAYDIN © The MIT Press, 2014 alpaydin@boun.edu.tr ethem/i2ml3e Lecture

14

Association Rules

Association rule: X ® Y People who buy/click/visit/enjoy X are

also likely to buy/click/visit/enjoy Y. A rule implies association, not

necessarily causation.

Page 15: INTRODUCTION TO MACHINE LEARNING 3RD EDITION ETHEM ALPAYDIN © The MIT Press, 2014 alpaydin@boun.edu.tr ethem/i2ml3e Lecture

15

Association measures

Support (X ® Y):

Confidence (X ® Y):

Lift (X ® Y):

customers

and bought whocustomers

##

,YX

YXP

X

YXXPYXP

XYP

bought whocustomers

and bought whocustomers

|

##

)(,

)(

)|()()(

,YP

XYPYPXP

YXP

Page 16: INTRODUCTION TO MACHINE LEARNING 3RD EDITION ETHEM ALPAYDIN © The MIT Press, 2014 alpaydin@boun.edu.tr ethem/i2ml3e Lecture

16

Example

Page 17: INTRODUCTION TO MACHINE LEARNING 3RD EDITION ETHEM ALPAYDIN © The MIT Press, 2014 alpaydin@boun.edu.tr ethem/i2ml3e Lecture

17

Apriori algorithm (Agrawal et al., 1996)

For (X,Y,Z), a 3-item set, to be frequent (have enough support), (X,Y), (X,Z), and (Y,Z) should be frequent.

If (X,Y) is not frequent, none of its supersets can be frequent.

Once we find the frequent k-item sets, we convert them to rules: X, Y ® Z, ...and X ® Y, Z, ...