learning decision trees using the fourier spectrum

Post on 09-Jan-2016

35 Views

Category:

Documents

0 Downloads

Preview:

Click to see full reader

DESCRIPTION

Learning Decision Trees using the Fourier Spectrum. By Eyal Kushilevitz Yishay Mansour. What is Learning. What are we Learning?. Term (and of literals (. DNF (or of terms). DT , : See latter. Does always. All inputs. We Want:. Randomized Algorithm. Do we always succeed ?. - PowerPoint PPT Presentation

TRANSCRIPT

Haim Kermany

Learning Decision Treesusing the Fourier Spectrum

By

Eyal Kushilevitz

Yishay Mansour

( )h x

What is Learning

( )f x( ,?)a

( )f a

What are we Learning?

Boolean Function: :{0,1 1} { ,1}nf

1 2 3 4 5 1 2 3 4 1: ( , , , , ) ( ) ( )ex f x x x x x x x x x x

•Term (and of literals(

1 2 3 4 5 1 2 4 1 2 4: ( , , , , )ex f x x x x x x x x x x x

(01011) (0 1) (0 1 0) 1f

•DNF (or of terms)

1 2 3 4 5 1 2 4 2 3 4 5: ( , , , , )ex f x x x x x x x x x x x x

•DT , : See latter

Randomized Algorithm

( ) ( ) ?h x f x Does always All inputs

( )Prob ( )h x f x We Want:

( ) 1h x

( ) 1f x

( ) ( )h x f xDo we always succeed ?

Prob =

Pr ( )ob Pr ( )ob

fail

h x f x 0 Deterministic Algorithem

0 ( ) ( )h x f x

( ) 1,1sign g x

2E ( ) ( )f x g x

:g approcximates f

approximation

P rob ( ) ( )f x sign g x

( ) and thenif f x boolean g approcximates f

What we want

Fourier Transform

i

i1

1 if mod 2 = 0

1 if mod 2 = 1( ... ) i i

i i

z nx z

x zx x

1 0 ( )1

01110

011010110 =+1

(1+ 1)mod 2=0

0,1

ˆ ( )n

z z

z

f f x f x

{0,1}nz x z

t-phase functiont-phase function – a function that has at most t Fourier coefficient

2 g approximates f

2

then th O approcximates f

1 g t phase functionif that:g

ˆ0,1

ˆ ˆ( ) ( )i

nz ti

z z z zfz

f x f x h x f x

Only big coefficients

Very small

( ) has all coefficient that start with f x

:{0,1}n kf

0,1

ˆ( )n k

f x f x

{0,1}k

( ) ( ) , is the empty stringf x f x

all coefficient in ( ) = +f x1all coefficient in ( ) f x

0all coefficient in ( ) f x

( )f x

Tree f x

1f x

11f x

0f x

10f x 00f x 01f x

1f x

1f x

n

f x

2

n

What we need

Finding only big coefficients

ˆif f if output n

else 0 ; 1 ;Coef Coef

Coef

Analyzing coef()

for any , 0,1 :k

k 2 2ˆif then f E f

{0,1}

1[ ] ( ) ( )

2 nnx

E fg f x g x

Changing coef()

2if B if output n

else 0 ; 1 ;Coef Coef

Coef 2B E f

Approximating 2E f x

{0,1}

( )kyf x E f xy y

( )f x

2, , ( ) ( )x y zE f x E f xy f xz y z

2

{0,1}

2

{0,1}

, ,

, ,

( )

( ) ( )

( ) ( )

( ) ( )

( ) ( )

n k

k

x

x y

x y y

x y z

x y z

x y z

E f x

E E f xy y

E E f xy y E f xy y

E E f xy y E f xz z

E f xy y f xz z

E f xy f xz y z

Proof :

Approx do times:m

choose 0,1 ;k

iy choose 0,1 ;

n k

ix

1

Let ;i i i i i ii m

B AVG f y x f y z y z

Approximating 2E f x

2

approximating

E f x

MQ

choose 0,1 ;k

iy

Changing coef()

if output n

else 0 ; 1 ;Coef Coef

Coef

2f 2i B

B Approx Save Side

Finding

m 22 1

4 222Pr 4m

E f B e

41 1logm O

with probability 1 : 22 2 2E f B

2 22 4 2E f B

22 2

4E f

22 2 2E f B

Coef() output

22 2ˆ 2ˆ will be output

z z z

z

z f E f B

f

ˆevery will be outputzf

Coaf() time

2 22

1 f E f

for any :k

2 22 4 2There wont be a recursive call

E f B

222

4 4nE f

22

2 2

2

1 f E f

for any :k

2

22

4 4f E f

for any :k

24There will be at most recursive call

(every call work )

n

m

There will be

coefficients

1( )poly

Running time is

1 1( , , )poly n

Finding

h x

ˆoutput z zz Z

h x f

Z Coef

ˆ approximate zz Z f

find h x

ˆ [ ]z zf E f

Conclusion

2 g approximates f 1 g t phase function

if that:g

1 1

Then there is exist a

that output a function , such that with

probability 1 .

The algorithm run in ( , , , )

randomized algorithm

h

O approximates f

poly n t log

ˆ0,1 ( )1

ˆ ˆ( ) ( )i

nzi

z z z zfz L f

f x f x h x f x

f h approximates f

1ˆ( ) zz

L f f

1( )L fHow to find h(x)

use Coaf() again

Best algorithm ever

1 11

There is a

that for boolean function

output a function , such that with

probability 1 .

The algorithm run in

any

( , , , )

randomized algorithm

f

h

h approximates f

poly n L f log

1L f

Can be exp(n) 1L f

Decision Tree (DT)

+1-1

-1

-1-1-1 +1

+1

+1

+1

+1

01011

11100

10101

11001

10010

00010

0001010010 11100

11100

1 ( )f DT L f m

the of nodes in the treem number

Decision Tree (DT)

Proof:

1 1

There is a

that for any output a function ,

such that with probability 1

.

The algorithm run in ( , , , )

randomized algorithm

f DT h

h approximates f

poly n m log

Exacting

depth of the tree f fd T T

ˆ ˆ

2 fz z d T

kf k f

ˆ ˆz z zf g

:g approcximates f

choose:

21 12 2d

1 1

2 2 fd T ˆ zround g

Final Result

1

There is a

that for any

output a function , such that

with probability 1

.

The algor

ithm run in ( , , log )

randomized algorithm

f DT

h

poly n m

h f

top related