statistical neurodynamics of deep...

Post on 26-Jul-2020

6 Views

Category:

Documents

0 Downloads

Preview:

Click to see full reader

TRANSCRIPT

Statistical Neurodynamicsof Deep Networks

Shun‐ichi Amari

RIKEN Brain Science Institute

Statistical NeurodynamicsRozonoer (1969)Amari (1971; 197Amari et al (2013)Toyoizumi et al (2015)Poole, …, Ganguli (2016)

~ (0 , 1)ijw N

Macroscopic behaviorscommon to almost all (typical) networks

Macroscopic variables

2

1

1

1activity :

distance: = [ : ']curvature :

( )( )

i

l l

l l

A xn

D D

A F AD K D

x x

Deep Networks

01

2

1

( )

1

( )

i ij i il l

il ll

l l

x w x w

A xn

A F A

2

0

~ (0, 1 / )

0, 1'(0) =const

ij

i

w N n

w

Pullback Metric

2 1la b

abl

ds g g dx dx d dn

x x

1ab a b

l

gn

e e

1l ln n

Poole et al (2016)Deep neural networks

Dynamics of Activity

2 20

1

20

( ) ( )~ (0, )

1 ( ) [ ( ) ] ( )

( ) ( ) ~ (0,1)

k k

l

y w y uu N A

A y E u An

A Av Dv v N

0

(0) (0) 1

( )

convergei

A A

x

Dynamics of Metric

2 2

21

( ) ( '( ) )

E[ '( )) ] E[ '( )) ]E[ ]

mean field approximation

( ) '( )

k k

a a

k k

ab k j kj

k j k j

dy B dyB

B B u w

g B B g

u w w u w w

A Av Dv

e e

1

1 1

1

( )conformal transformation!

( )

ab ab

ab

ll

ab ab

g A g

A

g

rotation, expansion

Dynamics of Curvature

2 2

''( )( )( ) '( )

| |

ab a b a b

a b a b

ab ab ab

ab ab

H y

u

H

e

w e w e w e

H H H

H

22

2 21

2 1 1

12

2 1

( ) ''( )

( ) ( )(2 1) ( )1

( )(2 1)

exponwntial expansion!

l l l l l

ab abab

ll

ab ab

A Av Dv

H A A A H

H l A

Dynamics of Distance (Amari, 1974)

21( , ') ( ')

1( , ') ' '

' 2

~N(0, V)

' ' V=

( ') E[ (

ii

i i

k k

k k

D x x x xn

C x x x x x xn

D A A C

u w y

u w y A C

C A C A

) ( ' )]C C A C C

1

1

( )

1

l lD K D

dDdD

Poole et al (2016)Deep neural networks

Problem!

( , )( )

equidistance property

l lD DD K D

x x

Shuttering

Multiplicity

Dynamics of recurrent net

Dropout and backprop

Multilayer Perceptrons

i iy v w x

, i if v x w x

1 2( , ,..., )nx x x x

1 1( ,..., ; ,..., )m mw w v v

1 w x

yx

Multilayer Perceptron

1 1,

,

, ; ,i i

m m

y f

v

v v

x θ

w x

θ w w

neuromanifold ( )x

space of functions S

singularities

Geometry of singular model

y v n w x

vv | | 0w

1 , ,

:

t t t tG y

G l l

Fisher

Natural Gradient Stochastic Descent

Information Matrix

invarint; steepest descent

 

x

model: 2 hidden neurons

2

1 1 2 2

2

,

,

12

tu

f w w

y f

u e dt

x J x J x

x

Singular Region in Parameter Space

1 2 1 2

1 2 2

1 2 1

1 1 2 2

, ,

0, ,

, 0,

,

R w w w w

w w w

w w w

f w w

J J J J

J J

J J

x J x J x

Coordinate transformation

1 1 2 2

1 2

1 2

2 1

2 1

1 2

,

,

,

, , ,

w ww w

w w w

w wzw w

w z

J Jv

u J J

v u

Singular Region , 0 1R w z J u

Milnor attractor

Topology of singular R

2 21

2 32

blow-down coordinates , ,

1 ,

1 ,

, 1n

c z u u

c z z u

S

: = e

u

ue eu

Dynamic vector fields: Redundant case

top related