+ All Categories
Home > Documents > Deep Random Neural Fieldkabuto.phys.sci.osaka-u.ac.jp/~koji/workshop/DLAP2019/...Statistical...

Deep Random Neural Fieldkabuto.phys.sci.osaka-u.ac.jp/~koji/workshop/DLAP2019/...Statistical...

Date post: 26-Jul-2020
Category:
Upload: others
View: 2 times
Download: 0 times
Share this document with a friend
55
Deep Learning and Physics -- 2019 Deep Random Neural Field Shun-ichi Amari RIKEN Center for Brain ScienceAraya
Transcript
Page 1: Deep Random Neural Fieldkabuto.phys.sci.osaka-u.ac.jp/~koji/workshop/DLAP2019/...Statistical Neurodynamics Rozonoer (1969 ) Amari (1969, 1971; 1973) Sompolinski Amari et al (2013)

Deep Learning and Physics -- 2019

Deep Random Neural Field

Shun-ichi AmariRIKEN Center for Brain Science; Araya

Page 2: Deep Random Neural Fieldkabuto.phys.sci.osaka-u.ac.jp/~koji/workshop/DLAP2019/...Statistical Neurodynamics Rozonoer (1969 ) Amari (1969, 1971; 1973) Sompolinski Amari et al (2013)

Brief History of AI and NN First Boom: start

1956 ~ AI neural networks--perceptron

Dartmouth Conf. Perceptron

symbol universal computationlogic learning machine

Dark period (late 1960~1970’s) stochastic gradient descent learning (1967) for MLP

Page 3: Deep Random Neural Fieldkabuto.phys.sci.osaka-u.ac.jp/~koji/workshop/DLAP2019/...Statistical Neurodynamics Rozonoer (1969 ) Amari (1969, 1971; 1973) Sompolinski Amari et al (2013)

PerceptronF.Rosenblatt, Principles of Neurodynamics, 1961

McCulloch-Pitts neuron 0,1 binary learning

Multilayer lateral & feedback connection

x z

Page 4: Deep Random Neural Fieldkabuto.phys.sci.osaka-u.ac.jp/~koji/workshop/DLAP2019/...Statistical Neurodynamics Rozonoer (1969 ) Amari (1969, 1971; 1973) Sompolinski Amari et al (2013)

Deep Neural Networks Rosenblatt: multilayer perceptron

x ( , )z f x W=

( ) ( )( )

2, ,

,,

differentiable : analog neuron

L W y f W

L Wc

W

= −

∂→ +∆ ∆ = −

x x

xw w w wlearning of hidden neurons

analog neuron

stochastic gradient learningAmari, Tsypkin, 1966~67: error back-prop, 1976

Page 5: Deep Random Neural Fieldkabuto.phys.sci.osaka-u.ac.jp/~koji/workshop/DLAP2019/...Statistical Neurodynamics Rozonoer (1969 ) Amari (1969, 1971; 1973) Sompolinski Amari et al (2013)

Information Theory II--Geometrical Theory of Information

Shun-ichi AmariUniversity of Tokyo

Kyoritu Press, Tokyo, 1968

First stochastic descent learning of MLP (1967;1968)

Page 6: Deep Random Neural Fieldkabuto.phys.sci.osaka-u.ac.jp/~koji/workshop/DLAP2019/...Statistical Neurodynamics Rozonoer (1969 ) Amari (1969, 1971; 1973) Sompolinski Amari et al (2013)

( ) { } { }1 1 2 2 3 4, max , min ,f v v= ⋅ ⋅ + ⋅ ⋅x w x w x w x w xθ

x

1w

4w

max

max

1vy

2v

Page 7: Deep Random Neural Fieldkabuto.phys.sci.osaka-u.ac.jp/~koji/workshop/DLAP2019/...Statistical Neurodynamics Rozonoer (1969 ) Amari (1969, 1971; 1973) Sompolinski Amari et al (2013)
Page 8: Deep Random Neural Fieldkabuto.phys.sci.osaka-u.ac.jp/~koji/workshop/DLAP2019/...Statistical Neurodynamics Rozonoer (1969 ) Amari (1969, 1971; 1973) Sompolinski Amari et al (2013)

Second Boom

1970~ AI 1980~ neural networks expert system MLP (backprop)(MYCIN) associative memory

stochastic inference (Bayes) chess (1997)

Page 9: Deep Random Neural Fieldkabuto.phys.sci.osaka-u.ac.jp/~koji/workshop/DLAP2019/...Statistical Neurodynamics Rozonoer (1969 ) Amari (1969, 1971; 1973) Sompolinski Amari et al (2013)

Third Boom 2010~Deep learningStochastic inference (graphical model; Bayesian; WATSON)

Deep learningpattern recognition: vision, auditory,sentence analysis, machine translationalpha-go

Language processing; sequence and dynamics (word2vec, deep learning with rec. net)

Integration of (symbol, logic) vs (pattern, dynamics)

Page 10: Deep Random Neural Fieldkabuto.phys.sci.osaka-u.ac.jp/~koji/workshop/DLAP2019/...Statistical Neurodynamics Rozonoer (1969 ) Amari (1969, 1971; 1973) Sompolinski Amari et al (2013)

Deep Learning Self-Organization + Supervised LearningRBM: Restricted Boltzmann MachineAuto-Encoder, Recurrent Net

DropoutContrastive divergenceConvolutionResnetReLUAdversarial net

Page 11: Deep Random Neural Fieldkabuto.phys.sci.osaka-u.ac.jp/~koji/workshop/DLAP2019/...Statistical Neurodynamics Rozonoer (1969 ) Amari (1969, 1971; 1973) Sompolinski Amari et al (2013)

Victory of Deep Neural NetworksHinton 2005, 2006 ~ 2012

many others

visual pattern, auditory patternGo-gamesentence analysis, machine translation

adversarial network, pattern generation

Page 12: Deep Random Neural Fieldkabuto.phys.sci.osaka-u.ac.jp/~koji/workshop/DLAP2019/...Statistical Neurodynamics Rozonoer (1969 ) Amari (1969, 1971; 1973) Sompolinski Amari et al (2013)

Mathematical Neuroscience searches for the principlesmathematical studies using simple idealistic models

(not realistic)

Computational neuroscienceAI : technological realization

Page 13: Deep Random Neural Fieldkabuto.phys.sci.osaka-u.ac.jp/~koji/workshop/DLAP2019/...Statistical Neurodynamics Rozonoer (1969 ) Amari (1969, 1971; 1973) Sompolinski Amari et al (2013)

Mathematical Neuroscienceand Brain

Brain has found and implemented the principles

through evolution (random search)historical restrictionmaterial restriction

Very complex (not smartly designed)

Page 14: Deep Random Neural Fieldkabuto.phys.sci.osaka-u.ac.jp/~koji/workshop/DLAP2019/...Statistical Neurodynamics Rozonoer (1969 ) Amari (1969, 1971; 1973) Sompolinski Amari et al (2013)

Theoretical Problems on Learning: 1Local solution and global solution

Simulated annealingQuantum annealing:

Θ

( )ΘL

Page 15: Deep Random Neural Fieldkabuto.phys.sci.osaka-u.ac.jp/~koji/workshop/DLAP2019/...Statistical Neurodynamics Rozonoer (1969 ) Amari (1969, 1971; 1973) Sompolinski Amari et al (2013)

Theoretical Problem of Learning: 2training loss and generalization loss :overtraining

2

2

( , )1 | ( , ) |

[| ( , ) | ]

emp i i

gen

gen emp

y f x

L y f xN

L E y f xPL LN

θ ε

θ

θ

= +

= −

= −

≈ +

generalization loss

Training loss

Page 16: Deep Random Neural Fieldkabuto.phys.sci.osaka-u.ac.jp/~koji/workshop/DLAP2019/...Statistical Neurodynamics Rozonoer (1969 ) Amari (1969, 1971; 1973) Sompolinski Amari et al (2013)

Extremely wdie networkP-> ∞: P>>N

Local minimum =global minimumKawaguchi, 2019

Page 17: Deep Random Neural Fieldkabuto.phys.sci.osaka-u.ac.jp/~koji/workshop/DLAP2019/...Statistical Neurodynamics Rozonoer (1969 ) Amari (1969, 1971; 1973) Sompolinski Amari et al (2013)

Learning curve P>>N

Double descent

Belkin et al. 2019;Hastie et al. 2019

Page 18: Deep Random Neural Fieldkabuto.phys.sci.osaka-u.ac.jp/~koji/workshop/DLAP2019/...Statistical Neurodynamics Rozonoer (1969 ) Amari (1969, 1971; 1973) Sompolinski Amari et al (2013)

Random Neural Network

Random is excellent !! Random is magic!!

Statistical dynamicsRandom code

Page 19: Deep Random Neural Fieldkabuto.phys.sci.osaka-u.ac.jp/~koji/workshop/DLAP2019/...Statistical Neurodynamics Rozonoer (1969 ) Amari (1969, 1971; 1973) Sompolinski Amari et al (2013)

Random Deep NetworksPoole et. Al., 2016Schoenholtz et. Al., 2017~~

Signal propagationError back propagation

Page 20: Deep Random Neural Fieldkabuto.phys.sci.osaka-u.ac.jp/~koji/workshop/DLAP2019/...Statistical Neurodynamics Rozonoer (1969 ) Amari (1969, 1971; 1973) Sompolinski Amari et al (2013)

Jacot et al; Neural tangent kernel

ここに数式を入力します。

2

2

1( , ); ( , ) ( ( , ))2

1( , ) { ( , )} ; ( , ) ( , *)2

( ')( ( , ) ( , *))( , ) ( ) ( ) ( ') ( ', )

t

t t

y f x l x y f x

l x y f x e f x f x

l f x f x f xf x f x f x f x e x

θ θ

θ θ θ

θ θ θ

θ θ θ θ

θ η η θ θθ θ η θ

= = −

= − = −

∂ = − ∂ = − ∂ −∂ = ∂ ∂ = − ∂ ⋅∂

K; Gaussian kernel

Page 21: Deep Random Neural Fieldkabuto.phys.sci.osaka-u.ac.jp/~koji/workshop/DLAP2019/...Statistical Neurodynamics Rozonoer (1969 ) Amari (1969, 1971; 1973) Sompolinski Amari et al (2013)

( , '; ) ( ) ( ')( , ) ( , '; ) ( ', )t

K x x f x f xf x K x x e x

θ θθθ η θ θ

= ∂ ⋅∂

∂ = − < >

( , '; ) ( , ') : Gaussian kernelinitial

t ini

K x x K x xθθ θ

≈≈

Page 22: Deep Random Neural Fieldkabuto.phys.sci.osaka-u.ac.jp/~koji/workshop/DLAP2019/...Statistical Neurodynamics Rozonoer (1969 ) Amari (1969, 1971; 1973) Sompolinski Amari et al (2013)

Theorem P>>NOptimal solution lies near a random network.

Bailey et al2019

1( )

1( )

ij

ij

w On

w On

=

∆ =

random

Page 23: Deep Random Neural Fieldkabuto.phys.sci.osaka-u.ac.jp/~koji/workshop/DLAP2019/...Statistical Neurodynamics Rozonoer (1969 ) Amari (1969, 1971; 1973) Sompolinski Amari et al (2013)

Random Neural Field1

( ') ( , ') ( ) ( ')

( ') ( ( '))

l

l

u z w z z x z dz b z

x z u zϕ

= +

=

( ', ) : randam (0 mean Gaussian correla )e; t dw z z

Page 24: Deep Random Neural Fieldkabuto.phys.sci.osaka-u.ac.jp/~koji/workshop/DLAP2019/...Statistical Neurodynamics Rozonoer (1969 ) Amari (1969, 1971; 1973) Sompolinski Amari et al (2013)

Statistical Neurodynamicsmicrodynamics

( ) ( )1w

t T t+ =x x ( )( )sgn W t= x

1( )

t tX F X+ =

2 1

3 3 1

( ) ( )

( ) ( ) ?

W

W W

X X X T

X X X T T

= =

= =2

x x

x x

macrodynamics

: macrostate

Page 25: Deep Random Neural Fieldkabuto.phys.sci.osaka-u.ac.jp/~koji/workshop/DLAP2019/...Statistical Neurodynamics Rozonoer (1969 ) Amari (1969, 1971; 1973) Sompolinski Amari et al (2013)

Statistical Neurodynamics

Rozonoer (1969)Amari (1969, 1971; 1973)

SompolinskiAmari et al (2013)Toyoizumi et al (2015)Poole, …, Ganguli (2016)Schoenholz et al (2017)Yang & Schoenholtz (2017),Karakida, et al (2019)Jacot et al. (2019) ……

~ (0, 1)ijw N

Macroscopic behaviorscommon to almost all (typical) networks

Page 26: Deep Random Neural Fieldkabuto.phys.sci.osaka-u.ac.jp/~koji/workshop/DLAP2019/...Statistical Neurodynamics Rozonoer (1969 ) Amari (1969, 1971; 1973) Sompolinski Amari et al (2013)

Random Deep Networks

1

0

2

1

( )

1

( )

l l l

ij ji i

il ll

l l

x w x w

A xn

A F A

ϕ+

+

= +

=

=

∑2

20

~ (0, / )

~ (0, )ij l

i b

w N n

w N

σ

σ

Page 27: Deep Random Neural Fieldkabuto.phys.sci.osaka-u.ac.jp/~koji/workshop/DLAP2019/...Statistical Neurodynamics Rozonoer (1969 ) Amari (1969, 1971; 1973) Sompolinski Amari et al (2013)

Macroscopic variables

2

1

1

1activity :

distance: = [ : ']

( )( )

i

l l

l l

A xn

D D

A F AD K D

+

+

=

==

metric,curvature & Fisher informationx x

Page 28: Deep Random Neural Fieldkabuto.phys.sci.osaka-u.ac.jp/~koji/workshop/DLAP2019/...Statistical Neurodynamics Rozonoer (1969 ) Amari (1969, 1971; 1973) Sompolinski Amari et al (2013)

Dynamics of Activity: law of large numbers

2 20

1

20

( ) ( ) : ( )~ (0, )

1 ( ) [ ( ) ] ( )

( ) ( ) ~ (0,1)

i ik k i i

i

i il

x w x b u x Wx bu N A

A x E u An

A Av Dv v N

ϕ ϕ φ

ϕ χ

χ ϕ+

= + = = +

= = =

=

Page 29: Deep Random Neural Fieldkabuto.phys.sci.osaka-u.ac.jp/~koji/workshop/DLAP2019/...Statistical Neurodynamics Rozonoer (1969 ) Amari (1969, 1971; 1973) Sompolinski Amari et al (2013)

0

0

2

'(0) 1

( )

convergei

A A

x

χ

χ

>

=

→∑

Page 30: Deep Random Neural Fieldkabuto.phys.sci.osaka-u.ac.jp/~koji/workshop/DLAP2019/...Statistical Neurodynamics Rozonoer (1969 ) Amari (1969, 1971; 1973) Sompolinski Amari et al (2013)

Pullback Metric & Curvature

2 1li j

ijl

ds g dx dx d dn

= = ⋅∑ x x( )x Wxφ=

Page 31: Deep Random Neural Fieldkabuto.phys.sci.osaka-u.ac.jp/~koji/workshop/DLAP2019/...Statistical Neurodynamics Rozonoer (1969 ) Amari (1969, 1971; 1973) Sompolinski Amari et al (2013)

Basis vectors

( )

( )

1 1 1 1

1 1

1 1

1 1

( ..

:

. )

.

Jacob

.

n

.

ia

l l

l l l l l l

l l

l l l

i ii i i i i i

l l l

i ii i i

l m m

l l l l m m

a a a

B

dx u W dx B dx

d B d B B d

B B B

u W

ϕ

ϕ− −

− − − −

− −

− −

′= =

= =

′=

= =

∑ ∑

x x x

e e e

ここに数式を入力します。

( )x Wxφ=

Page 32: Deep Random Neural Fieldkabuto.phys.sci.osaka-u.ac.jp/~koji/workshop/DLAP2019/...Statistical Neurodynamics Rozonoer (1969 ) Amari (1969, 1971; 1973) Sompolinski Amari et al (2013)

1ab a b

l

gn

= ⋅e e

Page 33: Deep Random Neural Fieldkabuto.phys.sci.osaka-u.ac.jp/~koji/workshop/DLAP2019/...Statistical Neurodynamics Rozonoer (1969 ) Amari (1969, 1971; 1973) Sompolinski Amari et al (2013)

Dynamics of Metric

2 2

21

( ) ( '( ) )

E[ '( )) ] E[ '( )) ]E[ ]mean field approximation

( ) '( )

aa k k

a aa ak a k

a bab k j kj

a a a aa k j a k j

dx B dxB

B B u wg B B g

u w w u w w

A Av Dv

ϕ

ϕ ϕ

χ ϕ

=

=

= =

=

=

− −

=

e e

Page 34: Deep Random Neural Fieldkabuto.phys.sci.osaka-u.ac.jp/~koji/workshop/DLAP2019/...Statistical Neurodynamics Rozonoer (1969 ) Amari (1969, 1971; 1973) Sompolinski Amari et al (2013)

Mectric

( )

( )

1 1 1 1

1

2

2 2 2

221

,

l l

l l l l ll

l

l l l l

a bab ab

l l l

a bab

i ii i i l i i

i

l i

g BB g

ds g d x d x

BB w w u E

E u

ϕ σ ϕ δ

χ σ ϕ

− − − −

′ ′

= =

=

′ ′ = ≈

′=

e e

Law of large numbers

Page 35: Deep Random Neural Fieldkabuto.phys.sci.osaka-u.ac.jp/~koji/workshop/DLAP2019/...Statistical Neurodynamics Rozonoer (1969 ) Amari (1969, 1971; 1973) Sompolinski Amari et al (2013)

1( ) ( ( )) ( )l

ijijg x x g xχ= ∏conformal geometry

Page 36: Deep Random Neural Fieldkabuto.phys.sci.osaka-u.ac.jp/~koji/workshop/DLAP2019/...Statistical Neurodynamics Rozonoer (1969 ) Amari (1969, 1971; 1973) Sompolinski Amari et al (2013)

1

1 1

1

conformal transformation!( ) ( ) ( )

( )

ij ij

ij

ll

ij ij

g x A g x

A

g

χ

χ χ δ

χ δ

=

=

⇒ =

rotation, expansion

Page 37: Deep Random Neural Fieldkabuto.phys.sci.osaka-u.ac.jp/~koji/workshop/DLAP2019/...Statistical Neurodynamics Rozonoer (1969 ) Amari (1969, 1971; 1973) Sompolinski Amari et al (2013)

Domino Theorem

1 1 1 1

1 1 1 1

1 2

2

1

1

1

1 1 1

L L

L L L L L L

L L

L L L L L L

i ii i i i i i

i ii i i i i

l l l

m m m

l l l

m m m

i

B BB BBB

B BB BB

B B

B B BBBB

BW W W

δ χ δ

δ χ χ χ χ δ

− − − −

− − − −

− −

− −

′ ′ ′

′′ ′ ′

∂ ∂ ∂= = =

∂ ∂ ∂

∂ ∂ ∂= = =

Σ =

Σ =

∂ ∂ ∂

x x x

x x x

x x x

Page 38: Deep Random Neural Fieldkabuto.phys.sci.osaka-u.ac.jp/~koji/workshop/DLAP2019/...Statistical Neurodynamics Rozonoer (1969 ) Amari (1969, 1971; 1973) Sompolinski Amari et al (2013)

Dynamics of Curvature

2 2

''( )( )( ) '( )

| |

i iab a b a b

i a b a b

ab ab ab

ab ab

H xu

H

ϕ ϕ⊥

= ∇ = ∂ ∂

= ⋅ ⋅ + ⋅∂

= +

=

ew e w e w e

H H H

H

Page 39: Deep Random Neural Fieldkabuto.phys.sci.osaka-u.ac.jp/~koji/workshop/DLAP2019/...Statistical Neurodynamics Rozonoer (1969 ) Amari (1969, 1971; 1973) Sompolinski Amari et al (2013)

22

2 21

2 121

1

( ) ''( )

1 ( )(

exponential expansion! creation is smal

2 1) (

!

)

1

l

l l l lab abab

A Av Dv

H A A Hn

χ ϕ

χ δ χχ

χ

+

=

= + +

>

Page 40: Deep Random Neural Fieldkabuto.phys.sci.osaka-u.ac.jp/~koji/workshop/DLAP2019/...Statistical Neurodynamics Rozonoer (1969 ) Amari (1969, 1971; 1973) Sompolinski Amari et al (2013)

Poole et al (2016)Deep neural networks

Page 41: Deep Random Neural Fieldkabuto.phys.sci.osaka-u.ac.jp/~koji/workshop/DLAP2019/...Statistical Neurodynamics Rozonoer (1969 ) Amari (1969, 1971; 1973) Sompolinski Amari et al (2013)

Distance

[ ] 21, i iD x yn

= −∑x y

Page 42: Deep Random Neural Fieldkabuto.phys.sci.osaka-u.ac.jp/~koji/workshop/DLAP2019/...Statistical Neurodynamics Rozonoer (1969 ) Amari (1969, 1971; 1973) Sompolinski Amari et al (2013)

Dynamics of Distance (Amari, 1974)

( )

21( , ') ( ')

1( , ') ' '

' 2

~N(0, V)

' ' V=

( ') E[ (

ii

i i

ii k k

ii k k

D x x x xn

C x x x x x xn

D A A C

u w y

u w y A C

C A C Aϕ

= −

= ⋅ =

= + −

=

=

= −

∑ ) ( ' )]C C A C Cε ν ϕ ε ν+ − +

Page 43: Deep Random Neural Fieldkabuto.phys.sci.osaka-u.ac.jp/~koji/workshop/DLAP2019/...Statistical Neurodynamics Rozonoer (1969 ) Amari (1969, 1971; 1973) Sompolinski Amari et al (2013)

1

1

( )

1

l lD K D

dDdD

χ

+ =

= >

Page 44: Deep Random Neural Fieldkabuto.phys.sci.osaka-u.ac.jp/~koji/workshop/DLAP2019/...Statistical Neurodynamics Rozonoer (1969 ) Amari (1969, 1971; 1973) Sompolinski Amari et al (2013)

Problem!

( , ' ) : ( )

equi-distance property

l lD D lD K D

→ →∞

=

x x

Page 45: Deep Random Neural Fieldkabuto.phys.sci.osaka-u.ac.jp/~koji/workshop/DLAP2019/...Statistical Neurodynamics Rozonoer (1969 ) Amari (1969, 1971; 1973) Sompolinski Amari et al (2013)

dynamics of distancelim ( , )

lim lim ( , ) lim lim ( , )

L L

nL

L L L L

n L L n

D x y

D x y D x y

→∞→∞

→∞ →∞ →∞ →∞≠

Page 46: Deep Random Neural Fieldkabuto.phys.sci.osaka-u.ac.jp/~koji/workshop/DLAP2019/...Statistical Neurodynamics Rozonoer (1969 ) Amari (1969, 1971; 1973) Sompolinski Amari et al (2013)

Feedback Path

Error backpropFisher Information

Page 47: Deep Random Neural Fieldkabuto.phys.sci.osaka-u.ac.jp/~koji/workshop/DLAP2019/...Statistical Neurodynamics Rozonoer (1969 ) Amari (1969, 1971; 1973) Sompolinski Amari et al (2013)

Stochastic model : parameter spacemanifold of probability distributions

2

2

( ( ......)..) ; ~ (0,1)1( , : ) exp{ ( ( ; )) } ( )2

[ log ( , : ) log ( , : )]x W W

y Wx Wx N

p y x W c y x W q x

G E p y x W p y x Wds dWGdW

ϕ ϕ ε ε

ϕ

= +

= − −

= ∇ ∇

=

Page 48: Deep Random Neural Fieldkabuto.phys.sci.osaka-u.ac.jp/~koji/workshop/DLAP2019/...Statistical Neurodynamics Rozonoer (1969 ) Amari (1969, 1971; 1973) Sompolinski Amari et al (2013)

Learning: stochastic gradient descentSteepest Direction---Natural Gradient

( )1

1

, ,n

l ll

l G l

θ θ−

∂ ∂∇ = ∂ ∂ ∇ = ∇

θ

dθθ

( )l θ

( , ; )t t t t tl x yη θ∆ = − ∇θ

Page 49: Deep Random Neural Fieldkabuto.phys.sci.osaka-u.ac.jp/~koji/workshop/DLAP2019/...Statistical Neurodynamics Rozonoer (1969 ) Amari (1969, 1971; 1973) Sompolinski Amari et al (2013)

Natural Gradient

( ) ( )

( )

2

1

max

KL[p(x, ):p(x, )]=

dl l d l

d d

l G l

ε θ θ θ ε−

= + −

= +

∇ = ∇

θ θ θ

θ

θ

( , ; )t t t t tl x yη θ∆ = − ∇θ

Page 50: Deep Random Neural Fieldkabuto.phys.sci.osaka-u.ac.jp/~koji/workshop/DLAP2019/...Statistical Neurodynamics Rozonoer (1969 ) Amari (1969, 1971; 1973) Sompolinski Amari et al (2013)

Information Geometry of MLP

Natural Gradient Learning :S. Amari ; H.Y. Park

( )

( )

1

1 1 1 11 1 T

t t t t

lG

G G G f f G

η

ε ε

− − − −+

∂∆ = −

= + − ∇ ∇

θ θθ

Page 51: Deep Random Neural Fieldkabuto.phys.sci.osaka-u.ac.jp/~koji/workshop/DLAP2019/...Statistical Neurodynamics Rozonoer (1969 ) Amari (1969, 1971; 1973) Sompolinski Amari et al (2013)

Fisher Information

( )

( )

1 1 1

2 1 1

1

' ...

, (1/ )

, 0 ~ (1/ ),

, 0 ~ (1/ ),

m ll l l m

m m m m

l l l

il m x p

l m pl l

i j p

G EW W

W B BB BW W W W

G W W E O n

G W W O n l m

G O n i j

ϕ ϕ

ϕ ϕ ϕ ϕϕ

χ ϕ

− − +

− −

∂ ∂= ∂ ∂

∂ ∂ ∂ ∂= = =

∂ ∂ ∂ ∂ ′= +

= ≠

= ≠

x

w x x

w w

Page 52: Deep Random Neural Fieldkabuto.phys.sci.osaka-u.ac.jp/~koji/workshop/DLAP2019/...Statistical Neurodynamics Rozonoer (1969 ) Amari (1969, 1971; 1973) Sompolinski Amari et al (2013)

Unitwise natural gradient

1WW G lη −∆ = − ∇

Y. Ollivier; Marceau-Caron

Page 53: Deep Random Neural Fieldkabuto.phys.sci.osaka-u.ac.jp/~koji/workshop/DLAP2019/...Statistical Neurodynamics Rozonoer (1969 ) Amari (1969, 1971; 1973) Sompolinski Amari et al (2013)

Goodnews and bad news

G*: unitwise-diagonal matrix

1 1

*: * :

G G nG G n− −

→ →∞

→ →∞

Page 54: Deep Random Neural Fieldkabuto.phys.sci.osaka-u.ac.jp/~koji/workshop/DLAP2019/...Statistical Neurodynamics Rozonoer (1969 ) Amari (1969, 1971; 1973) Sompolinski Amari et al (2013)

Karakida theoryeigenvalues of G

( )21 , 1i i On

λ λ= =∑ ∑

distorted Riemannian metric2G

Page 55: Deep Random Neural Fieldkabuto.phys.sci.osaka-u.ac.jp/~koji/workshop/DLAP2019/...Statistical Neurodynamics Rozonoer (1969 ) Amari (1969, 1971; 1973) Sompolinski Amari et al (2013)

References:

Poole, …, Ganguli (2016)Schoenholz et al (2017)Yang & Schoenholtz (2017), ……

S. Amari, R. Karakida & M. Oizumi, Statistical neurodynamics of deep networks: Geometry of Signal Spaces. arXiv:1808.07169v1, 2018.

S. Amari, R. Karakida & M. Oizumi, Fisher information and natural gradient learning of random deep networks. arXiv:1808.07172v1, 2018(AISTATS-19).

R. Karakida, S. Akaho & S. Amari, Universal statistics of Fisher information indeep neural networks: Mean field approach. arXiv: 1806.01316, 2018(AISTATS-19).


Recommended