SurReal: Fréchet Mean and Distance Transform for Complex-Valued Deep LearningRudrasis Chakraborty, Jiayun Wang, Stella X. Yu
UC Berkeley / ICSI
Overview
1 New complex-valued deep learning theory thathandles scaling ambiguity with equivarianceand invariance properties on a manifold.
2 Sur-real experimental validation withsignificant performance gain (94%→ 98%) ata fraction (8%) of the baseline model size.
New Deep Learning Theory
1 New manifold representation R+× SO(2).
a+ i bF7→ (r, R(θ )) ,
r = (a+ i b) =p
a2+ b2
θ = arg(a+ i b) = atan2(b, a)
R(θ ) =
�
cos(θ ) − sin(θ )sin(θ ) cos(θ )
�
.
2 New group action G for complex scaling: theproduct of planar rotation and scaling.
3 New convolution that is equivariant to G:weighted Fréchet Mean filtering:
wFM ({zi} , {wi}) = argminm∈C
K∑
i=1
wid2 (zi,m) ,
whereK∑
i=1
wi = 1, wi ∈ (0,1],∀i
d(z1,z2) =
√
√
√
log2
�
r2
r1
�
+ ‖logm�
R−11 R2
�
‖2.
4 New fully connected layer operator that isinvariant to G: distance to the wFM.
m= wFM({ti}, {vi})ui = d(ti,m).
5 New nonlinear activation function: ReLU in thetangent space, log/exp maps back to manifold.
(r, R) 7→ (exp (ReLU (log(r))) , expm (ReLU (logm (R))) 2005/06/28ver : 1.3sub f i gpackage
Schematic of Our CNN
wFM
C
t1
td
tu
yc
y1
Invariant layer
FC
layer
Conv-layers
Conv-layers
Cascaded Conv-layers
x1;1
x7;8
wFM
wFM (fXg ; )
x1;1
x7;8
Equivariance of Fréchet Mean FilteringInvariance of Distance Transform
Rotation and scaling
wFM(z; w)
z1
z4
g:z1
g:z2
g:wFM(z; w)
z2z3
z5
g:z3
g:z5
g:z4
d1
d2d3
d4 d5
d1
d2
d3
d4
d5
Tangent ReLU for the Complex Plane
Our Results on MSTAR
100× 100× 1 49× 49× 1623× 23× 32
11× 11× 64
4× 4× 128
256
FCFC
512
CBRMCBRM
CBRMCBRM
CBRM
Softmax layer
FC
10
100× 100× 1
CCtR
20× 20× 50
CCtRCCtR
4× 4× 100
dFM
200
Softmax layer
FC
10
0%
20%
40%
60%
80%
100%
(a, b) r (a, b, r) (r, θ) z (a, b) z
raw data unit-norm data
98.2% 97.0%
z z
93.5%96.9%94.5%89.8% 46.0%
Our Results on RadioML
128× 2
CBRM
CBRM
130× 256
FC
FC
11
Softmax layer
256
128× 1
CCtR
25× 64
CCtRCCtR
5× 128
dFM
640
Softmax layer
FC
11