Dynamic Routing Between Capsules - Heidelberg University...Dynamic Routing Between Capsules Capsules...

Dynamic Routing Between Capsules

Explainable Machine Learning

6/19/18 Michael Dorkenwald2

Introduction

Dynamic Routing Between CapsulesDynamic Routing Between Capsulesby Sara Sabour, Nicholas Frosst, Geoffrey Hinton

from October 2017


Geoffrey Hinton

● Significant contributions for the backpropagration algorithm

● Idea for AlexNet● Invented Dropout


List of Content

● Motivation for Capsules ● Idea of Inverse Graphics● Capsules● Dynamic Routing Between Capsules● Capsules on MNIST● Conclusion


Convolutional Neural Networks (CNN)

● Special type of multi-layer neural networks, constructed to recognize visual patterns directly from pixel images


Features Maps of CNN’s


Max-Pooling Layer

● Dimension Reduction● Selective routing of features

Loses out positional information


Achievements of CNN’s


Spatial Relation


Motivation

Hinton: “The pooling operation used in convolutional neural networks is a big mistake and the fact that it works so well is a disaster.”

Looking for equivarianceChanges in viewpoint leads to corresponding changes in neural activities


List of Content



Computer Graphics

● Construct a visual image (rendering) from abstract representation of an object


Inverse Graphics

● Reverse process: start from the image and get the parameters trough inverse rendering


List of Content



Capsule Network

Capsule Network is a neural network that tries to perform inverse graphics


Capsule

● A Group of neurons ● Goal: predict the presence and the instantiation

parameters of a specific entity at a given location● Presence represent by the length of the activity

vector (probability)● Instantiation paramerters are:

– position, size, orientation, deformation, hue, texture, etc.


Primary Capsule Activities

Input: Image features

1st layer: Convolutional layer with ReLu activation

2nd layer: Convolutional capsule layer

Squashing function


Squashing Function

● Output vector represent probability that the entity is present

● Apply non-linearity to the whole capsule


Capsules

Inverse Rendering

Image Capsule activations


Capsules

Inverse Rendering

Image

Equivariance

Capsule activations


List of Content



Dynamic Routing

● Prediction vector

– With the previous capsule output ui and transformation matrix Wij

● The capsule output of the next layer– With vj ouput of the next layer


Dynamic Routing

● Couping Constants cij

– trained during the iterative dynamic routing process

– determined by ‘routing softmax’ whose initial logits bij are the log prior probabilities that capsule i should be coupled to capsule j.

● Agreement

– is simply the scalar product


Dynamic Routing

Routing Algorithm:


Dynamic Routing

Loss function:● For each digit capsule k, the loss function is margin

loss as

● where Tk = 1 when digit k is present and m+ = 0.9 and m− = 0.1. Default


Capsules

Hierachy of parts


Capsules

Inverse Rendering


Capsules

Inverse Rendering

Predicted outputs


Capsules

Inverse Rendering

AgreementShould be only routed to 7

Predicted outputs


Capsules

Inverse Rendering

Predicted outputs

Dynamic Routing:● bij = 0 for all i,j● Ci = softmax(bi)

0.5

0.50.5

0.5


Capsules

Inverse Rendering

Predicted outputs

Sj = weighted sumVj = squash(sj)

Output for round #1

0.50.5 0.5 0.5


Capsules

Inverse Rendering

Predicted outputs


Output for round #1

0.50.5 0.5 0.5

Agreement huge


Capsules

Inverse Rendering

Predicted outputs


Output for round #1

0.50.5 0.5 0.5

Disagreement small


Capsules

Inverse Rendering

Predicted outputs

Dynamic Routing:● bij = 0 for all i,j● Ci = softmax(bi)

0.8

0.10.2

0.9


Capsules

Inverse Rendering

Predicted outputs


Output for round #2

0.80.2 0.1 0.9


Clustering on agreement

What really happens:

Mean




Weighted mean




Weighted mean


Classification

Inverse Rendering

Loss function


Capsule Network Architecture


Reconstruction

Inverse Rendering

Loss function

Decoder

Neural Net

Reconstruction


Reconstruction

● Decoder structure to reconstruct a digit from the DigitCaps layer


Reconstruction as a regularization method

● Force the digit capsules to encode the instantiation parameters of the input digit

● minimize the sum of squared differences between the outputs of the logistic units and the pixel intensities

● Loss = margin loss + a reconstruction loss

with a = 0.0005


List of Content



Capsules on MNIST

● Images have been shifted by up to 2 pixel in each direction with zero padding, no other data augmentation or model averaging

● Baseline standard CNN with three Conv-Layers of (256,256,128 channels, 5x5 kernel, stride 1) followed by 2 fully connected layers (328,192 (dropout))

● Number of parameters: baseline 35.4 M, CapsNet 8.2 M and without reconstrcution subnetwork 6.8 M


Individual Dimensions of a Capsule


Robustness to Affine Transformations

● Trained a CapsNet and traditional CNN (with pooling) on a padded and translated MNIST training set

● tested networks on the affNIST (MNIST digit with a random small affine transformation)

● Under-trained CapsNet (99,23%) achieved 79 %● Traditional CNN (99,22%) with similar number of

parameters achieved 66 %


MultiMNIST

● create MultiMNIST dataset by overlaying a digit on top of another digit of a different class

● For each digit in MNIST they generate 1K MultiMNIST examples● Trainings set size 60M, test set size 10 M


MultiMNIST


CIFAR10

● Slight modification from the simple model they used for MNIST, with 3 routing iteration

● Achieved 10.6 % test error ● About what standard CNN achieved when they

were first applied to CIFAR10


List of Content



Conclusion

● Achieved state of the art accuracy on MNIST● Spatial relation are preserved (equivariance)

– Promising for object detection and segmentation● Dynamic Routing works great for overlapping digits● Robustness to affine transformation● Activation vectors are easier to interpret (scale, thickness,

rotation etc.)● Ability to analyze the hierarchy of objects


Conclusion

● Not state of the art on CIFAR10● No results on larger data sets (e.g. ImageNet)● Slow in training, because of the inner loop in

Dynamic Routing


Sources

● Slide 3: https://medium.com/ai%C2%B3-theory-practice-business/understanding-hintons-capsule-networks-part-i-intuition-b4b559d1159b

● Slide 5: https://www.mathworks.com/discovery/convolutional-neural-network.html

● Slide 6 and 7: https://cs231n.github.io/convolutional-networks/#pool

● Slide 11: https://www.reddit.com/r/MachineLearning/comments/2lmo0l/ama_geoffrey_hinton/clyj4jv/

● Slide 7: https://leonardoaraujosantos.gitbooks.io/artificial-inteligence/content/pooling_layer.html

● Slide 13 and 14 : https://kndrck.co/posts/capsule_networks_explained/

● Slide 8:https://medium.com/ai%C2%B3-theory-practice-business/understanding-hintons-capsule-networks-part-i-intuition-b4b559d1159b

● Slide 9: https://hackernoon.com/capsule-networks-are-shaking-up-ai-heres-how-to-use-them-c233a0971952


Thank you !

https://medium.com/ai%C2%B3-theory-practice-business/understanding-hintons-capsule-networks-part-i-intuition-b4b559d1159b


https://www.mathworks.com/discovery/convolutional-neural-network.html

https://cs231n.github.io/convolutional-networks/#pool

https://www.reddit.com/r/MachineLearning/comments/2lmo0l/ama_geoffrey_hinton/clyj4jv/

https://leonardoaraujosantos.gitbooks.io/artificial-inteligence/content/pooling_layer.html

https://kndrck.co/posts/capsule_networks_explained/



Date post:	31-Dec-2020
Category:	Documents
Upload:	others
View:	8 times
Download:	0 times

Dynamic Routing Between Capsules - Heidelberg University...Dynamic Routing Between Capsules Capsules...

Documents