Date post: | 07-Jul-2018 |
Category: |
Documents |
Upload: | md-johirul-islam |
View: | 223 times |
Download: | 0 times |
of 75
8/18/2019 Distributed Deep Learning Techniques
1/75
Apache Singa: A General Distributed DeepLearning Platform
Md Johirul IslamDepartment of Computer Science
Iowa State [email protected]
March 3, 2016
http://find/
8/18/2019 Distributed Deep Learning Techniques
2/75
OverviewSystem Architecture
Distributed Training FrameWorkNeuralNet
TrainingSummary
Singa is a general distributed deep learning platform fortraining big deep learning models over large datasets
It is designed with an intuitive programming model usinglayer abstraction
SINGA is intergrated with Mesos, so that distributedtraining can be started as a Mesos framework
SINGA can run on top of distributed storage system toachieve scalability. The current version of SINGA supportsHDFS
Md Johirul Islam 2/75 Apache Singa
http://find/
8/18/2019 Distributed Deep Learning Techniques
3/75
8/18/2019 Distributed Deep Learning Techniques
4/75
OverviewSystem Architecture
Distributed Training FrameWorkNeuralNet
TrainingSummary
Work ow
Training Goal is to nd optimal parameters involved in thetransformation functions that genrate good features forspecic tasks.
SGD algorithm is used to randomly initialize theparameters and then randomly update through iterations
Md Johirul Islam 4/75 Apache Singa
http://find/http://goback/
8/18/2019 Distributed Deep Learning Techniques
5/75
8/18/2019 Distributed Deep Learning Techniques
6/75
OverviewSystem Architecture
Distributed Training FrameWorkNeuralNet
TrainingSummary
Work Flow
The training workload is distributed over the workers andservers
In each iteration every worker calls the TrainOneBatchfunction to compute parameter gradients
TrainOneBatch takes a NeuralNet object representing aneural network and visits all the layers in a certain order
The resultant gradients are aggregated by the local stub. The stub forwards them to the corresponding servers forupdating
Md Johirul Islam 6/75 Apache Singa
http://goforward/http://find/http://goback/
8/18/2019 Distributed Deep Learning Techniques
7/75
OverviewSystem Architecture
Distributed Training FrameWorkNeuralNet
TrainingSummary
Work Flow
Md Johirul Islam 7/75 Apache Singa
http://find/
8/18/2019 Distributed Deep Learning Techniques
8/75
Overview
8/18/2019 Distributed Deep Learning Techniques
9/75
OverviewSystem Architecture
Distributed Training FrameWorkNeuralNet
TrainingSummary
Logical ArchitectureParallelismCommunication
Worker GroupMade up of one or more workers.Each worker group trains a complete model replica forparticular datasetThey compute the parameter gradients
A worker group communicates with only one server group.All worker groups communicate with the server groupasynchronously.workers inside a worker group communicatessynchronously.
Server GroupMade up of a number of servers.Each Server manages apartition of the model parameters.The handle get/update requests.The neighboring server groups synchronize time to time.
Md Johirul Islam 9/75 Apache Singa
http://find/
8/18/2019 Distributed Deep Learning Techniques
10/75
Overview
8/18/2019 Distributed Deep Learning Techniques
11/75
OverviewSystem Architecture
Distributed Training FrameWorkNeuralNet
TrainingSummary
Logical ArchitectureParallelismCommunication
Figure: Hybrid Parallelism.
Md Johirul Islam 11/75 Apache Singa
Overview
http://find/http://goback/
8/18/2019 Distributed Deep Learning Techniques
12/75
OverviewSystem Architecture
Distributed Training FrameWorkNeuralNet
TrainingSummary
Logical ArchitectureParallelismCommunication
In Singa Workers and Server run in separate threads. Several workers and servers again reside in a process. There is a main thread in a process that works as stub. The communication between then occurs throughmessages occur through messages.
The stub aggregates all the local messages and forwards
them to different threads
Md Johirul Islam 12/75 Apache Singa
Overview
http://find/http://goback/
8/18/2019 Distributed Deep Learning Techniques
13/75
System ArchitectureDistributed Training FrameWork
NeuralNetTraining
Summary
Logical ArchitectureParallelismCommunication
Singa Communication library consists of two components:MessageSocket
Md Johirul Islam 13/75 Apache Singa
Overview
http://goforward/http://find/http://goback/
8/18/2019 Distributed Deep Learning Techniques
14/75
System ArchitectureDistributed Training FrameWork
NeuralNetTraining
Summary
Logical ArchitectureParallelismCommunication
Message header contains the Sender and Receiver IDs.
The sender and receiver id comprises of the group id andworker/server id.
The stub forwards messages seeing these id in theaddress table.
Md Johirul Islam 14/75 Apache Singa
Overview
http://goforward/http://find/http://goback/
8/18/2019 Distributed Deep Learning Techniques
15/75
System ArchitectureDistributed Training FrameWork
NeuralNetTraining
Summary
Logical ArchitectureParallelismCommunication
Md Johirul Islam 15/75 Apache Singa
http://goforward/http://find/http://goback/
8/18/2019 Distributed Deep Learning Techniques
16/75
Overview
8/18/2019 Distributed Deep Learning Techniques
17/75
System ArchitectureDistributed Training FrameWork
NeuralNetTraining
Summary
Logical ArchitectureParallelismCommunication
Md Johirul Islam 17/75 Apache Singa
Overviewh
http://goforward/http://find/http://goback/
8/18/2019 Distributed Deep Learning Techniques
18/75
System ArchitectureDistributed Training FrameWork
NeuralNetTraining
Summary
Logical ArchitectureParallelismCommunication
Md Johirul Islam 18/75 Apache Singa
OverviewS t A hit t
http://find/
8/18/2019 Distributed Deep Learning Techniques
19/75
System ArchitectureDistributed Training FrameWork
NeuralNetTraining
Summary
Logical ArchitectureParallelismCommunication
Md Johirul Islam 19/75 Apache Singa
OverviewSystem Architecture
http://goforward/http://find/http://goback/
8/18/2019 Distributed Deep Learning Techniques
20/75
System ArchitectureDistributed Training FrameWork
NeuralNetTraining
Summary
Logical ArchitectureParallelismCommunication
Md Johirul Islam 20/75 Apache Singa
OverviewSystem Architecture
http://goforward/http://find/http://goback/
8/18/2019 Distributed Deep Learning Techniques
21/75
System ArchitectureDistributed Training FrameWork
NeuralNetTraining
Summary
Logical ArchitectureParallelismCommunication
Creating Address
Md Johirul Islam 21/75 Apache Singa
OverviewSystem Architecture
http://find/
8/18/2019 Distributed Deep Learning Techniques
22/75
System ArchitectureDistributed Training FrameWork
NeuralNetTraining
Summary
Logical ArchitectureParallelismCommunication
Create Address
Md Johirul Islam 22/75 Apache Singa
OverviewSystem Architecture
http://goforward/http://find/http://goback/
8/18/2019 Distributed Deep Learning Techniques
23/75
System ArchitectureDistributed Training FrameWork
NeuralNetTraining
Summary
Logical ArchitectureParallelismCommunication
Sockets
There are two types of sockets: Dealer Socket and RouterSocket
The communication between dealers and router areasynchronous.
The Basic functions of Sockets are to send and receivemessages.
Md Johirul Islam 23/75 Apache Singa
OverviewSystem Architecture
L i l A hi
http://find/
8/18/2019 Distributed Deep Learning Techniques
24/75
yDistributed Training FrameWork
NeuralNetTraining
Summary
Logical ArchitectureParallelismCommunication
Md Johirul Islam 24/75 Apache Singa
OverviewSystem Architecture
L i l A hit t
http://goforward/http://find/http://goback/
8/18/2019 Distributed Deep Learning Techniques
25/75
yDistributed Training FrameWork
NeuralNetTraining
Summary
Logical ArchitectureParallelismCommunication
Poller
Md Johirul Islam 25/75 Apache Singa
OverviewSystem Architecture
Logical Architecture
http://find/
8/18/2019 Distributed Deep Learning Techniques
26/75
Distributed Training FrameWorkNeuralNet
TrainingSummary
Logical ArchitectureParallelismCommunication
A poller class provides the asynchronous communicationbetween the dealers and the Routers
One can register a set of Socket Interface Objects with apoller instance via calling add method and then calling waitmethod of this poll object to wait for the registeredSocketInterface to be ready for sending and receivingmessages
Md Johirul Islam 26/75 Apache Singa
OverviewSystem Architecture
Logical Architecture
http://find/http://goback/
8/18/2019 Distributed Deep Learning Techniques
27/75
Distributed Training FrameWorkNeuralNet
TrainingSummary
Logical ArchitectureParallelismCommunication
Md Johirul Islam 27/75 Apache Singa
OverviewSystem Architecture
Logical Architecture
http://goforward/http://find/http://goback/
8/18/2019 Distributed Deep Learning Techniques
28/75
Distributed Training FrameWorkNeuralNet
TrainingSummary
Logical ArchitectureParallelismCommunication
In Singa the Dealer Socket can connect to only one RouterSocket.
The connection is set up by connecting the dealer socketto the end point of the router socket. A router Socket can connect to one or more Dealer socket.Upon receiving a message the router forwards it to theappropriate dealer according to the Reciever ID of themessage.
Md Johirul Islam 28/75 Apache Singa
OverviewSystem Architecture
Logical Architecture
http://find/
8/18/2019 Distributed Deep Learning Techniques
29/75
Distributed Training FrameWorkNeuralNet
TrainingSummary
Logical ArchitectureParallelismCommunication
Md Johirul Islam 29/75 Apache Singa
OverviewSystem Architecture
Di ib d T i i F W kLogical Architecture
http://find/
8/18/2019 Distributed Deep Learning Techniques
30/75
Distributed Training FrameWorkNeuralNet
TrainingSummary
gParallelismCommunication
Md Johirul Islam 30/75 Apache Singa
OverviewSystem Architecture
Distributed Training FrameWork Overview
http://find/
8/18/2019 Distributed Deep Learning Techniques
31/75
Distributed Training FrameWorkNeuralNet
TrainingSummary
OverviewTypes of Topology
Singa Cluster topology support different distributed trainingframeworks.
The Cluster topology of Singa is congured in the clustereld of JobProto
Md Johirul Islam 31/75 Apache Singa
OverviewSystem Architecture
Distributed Training FrameWork Overview
http://goforward/http://find/http://goback/
8/18/2019 Distributed Deep Learning Techniques
32/75
Distributed Training FrameWorkNeuralNet
TrainingSummary
OverviewTypes of Topology
Md Johirul Islam 32/75 Apache Singa
OverviewSystem Architecture
Distributed Training FrameWork Overview
http://find/
8/18/2019 Distributed Deep Learning Techniques
33/75
Distributed Training FrameWorkNeuralNet
TrainingSummary
OverviewTypes of Topology
Md Johirul Islam 33/75 Apache Singa
OverviewSystem Architecture
Distributed Training FrameWork Overview
http://find/
8/18/2019 Distributed Deep Learning Techniques
34/75
Distributed Training FrameWorkNeuralNet
TrainingSummary
OverviewTypes of Topology
SandBlaster
This is a synchronous framework used by Google Brain. A single server group is launched to handle all requestsfrom workers. A worker computes on its partition of themodel, and only communicates with servers handlingrelated parameters.
Md Johirul Islam 34/75 Apache Singa
OverviewSystem Architecture
Distributed Training FrameWork Overview
http://find/
8/18/2019 Distributed Deep Learning Techniques
35/75
gNeuralNet
TrainingSummary
Types of Topology
Figure: SandBlaster topology
Md Johirul Islam 35/75 Apache Singa
OverviewSystem Architecture
Distributed Training FrameWork Overview
http://find/
8/18/2019 Distributed Deep Learning Techniques
36/75
gNeuralNet
TrainingSummary
Types of Topology
AllReduce
This is a synchronous framework used by BaiduDeepImage
We bind each worker with a server on the same node, sothat each node is responsible for maintaining a partition ofparameters and collecting updates from all other nodes
Md Johirul Islam 36/75 Apache Singa
OverviewSystem Architecture
Distributed Training FrameWork Overview
http://goforward/http://find/http://goback/
8/18/2019 Distributed Deep Learning Techniques
37/75
NeuralNetTraining
Summary
Types of Topology
Figure: AllReduce topology
Md Johirul Islam 37/75 Apache Singa
OverviewSystem Architecture
Distributed Training FrameWork Overview
http://find/
8/18/2019 Distributed Deep Learning Techniques
38/75
NeuralNetTraining
Summary
Types of Topology
Downpour
This is a asynchronous framework used by Google Brain.
Figure: Downpour topology
Md Johirul Islam 38/75 Apache Singa
http://find/
8/18/2019 Distributed Deep Learning Techniques
39/75
8/18/2019 Distributed Deep Learning Techniques
40/75
OverviewSystem Architecture
Distributed Training FrameWorkN lN
OverviewTypes of Neural NetworkL
8/18/2019 Distributed Deep Learning Techniques
41/75
NeuralNetTraining
Summary
LayerParam
Md Johirul Islam 41/75 Apache Singa
OverviewSystem ArchitectureDistributed Training FrameWork
Ne ralNet
OverviewTypes of Neural NetworkLa er
http://goforward/http://find/http://goback/
8/18/2019 Distributed Deep Learning Techniques
42/75
NeuralNetTraining
Summary
LayerParam
NeuralNet represents a user neural network model
We have to convert neural net into conguration NeuralNet Users congure NeuralNet by listing all layers of the neuralnet and specifying each layer source layers names
Md Johirul Islam 42/75 Apache Singa
OverviewSystem ArchitectureDistributed Training FrameWork
NeuralNet
OverviewTypes of Neural NetworkLayer
http://find/http://goback/
8/18/2019 Distributed Deep Learning Techniques
43/75
NeuralNetTraining
Summary
LayerParam
Feed Forward
They do not have any cycles Example: MLP,CNN
Md Johirul Islam 43/75 Apache Singa
OverviewSystem ArchitectureDistributed Training FrameWork
NeuralNet
OverviewTypes of Neural NetworkLayer
http://find/http://goback/
8/18/2019 Distributed Deep Learning Techniques
44/75
NeuralNetTraining
Summary
LayerParam
Figure: A Simple MLP
Md Johirul Islam 44/75 Apache Singa
OverviewSystem ArchitectureDistributed Training FrameWork
NeuralNet
OverviewTypes of Neural NetworkLayer
http://find/
8/18/2019 Distributed Deep Learning Techniques
45/75
NeuralNetTraining
Summary
LayerParam
Energy Models In energy models the connections are undirected To convert these models into NeuralNet we have to replaceeach undirected connection with two directed connections
Md Johirul Islam 45/75 Apache Singa
OverviewSystem ArchitectureDistributed Training FrameWork
NeuralNet
OverviewTypes of Neural NetworkLayer
http://find/
8/18/2019 Distributed Deep Learning Techniques
46/75
eu a etTraining
Summary
ayeParam
RNN Models
For recurrent neural networks rst step would be to unrollthe recurrent layer
Md Johirul Islam 46/75 Apache Singa
OverviewSystem ArchitectureDistributed Training FrameWork
NeuralNet
OverviewTypes of Neural NetworkLayer
http://goforward/http://find/http://goback/
8/18/2019 Distributed Deep Learning Techniques
47/75
TrainingSummary
yParam
Layer is core abstraction in SINGA It performs a variety of feature transformation to obtainhigh level features.
Md Johirul Islam 47/75 Apache Singa
OverviewSystem ArchitectureDistributed Training FrameWork
NeuralNet
OverviewTypes of Neural NetworkLayer
http://find/
8/18/2019 Distributed Deep Learning Techniques
48/75
TrainingSummary
Param
Md Johirul Islam 48/75 Apache Singa
OverviewSystem ArchitectureDistributed Training FrameWork
NeuralNet
OverviewTypes of Neural NetworkLayer
http://find/
8/18/2019 Distributed Deep Learning Techniques
49/75
TrainingSummary
Param
Built in Layers
Input Layers: for loading data from HDFS, DISK orNetwork into memory
Neuron Layers: For feature transformation e.gconvolution, pooling, dropout Loss Layers: for measuring training objective loss, e.g.Cross Entropy loss, Euclidean Loss
Output Layers: For putting the output of prediction intoDISK, HDS etc.
Connection Layers: For connecting partitions whenNeuralNet is partitioned.
Md Johirul Islam 49/75 Apache Singa
OverviewSystem ArchitectureDistributed Training FrameWork
NeuralNet
OverviewTypes of Neural NetworkLayer
http://find/
8/18/2019 Distributed Deep Learning Techniques
50/75
TrainingSummary
Param
Input Layers
A base layer for for loading data from data store It has different subclasses SingleLabelRecordLayer,RecordInputLayer, CSVInputLayer, ImagePreprocessLayer
and many others.
Md Johirul Islam 50/75 Apache Singa
OverviewSystem ArchitectureDistributed Training FrameWork
NeuralNet
OverviewTypes of Neural NetworkLayer
http://find/http://goback/
8/18/2019 Distributed Deep Learning Techniques
51/75
TrainingSummary
Param
Output Layers
This layer gets data from its source layer and converts itinto records of type RecordProto. Records are written as(key,value) tuples into Store.
Md Johirul Islam 51/75 Apache Singa
OverviewSystem ArchitectureDistributed Training FrameWork
NeuralNetT i i
OverviewTypes of Neural NetworkLayerP
http://find/
8/18/2019 Distributed Deep Learning Techniques
52/75
TrainingSummary
Param
Neuron Layer
They manipulate feature transformation ConvolutionLayer: conducts convolution transformation.
Md Johirul Islam 52/75 Apache Singa
http://find/
8/18/2019 Distributed Deep Learning Techniques
53/75
OverviewSystem Architecture
Distributed Training FrameWorkNeuralNet
Training
OverviewTypes of Neural NetworkLayerParam
8/18/2019 Distributed Deep Learning Techniques
54/75
TrainingSummary
Param
ConnectionLayer
ConcateLayer: connects more than one source layers toconcatenate their feature blob along given dimension
SliceLayer: connects to more than one destination layersto slice its feature blob along given dimension
SplitLayer: connects to more than one destination layers
to replicate its feature blob
Md Johirul Islam 54/75 Apache Singa
OverviewSystem Architecture
Distributed Training FrameWorkNeuralNet
Training
OverviewTypes of Neural NetworkLayerParam
http://find/
8/18/2019 Distributed Deep Learning Techniques
55/75
TrainingSummary
Param
Md Johirul Islam 55/75 Apache Singa
OverviewSystem Architecture
Distributed Training FrameWorkNeuralNet
Training
OverviewTypes of Neural NetworkLayerParam
http://goforward/http://find/http://goback/
8/18/2019 Distributed Deep Learning Techniques
56/75
TrainingSummary
Param
Base Layer Class
Fields:
Methods:
Md Johirul Islam 56/75 Apache Singa
OverviewSystem Architecture
Distributed Training FrameWorkNeuralNet
Training
OverviewTypes of Neural NetworkLayerParam
http://find/
8/18/2019 Distributed Deep Learning Techniques
57/75
TrainingSummary
Param
Creating Custom Layer
Md Johirul Islam 57/75 Apache Singa
OverviewSystem Architecture
Distributed Training FrameWorkNeuralNet
Training
OverviewTypes of Neural NetworkLayerParam
http://goforward/http://find/http://goback/
8/18/2019 Distributed Deep Learning Techniques
58/75
gSummary
Creating Custom Layer
Md Johirul Islam 58/75 Apache Singa
OverviewSystem Architecture
Distributed Training FrameWorkNeuralNet
Training
OverviewTypes of Neural NetworkLayerParam
http://find/
8/18/2019 Distributed Deep Learning Techniques
59/75
Summary
Creating Custom Layer
Md Johirul Islam 59/75 Apache Singa
OverviewSystem Architecture
Distributed Training FrameWorkNeuralNet
Training
OverviewTypes of Neural NetworkLayerParam
http://find/
8/18/2019 Distributed Deep Learning Techniques
60/75
Summary
Creating Custom Layer
Md Johirul Islam 60/75 Apache Singa
OverviewSystem Architecture
Distributed Training FrameWorkNeuralNet
Training
OverviewTypes of Neural NetworkLayerParam
http://goforward/http://find/http://goback/
8/18/2019 Distributed Deep Learning Techniques
61/75
Summary
A Param object in SINGA represents a set of parameterse.g weight matrix or a bias vector congured inside a layerconguration
Md Johirul Islam 61/75 Apache Singa
OverviewSystem Architecture
Distributed Training FrameWorkNeuralNet
Training
OverviewTypes of Neural NetworkLayerParam
http://find/
8/18/2019 Distributed Deep Learning Techniques
62/75
Summary
Different Parameter Types
Md Johirul Islam 62/75 Apache Singa
OverviewSystem Architecture
Distributed Training FrameWorkNeuralNet
TrainingS
OverviewTypes of Neural NetworkLayerParam
http://find/
8/18/2019 Distributed Deep Learning Techniques
63/75
Summary
Creating Custom Parameter Type
Md Johirul Islam 63/75 Apache Singa
OverviewSystem Architecture
Distributed Training FrameWorkNeuralNet
TrainingS
OverviewTypes of Neural NetworkLayerParam
http://find/
8/18/2019 Distributed Deep Learning Techniques
64/75
Summary
Creating Custom Parameter Type
Md Johirul Islam 64/75 Apache Singa
OverviewSystem Architecture
Distributed Training FrameWorkNeuralNet
TrainingSummary
TrainOneBatchUpdater
http://find/
8/18/2019 Distributed Deep Learning Techniques
65/75
Summary
For each SGD iteration every worker calls theTraineOneBatch function to compute gradients ofparameters associated with local layers.
SINGA implemented two algorithms for the TrainOneBatchBP or BackPropagation: Used By Feed forward and RNN
modelsCD or Contrastive Divergence: used by energy models
Md Johirul Islam 65/75 Apache Singa
OverviewSystem Architecture
Distributed Training FrameWorkNeuralNet
TrainingSummary
TrainOneBatchUpdater
http://find/
8/18/2019 Distributed Deep Learning Techniques
66/75
Summary
Implementing new Algorithms
To implement a new algorithm for TrainOneBatch we haveto create a subclass of Worker
Md Johirul Islam 66/75 Apache Singa
OverviewSystem Architecture
Distributed Training FrameWorkNeuralNet
TrainingSummary
TrainOneBatchUpdater
http://find/
8/18/2019 Distributed Deep Learning Techniques
67/75
Summary
Implementing new Algorithms
Md Johirul Islam 67/75 Apache Singa
OverviewSystem Architecture
Distributed Training FrameWorkNeuralNet
TrainingSummary
TrainOneBatchUpdater
http://find/
8/18/2019 Distributed Deep Learning Techniques
68/75
Summary
Implementing new Algorithm
Md Johirul Islam 68/75 Apache Singa
OverviewSystem Architecture
Distributed Training FrameWorkNeuralNet
TrainingSummary
TrainOneBatchUpdater
http://find/
8/18/2019 Distributed Deep Learning Techniques
69/75
y
Every Server in SINGA has an updater instance There are many updaters all of which are subclasses ofUpdater class
The base Updater implements the Vanilla SGD Algorithm
Md Johirul Islam 69/75 Apache Singa
OverviewSystem Architecture
Distributed Training FrameWorkNeuralNet
TrainingSummary
TrainOneBatchUpdater
http://find/
8/18/2019 Distributed Deep Learning Techniques
70/75
Learning Rate
There are different change methods like kFixed, kLinear,kExponential, kInverseT, kStep, kFixedStep
Md Johirul Islam 70/75 Apache Singa
OverviewSystem Architecture
Distributed Training FrameWorkNeuralNet
TrainingSummary
TrainOneBatchUpdater
http://find/
8/18/2019 Distributed Deep Learning Techniques
71/75
For different change methods different conguration wouldbe used.
Md Johirul Islam 71/75 Apache Singa
OverviewSystem Architecture
Distributed Training FrameWorkNeuralNet
TrainingSummary
TrainOneBatchUpdater
http://find/
8/18/2019 Distributed Deep Learning Techniques
72/75
Implement Custom Updater
Figure: Base Updater Class
Md Johirul Islam 72/75 Apache Singa
OverviewSystem Architecture
Distributed Training FrameWorkNeuralNet
TrainingSummary
TrainOneBatchUpdater
http://find/
8/18/2019 Distributed Deep Learning Techniques
73/75
Implement Custom Updater
Md Johirul Islam 73/75 Apache Singa
http://find/
8/18/2019 Distributed Deep Learning Techniques
74/75
OverviewSystem Architecture
Distributed Training FrameWorkNeuralNet
TrainingSummary
8/18/2019 Distributed Deep Learning Techniques
75/75
We can use SINGA without a much programmingexperience.
To get our custom layers,parameters, algorithms we need
to change the code. Apache SINGA still in development phase. A lot of featuresare being added very soon.
Currently it has Python Binding following Keras.
It currently supports training on GPU
Md Johirul Islam 75/75 Apache Singa
http://find/