Date post: | 21-Jan-2018 |
Category: |
Data & Analytics |
Upload: | felipe-almeida |
View: | 562 times |
Download: | 2 times |
Keras 2“You have just found Keras”
Felipe AlmeidaRio Machine Learning Meetup / June 2017
First Steps
1
Content● Intro● Neural Networks● Keras● Examples● Keras concepts● Resources
2
Intro● Neural nets are versatile, but there was a need for a simple
framework to design + experiment with them.
● Neural nets (particularly with multiple layers) need a lot of time to be trained
● Recent advances in algorithms (Layerwise-training, contrastive divergence, etc) and in hardware (leveraging GPUs for tensor operations), as well as the massive amounts of available data have made deep learning popular
3
Neural Networks● Generally speaking, neural networks are nonlinear machine
learning models.
● They can be used for supervised or unsupervised learning.
● Deep learning refers to training neural nets with multiple layers.○ They are more powerful but only if you have lots of data to train
them on.● Keras is used to create neural network models
4
Neural Networks - Sample Architectures
Source: neuralnetworksanddeeplearning.com 5
Source: neuralnetworksanddeeplearning.com 6
Neural Networks - Sample Architectures
Source: neuralnetworksanddeeplearning.com 7
Neural Networks - Sample Architectures
Source: neuralnetworksanddeeplearning.com 8
Neural Networks - Sample Architectures
Source: University of Bonn9
Neural Networks - Sample Architectures
Source: AI GitBook10
Neural Networks - Sample Architectures
Keras● Models created by Keras can be executed on a backend:
○ Tensorflow (default)○ Theano○ CNTK (Beta)○ MxNet (Beta)
● Keras has builtin GPU support with CUDA○ CUDA is a framework for using the GPU on Nvidia video cards
for mathematical (tensor) operations11
Keras● Keras is the de facto deep learning frontend
Sou
rce:
@fc
holle
t, Ju
n 3
2017
12
Keras● Keras is among the libraries supported by Apple’s CoreML
Source: @fchollet, Jun 5 2017
13
Example #1● The MNIST dataset contains 60,000 labelled handwritten digits (for
training) and 10,000 for testing.
14
Example #1● We can train a neural net to classify a digit’s pixels into one of the
10 digit classes:
NOTEBOOK - MNIST MLP
15
Example #2● The MNIST dataset can also be trained using multi-layer,
convolutional neural networks (CNNs).○ The results with a regular NN are already good, but it’s good to
show how to train a CNN
● NOTEBOOK - MNIST CNN
16
Example #2 - What are CNNs● While the model is being trained, let’s understand what a CNN
looks like and what it’s good for.
● CNNs use convolutional operations to extract features that are position invariant.○ In other words, they make it possible to train models that detect
features no matter what position they are in the input samples
17
Example #2 - What are CNNs● For this reason, they are often used for image classification:
18
Example #3● CNNs can also be used for text classification
○ In fact, they produce state-of-the-art results in tasks such as:■ Text classification■ Sentiment analysis
● Let’s train a CNN model to classify documents in the newsgroup_20 dataset
● NOTEBOOK IMDB CNN19
Keras: Models● The most important part of keras are models.
● Model = layers, loss and an optimizer
● These are the objects that you add Layers to, call compile() and fit() on.
● Models can be saved and checkpointed for later use
20
Keras: Layers● Layers are used to define what your architecture looks like
● Examples of layers are:○ Dense layers (this is the normal, fully-connected layer)○ Convolutional layers (applies convolution operations on the
previous layer)○ Pooling layers (used after convolutional layers)○ Dropout layers (these are used for regularization, to avoid
overfitting)21
Keras: Loss Functions● Loss functions are used to compare the network’s predicted output
with the real output, in each pass of the backpropagations algorithm○ Loss functions are used to tell the model how the weights
should be updated● Common loss functions are:
○ Mean squared error○ Cross-entropy○ etc.
22
Keras: Optimizers● Optimizers are strategies used to update the network’s weights in
the backpropagation algorithm.
● The most simple optimizer is the Stochastic Gradient Descent Algorithm (SGD), but there are many other you can choose, such as:○ RMSProp○ Adagrad
23
Keras: Optimizers● Most optimizers can be tuned using hyperparameters, such as:
○ The learning rate to use○ Whether or not to use momentum
24
Keras: CPU / GPU● If your computer has a good graphics card, it can be used to speed
up model training
● All models up to now were trained using the GPU.○ Let’s see what happens if we disable to the GPU, and force
keras to use the CPU instead.
25
Keras: Other information● Feature preprocessing
○ Although you can use any other method for feature preprocessing, keras has a couple of utilities to help, such as:■ To_categorical (to one-hot encode data)■ Text preprocessing utilities, such as tokenizing
26
Keras: Other information● You can integrate Keras models into a Scikit-learn Pipeline.
○ There are special wrapper functions available on Keras to help you implement the methods that are expected by a scikit-learn classifier, such as fit(), predict(), predict_proba(), etc.
○ You can also use things like scikit-learn’s grid_search, to do model selection on Keras models, to decide what are the best hyperparameters for a given task.
27
Keras: Other information● Nearly everything in Keras can be regularized. In addition to the
Dropout layer, there are all sorts of other regularizers available, such as:○ Weight regularizers○ Bias regularizers○ Activity regularizers
28
Resources● Keras Cheat Sheet by DataCamp
29