+ All Categories
Home > Documents > CS-E4050 - Deep Learning Session 3: Theano · 2016-11-02 · Table of Contents Theano: basic...

CS-E4050 - Deep Learning Session 3: Theano · 2016-11-02 · Table of Contents Theano: basic...

Date post: 15-Jul-2020
Category:
Upload: others
View: 5 times
Download: 1 times
Share this document with a friend
32
CS-E4050 - Deep Learning Session 3: Theano Jyri Kivinen Aalto University 22 September 2015 Jyri Kivinen (Aalto University) CS-E4050 - Deep Learning Session 3: Theano 22 September 2015 1 / 32
Transcript
Page 1: CS-E4050 - Deep Learning Session 3: Theano · 2016-11-02 · Table of Contents Theano: basic information, use in the course, computational resources Theano setup test Tutorials and

CS-E4050 - Deep LearningSession 3: Theano

Jyri Kivinen

Aalto University

22 September 2015

Jyri Kivinen (Aalto University) CS-E4050 - Deep Learning Session 3: Theano 22 September 2015 1 / 32

Page 2: CS-E4050 - Deep Learning Session 3: Theano · 2016-11-02 · Table of Contents Theano: basic information, use in the course, computational resources Theano setup test Tutorials and

Table of Contents

Theano: basic information, use in the course, computationalresources

Theano setup test

Tutorials and coding

Home exercises

Jyri Kivinen (Aalto University) CS-E4050 - Deep Learning Session 3: Theano 22 September 2015 2 / 32

Page 3: CS-E4050 - Deep Learning Session 3: Theano · 2016-11-02 · Table of Contents Theano: basic information, use in the course, computational resources Theano setup test Tutorials and

Theano: basic information

I http://deeplearning.net/software/theano/

[Welcome]

I http://deeplearning.net/software/theano/

introduction.html [Theano at a Glance]

Jyri Kivinen (Aalto University) CS-E4050 - Deep Learning Session 3: Theano 22 September 2015 3 / 32

Page 4: CS-E4050 - Deep Learning Session 3: Theano · 2016-11-02 · Table of Contents Theano: basic information, use in the course, computational resources Theano setup test Tutorials and

The use of Theano in the course

I There will be several computer experiments duringthe course

I They should be done in Python and the use ofTheano is encouraged (some is enforced)

I Neural network toolboxes or libraries are not allowed(e.g. Blocks, PyLearn, Lasagne)

I This session will provide a short tutorial session onTheano

I Before that, we will describe the computationalresources we have for the course, and do somechecks.

Jyri Kivinen (Aalto University) CS-E4050 - Deep Learning Session 3: Theano 22 September 2015 4 / 32

Page 5: CS-E4050 - Deep Learning Session 3: Theano · 2016-11-02 · Table of Contents Theano: basic information, use in the course, computational resources Theano setup test Tutorials and

Theano at Aalto

I Aalto IT Services has installed Theano onto most if not allLinux classroom computers in the Maari and in the Kandibuilding (https://inside.aalto.fi/display/ITServices/IT+facilities+at+Otaniemi), and in thisclassroom. Thanks to Aalto IT services for doing theinstallations for us, awesome job!

I The computers in the Maari and in the Kandi building haveGPUs suitable for GPGPU-processing with Theano.

I Remote use of them is possible in general, but extreme care isneeded, because someone might be using the desktops locally!(How would you like your user experience to be there?)

Jyri Kivinen (Aalto University) CS-E4050 - Deep Learning Session 3: Theano 22 September 2015 5 / 32

Page 6: CS-E4050 - Deep Learning Session 3: Theano · 2016-11-02 · Table of Contents Theano: basic information, use in the course, computational resources Theano setup test Tutorials and

Setup in this session

I For the most part, the computations are expected to be doneusing only CPUs; some GPGPU computation will be done.

I For doing GPGPU computations (and to speed up things),everyone will be provided with a hostname of a remotecomputer allowing such; a list should be circulating, markdown your hostname.

I The use will be remotely via ssh (as may often be the case inpractice), and you need to take care not to disturb the localusers; use the ’-X’-flag to allow X11-forwarding (get graphicaloutput).

I We are expecting to use the same procedure on the othersessions here.

Jyri Kivinen (Aalto University) CS-E4050 - Deep Learning Session 3: Theano 22 September 2015 6 / 32

Page 7: CS-E4050 - Deep Learning Session 3: Theano · 2016-11-02 · Table of Contents Theano: basic information, use in the course, computational resources Theano setup test Tutorials and

Table of Contents

Theano: basic information, use in the course, computationalresources

Theano setup test

Tutorials and coding

Home exercises

Jyri Kivinen (Aalto University) CS-E4050 - Deep Learning Session 3: Theano 22 September 2015 7 / 32

Page 8: CS-E4050 - Deep Learning Session 3: Theano · 2016-11-02 · Table of Contents Theano: basic information, use in the course, computational resources Theano setup test Tutorials and

Let’s test our setups

1. Local desktop Theano check: Start terminal, type to havecommand

% THEANO_FLAGS=mode=FAST_RUN,device=cpu,floatX=float32,\

> nvcc.flags=-D_FORCE_INLINES python -c ’import theano’

2. Remote desktop Theano checks: Connect to remote hostusing ssh (via kosh or lyta). Then do:

2.1 CPU computation check: similarly to the above.2.2 GPU computation check: same as above but with

device=gpu

Report any problems; it does not matter if you get ’(CNMeM is

disabled, cuDNN not available)’. If successful, disconnect.

Jyri Kivinen (Aalto University) CS-E4050 - Deep Learning Session 3: Theano 22 September 2015 8 / 32

Page 9: CS-E4050 - Deep Learning Session 3: Theano · 2016-11-02 · Table of Contents Theano: basic information, use in the course, computational resources Theano setup test Tutorials and

Table of Contents

Theano: basic information, use in the course, computationalresources

Theano setup test

Tutorials and coding

Home exercises

Jyri Kivinen (Aalto University) CS-E4050 - Deep Learning Session 3: Theano 22 September 2015 9 / 32

Page 10: CS-E4050 - Deep Learning Session 3: Theano · 2016-11-02 · Table of Contents Theano: basic information, use in the course, computational resources Theano setup test Tutorials and

Guided revisit to Theano-tutorials

1. http://deeplearning.net/software/theano/

tutorial/index.html (Basics: BabySteps-Algebra, More Examples)

2. ’Introduction to Theano’-tutorial by Pascal Lamblinat Deep Learning Summer School, Montreal 2016:http://videolectures.net/

deeplearning2016_lamblin_theano/

Jyri Kivinen (Aalto University) CS-E4050 - Deep Learning Session 3: Theano 22 September 2015 10 / 32

Page 11: CS-E4050 - Deep Learning Session 3: Theano · 2016-11-02 · Table of Contents Theano: basic information, use in the course, computational resources Theano setup test Tutorials and

Onto coding: first steps together

I Starting Python with suitable Theano-configuration

I Importing libraries (Theano,NumPy,Matplotlib)

I Simple use of the libraries.

Jyri Kivinen (Aalto University) CS-E4050 - Deep Learning Session 3: Theano 22 September 2015 11 / 32

Page 12: CS-E4050 - Deep Learning Session 3: Theano · 2016-11-02 · Table of Contents Theano: basic information, use in the course, computational resources Theano setup test Tutorials and

Function differentiation: manually

I Function will be differentiated with respect to someinput variables; function is provided by the teacher.

I We will first do the maths

I Then we will implement the gradient computationusing Theano without using automaticdifferentiation (the theano.tensor.grad-function).

Jyri Kivinen (Aalto University) CS-E4050 - Deep Learning Session 3: Theano 22 September 2015 12 / 32

Page 13: CS-E4050 - Deep Learning Session 3: Theano · 2016-11-02 · Table of Contents Theano: basic information, use in the course, computational resources Theano setup test Tutorials and

Function differentiation: automagically(via the theano.tensor.grad)

I We will first go through a tutorial:http://deeplearning.net/software/theano/

tutorial/gradients.html

I Then we implement the gradient computation usingthe theano.tensor.grad-function.

Jyri Kivinen (Aalto University) CS-E4050 - Deep Learning Session 3: Theano 22 September 2015 13 / 32

Page 14: CS-E4050 - Deep Learning Session 3: Theano · 2016-11-02 · Table of Contents Theano: basic information, use in the course, computational resources Theano setup test Tutorials and

Function differentiation: comparison ofthe results, visualization

Assuming appropriate input data (and variables):

I Measure difference in the gradient computations.

I Visualize the function and the gradients.

Differences between the gradient evaluations should besmall.

Jyri Kivinen (Aalto University) CS-E4050 - Deep Learning Session 3: Theano 22 September 2015 14 / 32

Page 15: CS-E4050 - Deep Learning Session 3: Theano · 2016-11-02 · Table of Contents Theano: basic information, use in the course, computational resources Theano setup test Tutorials and

XOR

I Implement the XOR-solver network as described inSection 6.1 of Goodfellow, Bengio, and Courville(2016); URL: http://www.deeplearningbook.org/contents/mlp.html.

I Reproduce the results producing a visualizing similarto as in the Figure 6.1.

I Compute the partial derivatives of it with respect tothe model parameters; get the parameter gradient.

I Evaluate the gradient on (the) data, check theresult.

Jyri Kivinen (Aalto University) CS-E4050 - Deep Learning Session 3: Theano 22 September 2015 15 / 32

Page 16: CS-E4050 - Deep Learning Session 3: Theano · 2016-11-02 · Table of Contents Theano: basic information, use in the course, computational resources Theano setup test Tutorials and

15 minutes break

Jyri Kivinen (Aalto University) CS-E4050 - Deep Learning Session 3: Theano 22 September 2015 16 / 32

Page 17: CS-E4050 - Deep Learning Session 3: Theano · 2016-11-02 · Table of Contents Theano: basic information, use in the course, computational resources Theano setup test Tutorials and

Logistic regression: Tutorial revisit,implementation

I Logistic regression tutorial in http://deeplearning.

net/software/theano/tutorial/examples.html

I Implementation without using shared variables.

Jyri Kivinen (Aalto University) CS-E4050 - Deep Learning Session 3: Theano 22 September 2015 17 / 32

Page 18: CS-E4050 - Deep Learning Session 3: Theano · 2016-11-02 · Table of Contents Theano: basic information, use in the course, computational resources Theano setup test Tutorials and

Further tutorial visits

Deep Learning Tutorials: Familiarize with

I ’Classifying MNIST digits using Logistic Regression’http://deeplearning.net/tutorial/logreg.html

I ’Multilayer Perceptron’http://deeplearning.net/tutorial/mlp.html

Jyri Kivinen (Aalto University) CS-E4050 - Deep Learning Session 3: Theano 22 September 2015 18 / 32

Page 19: CS-E4050 - Deep Learning Session 3: Theano · 2016-11-02 · Table of Contents Theano: basic information, use in the course, computational resources Theano setup test Tutorials and

Tutorial (extension): logistic regression(network)

I Alternative 1: Logistic regression similar to as inhttp://deeplearning.net/software/theano/tutorial/

examples.html

I Alternative 2: Logistic regression similar to as inhttp://deeplearning.net/tutorial/logreg.html

I Let’s make the model deeper.

Jyri Kivinen (Aalto University) CS-E4050 - Deep Learning Session 3: Theano 22 September 2015 19 / 32

Page 20: CS-E4050 - Deep Learning Session 3: Theano · 2016-11-02 · Table of Contents Theano: basic information, use in the course, computational resources Theano setup test Tutorials and

Table of Contents

Theano: basic information, use in the course, computationalresources

Theano setup test

Tutorials and coding

Home exercises

Jyri Kivinen (Aalto University) CS-E4050 - Deep Learning Session 3: Theano 22 September 2015 20 / 32

Page 21: CS-E4050 - Deep Learning Session 3: Theano · 2016-11-02 · Table of Contents Theano: basic information, use in the course, computational resources Theano setup test Tutorials and

Home exercises:

I (Ideally for tomorrow) Read specific parts ofChapter 9 - Convolutional Networks: Introduction,9.11, 9.1, 9.2, 9.3, 9.7.

I Reported task: Implementation of andexperimentation with gradient descent - basedtraining of a feed-forward neural network (with noparameter tying/sharing) using Theano; details onthe next slide.

Jyri Kivinen (Aalto University) CS-E4050 - Deep Learning Session 3: Theano 22 September 2015 21 / 32

Page 22: CS-E4050 - Deep Learning Session 3: Theano · 2016-11-02 · Table of Contents Theano: basic information, use in the course, computational resources Theano setup test Tutorials and

Home exercises:Reported task: Implementation of and experimentation withgradient descent - based training of a feed-forward neural network(with no parameter tying/sharing) using Theano.

I Choose the details: data, network structure, etc.

I Use the theano.tensor.grad-function for gradient-computation

I Experiment with the approach providing the followingvisualization: objective function evaluated on the training dataas a function of the number of training epochs; could also plotfurther things such as learned network parameters, unitactivations (as a response to some data), (other) importanttraining diagnostics.

I Describe your approach and the results in the report (providingthe visualization(s) and most important lines of code).

Jyri Kivinen (Aalto University) CS-E4050 - Deep Learning Session 3: Theano 22 September 2015 22 / 32

Page 23: CS-E4050 - Deep Learning Session 3: Theano · 2016-11-02 · Table of Contents Theano: basic information, use in the course, computational resources Theano setup test Tutorials and

On home exercises that will be giventomorrow (contents may slightly change):

I Familiarize with the Conv. Neural Networks (LeNet)tutorial: http:

//deeplearning.net/tutorial/lenet.html.

I Reported task: Implement and experiment withgradient descent - based training of a convolutional(feed-forward) network; details on the next slide.

Jyri Kivinen (Aalto University) CS-E4050 - Deep Learning Session 3: Theano 22 September 2015 23 / 32

Page 24: CS-E4050 - Deep Learning Session 3: Theano · 2016-11-02 · Table of Contents Theano: basic information, use in the course, computational resources Theano setup test Tutorials and

On home exercises that will be giventomorrow (contents may slightly change):

Reported task: Implement and experiment with gradient descent -based training of a convolutional (feed-forward) network:

I Choose the details: data, network structure, etc., except: haveat least one ”convolutional” layer; have pooling.

I Experiment with the approach providing the followingvisualization: objective function evaluated on the training dataas a function of the number of training epochs; could also plotfurther things such as learned network parameters, unitactivations (as a response to some data), (other) importanttraining diagnostics.

I Provide a description of your approach and the results in thereport (providing the visualization(s) and most important linesof code).

Jyri Kivinen (Aalto University) CS-E4050 - Deep Learning Session 3: Theano 22 September 2015 24 / 32

Page 25: CS-E4050 - Deep Learning Session 3: Theano · 2016-11-02 · Table of Contents Theano: basic information, use in the course, computational resources Theano setup test Tutorials and

Exercises: Optimization for Deep LearningI For the Presenters: Implement training of a convolutional

feed-forward network via Stochastic Gradient Descent (SGD)with momentum. Using mini-batch size of 1, 100, and also thefull training data, do the following: plot the norm of theparameter gradient and the training error/main cost function(evaluated using the data to compute the gradient, and thefull training set including and excluding the data, eachnormalized with the number of datapoints in the evaluation)as a function of the number of parameter updates and also oftraining epochs; one epoch visits all training points once,assume in a shuffled manner between different epochs.

I For the Others: Implement RMSProp for training afeed-forward neural network. Plot the norm of the gradientand the training error on the full training set, as a function ofthe number of training epochs.

Jyri Kivinen (Aalto University) CS-E4050 - Deep Learning Session 3: Theano 22 September 2015 25 / 32

Page 26: CS-E4050 - Deep Learning Session 3: Theano · 2016-11-02 · Table of Contents Theano: basic information, use in the course, computational resources Theano setup test Tutorials and

Exercises: Optimization for Deep Learning

I For Everyone: Describe your approach and theresults in the report (providing the visualization(s)and most important lines of code).

Jyri Kivinen (Aalto University) CS-E4050 - Deep Learning Session 3: Theano 22 September 2015 26 / 32

Page 27: CS-E4050 - Deep Learning Session 3: Theano · 2016-11-02 · Table of Contents Theano: basic information, use in the course, computational resources Theano setup test Tutorials and

Further information and guidance onexercises and their presentation

I We have few copies of a Python-book (Python for DataAnalysis by Wes McKinney, O’Reilly, 2013 [2012]) available forthe exercise presenters to borrow. One book per group ishanded at the session where the exercise presentation task isgiven and is to be returned at the time of the presentation.

I The recommended presentation time is 45 minutes. Thatleaves 15 minutes break + 45 minutes time for the others (e.g.to finalize their exercises and get help on them).

Jyri Kivinen (Aalto University) CS-E4050 - Deep Learning Session 3: Theano 22 September 2015 27 / 32

Page 28: CS-E4050 - Deep Learning Session 3: Theano · 2016-11-02 · Table of Contents Theano: basic information, use in the course, computational resources Theano setup test Tutorials and

Further information and guidance onexercises and their presentation

Help on getting organized with the exercises and experiments:

I Read ’Principles of Research Code’ writing by Charles Suttonat http://www.theexclusive.org/2012/08/principles-of-research-code.html; see also the slides.

I An example layout: main project folder with subfolders: data,code, configFiles, submitFiles, results, experiments, exercises,presentations, reports; see next slide for some additionalinformation.

Jyri Kivinen (Aalto University) CS-E4050 - Deep Learning Session 3: Theano 22 September 2015 28 / 32

Page 29: CS-E4050 - Deep Learning Session 3: Theano · 2016-11-02 · Table of Contents Theano: basic information, use in the course, computational resources Theano setup test Tutorials and

On the example directory layout and setup

data holds input data to the algorithms (e.g. MNIST data).

code contains the main codes (which are Python-files).

configFiles configuration files for specifying (varying) experiment details (such ashyperparameter values), inputs to the main codes

results the experiment results are written to subdirectories specified in theconfiguration files; have matching names with the configuration file nameand the results directory name.

submitFiles shell-scripts for running the experiments; e.g. start an experiment usingsome Python-code with configurations read as input from a configurationfile; have matching naming with configuration files (in terms of the prefix,etc.)

experiments describes the experiments conducted, started by running the submitFiles

Jyri Kivinen (Aalto University) CS-E4050 - Deep Learning Session 3: Theano 22 September 2015 29 / 32

Page 30: CS-E4050 - Deep Learning Session 3: Theano · 2016-11-02 · Table of Contents Theano: basic information, use in the course, computational resources Theano setup test Tutorials and

Further tips

I See ’Informal guidelines on literature and exercisepresentation, and giving feedback on them’ underthe ’Materials’-page in MyCourses.

Jyri Kivinen (Aalto University) CS-E4050 - Deep Learning Session 3: Theano 22 September 2015 30 / 32

Page 31: CS-E4050 - Deep Learning Session 3: Theano · 2016-11-02 · Table of Contents Theano: basic information, use in the course, computational resources Theano setup test Tutorials and

ReferencesI Welcome – Theano 0.8.2 documentation.

URL: http://www.deeplearning.net/software/theano/

[page generated 19 Sep., 2016].I J. Bergstra, O. Breuleux, F. Bastien, P. Lamblin, R.

Pascanu, G. Desjardins, J. Turian, D. Warde-Farleyand Y. Bengio. Theano: A CPU and GPU MathExpression Compiler. In Proc., Python for ScientificComputing Conference (SciPy), 2010.

I Deep Learning Tutorials, Deep Learning 0.1documentation.URL: http://deeplearning.net/tutorial/[page generated 20 Sep., 2016].

Jyri Kivinen (Aalto University) CS-E4050 - Deep Learning Session 3: Theano 22 September 2015 31 / 32

Page 32: CS-E4050 - Deep Learning Session 3: Theano · 2016-11-02 · Table of Contents Theano: basic information, use in the course, computational resources Theano setup test Tutorials and

I I. Goodfellow, Y. Bengio, and A. Courville. Deeplearning. Book in preparation for MIT Press, 2016.URL: http://www.deeplearningbook.org.

I Pascal Lamblin. ’Introduction to Theano’-tutorial atthe Deep Learning Summer School, Montreal 2016.Videolectures.NET, URL: http://videolectures.net/deeplearning2016_lamblin_theano/ [pagegenerated 18 Sep., 2016].

I Wes McKinney. Python for Data Analysis. O’Reilly,2013 (2012).

I Theano Development Team. Theano: A Pythonframework for fast computation of mathematicalexpressions. arXiv e-prints, abs/1605.02688, May2016.

Jyri Kivinen (Aalto University) CS-E4050 - Deep Learning Session 3: Theano 22 September 2015 32 / 32


Recommended