www.hdm-stuttgart.de
Deep Learning @ HdM 2018
UNDERSTANDING THE WORLD, BY LEARNING HOW TO MODEL IT
2Deep Learning @ HdM Stuttgart | Johannes Theodoridis | 09.08.2018
› Johannes Theodoridis
› Audiovisuelle Medien @ HdM
› Computer Science and Media @ HdM
› Exchange @ KTH Stockholm
› Currently working with Johannes Maucher on AI and ML @ HdM› Email: [email protected]
About me
deepart.io
(Image first slide: https://i.redd.it/2ag4n25oq02y.jpg)
3Deep Learning @ HdM Stuttgart | Johannes Theodoridis | 09.08.2018
IRGEN
What do you do?
DWA SM I T MEDIEN
4Deep Learning @ HdM Stuttgart | Johannes Theodoridis | 09.08.2018
IRGEN
What do you do?
DWA SM I T MEDIEN
5Deep Learning @ HdM Stuttgart | Johannes Theodoridis | 09.08.2018
6Deep Learning @ HdM Stuttgart | Johannes Theodoridis | 09.08.2018
What today is not about
But don’t be fooled!
Details matter in Deep Learning.
7Deep Learning @ HdM Stuttgart | Johannes Theodoridis | 09.08.2018
2017 in AI: Poker
Name Rank Results(inchips)
DongKim 1 -$85,649
DanielMacAulay 2 -$277,657
JimmyChou 3 -$522,857
JasonLes 4 -$880,087
Total: -$1,766,250
› Brains Vs. AI - January 2017 @ Rivers Casino Pittsburgh
› AI wins 20-day Heads-up, No-Limit Texas Holdém
tournament against 4 top-class human poker players.
› ~ 10ˆ161 different decision points in Texas hold’em.
› Infeasible to pre-compute a strategy for each of the
moves.
Libratus: The Superhuman AI for
No-Limit Poker[Brown, Sandholm – IJCAI 2017]
"I didn’t realize how good it was until today. I felt like I
was playing against someone who was cheating, like itcould see my cards. I’m not accusing it of cheating. It
was just that good.” – Dong Kim(Source: https://www.wired.com/2017/01/ai-conquer-poker-not-without-human-help/)
8Deep Learning @ HdM Stuttgart | Johannes Theodoridis | 09.08.2018
› 2016 AlphaGO
› learned from expert games + selfplay
› defeats Lee Sedol (world champion) 4:1
› 2017 AlphaGo Zero
› learned entirely on ist own
› defeats AlphaGo 5:0
2017 in AI: Board Games
(Credit: Photo courtesy of Google)
Mastering the game of Go with deep
neural networks and tree search[Silver et al. – Nature 2016]
Mastering the game of Go without
human knowledge[Silver et al. – Nature 2017]
9Deep Learning @ HdM Stuttgart | Johannes Theodoridis | 09.08.2018
› 2015
› 2017
2017 in AI: Video Games
Human-level control through
deep reinforcement learning[Mnih et al. – Nature 2015]
OpenAI bot wins 1vs1 against Dendi
in a best-of-three match.https://blog.openai.com/dota-2/
https://blog.openai.com/more-on-dota-2/https://openai.com/the-international/
10Deep Learning @ HdM Stuttgart | Johannes Theodoridis | 09.08.2018
› August 5, 2018
› Long time horizons: ~ 20000 Moves (Chess ~ 40, Go ~ 150)
› Action Space: ~1000 valid actions each tick (Chess ~35, Go ~250)
› Observation Space: 20,000 numbers representing all game information (Chess 70, Go 400)
› Learned via self play: “OpenAI Five plays 180 years worth of games against itself every day.“
› Hardware: Training is running on 256 GPUs and 128,000 CPU cores.
2018 in AI: Video Games
OpenAI Five wins 2 out of 3
games against a Semi-Pro Teamhttps://blog.openai.com/openai-five/
Images: blog.openai.com/
11Deep Learning @ HdM Stuttgart | Johannes Theodoridis | 09.08.2018
› Dermatologist-level classification of skin
cancer with deep neural networks [Esteva et al. – Nature 2017]
› Trained on 129,450 clinical images
› Performance on par when tested against
21 board-certified dermatologists
2017 in AI: Healthcare
12Deep Learning @ HdM Stuttgart | Johannes Theodoridis | 09.08.2018
› April 11, 2018 - FDA Permits Marketing of
First AI-based Medical Device: IDx – DR.
› Diagnostic system that autonomously analyzes images of the retina for signs of
diabetic retinopathy.
› “Machines can help the doctor make a
better diagnosis, but they are not good at
making medical decisions afterward.” [EyeNet: Artificial Intelligence: The Next Step in Diagnostics - American Academy of Ophthalmology (AAO), Nov 2017]
2018 in AI: Healthcare
Source: https://www.eyediagnosis.net
13Deep Learning @ HdM Stuttgart | Johannes Theodoridis | 09.08.2018
2017 in AI: Systems
› The Case for Learned Index Structures [Kraska et al. – arxiv 1712.01208]
› Replace B-Trees-Index or Hash-Index with
a Neural Network
› + 70% in speed
› + saving an order-of-magnitude in
memory (over several real-world data sets)
› Authors argue that “replacing core
components of a data management
system through learned models has far
reaching implications for future systems
designs”
14Deep Learning @ HdM Stuttgart | Johannes Theodoridis | 09.08.2018
› “I have a terrible confession to make. AI systems today suck“Yann LeCun at Brown University 2017
› “All of these AI systems we see, none of them is ‘real‘ AI“ Josh Tennenbaum at CCN 2017
Wait what?
15Deep Learning @ HdM Stuttgart | Johannes Theodoridis | 09.08.2018
› Strong AI (or Artificial General Intelligence AGI) - can solve every task.This is what everyone is worried about in the media, Singularity etc. but, we are not even close!
› Weak AI (or narrow AI) – can solve a specific task.This is everything you have seen so far. Works really well for some tasks like image and speech recognition.
A rough distinction
16Deep Learning @ HdM Stuttgart | Johannes Theodoridis | 09.08.2018
› The brain learns with an efficiency that none of our machine learning methods can match.
› Our supervised learning systems require large numbers of examples.
› Our reinforcement learning systems require millions of trials.
› That is why we don‘t have robots that are as agile as a cat or a rat.
› That is why we don‘t have dialog systems that have common sense.
› What is missing?
› Learning paradigms that build (predictive) models of the world through observation and action.
Why are we “not even close“ to AGI?
Slide copied from: Dr. Yann LeCun, "How Could Machines Learn as Efficiently as Animals and Humans?"
https://www.youtube.com/watch?v=uYwH4TSdVYs
17Deep Learning @ HdM Stuttgart | Johannes Theodoridis | 09.08.2018
› Machine Learning is the subfield of artificial intelligence concerned with programs that learn from experience.
[Russell and Norvig - Artificial intelligence: a modern approach]
What is Machine Learning?
18Deep Learning @ HdM Stuttgart | Johannes Theodoridis | 09.08.2018
› Task: Tell if there is an apple in the image
What is Machine Learning?
def contains_apple(image)
red_pixels = count(image.RED)
if red_pixels > 300:
return True
else
return False
YES NO
Does not scale
Approach 1: write code
Does scale: With enough compute power and training samples
Approach2: learn from data
MachineLearning
YES NO
19Deep Learning @ HdM Stuttgart | Johannes Theodoridis | 09.08.2018
› Traditional Pattern Recognition: Fixed/Handcrafted Feature Extractor
› Deep Learning: Representations are hierarchical and trained
What is Deep Learning?
Trainable
Classifier
Feature
Extractor
Trainable
Classifier
High-Level
Features
Mid-Level
Features
Low-Level
Features
Understanding Neural Networks
Through Deep Visualization [Yosinski et al. – ICML 2015]
Slide Credit: Yann LeCun
20Deep Learning @ HdM Stuttgart | Johannes Theodoridis | 09.08.2018
› Because of the labels we call this SUPERVISED LEARNING.› These labels need to be generated somehow (by humans mostly).
How do we train these things?
P T
Error
Predict Labels P
Calculate the error by comparing
predicted and true labels
Update the pipeline towards less error
Select a random
mini-batch of data
Training Data – Labeled by category
Label: Fruits
Label: Vehicles
21Deep Learning @ HdM Stuttgart | Johannes Theodoridis | 09.08.2018
› CNN architecture that was used by [Mnih et al. – Nature 2015]
to play Atari Games (Deep Q-Networks - DQN)
What is in the boxes?
Input:Current game screen
Convolutional Neural Network – CNN(note: no pooling layers in this architecture)
Output:Best action to choose
22Deep Learning @ HdM Stuttgart | Johannes Theodoridis | 09.08.2018
› RECEPTIVE FIELDS, BINOCULAR INTERACTION AND FUNCTIONAL
ARCHITECTURE IN THE CAT'S VISUAL
CORTEX [Hubel & Wiesel 1962]
A bit of CNN history: Thank you cats :)
AlexNet[Krizhevsky, Sutskever, Hinton 2012]
Neocognitron[Fukushima 1980]
LeNet-5[LeCun, Bengio, Haffner 1998]
Deep Learning
(Photo by Bertil Videt CC BY-SA 3.0)
Large Scale Visual Recognition
Challenge (ILSVRC)
› ½ Nobel Prize in Physiology or Medicine 1981: David H. Hubel and Torsten
N. Wiesel "for their discoveries concerning information processing in the visual system".
23Deep Learning @ HdM Stuttgart | Johannes Theodoridis | 09.08.2018
What does “deep“ mean?
Input
FC4096
FC4096
FC1000
softmax
conv64
conv64
maxpool
conv128
conv128
maxpool
conv256
conv256
maxpool
conv512
conv512
maxpool
conv512
conv512
conv512
conv512
maxpool
conv512
conv512VGG
[Simonyan, Zisserman 2014]
Slide Credit: Yann LeCun
GoogLeNet[Szegedy et al. 2014]
ResNet[He et al. 2015]
DenseNet[Huang et al. 2017]
24Deep Learning @ HdM Stuttgart | Johannes Theodoridis | 09.08.2018
› Image Classification Image Retrieval
› Machine Translation
Supervised Learning
ImageNet Classification with Deep Convolutional Neural Networks[Krizhevsky, Sutskever, Hinton 2012]
Convolutional Sequence to Sequence Learning[Gehring et al. 2017]
German: ”Sie stimmen zu”English: ”They agree”
25Deep Learning @ HdM Stuttgart | Johannes Theodoridis | 09.08.2018
› Image Caption Generation
Supervised Learning
Show, Attend and Tell: Neural Image Caption Generation with
Visual Attention[Xu et al. 2015]
26Deep Learning @ HdM Stuttgart | Johannes Theodoridis | 09.08.2018
› Instance Segmentation
Supervised Learning
Mask R-CNN[He et al. 2017]
27Deep Learning @ HdM Stuttgart | Johannes Theodoridis | 09.08.2018
› Instance Segmentation in traffic
Supervised Learning
Mask R-CNN[He et al. 2017]
(Source: 4K Mask RCNN COCO Object detection and segmentation #2
https://www.youtube.com/watch?v=OOT3UIXZztE )
28Deep Learning @ HdM Stuttgart | Johannes Theodoridis | 09.08.2018
› Pose Estimation
Supervised Learning
Mask R-CNN[He et al. 2017]
29Deep Learning @ HdM Stuttgart | Johannes Theodoridis | 09.08.2018
› Play SNES games (Bachelor Thesis @ HdM ) Learn Locomotion Behaviours @ DeepMind
Reinforcement Learning
Emergence of Locomotion Behaviours in Rich Environments[Heess et al. 2017] (Video: https://www.youtube.com/watch?v=hx_bgoTF7bs)
30Deep Learning @ HdM Stuttgart | Johannes Theodoridis | 09.08.2018
› Obstacles to AI› Learning models of the world
› Learning to reason and plan
Yann LeCun at CCN 2017
(but he made this point in many talks)
What are we missing?
31Deep Learning @ HdM Stuttgart | Johannes Theodoridis | 09.08.2018
› Image Caption Fails.
› The teddy doesn't fit into the brown suitcase because it's too
[small/large]. What is too [small/large]?Answers:The suitcase/the teddy. (Winograd Schemas)
› ”Tom picked up his bag and left the room”.
› These questions are easy for us because we have a model of the
world.
Common Sense Knowledge
(Sources: https://techcrunch.com/2016/11/08/shining-light-on-facebooks-ai-strategy/ ,
http://www.reactiongifs.com/wp-content/uploads/2013/02/nwld.gif , http://images.memes.com/meme/999039 )
32Deep Learning @ HdM Stuttgart | Johannes Theodoridis | 09.08.2018
› Common Sense is the ability to fill in the blanks› Filling in the visual field at the retinal blind spot.
› Filling in occluded images, missing segments in speech.
› Intuitive Physics + Intuitive Psychology
› track objects over time
› discount physically implausible trajectories
› distinguish animate agents from inanimate objects
› understand that other people have mental states like goals and beliefs
› Where can this come from? -> Unsupervised Learning
› Most of the learning performed by animals and humans is unsupervised. (no teacher)
› We learn how the world works by observing it.› We learn that the world is 3-dimensional.
› We learn object permanence.
› We build a model of the world through predictive unsupervised learning. (This predictive model gives us “common sense“)
Common Sense Knowledge
(Slide is composition from: Yann LeCun, "How Could Machines Learn as Efficiently as Animals and Humans?" https://www.youtube.com/watch?v=uYwH4TSdVYs
, Sources: Baby http://www.mommyshorts.com/wp-content/uploads/2014/09/6a0133f30ae399970b0192aa1b4c77970d-800wi.jpg , Retina by Jerry CrimsonMann CC-BY-SA 3.0)
33Deep Learning @ HdM Stuttgart | Johannes Theodoridis | 09.08.2018
› Task: Predict in which direction the Mikado sticks will fall
› Problem: Invariant prediction: The training samples are merely representatives
of a whole set of possible outputs (e.g. a manifold of outputs)
› We need to represent a distribution. But how do you represent a distribution
in high dimensional space?
› Solution (one): Energy-Based Unsupervised Learning› Idea: Take low value on data manifold, higher values everywhere else
Learning Predictive Forward Models of the world.
observation 1 observation 2 …
Y1
Y2
Slide Credit: Yann LeCun
Thx: Raphy for playing Mikado with me
34Deep Learning @ HdM Stuttgart | Johannes Theodoridis | 09.08.2018
› The Generator network will try to generate fake images that fool the discriminator.
› The Discriminator network will try to distinguish between a real and a generated image.
Generative Adversarial Networks (GAN) [Goodfellow et al. 2014]
Discriminator
(NeuralNetwork)
Real
FakeGenerator
(NeuralNetwork)
Realworld
images
”Noise”
35Deep Learning @ HdM Stuttgart | Johannes Theodoridis | 09.08.2018
› Generate bedrooms - 2016
Welcome to the GAN Zoo
Unsupervised Representation Learning with Deep Convolutional
Generative Adversarial Networks[Radford et al. ICLR 2016]
36Deep Learning @ HdM Stuttgart | Johannes Theodoridis | 09.08.2018
› Generate bedrooms, buildings, cats - 2017
GAN Zoo
StackGAN++: Realistic Image Synthesis with Stacked Generative
Adversarial Networks[Zhang et al. 2017]
37Deep Learning @ HdM Stuttgart | Johannes Theodoridis | 09.08.2018
› Generate celebrities 2018
GAN Zoo
IntroVAE: Introspective VariationalAutoencoders for Photographic
Image Synthesis[Huang et al. 2018]
Progressive Growing of GANs for Improved Quality, Stability, and
Variation[Karras et al. 2018]
High resolution: 1024 x 1024 pixel
38Deep Learning @ HdM Stuttgart | Johannes Theodoridis | 09.08.2018
GAN Zoo
› Face arithmetic
StarGAN: Unified Generative Adversarial Networks for Multi-Domain
Image-to-Image Translation[Choi et al. 2017]
Unsupervised Representation Learning with Deep Convolutional
Generative Adversarial Networks[Radford et al. ICLR 2016]
39Deep Learning @ HdM Stuttgart | Johannes Theodoridis | 09.08.2018
› Next Frame Prediction
GAN Zoo
Deep multi-scale video prediction beyond mean square error[Mathieu et al. 2017]
Predicting Deeper into the Future of Semantic Segmentation[Luc and Neverova et al. 2017]
(Sources: https://cs.nyu.edu/~mathieu/iclr2016.html, https://github.com/facebookresearch/SegmPred )
40Deep Learning @ HdM Stuttgart | Johannes Theodoridis | 09.08.2018
› Image-to-Image translation
GAN Zoo
Image-to-Image Translation with Conditional Adversarial
Networks[Isola et al. 2017]
Unpaired Image-to-Image Translation using Cycle-Consistent
Adversarial Networks[Zhu and Park et al. 2017]
Image-to-Image Demohttps://affinelayer.com/pixsrv/
41Deep Learning @ HdM Stuttgart | Johannes Theodoridis | 09.08.2018
GAN Zoo
› Text-to-Image translation
StackGAN++: Realistic Image Synthesis with Stacked Generative
Adversarial Networks[Zhang et al. 2017]
42Deep Learning @ HdM Stuttgart | Johannes Theodoridis | 09.08.2018
› Image Colorization
GAN Zoo
Scribbler: Controlling Deep Image Synthesis with Sketch and Color[Sangkloy et al. 2017]
Style2Paints 2.1https://github.com/lllyasviel/style2paints
Colorful Image Colorization[Zhang, Isola, Efros 2016]
43Deep Learning @ HdM Stuttgart | Johannes Theodoridis | 09.08.2018
GAN Zoo
› Interactive drawing
Generative Visual Manipulation on the Natural Image Manifold[Zhu et al. 2016]
44Deep Learning @ HdM Stuttgart | Johannes Theodoridis | 09.08.2018
› We will see a lot more real world applications ofSupervised Learning in many (new) domains.
› We will see more efficient Reinforcement Learning.
(good for robotics)
› Research in Unsupervised Learning “just started“.
› Key to “stronger“ AI: Prediction + Planning = Reasoning.
Whats next? (my prediction)
45Deep Learning @ HdM Stuttgart | Johannes Theodoridis | 09.08.2018
› We do AI and ML since 2006 / 2007 (Medieninformatik / Mobile Medien)
› Applied approach: How can we bring AI into production?› Lectures are split ~50/50 between theory and programming
› Constantly growing number of students in AI lectures (last ML course was 60+)
› NEW: ML specialization within the Computer Science and Media Master program.
› Many AI related projects in: Gaming, Apps, Websites, Embedded Systems
› 10 - 15 degree theses per semester (inhouse and with industry: Daimler, Bosch, Porsche etc.)
46Deep Learning @ HdM Stuttgart | Johannes Theodoridis | 09.08.2018
› We go to Hackathons J
› Visit us:
www.hdm-stuttgart.de/~maucher
› or come to the HdM Media Night!
(next one is end of Winter Term 18/19 ~ end of January)
› Thank you!
AI @ HdM Stuttgart
Daimler TSS Artificial Intelligence Garage – November 2017