UNDERSTANDING THE WORLD, BY LEARNING …...› We learn how the world works by observing it. › We...

www.hdm-stuttgart.de

Deep Learning @ HdM 2018

UNDERSTANDING THE WORLD, BY LEARNING HOW TO MODEL IT

2Deep Learning @ HdM Stuttgart | Johannes Theodoridis | 09.08.2018

› Johannes Theodoridis

› Audiovisuelle Medien @ HdM

› Computer Science and Media @ HdM

› Exchange @ KTH Stockholm

› Currently working with Johannes Maucher on AI and ML @ HdM› Email: [email protected]

About me

deepart.io

(Image first slide: https://i.redd.it/2ag4n25oq02y.jpg)


IRGEN

What do you do?

DWA SM I T MEDIEN


IRGEN

What do you do?

DWA SM I T MEDIEN



What today is not about

But don’t be fooled!

Details matter in Deep Learning.


2017 in AI: Poker

Name Rank Results(inchips)

DongKim 1 -$85,649

DanielMacAulay 2 -$277,657

JimmyChou 3 -$522,857

JasonLes 4 -$880,087

Total: -$1,766,250

› Brains Vs. AI - January 2017 @ Rivers Casino Pittsburgh

› AI wins 20-day Heads-up, No-Limit Texas Holdém

tournament against 4 top-class human poker players.

› ~ 10ˆ161 different decision points in Texas hold’em.

› Infeasible to pre-compute a strategy for each of the

moves.

Libratus: The Superhuman AI for

No-Limit Poker[Brown, Sandholm – IJCAI 2017]

"I didn’t realize how good it was until today. I felt like I

was playing against someone who was cheating, like itcould see my cards. I’m not accusing it of cheating. It

was just that good.” – Dong Kim(Source: https://www.wired.com/2017/01/ai-conquer-poker-not-without-human-help/)


› 2016 AlphaGO

› learned from expert games + selfplay

› defeats Lee Sedol (world champion) 4:1

› 2017 AlphaGo Zero

› learned entirely on ist own

› defeats AlphaGo 5:0

2017 in AI: Board Games

(Credit: Photo courtesy of Google)

Mastering the game of Go with deep

neural networks and tree search[Silver et al. – Nature 2016]

Mastering the game of Go without

human knowledge[Silver et al. – Nature 2017]


› 2015

› 2017

2017 in AI: Video Games

Human-level control through

deep reinforcement learning[Mnih et al. – Nature 2015]

OpenAI bot wins 1vs1 against Dendi

in a best-of-three match.https://blog.openai.com/dota-2/

https://blog.openai.com/more-on-dota-2/https://openai.com/the-international/


› August 5, 2018

› Long time horizons: ~ 20000 Moves (Chess ~ 40, Go ~ 150)

› Action Space: ~1000 valid actions each tick (Chess ~35, Go ~250)

› Observation Space: 20,000 numbers representing all game information (Chess 70, Go 400)

› Learned via self play: “OpenAI Five plays 180 years worth of games against itself every day.“

› Hardware: Training is running on 256 GPUs and 128,000 CPU cores.

2018 in AI: Video Games

OpenAI Five wins 2 out of 3

games against a Semi-Pro Teamhttps://blog.openai.com/openai-five/

Images: blog.openai.com/


› Dermatologist-level classification of skin

cancer with deep neural networks [Esteva et al. – Nature 2017]

› Trained on 129,450 clinical images

› Performance on par when tested against

21 board-certified dermatologists

2017 in AI: Healthcare


› April 11, 2018 - FDA Permits Marketing of

First AI-based Medical Device: IDx – DR.

› Diagnostic system that autonomously analyzes images of the retina for signs of

diabetic retinopathy.

› “Machines can help the doctor make a

better diagnosis, but they are not good at

making medical decisions afterward.” [EyeNet: Artificial Intelligence: The Next Step in Diagnostics - American Academy of Ophthalmology (AAO), Nov 2017]

2018 in AI: Healthcare

Source: https://www.eyediagnosis.net


2017 in AI: Systems

› The Case for Learned Index Structures [Kraska et al. – arxiv 1712.01208]

› Replace B-Trees-Index or Hash-Index with

a Neural Network

› + 70% in speed

› + saving an order-of-magnitude in

memory (over several real-world data sets)

› Authors argue that “replacing core

components of a data management

system through learned models has far

reaching implications for future systems

designs”


› “I have a terrible confession to make. AI systems today suck“Yann LeCun at Brown University 2017

› “All of these AI systems we see, none of them is ‘real‘ AI“ Josh Tennenbaum at CCN 2017

Wait what?


› Strong AI (or Artificial General Intelligence AGI) - can solve every task.This is what everyone is worried about in the media, Singularity etc. but, we are not even close!

› Weak AI (or narrow AI) – can solve a specific task.This is everything you have seen so far. Works really well for some tasks like image and speech recognition.

A rough distinction


› The brain learns with an efficiency that none of our machine learning methods can match.

› Our supervised learning systems require large numbers of examples.

› Our reinforcement learning systems require millions of trials.

› That is why we don‘t have robots that are as agile as a cat or a rat.

› That is why we don‘t have dialog systems that have common sense.

› What is missing?

› Learning paradigms that build (predictive) models of the world through observation and action.

Why are we “not even close“ to AGI?

Slide copied from: Dr. Yann LeCun, "How Could Machines Learn as Efficiently as Animals and Humans?"

https://www.youtube.com/watch?v=uYwH4TSdVYs


› Machine Learning is the subfield of artificial intelligence concerned with programs that learn from experience.

[Russell and Norvig - Artificial intelligence: a modern approach]

What is Machine Learning?


› Task: Tell if there is an apple in the image

What is Machine Learning?

def contains_apple(image)

red_pixels = count(image.RED)

if red_pixels > 300:

return True

else

return False

YES NO

Does not scale

Approach 1: write code

Does scale: With enough compute power and training samples

Approach2: learn from data

MachineLearning

YES NO


› Traditional Pattern Recognition: Fixed/Handcrafted Feature Extractor

› Deep Learning: Representations are hierarchical and trained

What is Deep Learning?

Trainable

Classifier

Feature

Extractor

Trainable

Classifier

High-Level

Features

Mid-Level

Features

Low-Level

Features

Understanding Neural Networks

Through Deep Visualization [Yosinski et al. – ICML 2015]

Slide Credit: Yann LeCun


› Because of the labels we call this SUPERVISED LEARNING.› These labels need to be generated somehow (by humans mostly).

How do we train these things?

P T

Error

Predict Labels P

Calculate the error by comparing

predicted and true labels

Update the pipeline towards less error

Select a random

mini-batch of data

Training Data – Labeled by category

Label: Fruits

Label: Vehicles


› CNN architecture that was used by [Mnih et al. – Nature 2015]

to play Atari Games (Deep Q-Networks - DQN)

What is in the boxes?

Input:Current game screen

Convolutional Neural Network – CNN(note: no pooling layers in this architecture)

Output:Best action to choose


› RECEPTIVE FIELDS, BINOCULAR INTERACTION AND FUNCTIONAL

ARCHITECTURE IN THE CAT'S VISUAL

CORTEX [Hubel & Wiesel 1962]

A bit of CNN history: Thank you cats :)

AlexNet[Krizhevsky, Sutskever, Hinton 2012]

Neocognitron[Fukushima 1980]

LeNet-5[LeCun, Bengio, Haffner 1998]

Deep Learning

(Photo by Bertil Videt CC BY-SA 3.0)

Large Scale Visual Recognition

Challenge (ILSVRC)

› ½ Nobel Prize in Physiology or Medicine 1981: David H. Hubel and Torsten

N. Wiesel "for their discoveries concerning information processing in the visual system".


What does “deep“ mean?

Input

FC4096

FC4096

FC1000

softmax

conv64

conv64

maxpool

conv128

conv128

maxpool

conv256

conv256

maxpool

conv512

conv512

maxpool

conv512

conv512

conv512

conv512

maxpool

conv512

conv512VGG

[Simonyan, Zisserman 2014]


GoogLeNet[Szegedy et al. 2014]

ResNet[He et al. 2015]

DenseNet[Huang et al. 2017]


› Image Classification Image Retrieval

› Machine Translation

Supervised Learning

ImageNet Classification with Deep Convolutional Neural Networks[Krizhevsky, Sutskever, Hinton 2012]

Convolutional Sequence to Sequence Learning[Gehring et al. 2017]

German: ”Sie stimmen zu”English: ”They agree”


› Image Caption Generation

Supervised Learning

Show, Attend and Tell: Neural Image Caption Generation with

Visual Attention[Xu et al. 2015]


› Instance Segmentation

Supervised Learning

Mask R-CNN[He et al. 2017]


› Instance Segmentation in traffic

Supervised Learning


(Source: 4K Mask RCNN COCO Object detection and segmentation #2

https://www.youtube.com/watch?v=OOT3UIXZztE )


› Pose Estimation

Supervised Learning



› Play SNES games (Bachelor Thesis @ HdM ) Learn Locomotion Behaviours @ DeepMind

Reinforcement Learning

Emergence of Locomotion Behaviours in Rich Environments[Heess et al. 2017] (Video: https://www.youtube.com/watch?v=hx_bgoTF7bs)


› Obstacles to AI› Learning models of the world

› Learning to reason and plan

Yann LeCun at CCN 2017

(but he made this point in many talks)

What are we missing?


› Image Caption Fails.

› The teddy doesn't fit into the brown suitcase because it's too

[small/large]. What is too [small/large]?Answers:The suitcase/the teddy. (Winograd Schemas)

› ”Tom picked up his bag and left the room”.

› These questions are easy for us because we have a model of the

world.

Common Sense Knowledge

(Sources: https://techcrunch.com/2016/11/08/shining-light-on-facebooks-ai-strategy/ ,

http://www.reactiongifs.com/wp-content/uploads/2013/02/nwld.gif , http://images.memes.com/meme/999039 )


› Common Sense is the ability to fill in the blanks› Filling in the visual field at the retinal blind spot.

› Filling in occluded images, missing segments in speech.

› Intuitive Physics + Intuitive Psychology

› track objects over time

› discount physically implausible trajectories

› distinguish animate agents from inanimate objects

› understand that other people have mental states like goals and beliefs

› Where can this come from? -> Unsupervised Learning

› Most of the learning performed by animals and humans is unsupervised. (no teacher)

› We learn how the world works by observing it.› We learn that the world is 3-dimensional.

› We learn object permanence.

› We build a model of the world through predictive unsupervised learning. (This predictive model gives us “common sense“)

Common Sense Knowledge

(Slide is composition from: Yann LeCun, "How Could Machines Learn as Efficiently as Animals and Humans?" https://www.youtube.com/watch?v=uYwH4TSdVYs

, Sources: Baby http://www.mommyshorts.com/wp-content/uploads/2014/09/6a0133f30ae399970b0192aa1b4c77970d-800wi.jpg , Retina by Jerry CrimsonMann CC-BY-SA 3.0)


› Task: Predict in which direction the Mikado sticks will fall

› Problem: Invariant prediction: The training samples are merely representatives

of a whole set of possible outputs (e.g. a manifold of outputs)

› We need to represent a distribution. But how do you represent a distribution

in high dimensional space?

› Solution (one): Energy-Based Unsupervised Learning› Idea: Take low value on data manifold, higher values everywhere else

Learning Predictive Forward Models of the world.

observation 1 observation 2 …

Y1

Y2


Thx: Raphy for playing Mikado with me


› The Generator network will try to generate fake images that fool the discriminator.

› The Discriminator network will try to distinguish between a real and a generated image.

Generative Adversarial Networks (GAN) [Goodfellow et al. 2014]

Discriminator

(NeuralNetwork)

Real

FakeGenerator

(NeuralNetwork)

Realworld

images

”Noise”


› Generate bedrooms - 2016

Welcome to the GAN Zoo

Unsupervised Representation Learning with Deep Convolutional

Generative Adversarial Networks[Radford et al. ICLR 2016]


› Generate bedrooms, buildings, cats - 2017

GAN Zoo

StackGAN++: Realistic Image Synthesis with Stacked Generative

Adversarial Networks[Zhang et al. 2017]


› Generate celebrities 2018

GAN Zoo

IntroVAE: Introspective VariationalAutoencoders for Photographic

Image Synthesis[Huang et al. 2018]

Progressive Growing of GANs for Improved Quality, Stability, and

Variation[Karras et al. 2018]

High resolution: 1024 x 1024 pixel


GAN Zoo

› Face arithmetic

StarGAN: Unified Generative Adversarial Networks for Multi-Domain

Image-to-Image Translation[Choi et al. 2017]

Unsupervised Representation Learning with Deep Convolutional

Generative Adversarial Networks[Radford et al. ICLR 2016]


› Next Frame Prediction

GAN Zoo

Deep multi-scale video prediction beyond mean square error[Mathieu et al. 2017]

Predicting Deeper into the Future of Semantic Segmentation[Luc and Neverova et al. 2017]

(Sources: https://cs.nyu.edu/~mathieu/iclr2016.html, https://github.com/facebookresearch/SegmPred )


› Image-to-Image translation

GAN Zoo

Image-to-Image Translation with Conditional Adversarial

Networks[Isola et al. 2017]

Unpaired Image-to-Image Translation using Cycle-Consistent

Adversarial Networks[Zhu and Park et al. 2017]

Image-to-Image Demohttps://affinelayer.com/pixsrv/


GAN Zoo

› Text-to-Image translation

StackGAN++: Realistic Image Synthesis with Stacked Generative

Adversarial Networks[Zhang et al. 2017]


› Image Colorization

GAN Zoo

Scribbler: Controlling Deep Image Synthesis with Sketch and Color[Sangkloy et al. 2017]

Style2Paints 2.1https://github.com/lllyasviel/style2paints

Colorful Image Colorization[Zhang, Isola, Efros 2016]


GAN Zoo

› Interactive drawing

Generative Visual Manipulation on the Natural Image Manifold[Zhu et al. 2016]


› We will see a lot more real world applications ofSupervised Learning in many (new) domains.

› We will see more efficient Reinforcement Learning.

(good for robotics)

› Research in Unsupervised Learning “just started“.

› Key to “stronger“ AI: Prediction + Planning = Reasoning.

Whats next? (my prediction)


› We do AI and ML since 2006 / 2007 (Medieninformatik / Mobile Medien)

› Applied approach: How can we bring AI into production?› Lectures are split ~50/50 between theory and programming

› Constantly growing number of students in AI lectures (last ML course was 60+)

› NEW: ML specialization within the Computer Science and Media Master program.

› Many AI related projects in: Gaming, Apps, Websites, Embedded Systems

› 10 - 15 degree theses per semester (inhouse and with industry: Daimler, Bosch, Porsche etc.)


› We go to Hackathons J

› Visit us:

www.hdm-stuttgart.de/~maucher

› or come to the HdM Media Night!

(next one is end of Winter Term 18/19 ~ end of January)

› Thank you!

AI @ HdM Stuttgart

Daimler TSS Artificial Intelligence Garage – November 2017

Date post:	20-May-2020
Category:	Documents
Upload:	others
View:	2 times
Download:	0 times

UNDERSTANDING THE WORLD, BY LEARNING …...› We learn how the world works by observing it. › We...

Documents