+ All Categories
Home > Documents > Unsupervised Video Object Segmentation for Deep ...€¦ · Unsupervised Video Object Segmentation...

Unsupervised Video Object Segmentation for Deep ...€¦ · Unsupervised Video Object Segmentation...

Date post: 08-Aug-2020
Category:
Upload: others
View: 9 times
Download: 0 times
Share this document with a friend
26
Unsupervised Video Object Segmentation for Deep Reinforcement Learning Machine Learning and Data Analytics Symposium Doha, Qatar, April 1, 2019 Vikash Goel, Jameson Weng, Pascal Poupart
Transcript
Page 1: Unsupervised Video Object Segmentation for Deep ...€¦ · Unsupervised Video Object Segmentation for Deep Reinforcement Learning Machine Learning and Data Analytics Symposium Doha,

Unsupervised Video Object Segmentation for Deep Reinforcement Learning

Machine Learning and Data Analytics SymposiumDoha, Qatar, April 1, 2019

Vikash Goel, Jameson Weng, Pascal Poupart

Page 2: Unsupervised Video Object Segmentation for Deep ...€¦ · Unsupervised Video Object Segmentation for Deep Reinforcement Learning Machine Learning and Data Analytics Symposium Doha,

2

Pascal: RBC Borealis AI Research Director

• Research institute funded by RBC

• 5 research centers: – Montreal, Toronto, Waterloo,

Edmonton and Vancouver

• 80 researchers: – Integrated (applied & fundamental) research model

• ML, RL, NLP, computer vision, private AI, knowledge graphs

• We are hiring!

Page 3: Unsupervised Video Object Segmentation for Deep ...€¦ · Unsupervised Video Object Segmentation for Deep Reinforcement Learning Machine Learning and Data Analytics Symposium Doha,

3

Pascal: ML Professor at U of Waterloo

• Deep Learning– Automated structure learning, sum-product networks, transfer learning

• Reinforcement learning– Constrained RL, motion-oriented RL, sport analytics

• NLP– Conversational agents, machine translation, automated proofreading

• Theory– Convex relaxations of sum-product networks, characterization of local

optima in mixture models, consistent approximate Bayesian techniques

Page 4: Unsupervised Video Object Segmentation for Deep ...€¦ · Unsupervised Video Object Segmentation for Deep Reinforcement Learning Machine Learning and Data Analytics Symposium Doha,

4

Outline

• Background

– Reinforcement learning: data inefficiency

– Solution: self-supervised learning

• MOREL: Motion-Oriented REinforcement Learning

– Unsupervised object & motion recognition

– Faster policy optimization & interpretability

Reference: Goel, Weng, Poupart (2018) Unsupervised Video Object Segmentation for Deep Reinforcement Learning, NeurIPS.

Page 5: Unsupervised Video Object Segmentation for Deep ...€¦ · Unsupervised Video Object Segmentation for Deep Reinforcement Learning Machine Learning and Data Analytics Symposium Doha,

5

Reinforcement Learning

Games, robotics, automated trading, autonomous driving, recommender systems, conversational agents, operations research, data center optimization

Agent

Environment

ObservationReward Action

Page 6: Unsupervised Video Object Segmentation for Deep ...€¦ · Unsupervised Video Object Segmentation for Deep Reinforcement Learning Machine Learning and Data Analytics Symposium Doha,

6

Data Inefficiency• Most RL successes: simulated environments

• Atari baselines: 40M frames (Schulman et al., 2017)

Atari MuJoCo VizDoom Computer Go

Page 7: Unsupervised Video Object Segmentation for Deep ...€¦ · Unsupervised Video Object Segmentation for Deep Reinforcement Learning Machine Learning and Data Analytics Symposium Doha,

7

Image-based RL

imag

e actionsor

values

deep neural network sparse

reward

Page 8: Unsupervised Video Object Segmentation for Deep ...€¦ · Unsupervised Video Object Segmentation for Deep Reinforcement Learning Machine Learning and Data Analytics Symposium Doha,

8

Self-supervised learning

• Auxiliary tasks and objectives– Future observation/reward prediction– Past observation prediction (inverse dynamics)– Observation reconstruction (auto-encoder)

Agent

Environment

ObservationReward Action

Page 9: Unsupervised Video Object Segmentation for Deep ...€¦ · Unsupervised Video Object Segmentation for Deep Reinforcement Learning Machine Learning and Data Analytics Symposium Doha,

9

Image-based RL• Deep RL:

• Self-supervised RL (auxiliary tasks):

imag

e actionsor

values

deep neural network

dense signal

deep neural networkim

age

next

imag

e

sparse reward

Page 10: Unsupervised Video Object Segmentation for Deep ...€¦ · Unsupervised Video Object Segmentation for Deep Reinforcement Learning Machine Learning and Data Analytics Symposium Doha,

10

Prior knowledge• What do you see?

– Humans: moving objects– RL agent: sequence of pixels

seaquest space invaders breakout

Page 11: Unsupervised Video Object Segmentation for Deep ...€¦ · Unsupervised Video Object Segmentation for Deep Reinforcement Learning Machine Learning and Data Analytics Symposium Doha,

11

Discovery of relevant features slows down learning

imag

e actionsor

values

deep neural network

sparse reward

Feature extractionPolicy optimization

Page 12: Unsupervised Video Object Segmentation for Deep ...€¦ · Unsupervised Video Object Segmentation for Deep Reinforcement Learning Machine Learning and Data Analytics Symposium Doha,

12

Faster LearningCan we learn a policy that automatically segments moving objects and identifies relevant objects?

seaquest space invaders breakout

Page 13: Unsupervised Video Object Segmentation for Deep ...€¦ · Unsupervised Video Object Segmentation for Deep Reinforcement Learning Machine Learning and Data Analytics Symposium Doha,

13

Outline• Background

– Reinforcement learning: data inefficiency– Solution: self-supervised learning

• MOREL: Motion-Oriented REinforcement Learning– Unsupervised object & motion recognition– Faster policy optimization & interpretability

Reference: Goel, Weng, Poupart (2018) Unsupervised Video Object Segmentation for Deep Reinforcement Learning, NeurIPS.

Page 14: Unsupervised Video Object Segmentation for Deep ...€¦ · Unsupervised Video Object Segmentation for Deep Reinforcement Learning Machine Learning and Data Analytics Symposium Doha,

14

MOREL: Motion-Oriented RL

Unsupervised object segmentation

Only 1% of the frames (random actions)

Faster policy segmentation

Based on object segmentation and motion

Phase 1 Phase 2

Page 15: Unsupervised Video Object Segmentation for Deep ...€¦ · Unsupervised Video Object Segmentation for Deep Reinforcement Learning Machine Learning and Data Analytics Symposium Doha,

15

Motion Consistency • Supervised segmentation: labor intensive labeling

• Idea: leverage optical flow (structure from motion)

Page 16: Unsupervised Video Object Segmentation for Deep ...€¦ · Unsupervised Video Object Segmentation for Deep Reinforcement Learning Machine Learning and Data Analytics Symposium Doha,

16

SfM-NetVijayanarasimhan, Ricco, Schmid, Sukthankar, Fragkiadaki, SfM-Net: Learning of Structure and Motion from Video, arXiv, 2017.

Page 17: Unsupervised Video Object Segmentation for Deep ...€¦ · Unsupervised Video Object Segmentation for Deep Reinforcement Learning Machine Learning and Data Analytics Symposium Doha,

17

SfM-Net predictions (KITTI 2015)

Page 18: Unsupervised Video Object Segmentation for Deep ...€¦ · Unsupervised Video Object Segmentation for Deep Reinforcement Learning Machine Learning and Data Analytics Symposium Doha,

18

Simplified 2D SfM-Net

• No skip connection

• Reconstruction loss: DSSIM (structural dissimilarity)

• Flow regularization: L1 loss

• Curriculum: gradually increase !"#$ from 0 to 1

%"#&'()*"+&* = -../0

%"#$ =12

0(2) × 62 7

%899 = %"#&'()*"+&* + !"#$%"#$

Page 19: Unsupervised Video Object Segmentation for Deep ...€¦ · Unsupervised Video Object Segmentation for Deep Reinforcement Learning Machine Learning and Data Analytics Symposium Doha,

19

Simplified 2D SfM-Net

Frame 1 Frame 2Masks

(summed)Most salient

mask Optical flow

Brea

kout

Pong

Page 20: Unsupervised Video Object Segmentation for Deep ...€¦ · Unsupervised Video Object Segmentation for Deep Reinforcement Learning Machine Learning and Data Analytics Symposium Doha,

20

Unsupervised object segmentationMasks (summed) Most salient mask Optical flow

Spac

e In

vade

rsBe

am R

ider

Seaq

uest

Frame 1 Frame 2Masks

(summed)Most salient

mask Optical flow

Page 21: Unsupervised Video Object Segmentation for Deep ...€¦ · Unsupervised Video Object Segmentation for Deep Reinforcement Learning Machine Learning and Data Analytics Symposium Doha,

21

MOREL: Motion-Oriented RLMulti-objective: max $%&'$() and min ,-./0'121,&3$$,$

Comparison with PPOBetter: 25 gamesSimilar: 25 gamesWorse: 9 games

Comparison with A2CBetter: 26 gamesSimilar: 30 gamesWorse: 3 games

Page 22: Unsupervised Video Object Segmentation for Deep ...€¦ · Unsupervised Video Object Segmentation for Deep Reinforcement Learning Machine Learning and Data Analytics Symposium Doha,

22

VideosPong

Breakout

Seaquest

Beamrider

Page 23: Unsupervised Video Object Segmentation for Deep ...€¦ · Unsupervised Video Object Segmentation for Deep Reinforcement Learning Machine Learning and Data Analytics Symposium Doha,

23

Performance CurvesBreakout

Epis

ode

rew

ards

Frames Frames

Epis

ode

rew

ards

Seaquest

Pong

Beamrider

Page 24: Unsupervised Video Object Segmentation for Deep ...€¦ · Unsupervised Video Object Segmentation for Deep Reinforcement Learning Machine Learning and Data Analytics Symposium Doha,

Pong

24

Ablation StudyBreakout

Seaquest Beamrider

FramesFrames

Epis

ode

rew

ards

Epis

ode

rew

ards

Page 25: Unsupervised Video Object Segmentation for Deep ...€¦ · Unsupervised Video Object Segmentation for Deep Reinforcement Learning Machine Learning and Data Analytics Symposium Doha,

25

Conclusion• MOREL: Motion-Oriented REinforcement Learning

– Unsupervised object & motion recognition– Faster policy optimization & interpretability

• Future work– 3D environments, physics-based dynamics, object-oriented RL,

model-based RL

Reference: Goel, Weng, Poupart (2018) Unsupervised Video Object Segmentation for Deep Reinforcement Learning, NeurIPS.

Page 26: Unsupervised Video Object Segmentation for Deep ...€¦ · Unsupervised Video Object Segmentation for Deep Reinforcement Learning Machine Learning and Data Analytics Symposium Doha,

26

RBC Borealis AI

• Graduating soon?– Join RBC Borealis AI (https://www.borealisai.com)– Email: [email protected]

• Research Institute– Fundamental research (publications)– Applied research (products)

• Topics– RL: automated trading– NLP: news filtering, information extraction, text generation– Computer Vision: satellite-based house valuation– Privacy: differential privacy– Knowledge graphs: recommender systems


Recommended