Depth Images Prediction from a Single RGB Image
Using Deep learning
Deep Learning
May 2017
Soubhi Hadri
Depth Images Prediction from a Single RGB Image
Table of Contents :
Introduction.1
Existing Solutions.2
Dataset and Model.3
Project Code and Results.1
Introduction
Depth Images Prediction from a Single RGB Image
Introduction
-In 3D computer graphics a depth map is an image or image channel
that contains information relating to the distance of the surfaces of
scene objects from a viewpoint.
-RGB-D image : a RGB image and its corresponding depth image
-A depth image is an image channel in which each pixel relates to a
distance between the image plane and the corresponding object in the
RGB image.
Depth Images Prediction from a Single RGB Image
Introduction
To approximate the depth of objects :
• Stereo camera : camera with two/more lenses to simulate human vision.
• Realsense or Kinect to get RGB-D images
• Deep Learning..!!
Existing Solutions
Depth Images Prediction from a Single RGB Image
Deep Learning for depth estimation :
Recently, there are many works to estimate the depth map for RGB image.
Depth Images Prediction from a Single RGB Image
Deep Learning for depth estimation :
Learning Fine-Scaled Depth Maps from Single RGB Images.
7 Feb 2017
Recently, there are many works to estimate the depth map for RGB image.
Dataset & Model
Depth Images Prediction from a Single RGB Image
Dataset : NYU Depth V2
The NYU-Depth V2 data set is comprised of video sequences from a variety of indoor scenes as recorded by both the RGB and Depth cameras from the Microsoft Kinect.
Depth Images Prediction from a Single RGB Image
Dataset : NYU Depth V2
The NYU-Depth V2 data set is comprised of video sequences from a variety of indoor scenes as recorded by both the RGB and Depth cameras from the Microsoft Kinect.
Depth Images Prediction from a Single RGB Image
Dataset : NYU Depth V2
The dataset consists of :
• 1449 labeled pairs of aligned RGB and depth images (2.8 GB).
• 407,024 new unlabeled frames - raw rgb, depth (428 GB).
• Toolbox: Useful functions for manipulating the data and labels.
Different parts of the dataset can be downloaded individually.
Authors : Nathan Silberman, Derek Hoiem, Pushmeet Kohli and Rob Fergus
2012
Depth Images Prediction from a Single RGB Image
Dataset : NYU Depth V2
The dataset consists of :
• 1449 labeled pairs of aligned RGB and depth images (2.8 GB).
• 407,024 new unlabeled frames - raw rgb, depth (428 GB).
• Toolbox: Useful functions for manipulating the data and labels.
Different parts of the dataset can be downloaded individually.
Authors : Nathan Silberman, Derek Hoiem, Pushmeet Kohli and Rob Fergus
2012
Depth Images Prediction from a Single RGB Image
Dataset : NYU Depth V2
For this project:
• Office 1-2 dataset (part of the whole dataset).
• 15 GB after processing RAW data.
• 3522 RGB-D images.
Depth Images Prediction from a Single RGB Image
Dataset : NYU Depth V2
For this project:
• Office 1-2 dataset (part of the whole dataset).
• 15 GB after processing RAW data.
• 3522 RGB-D images.
Split the data:
3522
20%
80% 2817
7052414
403
Training
Validation
Test
Depth Images Prediction from a Single RGB Image
Dataset : NYU Depth V2
Samples of the data:
Depth Images Prediction from a Single RGB Image
The Model for Depth Estimation:
Model proposed by JaN IVANECK in his master degree thesis -2016.
Depth Images Prediction from a Single RGB Image
The Model for Depth Estimation:
Model proposed by JaN IVANECK in his master degree thesis -2016.
He derived his model from Eigen et al.
Predicting Depth, Surface Normals and Semantic Labels with a Common Multi-Scale Convolutional Architecture.
17 Dec 2015
Depth Images Prediction from a Single RGB Image
The Model for Depth Estimation:
Global context network estimates the rough depth map of the wholescene from the input RGB image.
Depth Images Prediction from a Single RGB Image
The Model for Depth Estimation:
Gradient network estimates horizontal and vertical gradients of the depth map globally, for the whole RGB image.
Depth Images Prediction from a Single RGB Image
The Model for Depth Estimation:
Refining network improves the rough estimate from the global context network, utilizing gradients estimated by the gradient network and an input RGB image.
Depth Images Prediction from a Single RGB Image
The Model for Depth Estimation:
Global context network
Architecture of the global context network
The model is derived from AlexNet.
Depth Images Prediction from a Single RGB Image
Loss Function:
Root mean squared error log(rms-log)
Depth Images Prediction from a Single RGB Image
Training The Network:
1- Scale the output images to [0 1].
2-Subtraction 127 from input images to center the data (kind of normalization).
3-Initialize the convolution layers using AlexNet pre-trained CNN (Transfer
Learning).
4-Training the network using batches (batch size = 32) for 35 Epochs.
5- Save the session and model in the end of each Epoch.
Depth Images Prediction from a Single RGB Image
Training The Network:
1- Scale the label images to [0 1].
2-Subtraction 127 from input images to center the data (kind of normalization).
3-Initialize the convolution layers using AlexNet pre-trained CNN (Transfer
Learning).
4-Training the network using batches (batch size = 32) for 35 Epochs.
5- Save the session and model in the end of each Epoch.
Depth Images Prediction from a Single RGB Image
Training The Network:
1- Scale the label images to [0 1].
2-Subtraction 127 from input images to center the data (kind of normalization).
3-Initialize the convolution layers using AlexNet pre-trained CNN (Transfer
Learning).
4-Training the network using batches (batch size = 32) for 35 Epochs.
5- Save the session and model in the end of each Epoch.
Depth Images Prediction from a Single RGB Image
Training The Network:
1- Scale the label images to [0 1].
2-Subtraction 127 from input images to center the data (kind of normalization).
3-Initialize the convolution layers using AlexNet pre-trained CNN (Transfer
Learning).
4-Training the network using batches (batch size = 32) for 35 Epochs.
5- Save the session and model in the end of each Epoch.
Depth Images Prediction from a Single RGB Image
Training The Network:
1- Scale the label images to [0 1].
2-Subtraction 127 from input images to center the data (kind of normalization).
3-Initialize the convolution layers using AlexNet pre-trained CNN (Transfer
Learning).
4-Training the network using batches (batch size = 32) for 35 Epochs.
5- Save the session and model in the end of each Epoch.
Depth Images Prediction from a Single RGB Image
Project Functions :
1- split_data : to split and save the data into training/testing/val.npy files.
2- load_data : load data from .npy files.
3- plot_imgs: to plot pair of images.
4- get_next_batch: to get the next batch from training data.
5- loss : calculate the loss function.
6- model: to create model (network structure).
Depth Images Prediction from a Single RGB Image
Project Functions :
7- train: to start training .
8- evaluate: to evaluate new data after restoring the model..
Depth Images Prediction from a Single RGB Image
Project Tools and Libraries:
1- Tensorflow.
2- Slim : lightweight library for defining, training and evaluating complex models in TensorFlow.
3- Tensorboard.
4- numpy.
5-matplotlib.
Depth Images Prediction from a Single RGB Image
Project Results:
Training Loss error:
Depth Images Prediction from a Single RGB Image
Project Results:
Samples of new data:
Depth Images Prediction from a Single RGB Image
Project Results:
Explanation :
• Training data is not sufficient.
Depth Images Prediction from a Single RGB Image
Project Results:
Explanation :
• Training data is not sufficient.
In Jan’s experiment:• Full NYU dataset and 3 dataset generated from the original one. • Network was trained for 100,000 iterations.
Depth Images Prediction from a Single RGB Image
Project Results:
Explanation :
• Training data is not sufficient.
In Jan’s experiment:• Full NYU dataset and 3 dataset generated from the original one. • Network was trained for 100,000 iterations.
This experiment:
• It took ~26 hours for 30 Epochs.
Depth Images Prediction from a Single RGB Image
Project :
The project code and data will be available on GitHub:
https://github.com/SubhiH/Depth-Estimation-Deep-Learning
Depth Images Prediction from a Single RGB Image
Resources :
-https://arxiv.org/pdf/1607.00730.pdf
-http://janivanecky.com/
-http://cs.nyu.edu/~silberman/datasets/nyu_depth_v2.html
Thank You