Post on 26-Jul-2020
transcript
Minimizing Annotation Effort
Dr. Antonio M. López
antonio@cvc.uab.es
June 9th, 2019
ACKNOWLEDGMENTS
ICREA Academia Programme
ACKNOWLEDGMENTS
MICINN Project TIN2017-88709-R ("DANA")
AGAUR 2017-SGR-01597
CERCA (Centres de Recerca de Catalunya)
ACCIÓ (Generalitat de Catalunya)
2antonio@cvc.uab.es
Divide-&-Conquer Engineering View: Modular approach (Perception Local Maneuver)
3antonio@cvc.uab.es
Deep CNNs Need Annotated Data
Let’s labelling data for fun!
4antonio@cvc.uab.es
2008-10 2011 2012-14 2015-16 2017-19
1rst object detector
fully trained using
videogame data.
ECCV’14, ICCV’15, ECCV’16, ICCV’17, ECCV’18*
(*) Friday, Full day at room N1095ZG, VisDA Challenge
DA VirtualReal
for DPM
Virtual/Augmented Reality
for Visual Artificial Intelligence (VARVAI)
Deep Learning “starts”
for Computer Vision
Transferring & Adapting Source Knowledge
in Computer Vision (TASK-CV)
ECCV’16 & ACM-MM’16
’18: Computer Graphics for Autonomous Driving
Explosion on the use of synthetic
data in Computer Vision: GTA-V,
Internet Models, ...
AD Challenge
@ CVPR’19
5antonio@cvc.uab.es
Pure Data-Driven AI View & Naturalistic View: End-to-End Autonomous Driving
6antonio@cvc.uab.es
Imitation Learning: No manual supervision
7antonio@cvc.uab.es
ALVINN (1988)¹ DAVE (2005)²
1.D. Pomerleau. ALVINN: An autonomous land vehicle in a neural network. NIPS, 1988.
2.Y. LeCun, U. Muller, J. Ben, E. Cosatto, and B. Flepp. Off-road obstacle avoidance through end-to-end learning. NIPS, 2005.
8antonio@cvc.uab.es
Pure Data-Driven AI View & Naturalistic View: End-to-End Autonomous Driving (P&LP)
Still, many diverse experiences are required!
Index
• SYNTHIA: co-training object detectors
• CARLA: multimodal end-to-end driving
Index
• SYNTHIA: co-training object detectors
• CARLA: multimodal end-to-end driving
11antonio@cvc.uab.es
12antonio@cvc.uab.es
Unlabelled
Real-world
Data
Self-labelled
Real-world
Data
Object
DetectorDetect
Self-Learning, under domain shift
source: SYNTHIA, target: real-world dataset.
Detections as
Labelled Data
Train
Basic assumption:
Source model is relatively good detecting
on target data.
Basic idea:
1. Start with a detector trained on SYNTHIA.
2. Use the detector to process images of an
unlabelled real-world dataset (e.g. KITTI).
3. Select the M images with highest detection
scores. (Thr high precision, low recall).
4. Use detections and backgrounds from such
M images as self-labelled real-world data.
5. Retrain the detector with the SYNTHIA
data and the self-labelled data.
6. Keep doing 2-5 for C cycles.
Co-Training, under domain shift source: SYNTHIA, target: real-world dataset.
Unlabelled
Real-world
Data
Self-labelled
Real-world
Data #1
Object
Detector #1
Detect
Detections as
Labelled Data
TrainObject
Detector #2
Self-labelled
Real-world
Data # 2
Train
Detect
Basic assumptions:
1. Source models are good
detecting on target data.
2. Both detectors behave
essentially different.
Basic idea:
1. ~ Self-learning: one detector
(#1) sends to the other (#2)
the M images with most
confident detections.
2. ~ Discrepancy: from such M
images, the other detector
(#2) only keeps the N with
lowest confidence, N<M.
3. Parallel training.
4. Keep doing 1-3 for C cycles.
Index
• SYNTHIA: co-training object detectors
• CARLA: multimodal end-to-end driving
16antonio@cvc.uab.es
Pure Data-Driven AI View & Naturalistic View: End-to-End Autonomous Driving (P&LP)
… by Imitation/demonstration (behavior cloning)
17antonio@cvc.uab.es
?StraightLeft Right Nothing
Trajectory Planning
18antonio@cvc.uab.es
Branched Architecture
“End to End Driving via Conditional Imitation Learning”, Codevilla et al., ICRA’2018
19antonio@cvc.uab.es
20antonio@cvc.uab.es
“Monocular Depth Estimation by Learning from Heterogeneous Datasets”,
A. Gurram, O. Urfalioglu, I. Halfaoui, F. Bouzaraa, A.M. Lopez,
IEEE Intelligent Vehicles Symposium, 2018
Depth ground truth: KITTI LiDAR
Semantic ground truth: Cityscapes semantic segmenation
21antonio@cvc.uab.es 21
Phase 1 – Discrete depth estimation (i.e. classification).
22antonio@cvc.uab.es 22
Phase 1 – Semantic segmentation (classification).
23antonio@cvc.uab.es 23
Phase 2 – Depth regression.
24antonio@cvc.uab.es
KITTI: Training set (LiDAR ground truth) & Testing set
25antonio@cvc.uab.es
Quantitative results
Eigen et al. KITTI split. DRN - Depth regression network, DC-DRN Depth regression model with pre-trained classification network. DSC-DRN - Depth
regression network trained with the conditional flow approach for depth ranges 1-80m & 1-50m. In Godard approaches, "K" means using KITTI for
training, "CS + K" means using Cityscapes too. Bold stands for best, italics for second best.
26antonio@cvc.uab.es
Cityscapes Testing! (cross-domain generalization)
27antonio@cvc.uab.es
Photo-realistic SYNTHIA
28antonio@cvc.uab.es
Multimodal end-to-end driving: RGB+D multisensory / single-sensor (monocular)
Yi et al. (arXiv:1906.03199)
29antonio@cvc.uab.es
30antonio@cvc.uab.es
Address
Edifici O, Campus UAB
08193 Bellaterra
Barcelona
Phone & Fax
Direct Line: +34 93 581 2561
Fax: +34 93 581 1670
www.cvc.uab.es
E-contact
www.cvc.uab.es/~antonio
antonio@cvc.uab.es
Dr. Antonio M. López, Principal Investigator UAB & CVC ADAS Group
In conclusion, we are
lazy annotators!!!
Many Thanks!!!Q?