Deep Metric Learning
Department of Automation, Tsinghua University, Chinahttp://ivg.au.tsinghua.edu.cn/Jiwen_Lu/
Jiwen Lu
Outline
2
• Introduction
• Mahalanobis Deep Metric Learning
• Hamming Deep Metric Learning
• Multi-Modal Deep Metric Learning
• Conclusions
Visual Content UnderstandingVisual Recognition
3
Visual TrackingVisual Content Understanding
4
Visual SearchVisual Content Understanding
5
Distance Metric LearningApplying Mahalanobis distance to learn a positive semi-denite (PSD) matrix
Relationship with subspace learning
where .
6
• Linear
• Kernel
• Tensor
The structure of the input
The label type of training samples• Supervised
• Unsupervised
• Semi-supervised
Distance Metric Learning
7
The supervision type of training samples
The architecture of models
• Weakly-supervised
• Strongly-supervised
• Shallow
• Deep
The number of metrics• Single-Metric Learning
• Multi-Metric Learning
Distance Metric Learning
8
Distance Metric Learning
9
Large Margin Nearest Neighborhood (LMNN)
[Weinberger et al, NIPS2005]
Information-Theoretic Metric Learning (ITML)
[Davis et al, ICML2007]
where .
The optimization function can be re-formulated as
Distance Metric Learning
9
1111
Mahalanobis Deep Metric LearningFinal representation:
The distance of a pair is:
Illustration at the top layer
[1] Junlin Hu, Jiwen Lu, and Yap-Peng Tan, Discriminative deep metric learning for face verification in the wild, CVPR, pp. 1875-1882, 2014.
[2] Jiwen Lu, Junlin Hu, and Yap-Peng Tan, Discriminative deep metric learning for face and kinship verification, IEEE Trans. on Image Processing, vol. 26, no. 9, pp. 4269-4282, 2017.
1212
Mahalanobis Deep Metric LearningFace Verification
1313
Basic idea of the proposed DTML method.
Mahalanobis Deep Metric Learning
[3] Junlin Hu, Jiwen Lu, and Yap-Peng Tan, Deep transfer metric learning, CVPR, pp. 325-333, 2015.[4] Junlin Hu, Jiwen Lu, Yap-Peng Tan, and Jie Zhou, Deep transfer metric learning, IEEE Trans. on Image
Processing, vol. 25, no. 12, pp. 5576-5588, 2016.
1414Top r matched results of different methods on the VIPeR dataset
Mahalanobis Deep Metric LearningCross-Dataset Person Re-identification
1515
Main procedure of our proposed DML tracker.
Mahalanobis Deep Metric LearningVisual Tracking
[5] Junlin Hu, Jiwen Lu, and Yap-Peng Tan, Deep metric learning for visual tracking, IEEE Trans. on Circuits and Systems for Video Technology, vol. 26, no. 11, 2056-2068, 2016.
1616
Mahalanobis Deep Metric Learning
1717
Mahalanobis Deep Metric LearningImage Set Classification
[6] Jiwen Lu, Gang Wang, Weihong Deng, Pierre Moulin, and Jie Zhou, Multi-manifold deep metric learning for image set classification, CVPR, pp. 1137-1145, 2015.
1818
Average classification rates of different methods on different datasets
Mahalanobis Deep Metric LearningImage Set Classification
19
Mahalanobis Deep Metric LearningCross-Modal Matching
[7] Venice Erin Liong, Jiwen Lu, Yap-Peng Tan, and Jie Zhou, Deep coupled metric learning for cross-modal matching, IEEE Trans. on Multimedia, vol. 19, no. 6, pp. 1234-1244, 2017.
2020
Mahalanobis Deep Metric LearningCross-Modal Matching
21
Mahalanobis Deep Metric LearningFace and Person Recognition
[8] Yueqi Duan, Jiwen Lu, Jianjiang Feng, and Jie Zhou, Deep localized metric learning, IEEE Trans. on Circuits and Systems for Video Technology, 2017, accepted.
2222
Mahalanobis Deep Metric LearningFace and Person Recognition
23
Mahalanobis Deep Metric LearningMulti-view Deep Metric Learning
[9] Junlin Hu, Jiwen Lu, and Yap-Peng Tan, Sharable and individual multi-view metric learning, IEEE Trans. on Pattern Analysis and Machine Intelligence, 2017, accepted.
2424
Mahalanobis Deep Metric LearningMulti-view Deep Metric Learning
25
Mahalanobis Deep Metric LearningLabel Sensitive Deep Metric Learning
[10] Hao Liu, Jiwen Lu, Jianjiang Feng, and Jie Zhou, Label-sensitive deep metric learning for facial age estimation, IEEE Trans. on Information Forensics and Security, 2017, accepted.
2626
Mahalanobis Deep Metric LearningFacial Age Estimation
2727
Mahalanobis Deep Metric LearningAdversarial Dep Metric Learning
[11] Yongming Rao, Ji Lin, Jiwen Lu, and Jie Zhou, Learning discriminative aggregation network for video-based face recognition, ICCV, pp. 3781-3790, 2017.
2828
Mahalanobis Deep Metric LearningAdversarial Dep Metric Learning
29
Hamming Deep Metric LearningScalable Image Search
[12] Venice Erin Liong, Jiwen Lu, Gang Wang, Pierre Moulin, and Jie Zhou, Deep hashing for compact binary codes learning, CVPR, pp. 2475-2483, 2015.
[13] Jiwen Lu, Venice Erin Liong, and Jie Zhou, Deep hashing for scalable image search, IEEE Trans. on Image Processing, vol. 26, no. 5, pp. 2352-2367, 2017.
30
Results on CIFAR.
Hamming Deep Metric LearningScalable Image Search
31
Hamming Deep Metric LearningScalable Image Search
[14] Zhixiang Chen, Jiwen Lu, Jianjiang Feng, and Jie Zhou, Nonlinear discrete hashing, IEEE Trans. on Multimedia, vol. 19, no. 1, pp. 123-135, 2017.
32
Results on CIFAR.
Hamming Deep Metric LearningScalable Image Search
33
Hamming Deep Metric LearningScalable Video Search
[15] Zhixiang Chen, Jiwen Lu, Jianjiang Feng, and Jie Zhou, Nonlinear structural hashing for scalable video search, IEEE Trans. on Circuits and Systems for Video Technology, 2017, accepted.
34
Hamming Deep Metric LearningScalable Video Search
35
Hamming Deep Metric LearningScalable Video Search
[16] Venice Erin Liong, Jiwen Lu, Yap-Peng Tan, and Jie Zhou, Deep video hashing, IEEE Trans. on Multimedia, vol. 19, no. 6, pp. 1209-1219, 2017.
36
Hamming Deep Metric LearningScalable Video Search
37
Hamming Deep Metric LearningImage Matching
[17] Kevin Lin, Jiwen Lu, Chu-Song Chen, and Jie Zhou, Learning compact binary descriptors with unsupervised deep neural networks, CVPR, pp. 1183-1192, 2016.
Main procedure of our proposed approach.
38
Hamming Deep Metric LearningImage Matching
39
Hamming Deep Metric LearningImage Matching
[18] Yueqi Duan, Jiwen Lu, Ziwei Wang, Jianjiang Feng, and Jie Zhou, Learning deep binary descriptorwith multi-quantization, CVPR, pp. 1183-1192, 2017.
40
Hamming Deep Metric LearningImage Matching
4141
Multi-Modal Deep Metric Learning
Our proposed multi-modal deep metric learning framework.
[19] Anran Wang, Jiwen Lu, Jianfei Cai, Tat-Jen Cham, and Gang Wang, Large-margin multi-modal deep learning for RGB-D object recognition, IEEE Trans. on Multimedia, vol. 17, no. 11, pp. 1887-1898, 2015.
RGB-D Object Recognition
4242
Multi-Modal Deep Metric Learning
RGB-D object dataset: 51 classes, 207,920 RGB-D image frames, with roughly 600 images per object
10-fold split. For each split, one object from each class was sampled, resulting in 51 test objects. After subsampling every 5th frame from the videos, there were some 34,000 images for training and 6900 images for testing
RGB-D Object Recognition
4343
Multi-Modal Deep Metric Learning
[20] Anran Wang, Jianfei Cai, Jiwen Lu, and Tat-Jen Cham, MMSS: Multi-modal sharable and specific feature learning for RGB-D object recognition, ICCV, pp. 1125-1133, 2015.
4444
Multi-Modal Deep Metric Learning
RGB-D object dataset:
RGB-D Object Recognition
4545
Deep Metric Learning
[21] Jiwen Lu, Junlin Hu, and Jie Zhou, Deep metric learning for visual understanding, IEEE Signal Processing Magazine, vol. 34, no. 6, pp. 76-84, 2017.
4646
• Deep metric learning is very effective for manyvisual understanding tasks including visualrecognition, visual tracking, and visual search.
• More visual cues can be exploited to helpdevelop more elegant deep metric learningmethods for visual understanding.
• Theoretical analysis for deep metric learning to isrequired to show how it improves various visualunderstanding tasks.
Summary
4747
Thank you!