Date post: | 22-Jan-2018 |
Category: |
Engineering |
Upload: | taeoh-kim |
View: | 86 times |
Download: | 5 times |
Yonsei UniversityMVP Lab.
Bbox Regression
Classification
RoIfromSelective Search
RoI PoolingFixed Size Representation
Bbox Regression
Classification
RoI PoolingFixed Size Representation
Bbox Regression
Objectness
RPNRegionProposalNetwork
32x32x3
Conv1
Pool1
16x16x64
Conv2
Pool2
8x8x128
Conv3
Pool3
4x4x256
Conv4
Pool4
2x2x512
Conv5
Pool5
1x1x512
1x1x512 Conv
1x1 Heatmap
x32 Upsample
Softmax
Remove Pooling1x1 Conv for Heatmap Output
Slide from Mask R-CNN Tutorial, K. He. ICCV 2017
Slide from Mask R-CNN Tutorial, K. He. ICCV 2017
Sheep Dog
Human
Sheep
Sheep Sheep Sheep
Sheep Dog
Human
Dog
Human
Sheep
Sheep
Sheep Sheep Sheep
BBoxClassification
SegmentationClassification
BBoxClassification
SegmentationClassification
Can Separate
Cannot Segment
BBoxClassification
SegmentationClassification
Can Separate
Cannot Segment
Cannot Separate
Can Segment
BBoxClassification
SegmentationClassification
Segmentationin BBox
Classification
+ =
Can Separate
Cannot Segment
Cannot Separate
Can Segment
BBoxClassification
SegmentationClassification
Segmentationin BBox
Classification
+ =
Can Separate
Cannot Segment
Cannot Separate
Can Segment
Faster R-CNN FCN
BBoxClassification
SegmentationClassification
Segmentationin BBox
Classification
Faster R-CNN FCN FCNon BBOX !
+ =
+ =
Can Separate
Cannot Segment
Cannot Separate
Can Segment
Slide from Mask R-CNN Tutorial, K. He. ICCV 2017
FCN• Pixel-level Classification• Per Pixel Softmax (Multinomial)• Multi Instance
FCN• Pixel-level Classification• Per Pixel Softmax (Multinomial)• Multi Instance
Faster R-CNN• Classification• Instance Level RoI
FCN• Pixel-level Classification• Per Pixel Softmax (Multinomial)• Multi Instance
Faster R-CNN• Classification• Instance Level RoI
FCN• Pixel-level Classification• Per Pixel SoftmaxSigmoid (Binary)• Multi Instance
Faster R-CNN• Classification• Instance Level RoI
FCN• Pixel-level Classification• Per Pixel SoftmaxSigmoid (Binary)• Multi Instance
Faster R-CNN• Classification• Instance Level RoI
DBBBox + Class + Mask
𝐿 = 𝐿𝑐𝑙𝑠 +𝐿𝑏𝑜𝑥 +𝐿𝑚𝑎𝑠𝑘
𝐿𝑐𝑙𝑠:Softmax Cross Entropy𝐿𝑏𝑜𝑥:Regression𝐿𝑚𝑎𝑠𝑘:Binary Cross Entropy
Training Phase
𝐿𝑚𝑎𝑠𝑘 = 𝐿𝑐1 +𝐿𝑐2 +⋯+𝐿𝑐𝑘
𝐿𝑚𝑎𝑠𝑘 = 𝐿𝑐3
if) GT Class is 3
Training Phase
𝐿𝑚𝑎𝑠𝑘 = 𝐿𝑐1 +𝐿𝑐2 +⋯+𝐿𝑐𝑘
𝐿𝑚𝑎𝑠𝑘 = 𝐿𝑐3
if) GT Class is 3
Mask Branch Only Learns How to Mask independent of Class
Test Phase
Predicts Human MaskPredicts Car MaskPredicts Horse MaskPredicts ...
Test Phase
Predicts Human MaskPredicts Car MaskPredicts Horse MaskPredicts ...
Winner Takes All
Slide from Mask R-CNN Tutorial, K. He. ICCV 2017
Slide from Mask R-CNN Tutorial, K. He. ICCV 2017
Slide from Mask R-CNN Tutorial, K. He. ICCV 2017 Faster R-CNN, S. Ren, NIPS 2015
Slide from Mask R-CNN Tutorial, K. He. ICCV 2017
Deconv2x2 str2
Deconv2x2 str2
Slide from Mask R-CNN Tutorial, K. He. ICCV 2017 3x3 Conv4 Layer
Slide from Mask R-CNN Tutorial, K. He. ICCV 2017
1x1 Conv
1x1 Conv
Slide from Mask R-CNN Tutorial, K. He. ICCV 2017
Bbox Regression
Classification
RoI PoolingFixed Size Representation
Pooled Feature7x7
RoI Pooling (Fast R-CNN)• Input: Each RoI• Output: 7x7 Pooled Feature
RoI Align (Mask R-CNN)• Input: Each RoI• Output: 7x7 Pooled Feature
RoI Pooling (Fast R-CNN)• Input: Each RoI• Output: 7x7 Pooled Feature
RoI Align (Mask R-CNN)• Input: Each RoI• Output: 7x7 Pooled Feature
Feature Map
RoI
Note: Region Proposal Network RoI Prediction = Floating Point Representation
Feature Map
RoI
Feature Map
RoI
Feature Map
RoI
Max Pooling
Feature Map
RoI
Max Pooling
Feature Map
RoI
Feature Map
RoI
Feature Map
RoI
2x2 Subcells for Precision
= 0.15 + 0.25
+ 0.25 + 0.35
RoI
Feature Map
RoI
2x2 Subcell Max Pooling
Bbox Regression
Classification
RoI Align
Bbox Regression
Objectness
RPN
Binary Mask
Bbox Regression
Classification
RoI Align
Bbox Regression
Objectness
RPN
Binary Mask
Paste Back
Slide from Mask R-CNN Tutorial, K. He. ICCV 2017
• Faster R-CNN + ResNetDeep Residual Learning for Image Recognition, K He, 2016 CVPR
• Faster R-CNN + FPNFeature Pyramid Networks for Object Detection, T.Y.Lin 2017 CVPR
• Faster R-CNN + ResNetDeep Residual Learning for Image Recognition, K He, 2016 CVPR
• Faster R-CNN + FPNFeature Pyramid Networks for Object Detection, T.Y.Lin 2017 CVPR
Faster R-CNN + Binary Mask Prediction + FCN + RoIAlign
Faster R-CNN + Binary Mask Prediction + FCN + RoIAlign
Detection Performance Improvement
Q&A?