+ All Categories
Home > Technology > Temporal Action Localization in Untrimmed Videos via Multi Stage CNNs

Temporal Action Localization in Untrimmed Videos via Multi Stage CNNs

Date post: 12-Jan-2017
Category:
Upload: xavier-giro
View: 114 times
Download: 2 times
Share this document with a friend
26
Temporal Action Localization in Untrimmed Videos via Multi- Stage CNNs Slides by Alberto Montes Computer Vision Group Reading Group , June 13th, 2016 [arXiv ] [code ] Zheng Shou, Dongang Wang and Shih-Fu Chang
Transcript
Page 1: Temporal Action Localization in Untrimmed Videos via Multi Stage CNNs

Temporal Action Localization in Untrimmed Videos via Multi-Stage CNNs

Slides by Alberto MontesComputer Vision Group Reading Group,

June 13th, 2016

[arXiv] [code]

Zheng Shou, Dongang Wang and Shih-Fu Chang

Page 2: Temporal Action Localization in Untrimmed Videos via Multi Stage CNNs

Introduction

Page 3: Temporal Action Localization in Untrimmed Videos via Multi Stage CNNs

Previous Work

Improved Dense Trajectory (iDT)

Fisher Vector2D Convolution

Page 4: Temporal Action Localization in Untrimmed Videos via Multi Stage CNNs

Segment-CNN

Page 5: Temporal Action Localization in Untrimmed Videos via Multi Stage CNNs

Segment-CNN

Page 6: Temporal Action Localization in Untrimmed Videos via Multi Stage CNNs

Segment-CNN

Page 7: Temporal Action Localization in Untrimmed Videos via Multi Stage CNNs

Segment-CNN

Page 8: Temporal Action Localization in Untrimmed Videos via Multi Stage CNNs

Problem Definition

Video:

frame # frames

Annotations:

Candidates:

action category

action categorystart and ending frame

Page 9: Temporal Action Localization in Untrimmed Videos via Multi Stage CNNs

Multi-Scale Segment Generation

◉ Each frame resized to 171x128 pixels◉ Temporal sliding windows:

○ 16, 32, 64, 128, 256, 512 frames○ 75% overlap

◉ Construct segment s by uniformly sampling 16 frames

Page 10: Temporal Action Localization in Untrimmed Videos via Multi Stage CNNs

Network Architecture

C3D Network

Page 11: Temporal Action Localization in Untrimmed Videos via Multi Stage CNNs

Training Proposal and Classification Network

◉ lr=0.0001 except fc8 lr=0.01, momentum=0.9, weight decay factor=0.0005

◉ Drop lr by factor of 2 every 10K iterations

Proposal Network:

● fc8: 2 nodes

Classification Network:

● fc8: K+1 nodes

Page 12: Temporal Action Localization in Untrimmed Videos via Multi Stage CNNs

Localization Network

Add Custom Loss function

Page 13: Temporal Action Localization in Untrimmed Videos via Multi Stage CNNs

Localization Network

true class label

overlap sensitivity

Try to boost segments with high overlap

Works best with: λ = 1, α = 0.25

Page 14: Temporal Action Localization in Untrimmed Videos via Multi Stage CNNs

Localization Network

Learning target:

Page 15: Temporal Action Localization in Untrimmed Videos via Multi Stage CNNs

Localization Network

Page 16: Temporal Action Localization in Untrimmed Videos via Multi Stage CNNs

Prediction and Post-processing

◉ Keep segments with Ppro

> 0.7◉ Remove background segments◉ P

loc multiply with class-specific frequency of

occurrence for each window length in the training data to leverage window length distribution patterns

◉ NMS based on Ploc

to remove redundancy.

(θ - 0.1)

Page 17: Temporal Action Localization in Untrimmed Videos via Multi Stage CNNs

Experiments

Page 18: Temporal Action Localization in Untrimmed Videos via Multi Stage CNNs

MEXaction2

“Bull Charge Cape” and

“Horse Riding” videos

77 hours of videos

Training set: 1336 instances

Validation set: 310 instances

Test set: 329 instances

Datasets

THUMOS 2014

Temporal Action Detection Task

20 categories

Training set: 2755 videos

Validation set: 1010 videos and 3007 instances

Test set: 1574 videos and 3358 instances

Page 19: Temporal Action Localization in Untrimmed Videos via Multi Stage CNNs

Results MEXaction2

DFT: Dense Trajectory Features + SVM

Page 20: Temporal Action Localization in Untrimmed Videos via Multi Stage CNNs

Results MEXaction2

Page 21: Temporal Action Localization in Untrimmed Videos via Multi Stage CNNs

Results MEXaction2

Page 22: Temporal Action Localization in Untrimmed Videos via Multi Stage CNNs

Evaluation

Page 23: Temporal Action Localization in Untrimmed Videos via Multi Stage CNNs

Evaluation

Page 24: Temporal Action Localization in Untrimmed Videos via Multi Stage CNNs

Evaluation

Impact of individual networks:

Page 25: Temporal Action Localization in Untrimmed Videos via Multi Stage CNNs

Conclusions

Propose a multi-stage framework Semgent-CNN to address temporal action location

Page 26: Temporal Action Localization in Untrimmed Videos via Multi Stage CNNs

Thank you!Questions?


Recommended