+ All Categories
Home > Documents > SIAM 2014 InvitedTalk

SIAM 2014 InvitedTalk

Date post: 02-Jun-2018
Category:
Upload: binuq8usa
View: 221 times
Download: 0 times
Share this document with a friend

of 19

Transcript
  • 8/10/2019 SIAM 2014 InvitedTalk

    1/19

    Learning hierarchical invariant spatio-temporalfeatures for human action and activityrecognition

    Binu M Nair, Vijayan K Asari

    07/08/2014

  • 8/10/2019 SIAM 2014 InvitedTalk

    2/19

    Introduction

    Applications of activity/action recognition

    Gaming (Kinect) Autonomous Visual Control of Fighter Jets by Air Crew hand gestures.

    Research Objectives

    To detect and recognize harmful activities of individuals of interest from a set/pair of surveillance cameras at long range.

    Motivation: Monitoring a crowded environment and locating suspicious activities by security personnel

    Security personnel creates a temporary signature of people in the scene (type of clothing, the shape etc..)

    Identifies the action of the person (walking, running etc..)

    Locates the individual with suspicious action and then observes him closely of what he is doing( from the joints movements etc)

  • 8/10/2019 SIAM 2014 InvitedTalk

    3/19

    Introduction

    To have an automated system to perform these tasks, there are 4 different entities

    Automatic Pedestrian Unique ID tagger

    Security personnel fairly knowing what each person looked like

    Human Action Recognition

    Seeing what action each one does : walking, running, bending etc..

    Automatic Detection and Tracking of Specific body joints.

    Examining a particular individual(performing a suspicious action) closely of what he/she does

    Inference of what activity is performed by joint trajectory analysis based on context

    Eg: Bending down to place a suitcase or pick up a box or tying his shoe lace etc..

  • 8/10/2019 SIAM 2014 InvitedTalk

    4/19

    Motivation

    Need a real time system

    Recognize an action or an activity from 15-20 frames of a streaming video

    Should not depend on the initialization of action/gait cycle states (starting/ending points of a

    an action cycle)

    Should be invariant to speed of motion

    Applications

    Air crew hand gesture recognition for autonomous visual control of fighter jet

    Decision to follow a person based on activity in surveillance.

  • 8/10/2019 SIAM 2014 InvitedTalk

    5/19

    Typical Data-flow for Generic Action Recognitionsystem

    Feature Extraction : - Posture/Motion Cues (Hierarchical invariant features)

    Action Segmentation:- Segmenting out action instances consistent with the train set

    Action Learning and Classification:- Learn statistical models to classify new feature

    observations ( based on PCA-Generalized Regression Neural Networks)

    Feature

    Extraction

    Action Learning

    Action

    Classification

    Action Model

    Database

    Action

    SegmentationVideo

  • 8/10/2019 SIAM 2014 InvitedTalk

    6/19

    Feature Extraction and Feature Fusion

    HierarchicalHistogram of

    Oriented Flow

    Quantized

    Local Binary

    Pattern

    +

    Action

    Feature

    Input Frame

    Feature Fusion

    Optical Flow

    Optical Flow Mag/Dir

    Hierarchical Histogram of

    Oriented Flow

    HOF

    (N)

    HOF

    (N/2)

    HOF

    (N/2)

    HOF

    (N/2)

    HOF

    (N/2)

    Masked Region

    Feature Fusion Assumption that HHOF, LBFP and RT are independent of each other. Can concatenate one after the other to form the complete feature vector ( Feature Fusion in

    Biometric systems)

    R-Transform

  • 8/10/2019 SIAM 2014 InvitedTalk

    7/19

    Feature Selection

    Feature Set

    3-Level HHOF ( 140 elements) , 2-Level LBFP ( 295 elements) , 2-level R-Transform

    (180) : Total Feature Set

    Over fitting of regression model for each action class and tuned more to irrelevant and

    redundant feature elements and thus lower accuracy.

    Methodology ( Fast Correlation-based Feature Selection) - FCBF Identify relevant features with large correlation values

    Remove redundant features and choose a subset of features.

    Correlation measure based on Information Theory

    Symmetrical Uncertainty (SU) between two random variables X and Y

    H(X) Entropy ; IG(X|Y) information of X gained from the knowledge provided by

    Y

  • 8/10/2019 SIAM 2014 InvitedTalk

    8/19

    Algorithm(Training / Testing)

  • 8/10/2019 SIAM 2014 InvitedTalk

    9/19

    RESULTS

  • 8/10/2019 SIAM 2014 InvitedTalk

    10/19

    Weizmann dataset

    10 different actions performed by 9 different persons

    Low resolution video at 30 fps

    Static background

  • 8/10/2019 SIAM 2014 InvitedTalk

    11/19

    Weizmann Dataset

    Testing strategy:- Leave 10 out (corresponding to one person)

    Partial Sequence :- 15 frames with overlap of 10 frames

  • 8/10/2019 SIAM 2014 InvitedTalk

    12/19

    Robustness Test (Test for Deformity)With bag With dog Knees Up Limping Moonwalk

    Legs

    Occluded

    Normal

    WalkWith

    BriefcaseWith Pole With Skirt

    Test Seq 1st Best 2nd Best Median to

    all actions

    Swinging a

    bag

    Walk 2.508 Skip 3.094 3.939

    Carrying a

    briefcase

    Walk 1.866 Skip 2.170 3.641

    Walking

    with a dog

    Walk 1.806 Skip 2.338 3.824

    Knees Up Walk 2.894 Side 3.270 4.091

    Limping

    Man

    Walk 2.224 Skip 2.922 3.821

    Sleepwalkin

    g

    Walk 1.892 Skip 2.132 3.663

    Occluded

    Legs

    Walk 1.883 Skip 2.594 2.624

    NormalWalk

    Walk 1.886 Skip 2.624 3.633

    Occluded by

    a pole

    Walk 2.149 Skip 2.945 3.880

    Walking in a

    skirt

    Walk 1.855 Skip 2.159 3.540

  • 8/10/2019 SIAM 2014 InvitedTalk

    13/19

    Cambridge Hand gesture

    9 different hand gestures Different combinations of shape and motion 5 different illumination conditions

  • 8/10/2019 SIAM 2014 InvitedTalk

    14/19

    KTH Action Dataset

    6 human actions 25 subjects 4 different scenarios 600 sequence divided into 2391 subsequences Low res : 160 120 at 25 fps

    11/10/2014 Binu M Nair 14

    R lt 4 t i d f t

  • 8/10/2019 SIAM 2014 InvitedTalk

    15/19

    Results on 4 sets using proposed featureset.

  • 8/10/2019 SIAM 2014 InvitedTalk

    16/19

    Results on all sets with STIP features

  • 8/10/2019 SIAM 2014 InvitedTalk

    17/19

    UCF Sports Dataset

    High Res : 720 480 200 video sequences Contains 9 actions Challenge :

    Complex and varying background Wide range of scenes and view point variations

    Tested on 8 actions : dive, golf swing, lift, ride, run, skate, swing and walk Tested on window size of 15 frames with overlap of 10.

    11/10/2014 Binu M Nair 17

  • 8/10/2019 SIAM 2014 InvitedTalk

    18/19

    Future work in action recognition Testing on the UCF ARG

    Dataset Multi-view human action

    dataset Set of actions

    Boxing, carrying, clapping,digging, jogging, open-closetrunk, running,throwing, walking, waving

    Challenges Different resolutions

    across cameras. Different kinds of

    features.

  • 8/10/2019 SIAM 2014 InvitedTalk

    19/19

    Thank You

    Questions?


Recommended