+ All Categories

Cluster

Date post: 21-May-2015
Category:
Upload: dcu
View: 145 times
Download: 2 times
Share this document with a friend
Popular Tags:
18
Spatio-temporal local feature clusters Iveel
Transcript
Page 1: Cluster

Spatio-temporal local feature clusters

Iveel

Page 2: Cluster

Intro

• Overview • Video as a cloud of feature points• Clusters of feature points• Video representation • Classification• Decision making• Result

Page 3: Cluster

Overview

• In Bag-of-Features (BOF) representation, the spatio-temporal configuration of video is ignored

• Proposed approach is to integrate spatio-temporal structure in video representation.– Local features are grouped ( refered as cluster ) based on

their spatio-temporal proximity– Each group , or cluster, will be independently represented

as BOF, (refered as cluster-level BOF).

• It will allow to localize the action in the video segment.

Page 4: Cluster

Video

• A video segment can be viewed as a cloud of local features in 3D space (x,y,t) .

Page 5: Cluster

Local feature grouping

• Intuition: Closely localized features ( in spatio-temporal domain) are more likely to be correspond to a same object, and far ones are more unlikely.

• In order to exploit this idea, a tree cluster is used to group local features based on their spatio-temporal proximity.

In this example, local feature points grouped into two clusters ( red & blue )

Page 6: Cluster

Cluster-level BOF

• Once local features are grouped as a cluster, each cluster is represented using BOF approach ( will be referred as cluster-level BOF) . – A frequency histogram will be generated over local

descriptors which belong to a particular cluster.

Page 7: Cluster

Training & Learning

• At each scale, a SVM classifier is trained with cluster-level BOF.

Page 8: Cluster

Experimental study

• Action segments from TRECVID SED is used for training & testing.– 7 action class: CellToEar, Embrace, ObjectPut,

PeopleMeet, PeopleSplitUp, PersonRuns, Pointing.• Training : 210 video segments in total– 30 videos segments per action class

• Testing: 138 video segments in total– approx.20 video segments per action class

Page 9: Cluster

Experimental study

• The spatio-temporal bounding box is manually drawn for both test & training set segments.

Page 10: Cluster

Experiment 1- Cluster number vs performace

• The optimal number of cluster is studied. – In the experiment, 6 different cluster number are chosen:

1,2,4,8,16 and 32. – For example: If the cluster number is 16, then it means

that the video segment is divided into 16 sub-regions (cluster) and each has its own BOF histogram ( cluster-BOF) . Based on the bounding box information, the cluster-BOF is annotated.

Page 11: Cluster

Experiment 1- Cluster number vs performace : CellToEar

Page 12: Cluster

Experiment 1- Cluster number vs performace : Embrace

Page 13: Cluster

Experiment 1- Cluster number vs performace : ObjecPut

Page 14: Cluster

Experiment 1- Cluster number vs performace : PeopleMeet

Page 15: Cluster

Experiment 1- Cluster number vs performace : PeopleSplitUp

Page 16: Cluster

Experiment 1- Cluster number vs performace : PersonRuns

Page 17: Cluster

Experiment 1- Cluster number vs performace : Pointing

Page 18: Cluster

Conclusion

• The results is based on cluster-level BOF.• To give segment-based result, the proper

aggregation of cluster-BOFs, belong to same video-segment, is required. – The naïve approach is to assign an action class,

that has a highest vote from clusters, to its parent segment.


Recommended