Learning and Matching of Dynamic Shape Manifolds for Human Action Recognition

transcript

Learning and Matching of Dynamic Shape Manifolds for

Human Action Recognition

Liang Wang and David Suter

2007 IEEE

Outline

• Introduction

• Contribution

• LPP

• Activity Classification

• Results

Introduction

• The silhouettes can be regarded as points in a high-dimensional visual space, and these points may lie on a low-dimensional manifold embedded in the high-dimensional image space.

Introduction

• Exploit local preserving projections (LPP) for dimensionality reduction

• The associated sequences of dynamic silhouettes are used to learn the activity space using LPP

• To match activity trajectories in the low-dimension embedding space, two kinds of motion similarity measures are used.

Contribution

• The proposed method has several desirable properties: – Easier to comprehend and implement, without the

requirements of explicit feature tracking and complex probabilistic modeling of motion patterns.

– Being based on binary silhouette analysis, it naturally avoids some problems arising in most previous methods, e.g., unreliable 2-D or 3-D tracking, expensive and sensitive optical flow.

– Obtains good results on a large and challenging database and exhibits considerable robustness.

Methods of Dimensionality Reduction

• Curse of dimensionality.

• LPP based on linear projective maps that arise by solving a variational problem that optimally preserves the neighborhood structure of the data set.

Brief Introduction to LPP

• Constructing the adjacency graph– 該圖的點即是所有的測試資料，而判斷任兩點

之間是否有邊相連則有兩種方式，可依需求選擇

• ε-neighborhoods: 若兩點之間的距離小於某個常數ε ，則兩點之間有邊相連

• k –nearest neighbors: 若兩點中，有一點在另外一點於整組資料中最相近的 K 個點中，則兩點之間有邊相連

• Choosing the weights• Eigenmaps

• Constructing the adjacency graph

• Choosing the weights– 若在第一步驟的圖中兩點之間沒有邊相連，則

相關性為 0 ，否則將給定一個相關性，有兩種方式可供選擇， W 是一個 sparse 和對稱矩陣

• More simply: Wij = 1• Or Heat kernel: Wij =

• Eigenmaps

• Constructing the adjacency graph

• Choosing the weights

• Eigenmaps ( 找出投影矩陣 )– 最佳化的投影矩陣保留了 locality 可以由 minim

ize 下面的 objective function 求出

Brief Introduction to LPP• 依照相關性，找出投影矩陣 A

• 此時每組 aj 都獨立作用，所以改為考慮

• 其中 a 代表某一個投影向量，令則

D 是一個對角矩陣， Dii =L = D-W 稱作 Laplacian MatrixY 代表所有資料在 a 這個維度上面的投影

• 因為目前僅對轉換後的點之間的相關性作要求，還需要一些轉換後座標上的限制– 觀察 D 發現 Dii 代表與第 i 點相連的點數有多少，也說

明了此點的重要性，所以限制– 此一限制可讓越重要的點轉換出來的座標值越接近 0 ，

亦即將原點設在最密集的區域，此時所求的式子變成

– 最小值出現在

– 經過矩陣的微分運算可得– 把問題簡化成一個 solution of generalized eigenvalue

and eigenvector problem.

• 另外因為我們希望越小越好，所以取 a 為特徵最

小的 M 個非 0 特徵向量

Representations of Visual Inputs

• The results of transformation is a grayscale image that look similar to the input image.

• The distance transform assigns a number that is the distance between the pixel and the nearest nonzero pixel of raw.

Activity Classification• Assume that two action sequences are

respectively mapped into A1(L*T1) and A2(L*T2).– L: the reduced dimensionality.– T1 and T2: the durations of these two complete

actions respectively.

• We select two kinds of distance metrics to measure the motion similarity.

• Classifier

Motion Similarity

• Similarity-I: Normalized spatiotemporal correlation

• The computation usually requires knowing the temporal duration of each one action for such an approximate frame-to-frame matching.

• s and b explain time stretch and shifting respectively, T as max(T1,T2).

• We wrap each action trajectory matrix into the same temporal duration T by the bicubic interpolation.

Motion Similarity

• Similarity-II: Median Hausdorff Distance• A means of determining the resemblance of one p

oint set to another, by examining the fraction of points in one set that lie near points in the other set.

Classifier

• Action classification is performed in a nearest neighbour framework– TA: a test action sequence– Ri: the ith reference action sequence– d: similarity measure

• Classify the test as the class c that can minimize the similarity distance between test sequence and all reference pattern.

• d is the similarity measure d1 or d2 defined above

Results - evaluation dataset

Results - data processing

• To estimate the action cycle of each silhouette.• The object Ot’s self-similarity is computed at tim

e t1 and t2 based on the similarity measure of the absolute correlation.

• Bt1 is the bounding box of the object Ot1.

• In order to account for tracking error, the minimal S is found by translating over a small search radius r.

Results – results and analysis

• Identification mode: – the classifier determines which class a given

measurement belongs to in the nearest-neighbor framework

• verification mode: – The classifier is asked to verify whether a new

measurement really belongs to certain claimed class

• Reduced dimension: – Examine the relationship between the reduced

dimensions and the recognition rates

• Confusion matrix: – For the unsupervised LPP, there exist a few

false classifications. To analyze which action are incorrectly classified

Results – robustness test

• Corrupted silhouettes by real occlusions:

• Other challenging factors:• viewpoints

different clothes motion styles

Results - comparisons

• Different dimension reduction methods

Learning and Matching of Dynamic Shape Manifolds for Human Action Recognition

Documents