+ All Categories
Home > Documents > Distributed Multi-Dimensional Hidden Markov Model: Theory and ...mxiang/publications/spie08.pdf ·...

Distributed Multi-Dimensional Hidden Markov Model: Theory and ...mxiang/publications/spie08.pdf ·...

Date post: 20-Jul-2020
Category:
Upload: others
View: 5 times
Download: 0 times
Share this document with a friend
12
Distributed Multi-Dimensional Hidden Markov Model: Theory and Application in Multiple-Object Trajectory Classication and Recognition Xiang Ma, Dan Schonfeld and Ashfaq Khokhar Department of Electrical and Computer Engineering, University of Illinois at Chicago, 851 South Morgan Street, Chicago, IL, U.S.A. ABSTRACT In this paper, we propose a novel distributed multi-dimensional hidden Markov model (DHMM). The proposed model can represent, for example, multiple motion trajectories of objects and their interaction activities in a scene; it is capable of conveying not only dynamics of each trajectory, but also interactions information between multiple trajectories, which can be critical in many applications. We firstly provide a solution for non-causal, multi- dimensional hidden Markov model (HMM) by distributing the non-causal model into multiple distributed causal HMMs. We approximate the simultaneous solution of multiple HMMs on a sequential processor by an alternate updating scheme. Subsequently we provide three algorithms for the training and classification of our proposed model. A new Expectation-Maximization (EM) algorithm suitable for estimation of the new model is derived, where a novel General Forward-Backward (GFB) algorithm is proposed for recursive estimation of the model parameters. A new conditional independent subset-state sequence structure decomposition of state sequences is proposed for the 2D Viterbi algorithm. The new model can be applied to many other areas such as image segmentation and image classification. Simulation results in classification of multiple interacting trajectories demonstrate the superior performance and higher accuracy rate of our distributed HMM in comparison to previous models. Keywords: Trajectory Modelling, Activity Recognition, Hidden Markov Models 1. INTRODUCTION Motion trajectories of objects have been shown to provide critical information in the representation of object dynamics for retrieval and classification. In many applications, the simultaneous interaction among multiple objects provides critical information that is invaluable in retrieval and classification applications, e.g. character- izing interaction activities and group dynamics. Examples of interactions of multiple objects are depicted in Fig. 1. It is important to point out that classification of motion activities involving multiple interacting objects can only be ascertained by modelling their trajectories simultaneously. In particular, motion trajectory classification of each object separately cannot be used to extract interactive events. For example, in Fig. 1(a), if we look at one person only, we can say “one person walks.” However, the underlying event “two people walk and meet” cannot be determined. Hidden Markov model (HMM) is very powerful tool to model temporal dynamics of processes, and has been successfully applied to many applications such as speech recognition, 1 gesture recognition, 2 musical score follow- ing. 3 F. Bashir et al. 4 presented a novel classification algorithm of object motion trajectory based on 1D HMM. They segmented single trajectory into atomic segments called subtrajectories based on curvature of trajectory, then the subtrajectories are represented by their principal component analysis (PCA) coefficients. Temporal relationships of subtrajectories are represented by fitting a 1D HMM. However, all the above applications rely on a one-dimensional HMM structure. Simple combinations of 1D HMMs can not be used to characterize multiple trajectories, since 1D models fail to convey interaction information of multiple interacting objects. The major challenge here is to develop a new model that will semantically reserve and convey the “interaction” information. Further author information: (Send correspondence to X. Ma.) E-mails: {mxiang, ds, ashfaq}@ece.uic.edu
Transcript
Page 1: Distributed Multi-Dimensional Hidden Markov Model: Theory and ...mxiang/publications/spie08.pdf · Hidden Markov model (HMM) is very powerful tool to model temporal dynamics of processes,

Distributed Multi-Dimensional Hidden Markov Model:Theory and Application in Multiple-Object Trajectory

Classication and Recognition

Xiang Ma, Dan Schonfeld and Ashfaq Khokhar

Department of Electrical and Computer Engineering, University of Illinois at Chicago,851 South Morgan Street, Chicago, IL, U.S.A.

ABSTRACT

In this paper, we propose a novel distributed multi-dimensional hidden Markov model (DHMM). The proposedmodel can represent, for example, multiple motion trajectories of objects and their interaction activities in a scene;it is capable of conveying not only dynamics of each trajectory, but also interactions information between multipletrajectories, which can be critical in many applications. We firstly provide a solution for non-causal, multi-dimensional hidden Markov model (HMM) by distributing the non-causal model into multiple distributed causalHMMs. We approximate the simultaneous solution of multiple HMMs on a sequential processor by an alternateupdating scheme. Subsequently we provide three algorithms for the training and classification of our proposedmodel. A new Expectation-Maximization (EM) algorithm suitable for estimation of the new model is derived,where a novel General Forward-Backward (GFB) algorithm is proposed for recursive estimation of the modelparameters. A new conditional independent subset-state sequence structure decomposition of state sequencesis proposed for the 2D Viterbi algorithm. The new model can be applied to many other areas such as imagesegmentation and image classification. Simulation results in classification of multiple interacting trajectoriesdemonstrate the superior performance and higher accuracy rate of our distributed HMM in comparison toprevious models.

Keywords: Trajectory Modelling, Activity Recognition, Hidden Markov Models

1. INTRODUCTION

Motion trajectories of objects have been shown to provide critical information in the representation of objectdynamics for retrieval and classification. In many applications, the simultaneous interaction among multipleobjects provides critical information that is invaluable in retrieval and classification applications, e.g. character-izing interaction activities and group dynamics. Examples of interactions of multiple objects are depicted in Fig.1. It is important to point out that classification of motion activities involving multiple interacting objects canonly be ascertained by modelling their trajectories simultaneously. In particular, motion trajectory classificationof each object separately cannot be used to extract interactive events. For example, in Fig. 1(a), if we look atone person only, we can say “one person walks.” However, the underlying event “two people walk and meet”cannot be determined.

Hidden Markov model (HMM) is very powerful tool to model temporal dynamics of processes, and has beensuccessfully applied to many applications such as speech recognition,1 gesture recognition,2 musical score follow-ing.3 F. Bashir et al.4 presented a novel classification algorithm of object motion trajectory based on 1D HMM.They segmented single trajectory into atomic segments called subtrajectories based on curvature of trajectory,then the subtrajectories are represented by their principal component analysis (PCA) coefficients. Temporalrelationships of subtrajectories are represented by fitting a 1D HMM. However, all the above applications rely ona one-dimensional HMM structure. Simple combinations of 1D HMMs can not be used to characterize multipletrajectories, since 1D models fail to convey interaction information of multiple interacting objects. The majorchallenge here is to develop a new model that will semantically reserve and convey the “interaction” information.

Further author information: (Send correspondence to X. Ma.)E-mails: {mxiang, ds, ashfaq}@ece.uic.edu

Page 2: Distributed Multi-Dimensional Hidden Markov Model: Theory and ...mxiang/publications/spie08.pdf · Hidden Markov model (HMM) is very powerful tool to model temporal dynamics of processes,

Figure 1. Examples of multiple interactive trajectories: (a) “Two people walk and meet”. (b) “Two air planes fly towardseach other and pass by”.

Various HMM model structures have been proposed as extensions of 1D HMMs. Early efforts devoted toextending 1D-HMMs to higher dimensions were presented by pseudo 2D-HMMs5.6 The model is called “pseudo2D” in the sense that it is not a fully connected 2D-HMM. The basic assumption is that there exists a setof “superstates” that are Markovian and within each superstate there is a set of simple Markovian states. Toillustrate this model for higher dimensional systems, let us consider a two-dimensional image. The transitionbetween superstates is modeled as a first-order Markov chain and each superstate is used to represent an entirerow of the image; a simple Markov chain is then used to generate observations in the column, as depicted in Fig.2(a). Thus, superstates relate to rows while simple states to columns of the image. Later efforts to represent2D data using 1D HMMs were proposed using coupled HMMs (CHMMs)7.8 In this framework, each state ofthe 1D HMM is used as a meta-state to represent a collection of states, as depicted in Fig. 2(b). For example,image representation based on CHMMs would rely on a 1D HMM where each state represents an entire columnof the image. In certain applications, these models perform better than the classical 1D-HMM.5 However, theperformance of pseudo 2D-HMMs and CHMMs remains limited since these models capture only part of thetwo-dimensional hidden state information.

The first analytic solution to true two-dimensional HMMs has been presented by Li, Najmi and Gray.9 Theyproposed a causal two-dimensional HMM and presented its application to image Classification. In this model,state transition probability for each node is conditioned on the states of nearest neighboring nodes from thehorizontal and vertical directions, as depicted in Fig. 2(c). The limitation of this approach is that the statedependence of a specific node may arise from any direction and from any of its neighbors. Thus, the analyticsolution to the two-dimensional model presented in9 will only capture partial information. In particular, thetraining and classification algorithms presented in9 rely on the causality of the model. Hence, direct extension ofthese algorithms to general 2D-HMMs, which can represent state dependencies from neighbors in all directions,is not possible since such a model is inherently non-causal.

We propose a novel distributed multi-dimensional hidden Markov Model (DHMM) for modelling of interactingtrajectories involving multiple objects. In our model, each object-trajectory is modelled as a separate HiddenMarkov process; while “interactions” between objects are modelled as dependencies of state variables of oneprocess on states of the others. The intuition of our work is that, HMM is very powerful tool to model temporaldynamics of each process (trajectory); each process (trajectory) has its own dynamics, while it may be influencedby or influence others. In our proposed model, “influence” or “interaction” among processes (trajectories) aremodelled as dependencies of state variables among processes (trajectories). Our model is capable of conveying notonly dynamics of each trajectory, but also interactions information between multiple trajectories, while requiringno semantic analysis.

We first provide a solution for non-causal, multi-dimensional HMMs by distributing the non-causal modelinto multiple distributed causal HMMs. We approximate the simultaneous solution of multiple distributedHMMs on a sequential processor by an alternate updating scheme, one possible alternate updating schemeis depicted in Fig. 3. The numbers {1, 2, 3, 4, ...} are the sequence orders of updating of model parameters.

Page 3: Distributed Multi-Dimensional Hidden Markov Model: Theory and ...mxiang/publications/spie08.pdf · Hidden Markov model (HMM) is very powerful tool to model temporal dynamics of processes,

Figure 2. Various two-dimensional hidden Markov models: (a) Pseudo 2D-HMM (b) Coupled HMM (CHMM) (c) Causal2D-HMM9 with two nearest neighbors in vertical and horizontal directions (d) Proposed general non-causal 2D-HMM

Figure 3. Sequential alternate updating scheme of multiple distributed HMMs

Subsequently we extend the training and classification algorithms presented in9 to a general causal model.A new Expectation-Maximization (EM) algorithm for estimation of the new model is derived, where a novelGeneral Forward-Backward (GFB) algorithm is proposed for recursive estimation of the model parameters. Anew conditional independent subset-state sequence structure decomposition of state sequences is proposed for the2D Viterbi algorithm. The new model can be applied to many problems in pattern analysis and classification. Forsimplicity, the presentation in this paper will focus primarily on a special case of the our model in two-dimensions,which we referred to as distributed 2D hidden Markov Models (2D DHMMs).

Page 4: Distributed Multi-Dimensional Hidden Markov Model: Theory and ...mxiang/publications/spie08.pdf · Hidden Markov model (HMM) is very powerful tool to model temporal dynamics of processes,

Figure 4. Distributed 2D Hidden Markov Models: (a) Non-causal 2D Hidden Markov Model. (b) Distributed 2D HiddenMarkov Model 1. (c) Distributed 2D Hidden Markov Model 2.

2. DISTRIBUTED 2D HIDDEN MARKOV MODEL

Suppose there are M ∈ N interacting objects in a scene. Recall that in our model, each object-trajectory ismodelled as a Hidden Markov process of time; while “interactions” between object trajectories are modelled asdependencies of state variables of one process on those of the others. We constrain the probabilistic dependenciesof state in one process (trajectory) at time t, on its own state at time t-1, as well as on the states of other processes(trajectories) that “interact” or influence on it at time t and t-1, i.e.

Pr(s(m, t)|s(l, t), s(n, 1 : t− 1)) = Pr(s(m, t)|s(n, t− 1), s(l, t)) (1)

where m,n, l ∈ {1, ..., M} are indexes of processes (trajectories), l 6= m. The above constrain of state depen-dencies makes the desired model non-causal, since each process (trajectory) can influence others, there is noguarantee that the influence should be directional or causal. Fig. 4(a) shows an example of our non-causal 2DHMM, which is used to model two interacting trajectories. Each node S(i; t) in the figure represents one stateat specific time t for trajectory i, where t = {1, 2, ..., T}, i = {1,2}; each node O(i,t) represents observationscorresponding to S(i,t), and each arrow indicates transition of states (the reverse direction of it indicates depen-dency of states). The first row of states is the state sequence for trajectory 1, and the second row corresponds totrajectory 2. As can be seen, each state in one HMM chain (trajectory) will depend on its past state, the paststate of the other HMM chain (trajectory), and the concurrent state of the other HMM chain (trajectory).

The above model is capable of modelling multiple processes and their interactions, but it is intractable sinceit is non-causal. We propose a novel and effective solution to it, where we “decompose” it into M causal 2Dhidden Markov models with multiple dependencies of states, such that each HMM can be executed in parallelin a distributed framework. In each of the distributed causal HMM, state transitions (or state dependencies)must follow the same causality rule. For example, we distributed the non-causal 2D HMM in fig. 4(a) to twocausal 2D HMMs, shown in figs. 4(b) and 4(c), respectively. In fig. 4(b), state transitions follow the same rule,so do state transitions in fig. 4(c). The above rules ensure the homogeneous structure of each distributed HMM,which further enable us to develop relatively tractable training and classification algorithms.

When trajectory number M=3, our non-causal 2D hidden Markov model is depicted as in Fig. 5(a), we usethe same distributing scheme to get 3 distributed 2d hidden Markov models, as shown in Fig. 5(b), 5(c), and5(d), respectively. Using the same distributing scheme, we can distribute any non-causal 2D hidden Markovmodel that characterizing M(> 3) trajectories, to M distributed causal 2D hidden Markov models.

3. DHMM TRAINING AND CLASSIFICATION

Define the observed feature vector set O = {o(m,t), m = 1,2,...,M; t = 1,2,...,T } and corresponding hidden stateset S = {s(m,t), m = 1,2,...,M; t = 1,2,...,T }, and assume each state will take N possible values. The model

Page 5: Distributed Multi-Dimensional Hidden Markov Model: Theory and ...mxiang/publications/spie08.pdf · Hidden Markov model (HMM) is very powerful tool to model temporal dynamics of processes,

Figure 5. Distributed 2D Hidden Markov Models with application to 3 trajectories: (a) Non-causal 2D Hidden MarkovModel that treat 3 object trajectory as one system(only 2 adjacent time slots of the system states are shown). (b)Distributed 2D Hidden Markov Model 1 for Object Trajectory 1. (c) Distributed 2D Hidden Markov Model 2 for ObjectTrajectory 2. (d) Distributed 2D Hidden Markov Model 3 for Object Trajectory 3. (Please note in Figures (b) (c) and(d), only state transitions to one state point is shown, other state points follow the same rules, respectively).

parameters are defined as a set Θ = {Π,A,B}, where Π is the set of initial probabilities of states Π = {π(m,n)};A is the set of state transition probabilities A = {ai,j,k,l(m)}, and ai,j,k,l(m) = Pr(s(m, t) = l|s(m′, t) =k, s(m, t − 1) = i, s(m′, t − 1) = j) , and B is the set of probability density functions (PDFs) of the observedfeature vectors given corresponding states, assume B is a set of Gaussian distribution with means µm,n andvariances Σm,n, where m,m′ = 1, ..., M ;m 6= m′;n, i, j, k, l = 1, ..., N ; t = 1, ..., T . Due to space limit, we willdiscuss the case M = 2 in detail.

Please note that in all the following discussions, time durations and numbers of trajectories in one sceneare the same. This assumption is easily violated, due to different appear and disappear moments of differentobjects; however, since our aim is to model multiple interacting trajectories, we are only interested in those timedurations when interacting objects co-exist and interact with each other. For a long time duration within whichnumber of objects changes, we usually divide it into smaller time durations within which number of interactingtrajectories remains the same. What’s more, single object trajectory part can be easily modelled as 1D HMM,and only multiple interacting trajectories will be modelled as DHMMs.

3.1. Expectation-Maximization Algorithm

We propose a new Expectation-Maximization (EM) algorithm suitable for estimation of parameters of the MDistributed 2D Hidden Markov Models in M trajectory system, which is analog but different to EM algorithmfor 1D HMM.10 EM algorithm was firstly proposed by Dempster, Laird and Rubin,11 and there are manyapplications of EM algorithm to Hidden Markov Models10.1 Let’s take DHMM with applications to 2 trajectories,especially DHMM in Fig. 4(c) for example, to explain how EM algorithm works for DHMM; the estimation ofparameters of the DHMM with applications to M(> 2) is in the same way. Recall the Distributed 2D HiddenMarkov Model in Fig. 4(c), assume the duration of trajectories are T (Conditions that trajectories have differentdurations were discussed above). We have Observation Sequence O = {o(i, j), i = 1, 2; j = 1, ..., T}, and HiddenStates S = {s(i, j), i = 1, 2; j = 1, ..., T} where o(1, j), o(2, j) refer to observation (feature) set of object 1 and 2,and s(1, j), s(2, j) refer to state sets of object 1 and object 2. Here S refers to the union of those two state sets.Assume the number of states is N , so s(1, j), s(2, j) ∈ {1, 2, ..., N} for j = 1, 2, ..., T .

Page 6: Distributed Multi-Dimensional Hidden Markov Model: Theory and ...mxiang/publications/spie08.pdf · Hidden Markov model (HMM) is very powerful tool to model temporal dynamics of processes,

The incomplete data is O, and complete data is (O, S). The incomplete-data likelihood function is P (O|Θ),the complete-data likelihood function is P (O, S|Θ). We would like to maximize the complete-data likelihoodfunction P (O, S|Θ). According to EM algorithm, the Q function is:

Q(Θ,Θ′) =∑

S∈V

log(P (O, S)|Θ)P (S|O, Θ′). (2)

where

P (O, S|Θ) = πs(2,0)

T∏t=1

am,n,k,lbs(1,t)(o(1, t))bs(2,t)(o(2, t)). (3)

Here Θ′ is the current (known) estimation of parameters, Θ is the future (unknown) estimation of parametersthat maximize the likelihood function, V is the space of all possible state sequences with length of T . The jointprobability of Observations O and states S is P (O, S|Θ).

Substitute (4) into (3), we get

Q(Θ,Θ′) =∑

S∈V

log(πs(2,0))P (O, S|Θ′)︸ ︷︷ ︸

A∗

+∑

S∈V

T∑t=1

log(am,n,k,l)P (S|O, Θ′)

︸ ︷︷ ︸B∗

+∑

S∈V

T∑t=1

log(bs(2,t)(o(2, t)))P (S|O, Θ′)

︸ ︷︷ ︸C∗

+∑

S∈V

T∑t=1

log(bs(1,t)(o(1, t)))P (S|O, Θ′)

︸ ︷︷ ︸D∗

. (4)

Where in the above equations, am,n,k,l is the transition probability from states s(2, t − 1), s(1, t − 1), s(1, t) tostate s(2, t), when s(2, t − 1) is in state m, s(1, t − 1) is in state n, s(1, t) is in state k and s(2, t) is in statel. m,n, k, l ∈ {1, 2, ..., N}; bs(m,t)(o(m, t)) is the probability of observation o(m, t) from trajectory m givencorresponding state s(m, t), m = 1, 2; and assume they follow a d-dimensional Gaussian Distribution, when thecorresponding state is in i(i ∈ {1, 2, ..., N}), i. e.

bm,i(o(m, t)) =1

(2π)d2 |Σm,i| 12

e−12 (o(m,t)−µm,i)

T Σ−1m,i(o(m,t)−µm,i) (5)

where in the above equation, µm,i is d-dimensional mean vector and Σm,i is d × d covariance matrix, and d isthe dimensionality of observation (feature) vector.

To maximize Q(Θ,Θ′), we maximize both (A∗),(B∗),(C∗) and (D∗). By maximize (A∗),(B∗),(C∗) and (D∗),we get the iterative updating formulas of parameters of our distributed 2D HMM.

Define F(p)m,n,k,l(i, j) as the probability of state corresponding to observation o(i − 1, j) is state m, state

corresponding to observation o(i − 1, j − 1) is state n, state corresponding to observation o(i, j − 1) is state kand state corresponding to observation o(i, j) is state l, given the observations and model parameters,

F(p)m,n,k,l(i, j) (6)

Page 7: Distributed Multi-Dimensional Hidden Markov Model: Theory and ...mxiang/publications/spie08.pdf · Hidden Markov model (HMM) is very powerful tool to model temporal dynamics of processes,

= P

(m = s(i− 1, j), n = s(i− 1, j − 1), k = s(i, j − 1), l = s(i, j)|O, Θ(p)

), (7)

and define G(p)m (i, j) as the probability of the state corresponding to observation o(i, j) is state m, then

G(p)m (i, j) = P (s(i, j) = m|O, Θ(p)). (8)

We can get the iterative updating formulas of parameters of the proposed model,

π(p+1)m = P (G(p)

m (1, 1)|O, Θ(p)). (9)

a(p+1)m,n,k,l =

∑Ii

∑Jj F

(p)m,n,k,l(i, j)∑M

l=1

∑Ii

∑Jj F

(p)m,n,k,l(i, j)

. (10)

µ(p+1)m =

∑Ii

∑Jj G

(p)m (i, j)o(i, j)

∑Ii

∑Jj G

(p)m (i, j)

. (11)

Σ(p+1)m =

∑Ii

∑Jj G

(p)m (i, j)(o(i, j)− µ

(p+1)m )(o(i, j)− µ

(p+1)m )T

∑Ii

∑Jj G

(p)m (i, j)

. (12)

In eqns. (1)-(6), p is the iteration step number. F(p)m,n,k,l(i, j), G

(p)m (i, j) are unknown in the above formulas, next

we propose a General Forward-Backward (GFB) algorithm for the estimation of them.

3.2. General Forward-Backward (GFB) AlgorithmForward-Backward algorithm was firstly proposed by Baum et al.12 for 1D Hidden Markov Model and latermodified by Jia Li et al. in.9 Here, we generalize the Forward-Backward algorithm in129 so that it can beapplied to any HMM, the proposed algorithm is called General Forward-Backward (GFB) algorithm. For anyHMM model, if its state sequence satisfy the following property, then GFB algorithm can be applied to it: Theprobability of all-state sequence S can be decomposed as products of probabilities of conditional-independentsubset-state sequences U0, U1, ..., i.e., P (S) = P (U0)P (U1/U0)...P (Ui/Ui−1)..., where U0, U1, ..., Ui...are subsetsof all-state sequence in the HMM system, we call them subset-state sequences. Define the observation sequencecorresponding to each subset-state sequence Ui as Oi. Subset-state sequences for our model are shown in Fig.6(b)(c). The new structure enables us to use General Forward-Backward (GFB) algorithm to estimate the modelparameters.

3.2.1. Forward and Backward Probability

Define the forward probability αUu(u), u = 1, 2, ... as the probability of observing the observation sequence Ov(v ≤

u) corresponding to subset-state sequence Uv(v ≤ u) and having state sequence for u-th product component inthe decomposing formula as Uu, given model parameters Θ, i.e. αUu(u) = P{S(u) = Uu, Ov, v ≤ u|Θ}, and thebackward probability βUu

(u), u = 1, 2, ... as the probability of observing the observation sequence Ov (v > u)corresponding to subset-state sequence Uv(v > u), given state sequence for u-th product component as Uu andmodel parameters Θ, i.e. βUu

(u) = P (Ov, v > u|S(u) = Uu,Θ). The recursive updating formula of forward andbackward probabilities can be obtained as

αUu(u) = [

∑u−1

αUu−1(u− 1)P{Uu|Uu−1,Θ}]P{Ou|Uu,Θ}. (13)

βUu(u) =∑u+1

P (Uu+1|Uu,Θ)P (Ou+1|Uu+1,Θ)βUu+1(u + 1). (14)

Then, the estimation formulas of Fm,n,k,l(i, j), Gm(i, j) are :

Gm(i, j) =αUu(u)βUu(u)∑

u:Uu(i,j)=m αUu(u)βUu

(u). (15)

Fm,n,k,l(i, j)

=αUu−1(u− 1)P (Uu|Uu−1,Θ)P (Ou|Uu,Θ)βUu

(u)∑u

∑u−1[αUu−1(u− 1)P (Uu|Uu−1,Θ)P (Ou|Uu,Θ)βUu

(u)]. (16)

Page 8: Distributed Multi-Dimensional Hidden Markov Model: Theory and ...mxiang/publications/spie08.pdf · Hidden Markov model (HMM) is very powerful tool to model temporal dynamics of processes,

Figure 6. Various HMM models and their corresponding conditional-independent subset-state sequence decompositionstructures for GFB algorithm: (a)Distributed 2D Hidden Markov Model 1 for 3 trajectories. (b) Distributed 2D HiddenMarkov Model 2 for 3 trajectories (c) Causal 2D Hidden Markov Model. (d) 1D Hidden Markov Model.

3.3. Viterbi Algorithm

For classification, we employ a two-dimensional Viterbi algorithm13 to search for the best combination of stateswith maximum a posteriori probability and map each block to a class. This process is equivalent to search forthe state of each block using an extension of the variable-state Viterbi algorithm presented in,9 based on thenew structure in Fig. 6(b)(c). If we search for all the combinations of states, suppose the number of states ineach subset-state sequence Uu is w(u), then the number of possible sequences of states at every position will beMw(u), which is computationally infeasible. To reduce the computational complexity, we only use N sequencesof states with highest likelihoods out of the Mw(u) possible states.

3.4. Summary of DHMM Training and Classification Algorithms

-Training:

1. Assign initial values to {πm,n, ai,j,k,l, µm,n,Σm,n}.2. Update the forward and backward probabilities according to eqns. (13) and (14) using proposed GFB

algorithm, calculate old logP (O|Θ0).

3. Update Fi,j,k,l(m, t), Gn(m, t) according to eqns. (15)(16).

4. Update πm,n, ai,j,k,l(m), µm,n and Σm,n according to eqns. (9)-(12) using the proposed EM algorithm.

5. Back to step 2,Calculate new logP (O|Θ), stop if logP (O|Θ)-logP (O|Θ0) is below pre-set threshold.

-Classification: Use a two-dimensional Viterbi algorithm to search for the best combination of states withmaximum a posteriori (MAP) probability.

4. EXPERIMENTAL RESULTS: MULTIPLE-OBJECT TRAJECTORYCLASSIFICATION

For simplicity, we only tested our DHMM model-based multiple trajectory classification algorithm on the M=2trajectory cases. We test the classification performance of both proposed distributed 2D HMM-based classifier,causal 2D HMM-based classifier and traditional 1D HMM-based classifier on 2 datasets: (A) Synthetic multiple-trajectory dataset. (B) Subset of the Context Aware Vision using Image-based Active Recognition (CAVIAR)dataset∗which contains video clips of multiple trajectories with interactions.

Page 9: Distributed Multi-Dimensional Hidden Markov Model: Theory and ...mxiang/publications/spie08.pdf · Hidden Markov model (HMM) is very powerful tool to model temporal dynamics of processes,

Figure 7. Multiple-trajectories samples of two classes in Synthetic dataset: (a)(b) two multiple-trajectories (M=2)samples from class 1; (c)(d) two multiple-trajectories (M=2) samples from class 2.

Figure 8. Multiple-trajectories samples of two classes in CAVIAR dataset: (a) multiple-trajectories (M=2) sample andone frame from class 1: “Two people meet and walk together”; (b) multiple-trajectories (M=2) sample and one framefrom class 2: “Two people meet, fight and run away”.

Page 10: Distributed Multi-Dimensional Hidden Markov Model: Theory and ...mxiang/publications/spie08.pdf · Hidden Markov model (HMM) is very powerful tool to model temporal dynamics of processes,

Figure 9. ROC curve of DHMM, Strictly Causal 2D HMM and 1D HMM for Synthetic data

The results are reported in terms of 3 criteria:

1. the average Receiver Operating Characteristics (ROC) curve.

The ROC curve captures the trade-off between false positive rate versus the true positive rate as thethreshold on likelihood at the output of the classifier is varied. The resulting ROC curves are shown inFigs 9, 10. As a baseline case, the performance of a uniformly distributed random classifier is also presentedin the ROC curve.

2. The Area Under Curve (AUC).

The AUC is a convenient way of comparing classifiers, which varies from 0.5 (random classifier) to 1.0(ideal classifier). The AUCs for two datasets are shown in Figs. 9 and 10, respectively.

3. Classification Accuracy.

The Classification Accuracy is defined as :

PAccuracy = 1− |F ||S| . (17)

where |F | represents the cardinality of the false positives set, and |S| represents the cardinality of the wholedataset.

We firstly test on the Synthetic dataset. We construct the synthetic 2-trajectory dataset with 45 class of2-trajectories, each of which have 30 samples, totally 1350 two-trajectory samples in the dataset. Fig. 7 lists foursamples of multiple-trajectories (M=2) from 2 classes in our Synthetic dataset. We use 50% samples as trainingdata, and the rest as testing data. The ROC curve is shown in Fig. 9. Test results show that our DHMM-based classifier achieves a 91.25% high accurate rate of classification, followed by strictly-causal 2D HMM-basedclassifier, which is 8% lower than ours; and 1D HMM-based classifier, which is 15% lower than ours, as shown inTable 1. Then we test on our CAVIAR dataset, we select data classes that have 2 people interacting with eachother, and use 50% samples of ground truth trajectory as training data, and the rest as testing data. There are9 classes of 180 two-people interacting trajectories. Fig. 8 lists samples of multiple-trajectories(M=2) from twoclasses of multiple-trajectories out of 9 classes in our CAVIAR dataset. ROC curve is shown in Fig. 10. Herewe can see the performance of DHMM which model two-object trajectory as a system is better than Causal 2DHMM and 1D HMM. As shown in Table 1, the average classification accuracy of our DHMM-based classifierreaches 92.04%, which is 8% higher than strictly-causal 2D HMM-based classifier, and 12classifier.

∗The CAVIAR dataset is from EC Funded CAVIAR project/ IST 2001 37540.(http://homepages.inf.ed.ac.uk/rbf/CAVIAR/)

Page 11: Distributed Multi-Dimensional Hidden Markov Model: Theory and ...mxiang/publications/spie08.pdf · Hidden Markov model (HMM) is very powerful tool to model temporal dynamics of processes,

Figure 10. ROC curve of DHMM, Strictly Causal 2D HMM and 1D HMM for CAVIAR data

Table 1. Average Classification Error Rates

Method–Dataset SYNTHETIC(1350) CAVIAR(180)1D HMM 0.7654 0.8097

Strictly Causal 2D HMM 0.8319 0.8420DHMM 0.9125 0.9204

5. CONCLUSION

We propose a novel Distributed 2D Hidden Markov Model (DHMM) for the modelling of multiple-object tra-jectories and their interactions. This model can include multi-trajectory interaction information that is lost inprevious proposed models such as 1D HMM and Causal 2D HMM. For estimation of the DHMMs model param-eters, we derived a new EM algorithm suitable for our model, and a novel General Forward-Backward (GFB)algorithm is proposed for recursive calculation of model parameters. Simulation results on 1430 two-trajectorydata in both synthetic and CAVIAR datasets show superior performance and higher accuracy of our proposeddistributed 2D hidden Markov model.

REFERENCES1. L. R. Rabiner, “A tutorial on hidden markov models and selected applications in speech recognition,”

Proceedings of the IEEE 77, pp. 257–286, 1989.2. T. Starner and A. Pentland, “Real-time american sign language recognition from video using hidden markov

models,” Technical Report, MIT Media Lab, Perceptual Computing Group 375, 1995.3. C. Raphael, “Automatic segmentation of acoustic musical signals using hidden markov models,” IEEE

Transactions on Pattern Analysis and Machine Intelligence 21, pp. 360–370, 1999.4. F. I. Bashir, A. A. Khokhar, and D. Schonfeld, “Hmm based motion recognition system using segmented

pca,” IEEE International Conference on Image Processing (ICIP’05) 3, pp. 1288–1291, 2005.5. S. S. Kuo and O. E. Agazzi, “Machine vision for keyword spotting using pseudo 2d hidden markov models,”

Proceedings of International Conference on Acoustic, Speech and Signal Processing 5, pp. 81–84, 1993.6. C. C. Yen and S. S. Kuo, “Degraded documents recognition using pseudo 2d hidden markov models in

gray-scale images,” Proceedings of SPIE 2277, pp. 180–191, 1994.7. M. Brand, “Coupled hidden markov models for modeling interacting processes,” Technical Report, MIT

Media Lab, Perceptual Computing Group 405, 1997.8. M. Brand, N. Oliver, and A. Pentland, “Coupled hidden markov models for complex action recognition,”

IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR’97) 2277, p. 994,1997.

Page 12: Distributed Multi-Dimensional Hidden Markov Model: Theory and ...mxiang/publications/spie08.pdf · Hidden Markov model (HMM) is very powerful tool to model temporal dynamics of processes,

9. J. Li, A. Najmi, and R. M. Gray, “Image classification by a two-dimensional hidden markov model,” IEEETrans. on Signal Processing 48, pp. 517–533, 2000.

10. J. A. Bilmes, “A gental tutorial of the em algorithm and its application to parameter estimation for gaussianmixture and hidden markov models,” Technical Report, Dept. of EECS, U. C. Berkeley TR-97-021, 1998.

11. A. P. Dempster, N. M. Laird, and D. B. Rubin, “Maximum likelihood from incomplete data via the emalgorithm,” Journal of the Royal Statistical Society: Series B 39, pp. 1–38, 1977.

12. L. E. Baum, T. Petrie, G. Soules, and N. Weiss, “A maximization technique occuring in the statisticalanalysis of probabilistic functions of markov chains,” Ann. Math. Stat 1, pp. 164–171, 1970.

13. D. Schonfeld and N. Bouaynaya, “A new method for multidimensional optimization and its application inimage and video processing,” IEEE Signal Processing Letters 13, pp. 485–488, 2006.


Recommended