+ All Categories
Home > Documents > Discriminative functional analysis of human movements

Discriminative functional analysis of human movements

Date post: 09-Dec-2016
Category:
Upload: dana
View: 213 times
Download: 0 times
Share this document with a friend
11
Discriminative functional analysis of human movements Ali-Akbar Samadani a,, Ali Ghodsi b , Dana Kulic ´ a a Department of Electrical and Computer Engineering, University of Waterloo, Canada b Department of Statistics and Actuarial Science, University of Waterloo, Canada article info Article history: Available online 7 January 2013 Keywords: Human movement time-series analysis Dimensionality reduction for human movement analysis Affective movement analysis Human behaviour analysis abstract This paper investigates the use of statistical dimensionality reduction (DR) techniques for discriminative low dimensional embedding to enable affective movement recognition. Human movements are defined by a collection of sequential observations (time-series features) representing body joint angle or joint Cartesian trajectories. In this work, these sequential observations are modelled as temporal functions using B-spline basis function expansion, and dimensionality reduction techniques are adapted to enable application to the functional observations. The DR techniques adapted here are: Fischer discriminant analysis (FDA), supervised principal component analysis (PCA), and Isomap. These functional DR tech- niques along with functional PCA are applied on affective human movement datasets and their perfor- mance is evaluated using leave-one-out cross validation with a one-nearest neighbour classifier in the corresponding low-dimensional subspaces. The results show that functional supervised PCA outperforms the other DR techniques examined in terms of classification accuracy and time resource requirements. Ó 2013 Elsevier B.V. All rights reserved. 1. Introduction The perception of human behaviour arises from a combination of observable functional and expressive cues. The functional cues communicate explicit information about the nature of the activity being performed (e.g., knocking), while the expressive cues com- municate information about the feeling and intention of the dem- onstrator performing the activity (e.g. knocking angrily). Humans infer and ascribe affective meaning to an observed motion even if none is intended (Wallbott, 1998; Blake and Shiffrar, 2007). Eliminating the expressive cues from human behaviour analysis can result in misinterpretation of the demonstrated activity. There- fore, it is important to study the contribution of the expressive cues in human activity recognition and develop a system that can infer the affective expression encoded in human movements. There are many challenges to developing such a system due to the subconscious nature of the affective movements, and interper- sonal differences in conveying and perceiving affective expres- sions. Human movements can be subtle and are understood by humans sometimes without being consciously noticed and often without explicit consciousness of the features that communicate affective expressions (Cowie, 1535). Furthermore, human move- ments are highly variable in terms of intensity, timing, and the flexibility of body, even when the same demonstrator repeats a single movement multiple times (i.e., stochastic nature of the human movement execution (Faisal et al., 2008). It is unlikely, therefore, that humans can precisely tell us how to generate or recognize a specific affective movement. Many of the works on human movement analysis focus on characterizing general human movement, rather than focusing on the affective component of movement. The main categories of approaches to human movement analysis include: dynamical modelling (e.g. Nakanishi et al., 2004), neural networks (e.g., Ogata et al., 2005), and dimensionality reduction (e.g., Inamura et al., 2004; Losch et al., 2007). The work presented in this paper falls under the dimensionality reduction category. Statistical dimen- sionality reduction (DR) techniques transform high-dimensional data to a lower-dimensional subspace spanned by a set of feature transformations suitable for discriminative analysis. The resulting lower-dimensional embedding also helps to visualize the high- dimensional data, which in turn aids interpretation of a given dataset along its intrinsic degrees of freedom represented by the dimensions of the reduced subspace. DR techniques can be categorized into supervised (e.g. FDA) or unsupervised (e.g., PCA), and linear (e.g., PCA) or nonlinear (e.g., Isomap) techniques. DR techniques are often used for human movement analysis. For in- stance, principal component analysis (PCA) (Santello et al., 2002; Urtasun et al., 2006), Fisher discriminant analysis (FDA) (Dick and Brooks, 2003) and independent component analysis (Ivanenko et al., 2005) have been used to obtain a lower dimensional repre- sentation of human movements. In (Jenkins and Mataric ´, 2004), spatio-temporal Isomap (ST-Isomap) is used to embed motion 0167-8655/$ - see front matter Ó 2013 Elsevier B.V. All rights reserved. http://dx.doi.org/10.1016/j.patrec.2012.12.018 Corresponding author. Address: Department of Electrical and Computer Engi- neering, University of Waterloo, 200 University Avenue West, Waterloo, Ontario, Canada N2L 3G1. Tel.: +1 519 888 4567x31474. E-mail addresses: [email protected] (A. Samadani), aghodsib@uwaterloo. ca (A. Ghodsi), [email protected] (D. Kulic ´). URL: http://ece.uwaterloo.ca/~asamadan/ (A. Samadani). Pattern Recognition Letters 34 (2013) 1829–1839 Contents lists available at SciVerse ScienceDirect Pattern Recognition Letters journal homepage: www.elsevier.com/locate/patrec
Transcript

Pattern Recognition Letters 34 (2013) 1829–1839

Contents lists available at SciVerse ScienceDirect

Pattern Recognition Letters

journal homepage: www.elsevier .com/locate /patrec

Discriminative functional analysis of human movements

0167-8655/$ - see front matter � 2013 Elsevier B.V. All rights reserved.http://dx.doi.org/10.1016/j.patrec.2012.12.018

⇑ Corresponding author. Address: Department of Electrical and Computer Engi-neering, University of Waterloo, 200 University Avenue West, Waterloo, Ontario,Canada N2L 3G1. Tel.: +1 519 888 4567x31474.

E-mail addresses: [email protected] (A. Samadani), [email protected] (A. Ghodsi), [email protected] (D. Kulic).

URL: http://ece.uwaterloo.ca/~asamadan/ (A. Samadani).

Ali-Akbar Samadani a,⇑, Ali Ghodsi b, Dana Kulic a

a Department of Electrical and Computer Engineering, University of Waterloo, Canadab Department of Statistics and Actuarial Science, University of Waterloo, Canada

a r t i c l e i n f o

Article history:Available online 7 January 2013

Keywords:Human movement time-series analysisDimensionality reduction for humanmovement analysisAffective movement analysisHuman behaviour analysis

a b s t r a c t

This paper investigates the use of statistical dimensionality reduction (DR) techniques for discriminativelow dimensional embedding to enable affective movement recognition. Human movements are definedby a collection of sequential observations (time-series features) representing body joint angle or jointCartesian trajectories. In this work, these sequential observations are modelled as temporal functionsusing B-spline basis function expansion, and dimensionality reduction techniques are adapted to enableapplication to the functional observations. The DR techniques adapted here are: Fischer discriminantanalysis (FDA), supervised principal component analysis (PCA), and Isomap. These functional DR tech-niques along with functional PCA are applied on affective human movement datasets and their perfor-mance is evaluated using leave-one-out cross validation with a one-nearest neighbour classifier in thecorresponding low-dimensional subspaces. The results show that functional supervised PCA outperformsthe other DR techniques examined in terms of classification accuracy and time resource requirements.

� 2013 Elsevier B.V. All rights reserved.

1. Introduction

The perception of human behaviour arises from a combinationof observable functional and expressive cues. The functional cuescommunicate explicit information about the nature of the activitybeing performed (e.g., knocking), while the expressive cues com-municate information about the feeling and intention of the dem-onstrator performing the activity (e.g. knocking angrily). Humansinfer and ascribe affective meaning to an observed motion evenif none is intended (Wallbott, 1998; Blake and Shiffrar, 2007).Eliminating the expressive cues from human behaviour analysiscan result in misinterpretation of the demonstrated activity. There-fore, it is important to study the contribution of the expressive cuesin human activity recognition and develop a system that can inferthe affective expression encoded in human movements.

There are many challenges to developing such a system due tothe subconscious nature of the affective movements, and interper-sonal differences in conveying and perceiving affective expres-sions. Human movements can be subtle and are understood byhumans sometimes without being consciously noticed and oftenwithout explicit consciousness of the features that communicateaffective expressions (Cowie, 1535). Furthermore, human move-ments are highly variable in terms of intensity, timing, and the

flexibility of body, even when the same demonstrator repeats asingle movement multiple times (i.e., stochastic nature of thehuman movement execution (Faisal et al., 2008). It is unlikely,therefore, that humans can precisely tell us how to generate orrecognize a specific affective movement.

Many of the works on human movement analysis focus oncharacterizing general human movement, rather than focusing onthe affective component of movement. The main categories ofapproaches to human movement analysis include: dynamicalmodelling (e.g. Nakanishi et al., 2004), neural networks (e.g., Ogataet al., 2005), and dimensionality reduction (e.g., Inamura et al.,2004; Losch et al., 2007). The work presented in this paper fallsunder the dimensionality reduction category. Statistical dimen-sionality reduction (DR) techniques transform high-dimensionaldata to a lower-dimensional subspace spanned by a set of featuretransformations suitable for discriminative analysis. The resultinglower-dimensional embedding also helps to visualize the high-dimensional data, which in turn aids interpretation of a givendataset along its intrinsic degrees of freedom represented by thedimensions of the reduced subspace. DR techniques can becategorized into supervised (e.g. FDA) or unsupervised (e.g., PCA),and linear (e.g., PCA) or nonlinear (e.g., Isomap) techniques. DRtechniques are often used for human movement analysis. For in-stance, principal component analysis (PCA) (Santello et al., 2002;Urtasun et al., 2006), Fisher discriminant analysis (FDA) (Dickand Brooks, 2003) and independent component analysis (Ivanenkoet al., 2005) have been used to obtain a lower dimensional repre-sentation of human movements. In (Jenkins and Mataric, 2004),spatio-temporal Isomap (ST-Isomap) is used to embed motion

1830 A. Samadani et al. / Pattern Recognition Letters 34 (2013) 1829–1839

capture data into a lower dimensional subspace for the purpose ofextracting motion vocabularies. Application of Hidden MarkovModels (HMM) in motion modelling is frequently reported in theliterature (e.g. Bernardin et al., 2005; Kulic et al., 2008; Iba et al.,1999). HMMs are well suited for modelling the stochastic natureof human movements. However, HMM-based motion modellingsuffers from reduced accuracy as the similarity between themotions increases. To be able to distinguish between similarmotions, a more complex HMM is required, which comes at theexpense of increased memory and time resource requirements(Kulic et al., 2008).

Unlike research into functional activity perception from humanmotion, research on affect perception focuses on movementfeatures independent from the functional aspect of the movement.Research into human perception of affective movement has re-ported on multiple, sometimes conflicting movement featureswhich may be important for affect perception. Mainly, velocity,acceleration, and jerk (rate of change of acceleration) are suggestedas the key contributing movement features is conveying affect(Roether et al., 2009). For instance, depression is associated withslow movements and elation is characterized by fast and expansivemovements (Argyle, 1988). In (Crane and Gross, 2007), velocity,head orientation, shoulder, and elbow range of motion are foundas significant features in affect perception from gait. Boone et al.tested the ability of children (4–8 years) and adults (17–22 years)to correctly perceive affect encoded in expressive dance move-ments (Boone and Cunningham, 1998). The perception rate wasabove chance for both children and adults and they report theuse of the following six movement cues by participants for percep-tion of affective expressions: changes in tempo (anger), directionalchanges in face and torso (anger), frequency of arms up (happi-ness), duration of arms away from torso (happiness), muscletension (fear), duration of time leaning forward (sadness). In asimilar study, Camurri et al. (2003) tested human movement per-ception in four emotional categories (anger, fear, grief and happi-ness) conveyed through the same dance movement. They reportthat human observers were able to detect the transmitted emo-tions through the dance movement. They also suggest that theduration of the movement, quantity of the movement (the amountof observed movement relative to the velocity and movementenergy represented as a rough approximation of the physicalmomentum) and contraction index (measured as the amount ofbody contraction/expansion) play key roles in the perception ofaffect from dance movement. In (Pollick et al., 2001), differentaffective expressions conveyed through arm and hand drinkingand knocking movements displayed as point-light animationswere recognized by human observers. Faster movements were per-ceived as conveying high arousal emotions (arousal is an emotiondimension that corresponds to the level of activation, mental alert-ness, and physical activity – shortly the ‘‘call to action’’ (Russell andMehrabian, 1977). Among three affective hand movements used ina perceptual study (anger, happiness, sadness), only angry move-ments were reliably perceived (Samadani et al., 2011). Further-more, the arousal level for different affective hand movementsused in (Samadani et al., 2011) was correctly perceived. Thesestudies suggest that body movements carry critical informationabout their demonstrators’ affective state.

The results of these perceptual studies have been applied toimplement automated approaches for affect recognition frommotion. Bertnardt and Robinson used speed, acceleration, jerk,and distance of hand from body to distinguish between neutral,happy, angry, and sad knocking movements using support vectormachines (SVMs) (Bernhardt and Robinson, 2007). The knockingmovements were performed by 30 individuals and a recognitionrate of 50% was achieved. Camurri et al. used decision trees todistinguish between choreographed dance movements expressing

anger, fear, grief, and joy and report a correct recognition rate of40% on test data (Camurri et al., 2004). Boredom, confusion,delight, flow, and frustration were detected at a rate of 39% fromsitting using a combination of classifiers including Bayesian,SVM, decision trees, artificial neural network (ANN), and k-nearestneighbours (KNN) (D’Mello and Graesser, 2009). In (Kapur et al.,2005), five different classification approaches: logistic regression,decision trees, nave bayes, ANN, and SVM were evaluated for auto-matic recognition of affective body movements conveying anger,sadness, joy, and fear, using a dataset collected with an opticalmotion capture system. Mean marker velocity and accelerationvalues and standard deviations of the marker position, velocityand acceleration were manually selected as the movement fea-tures. ANN and SVM were reported as the most accurate classifierswith a demonstrator-specific recognition rate as high as 92%.Gunes and Piccardi (2009) studied the temporal/phase synchronyof affective face and body expressions for enhancing the automaticrecognition of affect using a video-recorded movement dataset.They considered 12 expressions: anxiety, boredom, uncertainty,puzzlement, neutral/positive/negative surprise, anger, disgust,fear, happiness and sadness. Different classification approachesincluding SVM, decision trees, ANN, and Adaboost were appliedand the best inter-individual recognition rate (77%) was obtainedusing Adaboost with decision tree classifiers. ANN was used in(Janssen et al., 2008) to recognize affective expressions (neutral,happy, sad, angry) from gait using kinetic features (measured usinga force plate) and kinematic features (joint angle trajectory andangular velocity of arm, hip, knee, and ankle). Person-specific rec-ognition of 98.5% and between-individual recognition of around80% were reported. In (Rett, 2009), Bayesian nets are used to modelmovements based on relationships between Laban movementanalysis (LMA) descriptors (Laban, 1947) and physical movementcharacteristics (e.g., acceleration, and curvature). They report aninter-individual recognition rate of 77% for an expressive move-ment dataset consisting of the following movements: lunging fora ball, maestro (conducting an orchestra), stretching to yawn, mak-ing the Ok-sign, pointing, waving bye bye, reaching for someone’shand to shake, and waving sagittally (approach sign) (Rett, 2009).In (Karg et al., 2010), PCA, kernel PCA, linear discriminant analysis,and general discriminant analysis were used for extracting a set offeatures to improve automatic recognition of discrete emotions(anger, sadness, happiness, and neutral) from gait using ANN, navebayes and SVM classifiers. An inter-individual recognition accuracyof 69% was observed for discrete emotions.

Despite the diverse literature on affective movement recogni-tion, the movement features critical to affect recognition are notyet precisely known, and the selection of features for affectivemovement recognition is usually done in an ad hoc manner. Thecurrent paper presents a systematic approach for automatic iden-tification of the features most salient for affective movement rec-ognition from raw movement measurements. Affective bodymovements are defined by a collection of joint angles or Cartesianpositions evolving over time. Therefore, it is important to considerthe temporal characteristics of the movements in designing anautomatic affect recognition model. This study presents anapproach to capture and represent both spatial and temporal fea-tures of the affective movements, which are then used to obtaina discriminative subspace for affective movement recognitionusing adapted dimensionality reduction (DR) techniques.

To apply DR techniques to sequential observations such asaffective movements, a fixed-length representation of these obser-vations is needed. An approach for fixed-length representation ofsequential observations is basis function expansion (BFE). BFEexpresses the sequential observations as temporal functionscomputed as a linear combination of a fixed number of basis func-tions (e.g., Fourier basis function). After transforming sequential

A. Samadani et al. / Pattern Recognition Letters 34 (2013) 1829–1839 1831

observations into basis function space using BFE, DR techniquesneed to be adapted to enable application to the resulting functionaldatasets. In (Ramsay, 1997), functional principal component anal-ysis (FPCA) is introduced, an approach for applying PCA to func-tional datasets. In (Rossi et al., 2005), an extension of radial basisfunction networks and multiplayer perceptrons to functionalinputs is presented. Kernel-based functional nonparametric regres-sion is introduced in (Ferraty and Vieu, 2006) and applied onchemiometrics, speech recognition and econometrics. In (Biauet al., 2005), functional k-nearest neighbour classification is intro-duced and tested with labelled speech samples. Functional fittingof gesture trajectories is exploited in (Bandera et al., 2009) to ex-tract local features (corners of curvature functions), which are usedalong with global features (gross geometric and structural charac-teristic of gestures) for gesture matching using different pairwisedistance functions.

In this work, affective movements are represented by a collec-tion of functional features using BFE. Then, an approach similarto Ramsay (1997) is employed to extend Fisher discriminant anal-ysis (FDA) (Fisher, 1936), supervised principal component analysis(SPCA) (Barshan et al., 2011) and Isomap (Tenenbaum et al., 2000)to functional datasets. These functional techniques along withFPCA are then used to obtain a lower-dimensional embedding ofthe movement and their performance is evaluated using leave-one-out cross validation (LOOCV) using a one-nearest neighbour(1NN) classifier in the resulting reduced subspaces.

This paper is organized as follows: in Section 2, the proposedapproach for obtaining a discriminative lower-dimensionalembedding of sequential observations is presented. Section 3describes the experiments on two affective movement datasets,and Section 4 presents the experimental results. Section 5 providesdiscussion of the results. Conclusions and directions for futurework are overviewed in Section 6.

2. Proposed approach

In this work, we present a systematic approach for fixed-lengthtime-series representation and automatic recognition of affectivemovements using BFE and DR techniques, respectively. Sequentialobservation data such as affective movements are different fromclassical multivariate datasets in which all the datapoints areconventionally represented by a vector of d discrete dimensions(features). Affective movements are inherently characterized asmultidimensional time-series data that might vary in length dueto temporal variability in human movement execution. Eachmovement observation Xi is described by a collection of featuresevolving over a temporal interval Ti. For instance, if we have nmovements defined by m time-series features, then a movementXi at a time-frame t 2 Ti is defined as:

XiðtÞ ¼ ff1iðtÞ; f2iðtÞ; . . . ; fmiðtÞg; fort ¼ 1;2; . . . ; Ti

i ¼ 1;2; . . . ;n

�;

where fjiðtÞ is the value of the jth time-series feature of movementXi at time t. To enable the application of DR techniques, the move-ment trajectories need to be represented by an equal number of dis-crete features, i.e., a fixed-size vectorial representation. In thefollowing a principled approach for fixed-length representation ofthe movement observations based on basis function expansion(BFE) is presented.

2.1. Basis function expansion

BFE is a common method for representing sequential observa-tions as temporal functions computed at every time step t(Ramsay, 1997). In BFE, time-series features are computed as a

weighted linear combination of a set of basis functions ukjKðk¼1Þ,

where K is the total number of basis functions. Considering a mul-tivariate time-series observation XiðtÞ with t ¼ f1;2; . . . ; Tig, theBFE for its jth feature (f ji) is:

f ij ¼XK

ðk¼1Þcijkuk ! f ij ¼ cT

ijU: ð1Þ

U is a matrix of size Ti � K containing basis function values ukðtÞ,and cij is a vector of basis function coefficients for the jth time-seriesfeature of XiðtÞ. The BFE coefficients of individual features are thenconcatenated in a single vector to represent the multivariate time-series observation Xi. This forms a dataset of X ¼ ½x1; x2; . . . ; xi�ni¼1,with xi being a processed multivariate time-series movement se-quence, which is now represented as a vector.

B-splines are chosen as the basis functions as they are suited forrepresenting observations that lack any strong cyclic variations(Ramsay, 1997). B-spline basis functions are defined as piecewisepolynomials with smoothly connected polynomial pieces and theyprovide two types of independent controls, a set of fixed controlpoints (knots) and spline degree (Araki et al., 2009), that help inmodelling less-structured observations. Individual B-splines areconstructed recursively using the Cox-de Boor recursion formula(De Boor, 2001; Lee, 1982). For a B-spline of degree nspline, there arenspline þ 1 polynomials of degree nspline joined at nspline inner knots. Ex-cept at the boundaries, each B-spline overlaps with 2nspline polyno-mial pieces and is positive over an interval of nspline þ 2 knots andzero elsewhere. The degree of the B-spline used depends on the de-sired order of derivative. For instance, if we require a smooth accel-eration curve, then B-splines of degree 3 and above should be used.

After choosing the type of basis function, least squares regres-sion criterion regularized by a roughness penalty is used to obtainan optimal set of coefficients that result in a good fit. Here, thesquare of the second derivative of the curve at each time instancet is used as the roughness penalty (Ramsay, 1997). Regularizing theleast squares criterion with the roughness penalty helps to avoidoverfitting and increases curve differentiability.

2.2. Functional dimensionality reduction

To enable the use of DR techniques on the class of functionaldatasets, this paper develops an adaptation of statistical DR tech-niques to functional datasets. First, the functional formulation ofPCA proposed by Ramsay Ramsay (1997) is reviewed. An extensionof this methodology is then proposed for functional Fisher discrim-inant analysis (FFDA), functional supervised PCA (FSPCA), andfunctional Isomap (FIsomap).

2.2.1. Functional PCAPCA provides a very informative way to interpret variation in

the data by projecting it to a lower-dimensional space formed byp principal components (PCs; directions of maximum variation inthe data), where p < d (the dimensionality of the data) (Jolliffe,2002). Unlike conventional PCA, in FPCA (Ramsay, 1997), the PCsare a set of orthonormal eigenfunctions expressed by a weightedlinear combination of basis functions. In multivariate PCA, theeigenequation to be solved is:

SW ¼ kW; ð2Þ

where S is the sample covariance matrix, W is a set of eigenvectorswith k being the eigenvalues. For functional observations xi, the var-iation in the dataset is approximated using the bivariate covariancefunction v.

vðs; tÞ ¼ 1n� 1

Xi

ðxiðsÞ � �xðsÞÞðxiðtÞ � �xðtÞÞ; ð3Þ

1832 A. Samadani et al. / Pattern Recognition Letters 34 (2013) 1829–1839

where �x is the mean of the sequential observations in the datasetand is obtained by:

xðtÞ ¼ 1n

Xi

xiðtÞ: ð4Þ

In order to obtain FPCA feature transformation w1ðtÞ, the costfunction that corresponds to the variance of the FPCA embedding(FPCA scores) along w1ðtÞ should be maximized:

maxw1

Xi

ðSi1Þ2

s:t :Z

w21ðsÞds ¼ jw1j2 ¼ 1;

ð5Þ

where Si1 is the FPCA embedding for a time-series observation xiðtÞalong w1ðtÞ and is derived using:

Si1 ¼Z

w1ðsÞxiðsÞds: ð6Þ

Similarly, the functional feature transformation w2ðtÞ can beobtained using Eq. (5), introducing an additional orthogonalityconstraint on eigenfunctions:Z

w1ðsÞw2ðsÞds ¼ 0: ð7Þ

This procedure is continued to obtain as many functional fea-ture transformations as required. In the following, a solution forthe above maximization is described. In FPCA, the eigenequationto be solved is:Z

vðs; tÞWðsÞds ¼ kWðtÞ: ð8Þ

W is a matrix of size K � p with its columns representing functionalPCs or eigenfunctions wrðtÞ; r ¼ f1;2; . . . ;pg. Suppose that we havea time-series dataset X ¼ fx1; x2; . . . ; xign

i¼1 that contains n time-ser-ies observations represented by vectors xi ¼ cT

i U. Using BFE, wrðtÞ,the rth eigenfunction (PC) of FPCA applied on X is expressed as:

wrðtÞ ¼XK

k¼1

brkuk ! wr ¼ bTr U; ð9Þ

where U is a matrix of size T � K containing basis function valuesukðtÞ, and br represents coefficients corresponding to successivebasis functions uk used to obtain a functional estimation of wrðtÞ.T is the number of sample points in the original time-series obser-vation. Let C be a matrix of size K � n containing the coefficients forBFE of time-series observations. Each column of C represents BFEcoefficients for a single time-series observation xi. The covariancefunction in Eq. (3) can be re-written as:

vðs; tÞ ¼ ðn� 1Þ�1UT CCTU: ð10Þ

Hence, the corresponding eigenequation (8) is:Zðn� 1Þ�1UTðtÞCT CUðsÞUTðsÞbrds ¼ kUTðtÞbr : ð11Þ

Eq. (11) holds for all the temporal arguments t, and therefore can bereduced to:

ðn� 1Þ�1CT CMbr ¼ kbr; ð12Þ

where M ¼R

UðsÞUTðsÞds. Furthermore, the constraint jwrj2 ¼ 1from Eq. (5) is equivalent to:

jwr j2 � bTr Mbr ¼ 1: ð13Þ

If we define, ur ¼ Mð1=2Þbr , we can rewrite the FPCA eigen Eq. (12)as:

ðn� 1Þ�1M1=2CT CM1=2ur ¼ kur

s:t : uTr ur ¼ 1

ð14Þ

Eq. (14) is a generalized eigenvalue problem and can be solved foreigenfunctions ur . The resulting ur can be used to compute br ,which in turn is used to compute the eigenfunction wrðtÞ usingEq. (9). Unlike multivariate PCA, where the maximum number ofPCs is equal to the dimensionality of the dataset, here the maximumnumber of PCs is T � 1 if K > T , and K otherwise; where T is thenumber of discrete sampling points in the original time-seriesobservations and K is the total number of basis functions used toexpand these observations. In the case of a concatenated multivar-iate time-series dataset obtained as explained in the previoussection, the eigenfunction corresponding to each individual time-series feature can be obtained by breaking down the resultingeigenfunctions from FPCA into pieces of length equal K.

2.2.2. Functional FDAFor multivariate datasets of dimensionality d, Fisher discrimi-

nant analysis (Fisher, 1936) projects a Kf -class dataset into aKf � 1 dimensional space in an attempt to maximize the distancebetween projected means and minimize the variance of each pro-jected class. From this intuition, the FDA objective function in aKf -class multivariate problem can be formulated as:

maxW

TrðWT SBWÞTrðWT SW WÞ

�max

WTrðWT SBWÞ

s:t : TrðWT SW WÞ ¼ I;

8<: ð15Þ

where W is a weight matrix of size ðd� Kf � 1Þ, and SB and SW arethe between-class and within-class covariances, respectively. Intro-ducing a Lagrange multiplier k and setting the first derivative of theLagrange function with respect to W to zero results in a generalizedeigenvalue problem:

S�1w SBW ¼ kW: ð16Þ

Therefore, W is a matrix of eigenvectors associated with the eigen-values of S�1

W SB. Here, a similar approach to FPCA is employed to adaptFDA for sequential observations estimated using BFE (i.e., functionalobservations). Assume that there are n multivariate time-seriesmovements Xijni¼1 belonging to Kf classes and each movement Xi isdefined over a temporal sequence t ¼ f1;2; . . . ; Tig. As before, thesemovements are estimated as temporal functions using a fixed num-ber of basis functions resulting in a set of functional observationsX ¼ fx1; x2; . . . ; xign

i¼1. The within-class covariance for the kf th classof the functional observations is computed as:

vgðs; tÞ ¼ ðng � 1Þ�1UT CKfCT

KfU; ð17Þ

where ng is the number of movements belonging to the kf th class, Uis a matrix of size Ti � K containing basis function values ukðtÞ, andCKf

is a matrix carrying the coefficients corresponding to the basisfunctions ukj

Kk¼1 of the movements in class kf . Similar to conven-

tional FDA, the between-class covariance (SB) can be estimated bysubtracting the within-class covariances (SW ) from the total covari-ance (ST ). The total covariance ST is computed using Eq. (3). Now, ifwe solve for the FDA optimization problem by introducing a La-grange multiplier k, we get the below eigenequation for the func-tional observations:Z

SBWðsÞdðsÞ ¼ kZ

SW WðsÞds; ð18Þ

where WðsÞ is a matrix containing the eigenfunctions (columns)associated with the Kf � 1 largest eigenvalues k. The rth eigenfunc-tion is wrðtÞ ¼

PKk¼1brkuk, r ¼ f1;2; . . . ;Kf � 1g. The set of the top

Kf � 1 eigenfunctions is represented in matrix form as: W ¼ UT B.Using the expressions obtained for the within-class covariancesand the between-class covariance, Eq. (18) can be written for eigen-function wr as follows:

A. Samadani et al. / Pattern Recognition Letters 34 (2013) 1829–1839 1833

ZðzBU

TðsÞCBCTBUðsÞÞdsUTðtÞbr ¼ k

ZðzWUTðsÞCW CT

WUðsÞÞdsUTðtÞbr ;

ð19Þ

which can be reduced to:

zBCBCTBMbr ¼ kzW CW CT

W Mbr ;

M ¼Z

UðsÞUTðsÞds:ð20Þ

zB and zW are the normalizing terms for between-class and within-class covariances. Similar to FPCA, the associated constraint to theeigen Eq. (20) is:

jwr j2 � bTr Mbr ¼ 1: ð21Þ

Let ur ¼ M1=2br , therefore eigenequation (20) can be written as:

zBCBCTBM1=2ur ¼ kzW CW CT

W M1=2ur;

ðzW CW CTW M1=2Þ�1ðz�1

B CBCTBM1=2Þur ¼ kur :

ð22Þ

The above generalized eigenvalue problem can be solved for eigen-functions ur , which is used to compute the FFDA feature transfor-mations wr . Subsequently these feature transformations will beused to embed the movements into the lower dimensional embed-ding of FFDA using Eq. (6).

2.2.3. Functional Supervised PCA based on HSIC (FSPCA)In cases where there are non-linear relations between two ran-

dom variables, non-linear dependency measures need to be used toexplore the correlation between these random variables. The Hil-bert Schmidt independence criterion (HSIC) (Gretton et al., 2005)provides an efficient tool to examine dependencies between tworandom variables. For this measure, first, each random variable isprojected into a higher-dimensional feature (Hilbert) space andthen the cross-covariance operator between the resulting featurespaces is used to derive the HSIC measure between the given vari-ables. Suppose that there are two random variables x and ymapped to the Hilbert spaces F and G using the mapping functionuðxÞ and wðyÞ, respectively. Let Kðx; xTÞ ¼ huðxÞ;uðxÞiF andLðy; yTÞ ¼ hwðyÞ;wðyÞiG be unique kernels associated with Hilbertspaces F and G, respectively. The empirical estimate of the HSICmeasure between them is expressed as:

HSICðx; yÞ ¼ 1n2 TrðKHLHÞ; ð23Þ

where H is a constant matrix used to centralize K and L. In this work,linear and Gaussian RBF kernels are used.

H ¼ 1� 1n

eeT ; ð24Þ

where e is a column vector of 1’s and n is the dimensionality of therandom variables.

Suppose X is a matrix carrying a set of multivariate datapointsin its columns and y is a vector carrying the datapoint labels.HSIC-based supervised PCA (Barshan et al., 2011) is a nonlineardimensionality reduction technique aiming to find a set of featuretransformations wr that maximize TrðHKHBÞ, i.e., it finds a wr suchthat the transformation wT

r X is highly dependent on y(r ¼ f1;2; . . . ; pg, where p is the desired dimensionality of the re-duced subspace). In the following, supervised PCA is formulatedwith a linear kernel on X and any type of kernel on y.

wTr X !linear kernel

XT wrwTr X;

y !any kernelB;

maxwrðTrðwT

r XHBHXT wrÞÞ s:t : wrwTr ¼ 1:

ð25Þ

Similar to classical PCA, the constraint wT w ¼ 1 is introduced tomake the optimization problem well-posed. It can be shown that

the solution to the maximization problem (25) are the largesteigenvalues of XHBHXT and their corresponding eigenvectors(Barshan et al., 2011). A special case of the above formulation iswhen B ¼ I, which results in the conventional PCA. Here, a modifi-cation of supervised PCA is presented that allows application ofthis nonlinear DR technique to the functional dataset. A modifica-tion of Eq. (25) to accommodate the basis function representationof the observations is as follows:

bTr UCTU !linear kernel

UT CUT brbTr UCTU;

y !any kernelB;

maxurðTrðuT

r DurÞÞ; D ¼ UCTUHBHUT CUT s:t : uruTr ¼ 1:

ð26Þ

Similar to FPCA and FFDA, the above maximization can be formu-lated as a generalized eigenequation by introducing a Lagrangemultiplier k. Then, the resulting eigenequation can be solved forur , which will be used to obtain wr . Finally, the FSPCA embeddingcan be obtained using Eq. (6).

2.2.4. Functional Isomap (F-Isomap)Isomap is a non-linear extension of the Multidimensional Scal-

ing (MDS) (Chen et al., 2008) dimensionality reduction technique,which performs MDS on the geodesic space of the non-linear datamanifold, preserving the pairwise geodesic distances betweendatapoints in the reduced subspace (Tenenbaum et al., 2000). Iso-map embedding is performed in three steps: (1) finding neighboursof each point (e.g. k-nearest neighbours) and constructing a graphM, which represents the neighbourhood relationships, (2) comput-ing the geodesic pairwise distance between all the points and (3)embedding the data with MDS, based on the geodesic distances be-tween the datapoints. To enable the application of Isomap tosequential observations, first functional estimations of these obser-vations are obtained and then the k-nearest neighbours for eachobservation are investigated. Next, the geodesic distance betweentwo sequential observations is computed as the shortest path be-tween the two observations in the neighbourhood graph, whichpasses through neighbour observations:

dNða; bÞ ¼ minO

Xi¼f2;...;l�1g

dðxi; xiþ1Þ: ð27Þ

In Eq. (27), O includes two or more connected movements in theneighbourhood graph, with x1 ¼ a and xl ¼ b. xi, and xiþ1 are k-near-est neighbours and xi ¼ cT

i U (functional representation of the move-ment). l is the number of movements in the geodesic path betweena and b, including a and b. Next, matrix N is formed with entries cor-responding to the pairwise geodesic distances dN . Eigenvaluedecomposition is performed on N to obtain the eigenfunctionswrðsÞ ¼ bT

r U corresponding to the top p eigenvalues. The functionalIsomap lower-dimensional embedding is computed using the topeigenfunctions as follows:

SrðxiÞ ¼ffiffiffiffiffikr

pwir ; ð28Þ

where SðxiÞ is the lower-dimensional embedding of xi; SrðxiÞ is therth dimension of SðxiÞ;wir is the ith component of the rth eigenfunc-tion wr with kr being the corresponding eigenvalue for wr . Unlikeother DR techniques in this work, Isomap embedding does not pro-vide a parametric transformation that can be used for transformingpreviously-unseen high-dimensional observations into the low-dimensional space. In (Bengio et al., 2004), a non-parametric esti-mation of the Isomap low-dimensional transformation is intro-duced to test the Isomap embedding. It is shown that the Isomapembedding for a test observation xt denoted as SðxtÞ can be approx-imated as:

Fig. 1. Screen shots of an animated hand movement used in this study. Local Euler angles were collected for the wrist (root) and three joints (A, B, and C) along each finger.Joints in each finger are named as shown on the index finger of the far right hand (A: proximal joint, B: intermediate joint, and C: distal joint).

1834 A. Samadani et al. / Pattern Recognition Letters 34 (2013) 1829–1839

SrðxtÞ ¼1ffiffiffiffiffikrp

Xi

wir~Kðxt ; xiÞ; ð29Þ

where ðkr ;wrÞ are the eigenvalue–eigenvector pairs obtained fromperforming eigenvalue decomposition on the neighbourhood ma-trix N. ~Kða; bÞ is a kernel function that produces the neighbourhoodmatrix N for Isomap embedding and is defined as:

~Kða;bÞ¼�12

d2Mða;bÞ�Ext ½d

2ðxt ;bÞ��ExTt½d2ða;xT

t Þ�þExt ;xTt½d2ðxt ;xT

t �� �

;

ð30Þ

In the case of sequential observations, the eigenvector wr isapproximated using BFE as brU. Considering X ¼ fx1; x2; . . . ; xign

i¼1

as a set of observations, the Isomap embedding for the test obser-vation xt is approximated as:

SrðxtÞ ¼1

2ffiffiffiffiffikrp

Xi

biruiðtÞð ÞðExTt½d2ðxT

t ; xiÞ� � d2ðxi; xtÞÞ; ð31Þ

where ExTt

is an average over the training observations. Detailed dis-cussion and proof for Isomap testing can be found in (Bengio et al.,2004). Here, Eq. (29) is used to test the F-Isomap embedding.

1 The markers are placed on the following body landmarks: left front head, rightfrond head, left back head, right back head, top chest, centre chest, left front waist,right front waist, left back waist, right back waist, top of spine, middle of back, leftouter metatarsal, right outer metatarsal, left toe, right toe, left shoulder, rightshoulder, left outer elbow, right outer elbow, left hand, right hand, left wrist innernear thumb, right wrist inner near thumb, left wrist outer opposite thumb, right wristouter opposite thumb, left knee, right knee, left ankle, right ankle, left heel, right heel.

3. Experiments

Two affective movement datasets are used to assess the pro-posed approaches; a simple dataset consisting of a single handmovement performed by a single demonstrator (?) and a muchlarger dataset consisting of a variety of full-body movementscperformed by multiple demonstrators (Kleinsmith et al., 2006).The hand movement dataset considers one movement type,closing and opening the hand, mainly involving phalangeal andcarpo-metacarpal joint movements. Three sets of movements of10 trials were collected, where each set conveys a differentexpression. Three different expressions were considered: sadness,happiness and anger. For each expression, 5 trials were performedon the right hand and 5 on the left hand. A demonstratorperformed the hand movements while wearing a Dataglove(ShapeHand from Measurand, 2009). Videos of these movementsare available in reference Samadani (2011). The Cartesiancoordinates for the root joint (wrist) and three joints A, B andC along each finger (Fig. 1) were collected at 84 frames persecond.

Next, a challenging dataset of full-body affective movementswas used to further assess the discriminative and computationalqualities of the functional DR techniques. This dataset contains183 acted full-body affective movements obtained from thirteendemonstrators who freely expressed movements conveying an-ger, happiness, fear, and sadness (Kleinsmith et al., 2006); hence,creating a dataset with different within-class movements (move-ment differing in structure and physical execution, whileexpressing a same affect). There are 32 markers attached to

body landmarks1 and their 3D Cartesian coordinates are collectedusing a motion capture system; hence a total of 96 time-seriesCartesian trajectories describe a movement. There are 46 sad, 47happy, 49 fearful, and 41 angry movements in the full-bodydataset.

For both datasets, the affective movements are preprocessedthrough BFE before the application of functional DR techniques.Two hundred B-splines of 4th degree are chosen to represent theaffective movement time series using BFE. BFE is performedusing MATLAB code provided in (Ramsay, 2008). Next, FPCA,FFDA, FSPCA, and FIsomap are applied to obtain discriminativelower-dimensional embeddings of the transformed affectivemovements (i.e., functional estimation of the movements). Forthe FSPCA, two types of kernels are applied to the movementlabels; a linear kernel and the Gaussian radial basis function(GRBF) kernel. MATLAB code provided in (Tenenbaum et al.,2000) is modified to generate the FIsomap embedding. For FPCA,FFDA and FSPCA, Eq. (6) is used to obtain the lower-dimensionalembedding for test observations. For FIsomap, lower-dimen-sional embeddings of test observations are computed usingEq. (31). The performance of the functional DR techniques in dis-criminating between affective movements is examined withleave-one-out cross validation (LOOCV) using one-nearest-neigh-bour misclassification error.

4. Results

The two-dimensional embeddings of the affective hand move-ments obtained by FPCA, FFDA, FSPCA and FIsomap are shown inFig. 2. The LOOCV training and testing recognition rates for differ-ent functional DR techniques along with their training time for thehand dataset are shown in Table 1.

The functional DR techniques are next applied to a more chal-lenging full-body movement dataset. Table 2 shows the LOOCVtraining and testing recognition rates obtained using the 1NN clas-sifier in the resulting reduced spaces for the full-body dataset. Forthe full-body movements, 3D subspaces of the functional DR tech-niques are used to compute LOOCV recognition rates due to theirdiscriminative advantage over 2D subspaces. The 2D embeddingof training and testing full-body movements are shown in Fig. 3to illustrate the ability of the functional DR techniques to discrim-inatively embed the high-dimensional affective movements in alow dimensional space.

Dim.1

Dim.2

FPCA

Dim.1

Dim.2

FFDA

Dim.1

Dim.2

F−Isomap

Dim.1

Dim.2

FSPCA−Linear

Dim.1

Dim.2

FSPCA−GRBF

SadAngryHappy

Fig. 2. Affective hand movement embedding in the resulting 2D subspaces. ForFPCA, F-Isomap, and FSPCA-Linear, movements from the same emotional categoryperformed on left and right hands are clustered separately in the resultingembedding.

Table 1Leave-one-out cross validation training and testing recognition rates for functional DRtechniques applied on the affective hand movements.

Training recognitionrate (%)

Testing recognitionrate (%)

Elapsed trainingtime (s)

FPCA 87.1 66.7 0.07FFDA 100 30.0 474.2F-Isomap 95.5 56.7 0.13FSPCA-

Linear97.0 96.7 0.09

FSPCA-GBRF

99.1 93.3 0.09

A. Samadani et al. / Pattern Recognition Letters 34 (2013) 1829–1839 1835

5. Discussion

FFDA performs perfectly on the training observations, as it col-lapses different observations belonging to the same affective classto nearly a single point (FFDA embeddings in Figs. 2 and 3). How-ever, it fails to accurately separate the test observations (low FFDAtesting recognition rates reported in Tables 1 and 2). The FFDAoverfitting effect observed for the hand movements (Fig. 2) andfor the full-body movements (Fig. 3) is due to the small numberof high-dimensional observations used (smaller number of obser-vations than the dimensionality of the observation), and confirmsearlier findings that FDA performs poorly on high dimensionalproblems when few training points are available (Martinez andKak, 2001).

Separated clusters of hand movements belonging to the sameaffective class are clearly observable in the reduced subspaces ofFPCA, F-Isomap, and FSPCA (Fig. 2). These distinct clusters corre-spond to the movements performed on the left and right hands.Furthermore, in the FPCA-Linear and F-Isomap embeddings, angryand happy hand movements overlap to some extent, while sadhand movements form distinct clusters (Fig. 2).

FIsomap achieves good performance on the training handmovements, however, poor performance is obtained on testingexemplars. As Isomap does not provide a parametric transforma-tion that can be used for evaluating the generalizability of theresulting reduced subspace to out-of-sample movements, anapproximation of Isomap out-of-sample embedding proposed in(Bengio et al., 2004) is adapted here to test the generalizability ofthe FIsomap to unseen movements. This approximation might be

the reason for the poor testing performance of the FIsomap(Table 1). Furthermore, the performance of Isomap deteriorates ifthe datapoints belong to disjoint underlying manifolds, whichmight be the case here (Geng et al., 2005).

For the full-body dataset, among the functional DR techniques,FSPCA-GRBF embedding shows dense and more distinct one-classclusters of movements in the resulting lower-dimensional space.By visual inspection, it is easy to associate different subintervalsof dimensions of the FSPCA-GRBF subspace to distinct affectivemovements. For instance, lower values of the first dimension ofthe FSPCA-GRBF subspace are occupied by sad movements,whereas happy movemvents are distributed along the higher val-ues of the first dimension. As can be seen from the LOOCV recogni-tion rates (Table 2), FSPCA-GRBF results in the highest training andtesting recognition rates. FPCA, F-Isomap, and FSPCA-Linearembeddings show a large overlapping between full-body move-ments from different classes resulting in poor discrimination be-tween the training and testing affective movements (Table 2).

As overviewed in Section 1, in the affective movement recogni-tion literature, the automatic inter-individual emotion recognitionrates range from 40% to 77% (Bernhardt and Robinson, 2007; Cam-urri et al., 2004; D’Mello and Graesser, 2009; Gunes and Piccardi,2009; Karg et al., 2010), depending on the number of intendedemotions, number of demonstrators, and the amount of within-class variations in the movements. Using apex postures from 108of the movements in the full-body dataset used in our study, Klein-smith et al. (2006) tested human perception of the intended emo-tions. The overall recognition rate was 54.7% with the leastrecognized postures being fearful ones (49.9% recognition rate)and the most recognized ones being the sad postures (63.4% recog-nition rate). The FSPCA-GRBF applied on the full-body datasetachieves overall training recognition rate of 59.1% (sad: 65.9%, hap-py: 63.9%, fearful: 50.9%, angry: 55.6%) and testing recognition rateof 53.6% (sad: 60.9%, happy: 61.7%, fearful: 44.9%, angry: 46.4%)which are comparable to human recognition rates on the samedataset as reported in the Kliensmith et al. perceptual study (Klein-smith et al., 2006).

If we consider the extent to which each class in the reducedsubspaces is spread as a measure of quality of the embeddings ofDR techniques, by visual inspection, one can argue that FSPCA-GRBF kernel results in the most compact embedding of the classesfor both hand and full-body movements. In the case of FFDA, de-spite of the compact embedding of each class to a single point, asdiscussed above, due to overfitting, poor embedding of the testobservations is observed (FFDA embeddings in Fig. 3). The compactembeddings of the high-dimensional movements also facilitate theinterpretation of the reduced subspace dimensions, as distinct sub-intervals of these dimensions can be associated to distinct affectivemovement classes (i.e., in FSPCA-GRBF emboldening shown inFig. 2, sad movements are uniquely characterized by lower valuesof dimension 1).

Next, the resulting functional transformations obtained byFPCA, FFDA, and FSPCA for the hand dataset are plotted as pertur-bations of the overall mean of the feature; lðtÞ � apTrðtÞ, wherelðtÞ is the functional feature mean across the movements, ap isthe perturbing constant, and TrðtÞ is the functional transformationcorresponding to that feature. Fig. 4 shows examples of perturba-tion plots for the hand movements resulting from FPCA, FFDA,and FSPCA techniques: (a) Y-trajectory of joint C of the middle fin-ger corresponding to the first dimension of the reduced subspacesand (b) Z-trajectory of joint A of the thumb corresponding to thesecond dimension of the reduced subspaces. The perturbation plotsfor other hand coordinates can be obtained similarly. These pertur-bation plots help to evaluate the importance of different move-ment functional features in constructing the discriminativereduced spaces either as a whole or over subintervals. For the

Fig. 3. Affective full-body movement embedding for (a) training data and (b) testing data in the resulting 2D subspaces.

Table 2Leave-one-out cross validation training and testing recognition rates for functional DRtechniques applied on the affective full-body movements. Highest training andtesting recognition rates are highlighted.

Training recognitionrate (%)

Testing recognitionrate (%)

Elapsed trainingtime (s)

FPCA 44.0 43.2 0.70FFDA 100 36.6 564.59F-Isomap 47.0 43.7 1.81FSPCA-

Linear44.2 43.7 1.68

FSPCA-GBRF

59.1 53.6 1.71

1836 A. Samadani et al. / Pattern Recognition Letters 34 (2013) 1829–1839

Y-trajectory of joint C of the middle finger, the perturbation plotsfor FPCA, and FSPCA with linear and GRBF kernels are quite similarwhile being different from the one corresponding to FFDA embed-ding (Fig. 4(a)). According to Fig. 4(a), for the Y-trajectory of joint Cof the middle finger, the functional feature variations mainly at thebeginning and at the end of the temporal interval of the movementexecution play an important role in producing FPCA and FSPCAsubspaces. If a functional feature has little effect in producing thediscriminative subspaces, it appears as overlapping of the func-tional feature mean with its positive and negative functional trans-

formation perturbations as is the case in the FFDA perturbationplot for the Y-trajectory of joint C of the middle finger. Therefore,the contribution of the Y-trajectory of joint C of the middle fingerin constructing FFDA embedding is not significant. As can be seenin Fig. 2, the 2D subspaces generated using FPCA and FSPCA aresimilar in their first dimension; sad movements are embeddedalong the lower-values, while happy and angry movements arecharacterized by higher values of the first dimension; hence, sim-ilar perturbation plots for the first dimension of the FPCA andFSPCA subspaces are obtained. Differences between FPCA andFSPCA embeddings occur along the second dimension.

An example of the perturbation plots for the Z-trajectory of jointA of the thumb corresponding to the second dimension of the re-duced subspaces produced by FPCA, FFDA, and FSPCA is shown inFig. 4(b). For the Z-trajectory of joint A of the thumb, the perturba-tion plot for FPCA demonstrates that the functional feature varia-tions mainly at the beginning and at the end of the temporalinterval of the movements play an important role in producingthe FPCA embedding. For the FFDA embedding, the functional fea-ture transformation introduces a highly variable trend of weightsstarting at one-third of the movement feature and attenuating to-ward the end (FFDA perturbation plot in Fig. 4(b)). This demon-strates the FFDA search for a direction in high-dimensional

Fig. 4. Perturbation plots corresponding to the reduced subspaces for the hand movements produced by different functional DR techniques (a) Y-trajectory of joint C of themiddle finger corresponding to the first dimension of the reduced subspaces and (b) Z-trajectory of joint A of the thumb corresponding to the second dimension of thereduced subspaces.

A. Samadani et al. / Pattern Recognition Letters 34 (2013) 1829–1839 1837

movement space that maximally separates different movementclasses, while forming compact classes along that direction throughweighting individual basis functions constructing the functionalfeature. The contribution of the Z-trajectory of joint A of the thumbin constructing the FSPCA-Linear embedding is not significant. Theperturbation plot for the FSPCA-GRBF shows that the entire Z-trajec-tory of joint A of the thumb plays an important role in constructingthe discriminative FSPCA-GRBF embedding. Therefore, FPCA and thetwo variations of FSPCA DR techniques result in a different func-tional feature transformation for discriminative embedding of theaffective movements. The superiority of the FSPCA techniques overthe FPCA in the discriminative analysis (Table 1) is likely due tothe fact that FSPCA benefits from movement labels in constructingthe discriminative lower-dimensional subspace.

Among the functional DR techniques covered here, FFDA is themost computationally expensive and this is due to the requirementfor computing the overall covariance as well as individual classcovariances. The least computationally expensive is FPCA followedby FSPCA. The computational complexity of the FIsomap algorithmdepends on the computation of pairwise geodesic distances(Tenenbaum et al., 2000).

The presented discriminative approach is not limited to pair-wise comparison (such as the case in Bandera et al., 2009), instead,

it systematically identifies a subspace where discriminative analy-sis on all the movements can be performed at once. Furthermore,newly observed movements can be classified by embedding themin the resulting lower-dimensional space. The by-product of theidentified lower-dimensional spaces are the features spanningthese spaces, which are the critical movement features in discrim-inating between different affective movements. Therefore, there isno need for hand-picking and estimating movement features thatmight be important for affective movement recognition (as donein Rett, 2009; Bandera et al., 2009).

6. Conclusion and future work

Different DR techniques are used in the context of functionaldata analysis to find a discriminative low-dimensional embeddingof a set of sequential affective movement observations. First, FDA,supervised PCA and Isomap DR techniques are adapted to enableapplication to the sequential functional observations. The sequen-tial observations are first modelled as temporal functions using BFEwith a fixed number of B-spline basis functions. Then, functionalversions of the DR techniques, FPCA, FFDA, FSPCA and FIsomap,are applied on the BFE representation of the affective movementsand corresponding lower-dimensional embeddings are obtained.

1838 A. Samadani et al. / Pattern Recognition Letters 34 (2013) 1829–1839

Leave-one-out cross validation using 1NN classifier was applied inthe reduced embeddings and training and testing recognition rateswere computed as the measure of performance for the functionalDR techniques. Overall, considering the testing recognition ratefor the both datasets (testing recognition rate demonstrates thecapability to discriminate unseen movements) and elapsed train-ing time as assessment criteria, the FSPCA-GRBF outperforms otherfunctional DR techniques tested here. Furthermore, for the full-body dataset, considering the large number of freely expressedaffective movements demonstrated by 13 different actors, FSPCA-GRBF technique shows promising performance when comparedwith the perceptual and automatic inter-personal affective move-ment recognition studies.

The presented movement recognition approach is particularlyuseful since it uses a minimal set of systematically obtained fea-ture transformations (dimensions spanning the lower-dimensionalsubspaces), rather than trying to recognize the movements in theiroriginal high-dimensional time-series format, which is most likelycharacterized by many redundant and irrelevant features to therecognition task. BFE is an efficient way to represent the high-dimensional and variable-length sequential observations as func-tions estimated by weighted linear combinations of a fixed numberof basis functions, which satisfies the discriminative DR tech-niques’ requirement for fixed-length vectorial representation ofthe sequential observations. BFE can also be regarded as an inter-mediate dimensionality reduction step as it produces a smootherdown-sampled version of the original temporal observations.

The long term goal of this project is to develop a systematic ap-proach for identifying a subset of features that can be used to dis-tinguish between different movements (i.e., to optimize movementrecognition). Using more variants of movements conveying anaffective expression will help identify a more generalized set offeatures (feature transformations) and movement qualities associ-ated with that affective expression, which consequently will helpto develop a more robust and accurate computational model forhuman affective movement recognition.

Acknowledgement

We would like to thank Dr. Andrea Kleinsmith for making theirfull-body dataset (Kleinsmith et al., 2006) available for us. Thiswork is supported by the National Science and Research Councilof Canada (NSERC).

References

Araki, Y., Konishi, S., Kawano, S., Matsui, H., 2009. Functional regression modelingvia regularized gaussian basis expansions. Ann. Inst. Stat. Math. 61 (4), 811–833.

Argyle, M., 1988. Bodily Communication. Methuen.Bandera, J., Marfil, R., Bandera, A., Rodríguez, J., Molina-Tanco, L., Sandoval, F., 2009.

Fast gesture recognition based on a two-level representation. PatternRecognition Lett. 30 (13), 1181–1189.

Barshan, E., Ghodsi, A., Azimifar, Z., Zolghadri Jahromi, M., 2011. Supervisedprincipal component analysis: Visualization, classification and regression onsubspaces and submanifolds. Pattern Recognition 44 (7), 1357–1371.

Bengio, Y., Paiement, J., Vincent, P., Delalleau, O., Le Roux, N., Ouimet, M., 2004. Out-of-sample extensions for LLE, Isomap, MDS, Eigenmaps, and spectral clustering.Adv. Neural Inf. Process. Systems 16, 177–184.

Bernardin, K., Ogawara, K., Ikeuchi, K., Dillmann, R., 2005. A sensor fusion approachfor recognizing continuous human grasping sequences using hidden markovmodels. IEEE Trans. Rob. 21 (1), 47–57.

Bernhardt, D., Robinson, P., 2007. Detecting affect from non-stylised body motions.In: Paiva, A., Prada, R., Picard, R. (Eds.), Affective Computing and IntelligentInteraction, Lecture Notes in Computer Science, vol. 4738. Springer, Berlin/Heidelberg, pp. 59–70.

Biau, G., Bunea, F., Wegkamp, M., 2005. Functional classification in hilbert spaces.IEEE Trans. Inf. Theory 51 (6), 2163–2172.

Blake, R., Shiffrar, M., 2007. Perception of human motion. Ann. Rev. Psychol. 58, 47–73.

Boone, R., Cunningham, J., 1998. Children’s decoding of emotion in expressive bodymovement: The development of cue attunement. Dev. Psychol. 34 (5), 1007–1016.

Camurri, A., Lagerlöf, I., Volpe, G., 2003. Recognizing emotion from dancemovement: Comparison of spectator recognition and automated techniques.Internat. J. Hum.-Comput. Stud. 59 (1), 213–225.

Camurri, A., Mazzarino, B., Volpe, G., 2004. Expressive interfaces. Cognition Technol.Work 6, 15–22.

Chen, C.-h., Hrdle, W., Unwin, A., Cox, M.A.A., Cox, T.F., 2008. Multidimensionalscaling. In: Handbook of Data Visualization. Springer Handbooks ofComputational Statistics. Springer, Berlin, Heidelberg, pp. 315–347.

Cowie, R., 1535. Perceiving emotion: Towards a realistic understanding of the task.Philos. Trans. Roy. Soc. B: Biol. Sci. 364 (1535), 3515–3525.

Crane, E., Gross, M., 2007. Motion capture and emotion: Affect detection in wholebody movement. In: Internat. Conf. on ACII. pp. 95–101.

De Boor, C., 2001. A Practical Guide to Splines, vol. 27. Springer Verlag.Dick, A., Brooks, M., 2003. Issues in automated visual surveillance. In: Internat. Conf.

on Digital Image Computing: Techniques and Applications. pp. 195–204.D’Mello, S., Graesser, A., 2009. Automatic detection of learner’s affect from gross

body language. Appl. Artif. Intell. 23 (2), 123–150.Faisal, A., Selen, L., Wolpert, D., 2008. Noise in the nervous system. Nat. Rev.

Neurosci. 9 (4), 292–303.Ferraty, F., Vieu, P., 2006. Nonparametric Functional Data Analysis: Theory and

Practice. Springer.Fisher, R., 1936. The use of multiple measurements in taxonomic problems. Ann.

Hum. Genet. 7 (2), 179–188.Geng, X., Zhan, D.-C., Zhou, Z.-H., 2005. Supervised nonlinear dimensionality

reduction for visualization and classification. IEEE Trans. Systems ManCybernet. Part B Cybernet. 35 (6), 1098–1107.

Gretton, A., Bousquet, O., Smola, A., Schölkopf, B., 2005. Measuring statisticaldependence with Hilbert–Schmidt norms. In: Algorithmic Learning Theory.Springer, pp. 63–77.

Gunes, H., Piccardi, M., 2009. Automatic temporal segment detection and affectrecognition from face and body display. IEEE Trans. Systems Man Cybernet. PartB Cybernet. 39 (1), 64–84.

Iba, S., Weghe, J., Paredis, C., Khosla, P., 1999. An architecture for gesture-basedcontrol of mobile robots. Proc. IEEE/RDJ Internat. Conf. on Intelligent Robots andSystems, IROS’99, vol. 2. IEEE, pp. 851–857.

Inamura, T., Toshima, I., Tanie, H., Nakamura, Y., 2004. Embodied symbol emergencebased on mimesis theory. Internat. J. Rob. Res. 23 (4–5), 363–377.

Ivanenko, Y., Cappellini, G., Dominici, N., Poppele, R., Lacquaniti, F., 2005.Coordination of locomotion with voluntary movements in humans. J.Neurosci. 25 (31), 7238–7253.

Janssen, D., Schöllhorn, W., Lubienetzki, J., Fölling, K., Kokenge, H., Davids, K., 2008.Recognition of emotions in gait patterns by means of artificial neural nets. J.Nonverbal Behav. 32 (2), 79–92.

Jenkins, O.C., Mataric, M.J., 2004. A spatio-temporal extension to Isomap nonlineardimension reduction. In: Proc. 21st Internat. Conf. on Machine Learning, ICML’04. ACM, New York, NY, USA, pp. 441–448.

Jolliffe, I., 2002. MyiLibrary Principal Component Analysis, vol. 2. Wiley OnlineLibrary.

Kapur, A., Virji-Babul, N., Tzanetakis, G., Driessen, P., 2005. Gesture-based affectivecomputing on motion capture data, in: Internat. Conf. on ACII. pp. 1–7.

Karg, M., Kühnlenz, K., Buss, M., 2010. Recognition of affect based on gaitpatterns. IEEE Trans. Systems Man Cybernet. Part B Cybernet. 40 (4), 1050–1061.

Kleinsmith, A., De Silva, P., Bianchi-Berthouze, N., 2006. Cross-culturaldifferences in recognizing affect from body posture. Interact. Comput. 18(6), 1371–1389.

Kulic, D., Takano, W., Nakamura, Y., 2008. Incremental learning, clustering andhierarchy formation of whole body motion patterns using adaptive hiddenmarkov chains. Internat. J. Rob. Res. 27 (7), 761–784.

Laban, R., Lawrence, F., 1947. Effort. Macdonald and Evans.Lee, E., 1982. A simplified b-spline computation routine. Computing 29 (4), 365–

371.Losch, M., Schmidt-Rohr, S., Knoop, S., Vacek, S., Dillmann, R., 2007. Feature set

selection and optimal classifier for human activity recognition. In: The 16thIEEE Internat. Symposium on Robot and Human interactive Communication(RO-MAN). IEEE, pp. 1022–1027.

Martinez, A., Kak, A., 2001. Pca versus lda. IEEE Trans. Pattern Anal. Machine Intell.23 (2), 228–233.

Measurand, 2009. Motion capture systems. <http://www.measurand.com>.Nakanishi, J., Morimoto, J., Endo, G., Cheng, G., Schaal, S., Kawato, M., 2004. Learning

from demonstration and adaptation of biped locomotion. Rob. Autonom.Systems 47 (2–3), 79–91.

Ogata, T., Sugano, S., Tani, J., 2005. Open-end human–robot interaction from thedynamical systems perspective: Mutual adaptation and incremental learning.Adv. Rob. 19 (6), 651–670.

Pollick, F., Paterson, H., Bruderlin, A., Sanford, A., 2001. Perceiving affect from armmovement. Cognition 82 (2), B51–B61.

Ramsay, J., 1997. Functional Data Analysis, second ed. Springer Science+Media,New York.

Ramsay, J., 2008. Functional data analysis software. <http://www.psych.mcgill.ca/misc/fda/software.html>.

Rett, J., 2009. Robot–human interface using laban movement analysis inside abayesian framework. Ph.D. thesis, University of Coimbra.

A. Samadani et al. / Pattern Recognition Letters 34 (2013) 1829–1839 1839

Roether, C., Omlor, L., Christensen, A., Giese, M., 2009. Critical features for theperception of emotion from gait. J. Vision 9 (6), 1–32.

Rossi, F., Delannay, N., Conan-Guez, B., Verleysen, M., 2005. Representation offunctional data in neural networks. Neurocomputing 64, 183–210.

Russell, J., Mehrabian, A., 1977. Evidence for a three-factor theory of emotions. J.Res. Pers. 11, 273–294.

Samadani, A., 2011. Questionnaire videos. <https://ece.uwaterloo.ca/asamadan/JulyVideos.htm>.

Samadani, A., DeHart, B., Robinson, K., Kulic, D., Kubica, E., Gorbet, R., 2011. A study ofhuman performance in recognizing expressive hand movements. In: 20th IEEEInternat. Symposium on Robot and Human Interactive Communication. pp. 93–100.

Santello, M., Flanders, M., Soechting, J., 2002. Patterns of hand motion duringgrasping and the influence of sensory guidance. J. Neurosci. 22 (4), 1426–1435.

Tenenbaum, J.B., Silva, V.d., Langford, J.C., 2000. A global geometric framework fornonlinear dimensionality reduction. Science 290 (5500), 2319–2323.

Tenenbaum, J.B., Silva, V.d., Langford, J.C., 2000. Isomap software. <http://isomap.stanford.edu/>.

Urtasun, R., Fleet, D., Fua, P., 2006. Temporal motion models for monocular andmultiview 3d human body tracking. Computer Vision and Image Understanding104 (2), 157–177.

Wallbott, H., 1998. Bodily expression of emotion. Eur. J. Soc. Psychol. 28 (6), 879–896.


Recommended