論文紹介 Probabilistic sfa for behavior analysis

Post on 12-Jan-2017

471 views 0 download

transcript

Zafeiriou, Lazaros, et al.

Neural Networks and Learning Systems, IEEE Transactions on

Probabilistic Slow Feature Analysis

for Behavior Analysis

Presenter : S5lab. Shuuji Mihara

Abstract1

This Paper propose a number of extensions in both

deterministic and the probabilistic SFA optimization

framework. Particularly about EM-SFA.

This paper shed further light on the relation of the two

sequence EM-SFA and CCA(Canonical Correlation

Analysis).

The proposed EM-SFA with DTW(Dynamic Time

Warping) algorithms were applied for facial behavior

analysis, demonstrating their usefulness for this task.

Index2

1. Introduction – What’s SFA?

2. Deterministic SFA

3. Probabilistic SFA

4. EM-SFA

5. EM-SFA with DTW

6. Experiments

7. Conclusion

Index3

1. Introduction – What’s SFA?

2. Deterministic SFA

3. Probabilistic SFA

4. EM-SFA

5. EM-SFA with DTW

6. Experiments

7. Conclusion

Slow Feature Analysis (2002 Wiskott)

Objective : Extract Slow Feature from Time series data .

4

transform

observation latent variable

Slow Feature Analysis

Slow Feature

1

Index5

1. Introduction – What’s SFA?

2. Deterministic SFA

3. Probabilistic SFA

4. EM-SFA

5. EM-SFA with DTW

6. Experiments

7. Conclusion

Deterministic SFA(1)6

transform

observation latent variable

Slow Feature Analysis

Slow Feature

𝑥1,1:𝑇

𝑥2,1:𝑇

𝑥𝑀,1:𝑇

𝑦1,1:𝑇

𝑦2,1:𝑇

𝑦𝑁,1:𝑇

𝑋 = [𝑥1,1:𝑇; 𝑥2,1:𝑇 … ; 𝑥𝑀,1:𝑇] 𝑌 = [𝑦1,1:𝑇; 𝑦2,1:𝑇 … ; 𝑦𝑁,1:𝑇]

𝑌 = 𝑉𝑇𝑋 (𝑉:𝑀 × 𝑁 𝑚𝑎𝑡𝑟𝑖𝑥)

Deterministic SFA(2)7

Determnistic SFA problem is formulated such optimization problem

min𝑉

tr[ 𝐘 𝐘T] 𝑠. 𝑡. 𝐘𝟏 = 𝟎, 𝐘𝐘T = 𝐈

constraints: zero mean, unit variance

decorreration

𝑌: 1𝑠𝑡 order time difference

Index8

1. Introduction – What’s SFA?

2. Deterministic SFA

3. Probabilistic SFA

4. EM-SFA

5. EM-SFA with DTW

6. Experiments

7. Conclusion

Probabilistic Slow Feature Analysis(pSFA) 9

Observation model

𝒙𝑡 = 𝑾−1𝒚𝑡 +𝒘𝑡

𝒚𝑡: latent variable 𝒙𝑡 ∶ observed data

𝝀 ∶ dependency of 𝒚𝒕−𝟏

𝑾−𝟏: observation matrix

𝒗𝑡 , 𝒘𝑡: noise(Gaussian)

𝒚𝑡 = 𝝀𝒚𝑡−1 + 𝒗𝒕

System model

𝑣𝑡~𝑁 0, Σ𝑤𝑡~𝑁(0, 𝜎𝑥

2𝐼)

constraints

𝝀𝒏𝟐 + 𝝈𝒏

𝟐 = 𝟏

Probabilistic Slow Feature Analysis(pSFA) 10

Slow Feature

𝝀𝒏𝟐 + 𝝈𝒏

𝟐 = 𝟏𝜆𝑛 large

𝜎𝑛 small

𝜆𝑛 small

𝜎𝑛 large𝒚𝑡 = 𝝀𝒚𝑡−1 + 𝒗𝒕

Index11

1. Introduction – What’s SFA?

2. Deterministic SFA

3. Probabilistic SFA

4. EM-SFA

5. EM-SFA with DTW

6. Experiments

7. Conclusion

EM-SFA(1)12

Previous method

Kalman smoother

ML

estimate Λ,𝑊−1

estimate 𝒚

Proposed method

Kalman smoother

Learning Sufficient Statistics

✕ Can’t estimate 𝜎𝑥2

EM algorithm

estimate Λ,𝑊−1, 𝜎𝑥2

Update

Sufficient Statistics

Kalman Smoother (1)13

State Space Model

𝒙𝑘 = 𝐴𝑘−1𝒙𝑘−1 + 𝜼𝒚𝑘 = 𝐻𝑘𝒙𝑘 + 𝝐

⇔𝒙𝑘 ~ 𝑁(𝐴𝑘−1𝒙𝑘−1, 𝑄𝑘−1)

𝒚𝑘 ~ 𝑁(𝑦𝑘𝒙𝑘−1, Σ𝑘)

Kalman Smoother (2)14

Estimate by Kalman Smoother

14

t,1x

t,2x

t,3x

t,1y

t,2y

t,3y

noise 𝜎𝑥2

ESTIMATION

Slow Feature

Probabilistic Slow Feature Analysis(pSFA) 15

Observation model

𝒙𝑡 = 𝑉𝒚𝑡 +𝒘𝑡

𝒚𝑡: latent variable 𝒙𝑡 ∶ observed data

𝚲 ∶ dependency of 𝒚𝒕−𝟏

𝑽 : observation matrix

𝒗𝑡 , 𝒘𝑡: noise(Gaussian)

𝒚𝑡 = 𝚲𝒚𝑡−1 + 𝒗𝒕

System model

𝑣𝑡~𝑁 0, Σ𝑤𝑡~𝑁(0, 𝜎𝑥

2𝐼)

constraints

𝝀𝒏𝟐 + 𝝈𝒏

𝟐 = 𝟏

Inference In SFA16

Parameter

Sufficient Statistics for EM

Kalman Smoother

EM-SFA Algorithm17

Index18

1. Introduction – What’s SFA?

2. Deterministic SFA

3. Probabilistic SFA

4. EM-SFA

5. EM-SFA with DTW

6. Experiments

7. Conclusion

Time Alignment19

Dynamic Time Warping(DTW)20

dynamic time warping (DTW) is an algorithm for

measuring similarity between two temporal sequences

which may vary in time or speed.

Canonical Correration Analysis21

canonical-correlation analysis (CCA) is a way of making

sense of cross-covariance matrices.

𝑢 = 𝑎′𝑥 𝑣 = 𝑏′𝑦

𝑥 = [𝑥1, 𝑥2, … ] y = [𝑦1, 𝑦2, … ]

multivariate data

univariate

𝑎′𝑏′ = argmax𝑎′,𝑏′

𝐶𝑜𝑟[𝑢, 𝑣]

EM-SFA with DTW22

The proposed EM-SFA is more suitable for aligning time

series, since it incorporates temporal constraints (via the

first-order Markov prior), while CCA incorporates a fully

connected MRF prior over the latent space

EM-SFA for Two Sequences23

The Complete joint likelihood distribution

log𝑃 𝑋1, 𝑋2, 𝑌 𝜃)

= log𝑃 𝑦1 0, Σ1) +

𝑡=2

𝑇

log𝑃 𝑦𝑡 𝑦𝑡−1, Λ)

+

𝑡=1

𝑇

log𝑃 𝑥𝑡1 𝑦𝑡 , 𝑉1, 𝜎𝑥,1

2 )+

𝑡=1

𝑇

log𝑃 𝑥𝑡2 𝑦𝑡, 𝑉2, 𝜎𝑥,2

2 )

EM-SFA with DTW24

Index25

1. Introduction – What’s SFA?

2. Deterministic SFA

3. Probabilistic SFA

4. EM-SFA

5. EM-SFA with DTW

6. Experiments

7. Conclusion

Experiment A –Synthetic Data26

27Action Unit(AU)

Experiment B –Real data1Unsupervised AU Temporal Phase Segmentation 28

Experiment B –Real data2Temporal Alignment CTW VS EM-SFA with DTW 29

Experiment B –Real data3Conflict Detection 30

Experiment B –Real data3Conflict Detection 31

Index32

1. Introduction – What’s SFA?

2. Deterministic SFA

3. Probabilistic SFA

4. EM-SFA

5. EM-SFA with DTW

6. Experiments

7. Conclusion

Conclusion33

This Paper propose a number of extensions in both

deterministic and the probabilistic SFA optimization

framework. Particularly about EM-SFA.

This paper shed further light on the relation of the two

sequence EM-SFA and CCA(Canonical Correlation

Analysis).

The proposed EM-SFA with DTW(Dynamic Time

Warping) algorithms were applied for facial behavior

analysis, demonstrating their usefulness for this task.

State Space Model(1)34

State Space Model

𝑥2 𝑥𝑇𝑥1

𝑦1 𝑦2 𝑦𝑇

latent variable

observed variable

sys-eq

obs-eq