Human Emotion Synthesis David Oziem, Lisa Gralewski, Neill Campbell, Colin Dalton, David Gibson,...

Post on 11-Jan-2016

222 views 4 download

Tags:

transcript

Human Emotion Synthesis

David Oziem, Lisa Gralewski, Neill Campbell, Colin Dalton, David Gibson, Barry Thomas

University of Bristol, Motion Ripper, 3CR Research

Synthesising Facial Emotions – University of Bristol – 3CR Research

Project Group

• Motion Ripper Project

– Methods of motion capture.– Re-using captured motion signatures.– Synthesising new or extend motion sequences.– Tools to aid animation.

• Collaboration between University of Bristol CS, Matrix Media & Granada.

Synthesising Facial Emotions – University of Bristol – 3CR Research

Introduction

• What is an emotion?

• Ekman outlined 6 different basic emotions.– joy, disgust, surprise, fear, anger and sadness.

• Emotional states relate to ones expression and movement.

• Synthesising video footage of an actress expressing different emotions.

Synthesising Facial Emotions – University of Bristol – 3CR Research

Synthesising Facial Emotions – University of Bristol – 3CR Research

Video Textures

• Video textures or temporal textures are textures with motion. (Szummer’96)

• Schodl’00, reordered frames from the original to produce loops or continuous sequences.

– Doesn’t produce new footage.

• Campbell’01, Fitzgibbon’01, Reissell’01, used Autoregressive process (ARP) to synthesis frames.

Examples of Video Textures

Synthesising Facial Emotions – University of Bristol – 3CR Research

Autoregressive Process

• Statistical model

• Calculating the model involves working out the parameter vector (a1…an) and w.

• n is known as the order of the sequence.

y(t) = – a1y(t – 1) – a2y(t – 2) – … – any(t – n) + w.ε

Parameter vector (a1,…,an) Noise

Current value at time t

Synthesising Facial Emotions – University of Bristol – 3CR Research

Autoregressive Process

• Statistical model

• Increasing dimensionality of y drastically increases the complexity in calculating (a1…an).

y(t) = – a1y(t – 1) – a2y(t – 2) – … – any(t – n) + w.ε

Synthesising Facial Emotions – University of Bristol – 3CR Research

Autoregressive Process

PCA analysis of Sad footage in 2D

Secondary mode

Primary mode

• Principal Components Analysis is used to reduce number of dimensions in the original sequence.

Synthesising Facial Emotions – University of Bristol – 3CR Research

Autoregressive Process

PCA analysis of Sad footage in 2D Generated sequence using an ARP

Secondary mode Secondary mode

Primary mode Primary mode

• Non-Gaussian Distribution is incorrectly modelled by an ARP.

Synthesising Facial Emotions – University of Bristol – 3CR Research

Face Modelling

• Campbell’01, synthesised a talking head.

• Cootes and Talyor’00, combined appearance model.– Isolates shape and texture.

• Requires labelled frames.– Must label important features

on the face.

Labelled points

Synthesising Facial Emotions – University of Bristol – 3CR Research

Combined Appearance

Shape space

Hand Labelled video footage provides a point set which represents the shape space of the clip.

Synthesising Facial Emotions – University of Bristol – 3CR Research

Combined Appearance

Shape space Texture space

Warping each frame into a standard pose, creates the texture space.

The standard pose is the mean position of the points.

Synthesising Facial Emotions – University of Bristol – 3CR Research

Combined Appearance

Shape space Texture space

Combined spaceCombined space

Joining the shape and texture space and then re-analysing using PCA produces the combined space.

Synthesising Facial Emotions – University of Bristol – 3CR Research

Combined Appearance

Shape space Texture space

Combined space

Reconstruction of the original sequence from the combined space.

Combined spaceCombined space

Synthesising Facial Emotions – University of Bristol – 3CR Research

Secondary mode

Primary mode

Combined Appearance

Combined Appearance sequence

Original sequence in 2D

Secondary mode

Primary mode

Change in distribution after applyingThe combined appearance technique

Synthesising Facial Emotions – University of Bristol – 3CR Research

Secondary mode

Primary mode

Combined Appearance

Generated SequenceOriginal sequence

Secondary mode

Primary mode

ARPmodelARP

model

• Visually the generated plot appears to have been generated using the same stochastic process as the original.

Synthesising Facial Emotions – University of Bristol – 3CR Research

Copying and ARP

• Combine the benefits of copying with ARP– New motion signatures.– Handles non-Gaussian distributions.

Synthesising Facial Emotions – University of Bristol – 3CR Research

Copying and ARP

Original inputOriginal input

Reduced inputReduced input

PCAPCA

• Important to reduce the complexity of the search process.• Need around 30 to 40 dimensions in this example.

Synthesising Facial Emotions – University of Bristol – 3CR Research

Copying and ARP

Original inputOriginal input

Reduced inputReduced input

Segmented inputSegmented inputPCAPCA Reduced segmentsReduced segmentsPCAPCA

• Temporal segments of between 15 to 30 frames.• Need to reduce each segment to be able to train ARP’s.

Synthesising Facial Emotions – University of Bristol – 3CR Research

Copying and ARP

Original inputOriginal input

Reduced inputReduced input

Segmented inputSegmented input Reduced segmentsReduced segmentsPCAPCA PCAPCA

ARPARP

Synthesised segmentsSynthesised segments

• Many of the learned models are unstable.• 10-20% are usable.

Synthesising Facial Emotions – University of Bristol – 3CR Research

Copying and ARP

Original inputOriginal input

Reduced inputReduced input

Segmented inputSegmented input Reduced segmentsReduced segmentsPCAPCA PCAPCA

ARPARP

Synthesised segmentsSynthesised segmentsSegment selectionSegment selection

Outputted SequenceOutputted Sequence

Synthesising Facial Emotions – University of Bristol – 3CR Research

Example

First mode

Time t

End of generated sequence.

Possible segments.

Compared section

Synthesising Facial Emotions – University of Bristol – 3CR Research

First mode

Time t

Example

Closest 3 segmentsare chosen.

Synthesising Facial Emotions – University of Bristol – 3CR Research

First mode

Time t

Example

The segment to be copied is randomly selected from the closest 3.

Synthesising Facial Emotions – University of Bristol – 3CR Research

First mode

Time t

Example

Segments are blended together using a small overlap and averaging the overlapping pixels.

Synthesising Facial Emotions – University of Bristol – 3CR Research

Secondary mode

Primary mode

Secondary mode

Primary mode

Copying& ARPmodel

Copying& ARPmodel

PCA analysis of Sad footage in 2D

Generated sequence

Copying and ARP

• Potentially infinitely long.• Includes new novel motions.

Synthesising Facial Emotions – University of Bristol – 3CR Research

Results (Angry)

Source Footage Copying with ARPCombined Appearance ARP

• Combined appearance produces higher resolution frames.

• Better motion from the copying and ARP approach

Synthesising Facial Emotions – University of Bristol – 3CR Research

Results (Sad)

Source Footage Copying with ARPCombined Appearance ARP

• Similar results as with the angry footage– Copied approach is less blurred due to the reduced variance.

Synthesising Facial Emotions – University of Bristol – 3CR Research

Comparison Results

- Combined appearance - Segment copying

• Simple objective comparison.– Randomly selected temporal segments.

Synthesising Facial Emotions – University of Bristol – 3CR Research

Comparison

• Perceptually is it better to have good motion or higher resolution.

Synthesising Facial Emotions – University of Bristol – 3CR Research

Combined appearance Segment Copying with ARP

Synthesising Facial Emotions – University of Bristol – 3CR Research

Other potential uses

• Self Organising Map

• Uses combined appearance– as each ARP model provides a

minimal representation of the given emotion.

• Can navigate between emotions to create new interstates.

Angry Sad Happy

Synthesising Facial Emotions – University of Bristol – 3CR Research

Conclusions

• Both methods can produce synthesised clips of a given emotion.

• Combined appearance produces higher definition frames.

• Copying and ARPs generates more natural movements.

Synthesising Facial Emotions – University of Bristol – 3CR Research

Questions