Helsinki University of Technology Laboratory of Computational Engineering Modeling facial...

transcript

Helsinki Universityof Technology

Laboratory of Computational Engineering

Modeling facial expressions for Modeling facial expressions for Finnish talking headFinnish talking head

Michael Frydrych, LCE, 11.6.2004

Michael Frydrych, 11.6.2004 Laboratory of Computational Engineering

Finnish talking headFinnish talking head

Computer animated model of a talking person

Synchronized A/V speech Model of emotional facial

expressions

User interface of “old” talking User interface of “old” talking headhead

What has been done with it? Studies in audiovisual speech perception Kiosk-interface at the University of

Tampere Cultural activities

Major role in play Kyberias at Kellariteatteri (2001)

Talking HeadTalking Head

ContentContent

Talking heads – why? Animation methods Controlling animation Making them speak Practicals -------------------------------------------------- Making the head smile Emotions –why? Practicals

Why talking heads?Why talking heads?

Entertainment Information services

Ananova, information kiosks Education services

Learning foreign languages,… Agents in spoken dialogue systems

nonverbal signals, comfort

TampereTampere museums museums

Aids in communicationAids in communication

Speech is both heard and seen Improve intelligibility in noisy env. Aid for hearing impaired people Synface

Synface (Synface (telephone -> animated telephone -> animated face)face)

ckholm

… … applicationsapplications

Language training speech training for profoundly deaf

Diagnostics and therapy EU: VEPSY, VREPAR (assess and treat

naxiety disorders and specific phobia)

Audiovisual speech Audiovisual speech integrationintegration = combining auditory and visual percepts into

a single speech percept

Strength of integration is demonstrated by McGurk-effect: combining sound /pa/ to a face ”telling” /ka/, speech percept is often /ta/(McGurk & MacDonald, 1976, Nature)

Result: Computer animated talking face improves intelligibility of auditory speech

A study in audio-visual speech A study in audio-visual speech perceptionperception

Psychophysical and psychophysiological experiments Audiovisual speech perception Emotion research …

Benefits Natural stimuli may contain unwanted

features Full controllability Quick creation of stimuli

… … application in researchapplication in research

Bulding on realismBulding on realism

Realism:1) Objective

topography, animation, texture, synchronization, ...

2) Subjective (communication) Audio-visual speech Facial expressions, nonverbal behavior

(prosody, eye movements)

Evaluation:Objective Subjective

Making the head speakMaking the head speak

Issues: Voice - speech synthesizer Animation – parameterization Synchronization

Acoustic Speech GenerationAcoustic Speech Generation

Based on Festival platform. Developed at The Centre for Speech Technology

Research, University of Edinburg, England. Scheme programming language, allows to

program behaviour Finnish voice, prosody, expansion (numerals,

etc.) Department of Phonetics, University of Helsinki

Issues: production of articulatory parameters, synchronization

Animation methods - Animation methods - representationrepresentation

Polygonal Keyframing

libraries of postures, interpolation Parametric deformations

deformations are grouped under parameters meaningful to the animator

Muscle-Based deformations Interactive deformations

numerous control points, deformation propagation

Free Form deformations deformation associated with a deformation box

Splines Implicit surfaces Physics-based models

Physical models of the skin Volume preservation Deformations by inducing forces

Hooks to data

Need the geometry of faces Rendering properties Deformation of facial expression or

speech

How? 2D and 3D techniques

3D Input

3D digitizer is the most direct way, fairly automatic (Optotrack)

3D trackers – digitizing of projected/marked mesh, rather manual

CT (Computer Tomography) and MRI (Magnetic Resonance Imaging)

and … 3D modeling programs

2D Input2D Input

Photogrammetry Two images of an object are taken from

different viewpoints, corresponding points are found

The 3D shape of faces can be determined from a single 2D image after projecting of regular pattern

Generic facial model is prepared and transformed to “match” a photograph 3rd dimension can be approximated by

acquiring face model (set priors) and Bayesian inference

Texture mappingTexture mapping

Data for articulation and Data for articulation and expressionsexpressions

Keyframing -> expression libraries Real-time/performance data Parameterization

Articulatory parameters – jaw opening, lip rounding, lip protrusion, …

Facial expressions – FACS Statistical models from expression

libraries or real-time data

Statistical parameterizationStatistical parameterization

Parameterized model learned from 3D performance data (Reveret)

by ISC

renoble

… … three control parametersthree control parameters

by ISC

renoble

… … and the resultsand the results

Rounding

Opening

Raising

by ISC

renoble

Finnish talking headFinnish talking head

Audiovisual database Using MaxReflex 3D optical tracker (at

Linköping Univ.) Multiple IR cameras, reflexive markers

reconstruction from stereo Coarticulation, lips, visual prosody

Point-lights positionsPoint-lights positions

Demo – live recording at Linköping

How to create “visemes” ?How to create “visemes” ?

Demo – reconstructed motionDemo – reconstructed motion

10 fps 40 fps

by ISC

renoble

End of 1st part

Helsinki University of Technology Laboratory of Computational Engineering Modeling facial...

Documents