1
Virtual Characters: Design, Implementation and Responses
Marco GilliesVinoba Vinayagamoorthy
Outline
• Why Virtual Human Representations?• Avatars and Agents• Representing a Person in a VE• Forward and Inverse Kinematics, Morph
targets• Designing Virtual Humans• Emotion, Personality and Social Intelligence• Believable behaviour• Conclusions
Virtual Human Representations
• Useful and interesting applications are with other people– Simulation of real events– Training– Entertainment– Shared VEs
• The Others are entirely ‘virtual’• The Others are entirely ‘real’
– As in shared (networked VEs)
2
Networked VEs
• Need some representation of the other people in the shared VE
• Typically called ‘avatars’• Avatars represent the real tracked person
– Spatial representation• Where they are, what they are looking at
– Behavioural representation• What they are doing
Avatar in Shared VE
Avatar Representations
• Spatial• Behavioural
3
Virtual Humans - Agents
http://www.miralab.unige.ch
Agents are entirely program controlled ratherthan representing an on-linehuman.These are examples fromvirtual fashion shows.
Different Aspects
• Graphics– Polygon meshes, rendering
• Animation– Skeletal animation, mesh morphing, physical simulation
• Behaviour
Graphics
Joao Oliveira (UCL CS)
Scanned body results in huge meshwhich can be rendered at differentresolutions (numbers of polygons)
4
Skeletal Animation
• The fundamental aspect of human body motion is the motion of the skeleton
• The motion of rigid bones linked by rotational joints (first approximation)
• I will discuss other elements of body motion such as muscle and fat briefly later
Typical Skeleton
• Circles are rotational joints lines are rigid links (bones)
• The red circle is the root (position and rotation offset from the origin)
• The character is animated by rotating joints and moving and rotating the root
Forward Kinematics (FK)• The position of a link is calculated by
concatenating rotations and offsets
O0
R0
O1 O2
R1
P2
5
Forward Kinematics (FK)
• First you choose a position on a link (the end point)• This position is rotated by the rotation of the joint
above the link• Translate by the length (offset) of the parent link and
then rotate by its joint. Go up it its parent and iterate until you get to the root
• Rotate and translate by the root position
Forward Kinematics (FK)
• Simple and efficient• Come for free in a scene graph architecture• Difficult to animate with,
– often we want to specify the positions of a characters hands not the rotations of its joints
• The Inverse Kinematics problem:– Calculating the required rotations of joints needed to put
a hand (or other body part) in a given position.
Inverse Kinematics
• An number of ways of doing it• Matrix methods (hard)• Cyclic Coordinate Descent (CCD)
– A geometric method (secretly matrices underneath)
R1
Pt
R0
O1 O2
6
Inverse Kinematics
• Start with the final link
Inverse Kinematics
• Rotate it towards the target
Inverse Kinematics
• Then go to the next link up
7
Inverse Kinematics
• Rotate it so that the end effector points towards the target
Inverse Kinematics
• And the next…
Inverse Kinematics
• And the next…
8
Inverse Kinematics
• And iterate until you reach the target
Inverse Kinematics
• And iterate until you reach the target
Inverse Kinematics
• And iterate until you reach the target
9
Inverse Kinematics
• And iterate until you reach the target
Inverse Kinematics
• And iterate until you reach the target
Inverse Kinematics
• IK is a very powerful tool• However, it’s computationally intensive• IK is generally used in animation tools and for
applying specific constraints• FK is used for the majority of real time animation
systems
10
Minimal Tracking for IK in VR
• Badler et al showed a minimal configuration for IK representing the movements of a human in VR– www.cis.upenn.edu/
~hollick/presence/presence.html
• It was shown that 4 sensors are sufficient to reasonably reconstruct the approximate body configuration in real-time.
Representation
• Layered representation– Skeleton structure forms a
scene graph– Scene graph embodies a
set of joints– A mesh overlays the scene
graph– As the skeletal structure
moves the mesh must deform appropriately (otherwise there are holes)
MPEG4 examplehttp://ligwww.epfl.ch/~maurel/Thesis98.html
Facial Animation
• Don’t have a common underlying structure like a skeleton
• Faces are generally animated as meshes of vertices
• Animate by moving individual vertices
11
Morph Targets
• Have a number of facial expressions, each represented by a separate mesh
• Each of these meshes must have the same number of vertices as the original mesh but with different positions
• Build new facial expressions out of these base expressions (called Morph Targets)
Morph Targets
Morph Targets
• Smoothly blend between targets• Give each target a weight between 0 and 1• Do a weighted sum of the vertices in all the
targets to get the output mesh
∑∑ ==∈
1;etsmorph_targ
tt
titi wvwv
12
Using Morph Targets
• Morph targets are a good low level animation technique
• Also need ways of choosing morph targets• Could let the animator choose (nothing wrong with
that)• But there are also more principled ways
Summary• Virtual human avatars are necessary to represent people to
themselves and in shared VEs.• Virtual human agents are necessary to represent social situations.• VHs are represented typically as ‘skinned’ skeletal scene graphs,
representing sets of joints.• Forward kinematics determines overall configuration given joint
angles and Inverse kinematics determines joint angles from requirements for end-effectors
• Representations typically need to be a mixture based on trackingdata and inferred state.
• Morph targets are a method of mesh deformation often used for facial animation
• Later will go on to consider more sophisticated models of behaviour determination, and also social intelligence.
Believable Behaviour
• For agents the behaviour is completely programmed.
• For avatars the behaviour is ideally completely determined by the behaviour of the real tracked human.
• In practice the human cannot be fully tracked –typically in VR only head and one hand movements are tracked!
13
Controlling/Inferring Behaviour
• In practice some elements of avatar behaviour are programmed not tracked
• E.g., breathing and eye blinking at the least
• Ideally can use information about ‘mood’ to determine aspects of avatar behaviour.
• Impossible to track every aspect of the human’s behaviour so much must be inferred and programmed.
• Real avatars are mixed.
Avatar
Agent
Tracking
Programming
Mixture of both
Behaviour
• Autonomously deciding what action to take at a given time
• Not necessary for film but vital for real time interaction
• At the interface between Graphics, AI and A-Life• The subject of the rest of this talk
Behaviour Outline
• An overview of early (land mark) behavioural simulation techniques
• An overview of social behaviour simulation taking in– Control algorithms– Psychological theories– How social behaviour is expressed in animation
14
Example
Craig Reynolds - flocking• The first behavioural simulation• Simulates the behaviour of flocks
of birds (boids), schools of fish or herds of animals
• Extensively used in films and other applications
• “Flocks, herds and schools: a distributed behavioural model” Craig Reynolds SIGGRAPH 1987
Craig Reynolds - flocking• Simulated the behaviour of
flocking birds with three rules:– Separation
• avoid crowding– Alignment
• align heading to average of local flockmates
– Cohension• head towards average positions of
flockmates
15
Animals• Reynolds work lead to an
exploration of animal behaviour
• Tu and Terzopoulos simulated fish
• Xiaoyuan Tu and DemetriTerzopoulos, "Artificial Fishes: Physics, Locomotion, Perception, Behavior", SIGGRAPH'94,
Fish• Homeostatic drives
– A drive to maintain a balance of a certain behavioural feature– Drives increase when unsatisfied, decrease when satisfied– Hunger, Libido
• Other drives– Fear: depends on distance to predator
Similar work: Dogs• Bruce Blumberg at MIT media lab • Silas – simulated dog with
homeostatic drives • Arbitration between drives• Multi-level control• “Multi-level control of
autonomous creates for real-time virtual environments” blumbergand Galyean 1995
16
Other aspects of simulation• Perception• Path finding• Learning• Social (Pack) behaviour
The Sims• Hugely popular game based on
people simulation• All about homeostatic drives
– Hunger, social, tiredness– Drives go up while not satisfied– Objects have a surrounding field
that attracts Sims based on the drives they satisfied (e.g. fridges satisfy hunger)
Social Intelligence
• These techniques work well as far as they go but do not model the complexities of human social interaction
• Vital if we are to have interesting interactions with autonomous characters
• Also useful for making our interactions via avatars closer to real interactions
17
Designing virtual humans
• GOAL: Represent the Person in VE consistently– With perceived realism, believability …
• Induce responses to the virtual human– Inducing realistic/lifelike responses
• Enhancing collaborative experience• Facilitate social communication and interpersonal
relationships
What responses do you get?
• David• Not very comfortable with
public speaking• Asked to speak about his
favourite subject: cables• Behaviours triggered at
appropriate intervals• Look at the virtual humans
Pertaub, D.-P., Slater, M., and Barker, C. (2002). An experiment on public speaking anxiety in response to three different types of virtual audience. Presence: Teleoperators and Virtual Environments, 11(1): 68-78
The Fear of public speaking
• The user was asked to give a presentation thrice– Positive, Negative and Mixed
• Positive - agents smiled, leaned forward, faced the user, maintained gaze, clapped hands, etc.
• Negative - agents yawned, slumped forward, put feet on the table, avoided eye contact, and finally walked out
• Mixed - agents started off with largely negative responses and gradually turned positive
18
Realistic responses in VE ?
• Individuals' self-rated performance was positivelycorrelated with the perceived good mood of the agents
• Evidence of a negative response especially strong with the negatively inclined audience– Sweating and stammering– Vocal protests at the agent behaviours
• Virtual humans with minimal behavioural-visual fidelity can elicit significant user responses
• Holy grail: Virtual humans with high visual fidelity that mimic real-life context-appropriate behaviours
Designing behaviour
• Creating apparent social intelligence is challenging• Have to present behavioural cues to depict a perceived
(and plausible) psychological state– Or the near-truth internal state of the Person being represented
• Human behaviour is a very intricate phenomenon – Dependent on many factors
• Extremely difficult to replicate especially if the design process is approached in an ad-hoc manner– For instance: In social interactions within VE, the more visually
realistic the virtual human, the more naturalistic users expect it to act
Inferring Behaviour: Animation imitating life
• Emotional models– Controllers of behaviour in accordance to internal states
• Personality models– Creating unique identities
• Conversation-feedback models– Controlling behaviour
• Social models– Interpersonal relationships and attitudes
• ???
Lasseter, J. (1987). Principles of traditional animation applied to 3d computer animation. ACM SIGGRAPH Computer Graphics, 21(4):35–44.
19
Emotion
• Integral to expressing the self, understanding others effectively and accomplishing social goals (both mutual and personal)
• Occur as a combination of the perception of environmental stimuli, neural/hormonal responses to these perceptions (feelings), and the subjective labelling of these feelings– Researchers argue that creating emotion is essential to creating
intelligence and reasoning– Cartoonists maintain that emotional expressions are necessary
substrates for producing plausible characters• In VE, Emotions can be useful as a control mechanism for
behaviour and changes in perceived states
Goleman, D. (1996), Emotional Intelligence. Bloomsbury Publishing Plc.Minsky, M. (1988), The society of mind. Touchstone.Picard, R. W. (1997), Affective computing. MIT Press.
Emotion research: riddled with issues
• Much confusion and uncertainty about concepts and definitions– Variety of disciplines including philosophers (Plato, 1945), neuroscientists
(Damasio, 1995), anthropologists (Darwin, 1872), and social psychologists (Brewer and Hewstone, 2004; Ekman and Davidson, 1994)
• General lack of agreement on what constitutes an emotion– Anger and sadness are accepted as emotions but there is less agreement
on moods (irritability, depression), long-term states (love), dispositions (benevolence), motivational feelings (hunger), cognitive feelings (confusion, deja vu) and calm states (satisfaction)
• Emotional states are processes that unfold over time and involves a variety of components
• Main underlying question is: Are emotions innate, learnt or both?
The existence of Basic/Pure emotions
• Empirical evidence exists – Universality of verbal labels, – Facial expression patterns and– Antecedent eliciting situations– Distinctive physiological response
patterns for anger, fear, disgust• Each model proposes its’ own
set of basic emotions• Six Ekman emotions associated
to facial expressions:– happiness, surprise, disgust, fear,
sadness and anger.– Imbalance between +ve/–ve labels
Ekman, P. (1982). Emotion in the Human Face. Cambridge University Press, New York.
http://mambo.ucsc.edu/psl/ekman.html
20
Another basic emotion model
• Plutchik’s model contains four pairs of opposites
• Allows for blends and emotional intensities but – not an overlay of opposite
emotions nor – does it allow for the cognitive
elements• In Plutchik's view, all emotions
are a combination of these basic emotions
Plutchik, R. (1980). A general psychoevolutionary theory of emotion, pages 3–33. Emotion: Theory, research, and experience: Vol. 1. Theories of emotion. Academic Press.
More complex models: OCC
• The OCC model is based on cognitive appraisals and provides a wider variety of containers– 22 groups in the original version (1998). Simpler version (2003) has – 6 positive categories: joy, hope, relief, pride, gratitude and love;– 7 negative categories: distress, fear, disappointment, anger and hate.
• The OCC suggests the emotions people experience depend on what they focus on in a situation and how they appraise it.– The focus might be on events, people, or objects.
• The OCC model is used widely to generate emotions in virtual humans
Ortony, A., Clore, G., and Collins, A. (1988). Cognitive structure of emotions. Cambridge University PressOrtony. (2003). On making believable emotional agents believable. Trapple, R. P. ed.: Emotions in humans and artefacts, pp. MIT Press, Cambridge, USA.
Personality
• Personality represents the unique characteristics of an individual
• Where as emotions are temporally inconsistent, personalities remain constant
• Personality is not specific to particular events• An emotion is a brief, focused change in personality• Personality models in virtual humans help create a sense
of uniqueness to it and acts as a long-term controller
21
The FIVE factor model
• Five dimensions of personality,• A normal distribution of scores
along these dimensions,• Scores vary continuously with
most people falling in between extremes
• Preferences indicated by strength of score, and
• An emphasis on individual personality traits which are stable through life,
• A model based on experience, not theory
• My personality relative to other females in the UK between the ages of 21 and 40– Openness (L) 0– Conscientiousness (A) 40– Extraversion (H) 78– Agreeableness (L) 1– Neuroticism (L) 24
• So I am practical, reasonably reliable, very social-able, uncompromising and very composed
McCrae, R. R. and John, O. P. (1992). An introduction to the five-factor model and its applications. Journal of Personality, 60:175–215.
The PAD model: emotion and personality
• Mehrabian’s PAD model has 3 inter-related dimension– pleasure, arousal and dominance
• Allows for links between personality and emotions
• Personality would be controlled by how prone you are to experience each dimension
• Low pleasure, high arousal and high dominance would be anger
• Low pleasure, high arousal but low dominance would be fear
Mehrabian, A. (1980). Basic dimensions for a general psychological theory: Implications for personality, social, environmental, and developmental studies. Oelgeschlager, Gunn & Hain.
Relationship and Attitude
• How we express our relationship or our feelings about other people depends on a number of inter-personal attitude– Status (dominance/submission)– Affiliation (liking/closeness)
Gillies and Ballin “A Model of Interpersonal Attitude and Posture Generation” Intelligent Virtual Agents 2003
22
Conversation feedback
• Face-to-face communication channels can be divided into two distinct but interrelated categories: – verbal and nonverbal
• Nonverbal behavioural changes give a tone to the communication, accent it and sometime overrides the verbal part
• The recreation of the non-verbal aspect of communication in CVEs tend to be problematic due to the many functions
Conversation
• Body language (non-verbal communication) facilitates conversation– Gaze gives feedback about
a listeners attention and help decide who should talk
– Gesture accompanies and adds to speech
• Multi-modal conversation, integrating speech with non-verbal communication
Cassell, J., Bickmore, T. W., Billinghurst, M., Campbell, L., Chang, K., Vilhj´almsson, H. H., and Yan,H. (1999). Embodiment in conversational interfaces: Rea. In Proceedings of SIGCHI, pages 520–527.Cassell, J., Vilhj´almsson, H. H., and Bickmore, T.W. (2001). Beat: The behaviour expression animationtoolkit. In SIGGRAPH, pages 477–486.
Communicative functions of non-verbal behaviours
• Emblems: used intentionally and consciously when verbal is not possible
• Illustrators: tied to speech patterns and often to aid the build up of rapport – or when the individual is having trouble finding the words– Culture dependent
• Affect displays: used with less awareness and intentionality to display emotional and psychological state
• Regulators: maintain the rhythm and flow of the conversation• Adaptors: involuntarily provide insight into individuals’ attitude
and anxiety level
Ekman, P. and Friesen, W. V. (1969). The repertoire of nonverbal behaviour: categories, origins, usageand coding. Semiotica, 1:49–98.
23
Categories of behavioural cues
• Vocal properties– Tone, Pitch, Loudness…
• Facial expressions– The most studied behavioural cue due
to it’s role in communication• Gaze behaviour
– Probably the most intense social signallers
• Kinesics: Posture and Motion– Numerous gestures depending on
culture for instance• Proxemics
– Culture and gender dependent
Argyle, M. (1998). Bodily Communication. Methuen & Co Ltd, second edition.
Facial expression
• In reality, 20000 facial expressions exist• Normally animated by blending “Morph Targets”• Different granularities of facial expression
– Facial action parameters (most basic units)• Basic emotions
– Phonemes (mouth shapes for lip-sync)– Principal component analysis
Gaze
• Normally animated procedurally – just rotating the eyes
and head• Very important in
conversation and social communication
• Also shows attention and liking
Argyle, M., Ingham, R., Alkema, F., and McCallin, M. (1973). The different functions of gaze. Semiotica,7:10–32.Lee, S. P., Badler, N. I., and Badler, J. (2002). Eyes alive. In SIGGRAPH, pages 637–644.
24
Gesture
• Normally animated by choosing from a library of gestures
• Very closely associated with speech– Also back channel gestures by listeners (e.g. head nod)
• Different types of gesture– E.g. beat, iconic
• Again see Cassell’s work referenced earlier
Posture
• Over 1000 stable postures have been observed
• Normally animated by choosing from (or blending between) a library of gestures
• Associated with attitude and emotion
• Associated also with interpersonal attitude
Coulson, M. (2004). Attributing emotion to static body postures: Recognition accuracy, confusions, andviewpoint dependence. Journal of Nonverbal Behavior, 28(2):117–139.
Perceived and intended expression
• Sets of cues are used to express and perceived internal states
• The first issue revolves around the existence of a set of distinct behavioural cues for a specific internal state caused by a specific stimuli
• The other problem is mapping common attributes presented in the expressed behaviours used with the same internal state caused by different stimuli
25
Designing Virtual Humans: Appearance vs. BehaviourVinayagamoorthy, V., Garau, M., Steed, A., and Slater, M. (2004b). An eye gaze model for dyadic interaction in an immersive virtual environment: Practice and experience. Computer Graphics Forum, 23(1):1–11.
Designing Virtual Humans: Appearance vs. Behaviour
• Sparse environment – abandoned building– Minimise visual distraction– One genderless cartoon form character– Two gender-matched higher fidelity characters
• Behaviour– Common limb animations and condition-dependent gaze
animations– Individuals listening in a conversation look at their conversational
partner for longer periods of time and more often than when theyare talking
• Negotiation task to avoid a scandal - 10 minutes
Designing Virtual Humans: Appearance vs. Behaviour
3 ♂ pairs3 ♀ pairs
3 ♂ pairs3 ♀ pairs
Inferred*
gaze
3 ♂ pairs 3 ♀ pairs
3 ♂ pairs 3 ♀ pairs
Random gaze
Higher –Fidelity
Cartoon – Form
App.
Beh.
Garau, M., Slater, M., Vinayagamoorthy, V., Brogni, A., Steed, A., and Sasse, A. M. (2003). The impact of avatar realism and eye gaze control on the perceived quality of communication in a shared immersive virtual environment. In Proceedings of SIGCHI, pages 529–536.
HighLowInferred*
gaze
LowHighRandom gaze
Higher –Fidelity
Cartoon – Form
App.
Beh.
26
Designing Virtual Humans: Appearance vs. Behaviour
• In each of the responses, the higher fidelity avatar had a higher response with the inferred-gaze model
• And a low response with the random-gaze model– Important to note that the differences between both the gaze models
were very subtle• Saccadic velocity and inter-saccadic intervals (means)
• Analysis demonstrated a very strong interaction effect between the type of avatar and the fidelity of the gaze model– The higher-fidelity avatar did not outperform the cartoon-form avatar– Similar hypothesis in the fields of robotics
Measuring Success
• So the careful design of behaviour is important but there are caveats
• Success of a VE is measured in terms of the extent to which sensory data projected within a virtual environment replaces the sensory data from the physical world– quantified by rating the individuals’ sense of presence during the
experience• For Virtual Humans: Success is taken as the extent to
which participants act and respond to the agents as if they were real– Subjective: Questionnaires, Interviews– Objective: Physiological, Behavioural
Subjective means
• Traditional methods: Questionnaires and interviews– Various questionnaires exist– http://www.presence-research.org
• Criticised due to its various dependencies– the individual’s accurate post-hoc recall, – processing and rationalisations of their experience in
the VE and – Varying interpretations of the word ‘presence’
27
Objective: Responses to stimuli
• Numerous possible objective measures– Subconscious responses
• Threat-related facial cues provokes individuals to use different viewing strategies– Neural responses
• Different areas of the brain are activated during +ve, -ve and neutral situations– Psychological responses
• Stress and Anxiety in response to threat– Physiological responses
• Galvanic Skin Responses, Heart Rate Variability, electrocardiograms, electromyography, Respiratory activity
– Behavioural responses• Flight or Fight (based on cognitive appraisal)
• Vary based on cognitive factors, personality, emotional state, gender etc.– How do we interpret the data and results?
Conclusion
• Virtual human agents are necessary to represent social situations
• Social intelligence is rather difficult to capture• Emotional, personality and interpersonal models• The design of behaviours should be implemented with
consideration to many other factors• Current Research focus on quantifying the successful
creation of Virtual Humans using objective measures