Date post: | 11-Jan-2016 |
Category: |
Documents |
Upload: | erik-kelly |
View: | 212 times |
Download: | 0 times |
Animating Virtual Humans in Intelligent Multimedia Storytelling
Minhua Eunice Ma and Paul Mc Kevitt
School of Computing and Intelligent SystemsFaculty of Engineering
University of Ulster, MageeDerry, Northern Ireland
PGNet 2005
Liverpool, 28 June 2005
Outline
State-of-the-art virtual human animation standards
VRML/X3D & MPEG-4 for object modelling H-Anim & MPEG-4 SNHC for humanoid modelling VHML & STEP for human animation modelling Natural language to 3D animation
Language visualisation (animation) in intelligent multimodal storytelling system, CONFUCIUS
Humanoid animation in CONFUCIUS Multiple animation channels Space sites of virtual humans Virtual object manipulation
Conclusion & future work
PGNet 2005
Liverpool, 28 June 2005
Four levels of virtual human representation
VRML (X3D)
H-Anim
VHML (BAML)XML-based
STEPscript-based
Level 13D object modelling
Level 23D human modelling
Level 3Human animation modelling
high level animation
low level animation MPEG-4
CONFUCIUSLevel 4Natural language to animationAnimNL
MPEG-4SNHC
Current virtual human representation languages can be classified to four groups according to the levels of abstraction, starting from 3D geometry modelling to language animation.
PGNet 2005
Liverpool, 28 June 2005
Level 1: 3D object modelling
VRML (X3D)
H-Anim
VHML (BAML)XML-based
STEPscript-based
Level 13D object modelling
Level 23D human modelling
Level 3Human animation modelling
high level animation
low level animation MPEG-4
CONFUCIUSLevel 4Natural language to animationAnimNL
MPEG-4SNHC
VRML (Virtual Reality Modelling Language) is a hierarchical scene description language that defines the geometry and behaviour of a 3D scene. X3D is the successor to VRML. MPEG-4 uses BIFS (Binary Format for Scenes) for real-time streaming. BIFS borrows many concepts from VRML. BIFS and VRML can be seen as different representations of the same data.
PGNet 2005
Liverpool, 28 June 2005
VRML (X3D)
H-Anim
VHML (BAML)XML-based
STEPscript-based
Level 13D object modelling
Level 23D human modelling
Level 3Human animation modelling
high level animation
low level animation MPEG-4
CONFUCIUSLevel 4Natural language to animationAnimNL
MPEG-4SNHC
H-Anim is a stardard VRML97 representation for humanoids. It defines standard human Joints articulation, segments dimensions, and sites for “end effector” and attachment points for clothing. MPEG-4 SNHC (Synthetic/Natural Hybrid Coding) incorporates H-Anim and provides an efficient way to animate virtual human and tools for the efficient compression of the animation parameters associated with the H-Anim human model.
Level 2: 3D human modelling
PGNet 2005
Liverpool, 28 June 2005
H-Anim joint-segment hierarchy
An H-Anim file contains a joint-segment hierarchy.
Each joint node may contain other joint nodes and a segment node that describes the body part associated with the joint.
Each segment is a normal VRML transform node describing the body part's geometry and texture.
H-Anim humanoids can be animated using keyframing, inverse kinematics, & other animation techniques.
PGNet 2005
Liverpool, 28 June 2005
H-Anim models on the Web
1http://www.ballreich.net/vrml/h-anim/nancy_h-anim.wrl
2http://ligwww.epfl.ch/~babski/StandardBody
3http://www.cis.upenn.edu/~beitler/H-Anim/Models/H-Anim1.1/
4http://students.cs.tamu.edu/mmiller/hanim/proto/dork-proto.wrl
Virtual human models
Nancy1 Baxter, Nana2
Y.T., Hiro3
Dilbert3 Max3 Jake3 Dork4
Authors Cindy Ballreich
Christian Babski
Matt Beitler
Matt Beitler
Matt Beitler
Matt Beitler
Michael Miller
URLs:
PGNet 2005
Liverpool, 28 June 2005
VRML (X3D)
H-Anim
VHML (BAML)XML-based
STEPscript-based
Level 13D object modelling
Level 23D human modelling
Level 3Human animation modelling
high level animation
low level animation MPEG-4
CONFUCIUSLevel 4Natural language to animationAnimNL
MPEG-4SNHC
VHML (Virtual Human Mark-up Language) is an XML-based language which provides an intuitive way to define virtual human animation. It is composed of several sub-languages: DMML, FAML, BAML, SML, and EML.
STEP is a scripting language for human actions. It has a Prolog-like syntax, which makes it compatible with most standard logic programming languages.
Level 3: Human animation modelling
PGNet 2005
Liverpool, 28 June 2005
VHML & STEP examples
<left-calf-flex amount=”medium”><right-calf-flex amount=”medium”>
<left-arm-front amount=“medium"><right-arm-front amount=“medium">
Standing on my knees I beg you pardon</right-arm-front></left-arm-front>
</right-calf-flex></left-calf-flex>A. A VHML example
script(walk_forward_step(Agent),ActionList):-ActionList=[parallel( [script_action(
walk_pose(Agent),move(Agent,front,fast)
])].B. A STEP example
PGNet 2005
Liverpool, 28 June 2005
VRML (X3D)
H-Anim
VHML (BAML)XML-based
STEPscript-based
Level 13D object modelling
Level 23D human modelling
Level 3Human animation modelling
high level animation
low level animation MPEG-4
CONFUCIUSLevel 4Natural language to animationAnimNL
MPEG-4SNHC
High level animation applications converting natural language to virtual human animation. Little research on virtual human animation focuses on this level.
The AnimNL project aims to enable people to use natural language instructions to tell virtual humans what to do
CONFUCIUS also deals with language animation Research on this level will lead to powerful web-based applications
Level 4: Natural language to animation
PGNet 2005
Liverpool, 28 June 2005
Architecture of CONFUCIUS
3D authoring toolsexisting 3D models &virtual human models
Visual/audio knowledge (3D models & animations, audio encapsulated in graphic models)
Knowledge base Surface transformer
Media allocator
Natural Language Processing
Text-to-Speech
Animation engine (with
nonspeech audio)
Synchronizing3D virtual worldwith speech in VRML
Natural language sentences
Language knowledge(WordNet, LCS database, FDG
parser)
mappingsemantic
representation
Presentation agent
(Merlin the Narrator)
Narration integration
Multimodal presentation
PGNet 2005
Liverpool, 28 June 2005
match basic motionsin library?
User interaction
animation controller
environmentplacement
N
Y
VRML file of the virtual story world
Motion instantiation
Either loading a precreated keyframe animation or providing animation specification for animation generation
Semantic Representation
If the event predicate matches basic human motions in animation library
Apply spatial info & place OBJ/HUMAN into a specified environment
Camera controllerAutomatic camera placement & apply cinematic rules
Humanoid animation in CONFUCIUS
PGNet 2005
Liverpool, 28 June 2005
Multiple animation channels
3rd level human animation modeling languages (VHML, STEP) provide a facility to specify both sequential and parallel temporal relations
Simultaneous animations cause the Dining Philosopher's problem for higher level animation using predefined animation data (multiple animations may request to access same body parts at the same time)
Multiple animation channels allow characters to run multiple animations at the same time, e.g. walking with the lower body while waving with the upper body
Multiple animation channels often disable one channel when a specific animation is playing on another channel to avoid conflicts with another animation
Involved joints /Animations sacroiliac l_hip r_hip … r_shoulder
walk 2 2 2 … 1
jump 2 2 2 … 1
wave 0 0 0 … 2
run 2 2 2 … 1
scratch head 0 0 0 … 2
sit 2 2 2 … 1
… … … … … …
PGNet 2005
Liverpool, 28 June 2005
Space sites of virtual humans
Types of virtual objectsSmall props, manipulated by hands or feet, e.g. cup, hat, ball
Big props, source or targets of actions, e.g. table, chair, tree
Stage props have internal structure, e.g. house, restaurant, chapel
Site tags of virtual humansManipulating small props, 6 sites on hands (three sites for each hand), one site on head (skull_tip), one site for each foot tip
For big props placement, 5 sites indicating five directions around the human body: x_front, x_back, x_left, x_right, x_bottom. Big props like a table or chairs usually placed at these positions.
For stage props setting, 5 more space tags indicating further places: far_front, far_back, far_left, far_right, far_top. Stage props (e.g. a house) often locate at these far sites.
grip, pincer grippushing
pointing
PGNet 2005
Liverpool, 28 June 2005
Virtual object manipulation
1. Store applicable objects in the animation file of an action and using lexical knowledge of nouns to infer hypernymy relations between objects
2. Including the manipulation hand postures and movements within the object description, besides its intrinsic object properties. These objects have the ability to describe in details their functionality and their possible interactions with virtual humans.
4 stored hand postures for interacting with 3D objects
index pointing(press a button)
grip (hold cup handle, knob, a bottle)
pincer grip (use thumb and index finger to pick up small objects)
palm push (push a piece of furniture)
Two approaches to organize knowledge required for successful grasping
PGNet 2005
Liverpool, 28 June 2005
Conclusion
Classified virtual human representation languages into four levels of abstraction
CONFUCIUS is an overall framework of intelligent multimedia storytelling, using 3D modelling/animation techniques with natural language understanding technologies to achieve higher level virtual human animation
A number of projects are currently based on virtual human animation, working on various application domains. Few of them takes modern NLP approach that a high level human animation system should be based on.
The value of CONFUCIUS lies in generation of 3D animation from natural language by automating the processes of language parsing, semantic representation and animation production.
Potential application areas: computer games, animation production and direction, multimedia presentation, shared virtual worlds
Future work: coordination & synchronization of multiple virtual humans