overview agents and avatarsruth/year4ves/slides08/l15.pdf · 2012. 1. 8. · ad inglfetoave...
TRANSCRIPT
1
Agents and Avatars
Ruth Aylett
Overview
Agents and Avatars Believability v naturalism Building the body H-anim Moving and sensing
IVAS
Intelligent Virtual Agents (IVAs)– Also:
• Synthetic characters• Embodied conversational characters (ECAs)• Virtual humans• BUT do not have to be humanoid…
– Embodied and autonomous• Require a control architecture - or ‘agent mind’• Insides v outsides: combining AI with graphics
– ‘Inhabit’ a virtual environment
Why IVAs?
Adding life to a VE– Animals, birds, insects– Crowds
• E.g. students in the virtual campus– Increase sense of presence
As a guide or teacher– Front-end to embedded knowledge in a VE
As a character in a story– Computer games: Creatures, the Sims
Interface agents– On web pages to make them more human– Personal representative– Sales rep
2
Why IVAs? - 2
As virtual actors– Instead of extras– Immersive Education Media Stage
As part of a simulation– Hostage release training– Battlefield medical training
Scientific investigation– Buildings evacuation– Testing for disability friendliness– Ecology and animal behaviour
Avatars
Hindu Sanskrit Term - “Representation of aDeity in visible form”– Snow Crash (Neil Stephenson 1992) - “A
graphical representation of yourself”
VE representation of the user– Embodiment need not be humanoid
Driven by the user– So NOT autonomous– A mapping rather than a control problem
Video avatars
3
“magic video mirror”with back-projection
video image mixedwith computergeneratedoverlay
Creating a video avatar (ALIVE, MIT)
•unencumbered interactionbetween human visitor andvirtual character based onposition, postures, and gestures
•main focus on “virtual presence”
http://www.ai.univie.ac.at/oefai/agents/
Believability
Term introduced by Joe Bates of the OZgroup at CMU in the 1990s– Combined art and technology
Very hard to define– A willing suspension of disbelief?– Seem like ‘real’ characters?– The ‘illusion of life’?– Willingness to attribute an internal state
• Ascribe intentionality
Believability and Naturalism
Are they the same thing? Graphics people seem to think so
– Is Mickey a real mouse?– Is he a believable character?
The uncanny valley
The ‘uncanny valley’– Work by a Japanese
researcher, Mori– Acceptability v
naturalism Goes -ve as
becomes ‘nearlyhuman’
SEE: http://en.wikipedia.org/wiki/Uncanny_Valley
4
The problem of expectations
Humans have hard-wired expectations– Used to interpret inter-personal behaviour– Fundamental social skill
Need to invoke this very carefully– Acceptability of movement
• Lip sync a key problem
– Interactive responsiveness• Memory of interaction a key issue• Games create a real problem with instant rewind
Believability issues
Consistency of character– Movement, language, appearance– Consistency with VE also an issue
Plausible skills– Physical, social, knowledge
Responsiveness– Notice and repair errors
Building a body
Use of 3D modelling package– 3ds Character Studio; Poser
Creation of skeleton– Required for animation later
Polygonal body– But overall count must be low for real-time
interaction - 100 polygon body
Texture to cover body– One body, many textures?
5
1. 2. 3. 4.
Capturing the userThe Dome TalkZone Booth
1. Checking the facialfeature extraction
2. Enter VisitorName
3. GeneratesAvatar Card
User interface
Skate Boarding Scene Disco Dancing Scene
AvatarExample
Taking the model home
6
TextureImages
X 4
SilhouetteImages
X 4
Stage 1: Image Capture
•Generic Avatar
•Polygonal Structure
•1500 Polygons
Change Data:Deforms the Mesh
Stage 2: Deform the Mesh
Apply Texturesto Deformed
Mesh
Stage 3: Apply the Textures The touch up process
7
What sort of motion?
Walking around– Jumping, running, swimming– Human and non-human
Picking things up– Agent-agent interaction
Gesture Talking heads
– Facial muscles, lip synch Group Movement
– Crowds, flocks, herds
Moving the body
Animation– Drawing on cartoon animation
Motion capture– Drawing on gait analysis and then film
PBM– Drawing on robotics
• Analytically calculated• Learned
Animation
Time consuming for good results– Requires artistic skills
The main character, Woody, in ToyStory had:– 700 degrees of freedom (200 for face and
fifty for the mouth)– 150 people (at Pixar) would generate 3
minutes of animation a week Can it be standardised?
Self-animation
Poses new problems:– Parametrisable animation very desirable
• E.g a walk that could be reused with differentstride length or foot height
– Melding and combining of animations ‘onthe fly’• Not just a morphing problem• ‘starting position’?
– Extra actions inserted by character
8
Standardising animation?
Animation depends on the structure ofskeleton that is being animated
Can we standardise a humanoidskeleton?– H-anim just such an attempt– Originally in the context of VRML– See www.h-anim.org
H-anim
A set of standard components– Humanoid: root of a figure– Joint: attached using transform specifying
current state of articulation plus geometryassociated with attached body part
– Segment: specifies attributes of physical linksbetween joints
– Site: where can add semantics– Displacer:range of movement allowed for object
in which embedded
H-anim And in VRML.. DEF hanim_l_hip Joint { name "l_hip" rotation 0 0 1 0 center 0.122 0.888271 -0.0693267 children [ DEF hanim_l_thigh Segment {
name"l_thigh"children Shape { appearance Appearance { material USE Pants_Color } geometry IndexedFaceSet {
9
Motion capture
Electro-magnetic or by camera Produces the most ‘natural’ results
– Extensively used in film for graphical extras– Can be used on avatars very successfully to transmit user
movement to their graphical representation
Even more problems for self-animation– Harder to parametrise– Worse combining/melding problems
http://www.televirtual.com/
SIMON The Signer
Performs deaf-signing in real-time,over-laid on normal TV image
Uses Teletext subtitle signal– accompanies many broadcast
programme
Linguistic translation of subtitlecontent into a popular sign language
Presented through on-screen,animated virtual human character– Runs at client end
SIMON The Signer - 2 Dictionary of signed words for
look up of accompanying physicalmovement, facial expressions andbody positions, stored as motion-capture data
Physical moves can be called inany order with interpolation tocreate(reasonably) smooth,natural looking signing sequences
However a big acceptanceproblem
A professional signer preparedfor a motion capture session
Physically-based modelling
Extending robot motor control Forward kinematics
– If you move joints so much– Where does the end effector go?
Inverse kinematics– If you want the end effector at x,y,z…– How much should which joints move?
Computationally demanding and often not verynaturalistic– But the most flexible option– Animation blending often added…
10
Fish example - spring-massmodel
Learning to move
Use of AI learning algorithms– Move– Evaluate and score– Keep movements which work well
Karl Sims ‘Blockies’– Early 1990s
Terzoplolous - Fish Learning
http://www.csri.utoronto.ca/~dt/
Virtual sensors
Local v global interaction– Global - read from world data-structures– Local - ‘sense’ the environment
Global approach very common– In most computer games– Efficient, easy to test
11
Advantages of local sensing
Scales well– Not affected by global size of environment
Agent has independence of environment– Up to a point
Makes for believability– Agent perceives what it should– Can’t see you round corners– Emergent complexity
TeleTubby Sensors
Forward ray-tracing sensor isseven meters and sweeps 45o,five times/sec
One vector sensor directed vertically downwards. Intersectionwith the ground is continually beingdetected
All these sensors are attached tothe geometry of the agent
Virtual VisionHow it’s done
Take description of 3D scene– Plus camera position
Produce 2D pixel array– Colour and intensity
Agent reconstitutes scene– Using eye separation for depth
12
Processing
Attentional focus– Cuts down processing– Directable eyes– Look for ‘interesting’ objects: orange pixels– Known as ‘active vision’
Variable resolution– High-res at centre of field
Sensing by message passing
Requires an architecture that distributesevents as messages– Concept of locale
• What is local for local sensing?
– Semantically determined: e.g a room Scenegraphs not set up for this style of
message passing– Think of routing in VRML for example
Pedagogical Agents - STEVE
http://www.isi.edu/isd/VET/steve-demo.html
VET ArchitectureRuns the simulation
Does the3D rendering &
detect interactions
TTSEngine
13
Steve's Architecture
Specific modules to steve
general cognitive architecturethat can model expert
performance as well asnovice behaviour.