overview agents and avatarsruth/year4ves/slides08/l15.pdf · 2012. 1. 8. · ad inglfetoave...

1

Agents and Avatars

Ruth Aylett

Overview

Agents and Avatars Believability v naturalism Building the body H-anim Moving and sensing

IVAS

Intelligent Virtual Agents (IVAs)– Also:

• Synthetic characters• Embodied conversational characters (ECAs)• Virtual humans• BUT do not have to be humanoid…

– Embodied and autonomous• Require a control architecture - or ‘agent mind’• Insides v outsides: combining AI with graphics

– ‘Inhabit’ a virtual environment

Why IVAs?

Adding life to a VE– Animals, birds, insects– Crowds

• E.g. students in the virtual campus– Increase sense of presence

As a guide or teacher– Front-end to embedded knowledge in a VE

As a character in a story– Computer games: Creatures, the Sims

Interface agents– On web pages to make them more human– Personal representative– Sales rep

2

Why IVAs? - 2

As virtual actors– Instead of extras– Immersive Education Media Stage

As part of a simulation– Hostage release training– Battlefield medical training

Scientific investigation– Buildings evacuation– Testing for disability friendliness– Ecology and animal behaviour

Avatars

Hindu Sanskrit Term - “Representation of aDeity in visible form”– Snow Crash (Neil Stephenson 1992) - “A

graphical representation of yourself”

VE representation of the user– Embodiment need not be humanoid

Driven by the user– So NOT autonomous– A mapping rather than a control problem

Video avatars

3

“magic video mirror”with back-projection

video image mixedwith computergeneratedoverlay

Creating a video avatar (ALIVE, MIT)

•unencumbered interactionbetween human visitor andvirtual character based onposition, postures, and gestures

•main focus on “virtual presence”

http://www.ai.univie.ac.at/oefai/agents/

Believability

Term introduced by Joe Bates of the OZgroup at CMU in the 1990s– Combined art and technology

Very hard to define– A willing suspension of disbelief?– Seem like ‘real’ characters?– The ‘illusion of life’?– Willingness to attribute an internal state

• Ascribe intentionality

Believability and Naturalism

Are they the same thing? Graphics people seem to think so

– Is Mickey a real mouse?– Is he a believable character?

The uncanny valley

The ‘uncanny valley’– Work by a Japanese

researcher, Mori– Acceptability v

naturalism Goes -ve as

becomes ‘nearlyhuman’

SEE: http://en.wikipedia.org/wiki/Uncanny_Valley

4

The problem of expectations

Humans have hard-wired expectations– Used to interpret inter-personal behaviour– Fundamental social skill

Need to invoke this very carefully– Acceptability of movement

• Lip sync a key problem

– Interactive responsiveness• Memory of interaction a key issue• Games create a real problem with instant rewind

Believability issues

Consistency of character– Movement, language, appearance– Consistency with VE also an issue

Plausible skills– Physical, social, knowledge

Responsiveness– Notice and repair errors

Building a body

Use of 3D modelling package– 3ds Character Studio; Poser

Creation of skeleton– Required for animation later

Polygonal body– But overall count must be low for real-time

interaction - 100 polygon body

Texture to cover body– One body, many textures?

5

1. 2. 3. 4.

Capturing the userThe Dome TalkZone Booth

1. Checking the facialfeature extraction

2. Enter VisitorName

3. GeneratesAvatar Card

User interface

Skate Boarding Scene Disco Dancing Scene

AvatarExample

Taking the model home

6

TextureImages

X 4

SilhouetteImages

X 4

Stage 1: Image Capture

•Generic Avatar

•Polygonal Structure

•1500 Polygons

Change Data:Deforms the Mesh

Stage 2: Deform the Mesh

Apply Texturesto Deformed

Mesh

Stage 3: Apply the Textures The touch up process

7

What sort of motion?

Walking around– Jumping, running, swimming– Human and non-human

Picking things up– Agent-agent interaction

Gesture Talking heads

– Facial muscles, lip synch Group Movement

– Crowds, flocks, herds

Moving the body

Animation– Drawing on cartoon animation

Motion capture– Drawing on gait analysis and then film

PBM– Drawing on robotics

• Analytically calculated• Learned

Animation

Time consuming for good results– Requires artistic skills

The main character, Woody, in ToyStory had:– 700 degrees of freedom (200 for face and

fifty for the mouth)– 150 people (at Pixar) would generate 3

minutes of animation a week Can it be standardised?

Self-animation

Poses new problems:– Parametrisable animation very desirable

• E.g a walk that could be reused with differentstride length or foot height

– Melding and combining of animations ‘onthe fly’• Not just a morphing problem• ‘starting position’?

– Extra actions inserted by character

8

Standardising animation?

Animation depends on the structure ofskeleton that is being animated

Can we standardise a humanoidskeleton?– H-anim just such an attempt– Originally in the context of VRML– See www.h-anim.org

H-anim

A set of standard components– Humanoid: root of a figure– Joint: attached using transform specifying

current state of articulation plus geometryassociated with attached body part

– Segment: specifies attributes of physical linksbetween joints

– Site: where can add semantics– Displacer:range of movement allowed for object

in which embedded

H-anim And in VRML.. DEF hanim_l_hip Joint { name "l_hip" rotation 0 0 1 0 center 0.122 0.888271 -0.0693267 children [ DEF hanim_l_thigh Segment {

name"l_thigh"children Shape { appearance Appearance { material USE Pants_Color } geometry IndexedFaceSet {

9

Motion capture

Electro-magnetic or by camera Produces the most ‘natural’ results

– Extensively used in film for graphical extras– Can be used on avatars very successfully to transmit user

movement to their graphical representation

Even more problems for self-animation– Harder to parametrise– Worse combining/melding problems

http://www.televirtual.com/

SIMON The Signer

Performs deaf-signing in real-time,over-laid on normal TV image

Uses Teletext subtitle signal– accompanies many broadcast

programme

Linguistic translation of subtitlecontent into a popular sign language

Presented through on-screen,animated virtual human character– Runs at client end

SIMON The Signer - 2 Dictionary of signed words for

look up of accompanying physicalmovement, facial expressions andbody positions, stored as motion-capture data

Physical moves can be called inany order with interpolation tocreate(reasonably) smooth,natural looking signing sequences

However a big acceptanceproblem

A professional signer preparedfor a motion capture session

Physically-based modelling

Extending robot motor control Forward kinematics

– If you move joints so much– Where does the end effector go?

Inverse kinematics– If you want the end effector at x,y,z…– How much should which joints move?

Computationally demanding and often not verynaturalistic– But the most flexible option– Animation blending often added…

10

Fish example - spring-massmodel

Learning to move

Use of AI learning algorithms– Move– Evaluate and score– Keep movements which work well

Karl Sims ‘Blockies’– Early 1990s

Terzoplolous - Fish Learning

http://www.csri.utoronto.ca/~dt/

Virtual sensors

Local v global interaction– Global - read from world data-structures– Local - ‘sense’ the environment

Global approach very common– In most computer games– Efficient, easy to test

11

Advantages of local sensing

Scales well– Not affected by global size of environment

Agent has independence of environment– Up to a point

Makes for believability– Agent perceives what it should– Can’t see you round corners– Emergent complexity

TeleTubby Sensors

Forward ray-tracing sensor isseven meters and sweeps 45o,five times/sec

One vector sensor directed vertically downwards. Intersectionwith the ground is continually beingdetected

All these sensors are attached tothe geometry of the agent

Virtual VisionHow it’s done

Take description of 3D scene– Plus camera position

Produce 2D pixel array– Colour and intensity

Agent reconstitutes scene– Using eye separation for depth

12

Processing

Attentional focus– Cuts down processing– Directable eyes– Look for ‘interesting’ objects: orange pixels– Known as ‘active vision’

Variable resolution– High-res at centre of field

Sensing by message passing

Requires an architecture that distributesevents as messages– Concept of locale

• What is local for local sensing?

– Semantically determined: e.g a room Scenegraphs not set up for this style of

message passing– Think of routing in VRML for example

Pedagogical Agents - STEVE

http://www.isi.edu/isd/VET/steve-demo.html

VET ArchitectureRuns the simulation

Does the3D rendering &

detect interactions

TTSEngine

13

Steve's Architecture

Specific modules to steve

general cognitive architecturethat can model expert

performance as well asnovice behaviour.

overview agents and avatarsruth/year4ves/slides08/l15.pdf · 2012. 1. 8. · ad inglfetoave...

Documents