near-optimal character animation with continuous control

Near-optimal Character Anima-tion with Continuous Control

Adrien Treuille, Yongjoon Lee, Zoran Popović

2008.10.14 HA SE HOON

Prerequisite Introduction Related Works Motion Model Control Policies Results Conclusion

Outline

Motion Graph◦ A directed graph to synthesize a new motion◦ Node = Pose◦ Edge = Motion Clip

We already discussed in 4th presentation◦ “Group Motion Graph”, Yu-Chi Lai, Stephen Chen-

ney, ShaoHua Fan◦ Presented by Heo Jae Pil

We will use a similar but different method!

Prerequisite #1: Motion Graph

S A B F

N

Jumping Walking Running

Short Motion

Short MotionShort Motion

Short Motion

Drawn by Heo Jae Pil

A sub-area of machine learning How agent should take action?

◦ Goal: Maximize reward Model of reinforcement learning

◦ A set of environment states S◦ A set of action A◦ A set of scalar reward in R

We will use reinforcement learning to find a near-optimal action

Prerequisite #2: Reinforcement Learning

Finding a “near optimal” character anima-tion◦ With real-time continuous controller◦ Tasks

Navigation (Walking) Spinning Navigation Fixed Object Avoidance (FOA) Moving Object Avoidance (MOA)

Introduction: The goal of the paper

Real Time Continuous Controller(Arrows under the char.)

You should know…◦ How to represent motions, states, policies◦ How to blend motion clips◦ How to define the cost function◦ How to find the near optimal policy

Introduction

Motion Graph ◦ KOVAR, L., GLEICHER, M., AND PIGHIN, F. 2002

Motion synthesis from annotations◦ ARIKAN, O., FORSYTH, D. A., AND O’BRIEN, J. F. 2003

Precomputing avatar behavior from hu-man motion data◦ LEE, J., AND LEE, K. H. 2004

Related Works

A set of motion clips С

One of each clip C = (p1, p2, …, pm)

Each pose p = ℝn

◦ A Vector specifying all joint positions

Motion Model: Definition of terms

Divided into two subsequences ◦ Cin and Cout

◦ Each subsequence covers one foot plant frame◦ This frame became a ‘constraint’ frame

Motion Model:Motion ClipsAssumption: A clip C represents one walk cy-cle

Cin Cout

Constraint frames: (b) for Cin, (d) for Cout

Allow any transition between two clips◦ Unlike “motion graph”!

Algorithm◦ Step 1: C is mirrored if necessary◦ Step 2: Overlap constraint frames of Cout and C’in

◦ Step 3: Rotate C’ to match its foot (C’ is reoriented so that its ground-contact foot coincides with that of C)

◦ Step 4: Blend Cout to C’in

◦ (With ground-contact foot as the root of kinematic skeleton)

Motion Model: Blending motion clips

Motion Model: Example of motion blending

Constraint frames (foot-planted frames)

Prevent foot-skating!!! Why?

Control: Goal tasks

NavigationUser controls

gait, path, torso orienta-

tion(example of

gait: walking, running,…)

Spinning NavigationUser controls the motion di-

rection as character spins

Fixed Obstacle Avoid-anceThe character

follows a line, avoiding fixed planar object

Moving Obstacle Avoid-anceThe character

follows a line, avoiding fixed planar object

with linear mo-tion

State X

◦ C: current clip◦ x, z, Ɵ : position and orientation◦ u, v, u’, v’: relative position and speed of obstacle◦ G, T, W: desired Gait, Torso Orientation, Spin

Control: Definition of the state

Intention of user

Transition function f: S x C S◦ X = (C, …) X’ = (C’, …)

Control: Transition function

State Cost Cs: S ℝ◦ If current clip is not desired gait then Cs(X) = ∞

(If X = (C, …) but C ∉ G, then Cs(X) = ∞)

Transition Cost Ct : S x S ℝ

Control: Costs

Weights of each term.

Policy ∏ : S C◦ Decide next clip C from given state S◦ Then we can move to next state S’ with transition

function f(X, C) X’ (X/X’ = current/next state)

Policies: Definition

Definition of state and transition func-tion

Greedy Policy ∏greedy

◦ Find the clip that minimize the cost!◦ Just look one state

But we should consider the following case:

Policies: Greedy Policy

We will minimize the entire cost◦ Taking into account the future

Redefine the cost function◦ For given some policy ∏ that produces (X1, X2, …

Xn)

◦ α ∈ [0, 1) : future uncertainty factor

Policies: Optimal Policy

Value function V: S ℝ◦ Long term cost under optimal policy ∏*

◦ V(X) = C∏*(X)

Def: Optimal policy is

Thus, if we can calculate value function V(X), We can find long-term optimal next state!

Policies: Value function

Now issue is approximation of V(X)

Linear approximation of basis functions◦ Basis functions Ф = (Ф1, Ф2, … ,Фn)◦ Each basis functions Ф : S R

Polynomials or Gaussians◦ Then, Approximation of V is

◦ Basis functions are pre-selected by human! So, how can we calculate weights r1, r2, … rn?

Policies: Near optimal policy

Draw set of sample clips Init state transition pairs and r

◦ Def: r = a vector of weights (r1, r2, … rn)

Step1

◦ X’ is the optimal next state from X under current V

Step 2◦ Recalculate r by solving the linear program

Policies: Near optimal policy (Algorithm)

Policies: Near optimal policy

Switchability◦ Switching between tabulated value function◦ When transition from a walking to a running, the

algorithm still picks near-optimal

Seperability◦ Learn separate value function for each task◦ Ex) MOA and FOA

Policies: Dimensionality

Let’s see the video

Results

Presents a new control model◦ High dimensional◦ Continuous◦ Real time◦ Near optimal

But…◦ Needs a large amount of database

Conclusion

Any question?

near-optimal character animation with continuous control

Documents

foot c

linear motion

clip c

motion direction

motion clipwe

motion clipshow

motion clipsassumption

motion synthesis