interactive control of avatars animated with human motion data by: jehee lee, jinxiang chai, paul s....

Interactive Control of Avatars Animated with Human Motion DataBy: Jehee Lee, Jinxiang Chai, Paul S. A. Reitsma, Jessica K. Hodgins, Nancy S. Pollard

Presented by: Nathan Hoobler

Why do we use motion capture?

Get realistic behavior “for free” An easy interface for generating control for

high DOF models Can capture behavior far too complicated

to model by handKung Fu, Acrobatics, other stylized motion

What is the problem with motion capture? Motion capture data is inherently

complicatedUsually far more degrees of freedom than can

be easily controlled by hand Not trivial to synthesize new behaviors

Transitions between different types of motion are hard

Often there are redundant behaviors

What does this paper do?

Identify distinct behaviors in the motion capture data

Allow intuitive control of high DOF data with a small DOF interface

Allow seamless transitions between different behaviors

System Overview

Loosely-patterned data comes in

A probabilistic transition matrix is built

Simplified transition graph is used to determine motion

System Overview

Various datasets come in

What kind of data can we use?

Long, consistent motion recordings are required for good transition generation

Does not handle sensor noise well

System Overview

Low-Level transitions are generated

Low-Level Representation

At this level, the system is very similar to the Video Textures techniqueFor each frame, find any other frames in the

dataset that are similarCalculate the probability of a transition from

frame j to frame k based on how closely the two frames match

Low-Level: Building the Matrix

The probability of transitioning from frame i to frame j is computed as

Where D(i, j) is the weighted “distance” from frame i to frame j

And d(pi, pj) is

So, how efficient is this?

Since the matrix is just a 2D mapping from any one frame to any other, the number of transitions is O(n^2)…

Since the matrix is just a 2D mapping from any one frame to any other, the number of transitions is O(n^2)…… For 4000-12000 frames per dataset (!)

We need to reduce the number of transitions

Low-Level: Pruning

We can take advantage of a few useful features of the Motion Capture data Contact with the world should be similar between

transitioning frames Any interesting data is going to have mostly low-

probability transitions There are many frames that are very similar to others We want to avoid going down dead-end routes

Low-Level: Pruning (Contact)

Criteria 1: ContactEven if frames are very similar, so not

transition if the contact states are different(Strict interpretation) Only allow transitions

during contact states

Low-Level: Pruning (Likelihood)

Criteria 2: LikelihoodThrow away transitions whose probability is

less than some threshold value

Low-Level: Pruning (Similarity)

Criteria 3: Similarity If a frame has many transitions to states that

are all very similar to each other as well, throw away all but the best fitting transition

Low-Level: Pruning (SCC)

Criteria 4: Connectedness In theory, we want to avoid transitions that

don’t lead to well-connected nodesOnly add transitions that remain within the

largest Strongly Connected Component of the graph

“A maximal subgraph of a directed graph such that for every pair of vertices u, v in the subgraph, there is a directed path from u to v and a directed path from v to u.” (Mathworld)

Low-Level: Blending

Need interpolation to avoid discontinuities Problem: sharp changes are allowed at

contact points

Low-Level: Blending

Need interpolation to avoid discontinuities Problem: sharp changes are allowed at

contact points Solution: use a non-linear blend function

centered on the contact point and a moving average

Low-Level: Blending

Case 1: Follow the incoming frame

Case 2: Follow the outgoing frame

Case 3: Choose the side closest to the contact point

Case 4: Just let the foot slide; it’ll look bad no matter what

Low-Level: Coordinate System

Fixed/Global versus RelativeEach has an advantage, depending on the

situationThe paper uses both, depending on the

example

Fixed/Global Coordinates

AdvantagesGood for spatial data (the recording

environment corresponds strongly with the simulated environment)

DisadvantagesNot good for synthesizing motion in new

environments

Relative Coordinates

AdvantagesMuch easier to synthesize motions from

anywhere in the environment into new behaviors

Disadvantages Ignores orientation and position in three-

space, which may be important for some actions

High-Level Representation

Low-level representation is far too complicated to interact with

Simplify the data by grouping like frames into clusters

For each frame, find the possible clusters that can be transitioned to in the near term

Frames are grouped into clusters

Building Clusters

We want a simplified data setWeight important joints (arms, legs, pelvis,

etc.) highWeight less important joints (neck, etc.) low

Using weighted values, find similar frames and group them into clusters

Frames are grouped into clusters

A transition tree is built for each frame

Building the Cluster Forest

Each frame has a tree of clusters representing its valid transitionsFind the most probable transition from the

current frame to another cluster If the number of frames required to reach that

cluster is within a time threshold, add it to the forest

Repeat

Caveats about Clustering

Clustering is not always extremely useful Mostly a user interface issue

Useful for directly selecting the next motion (Direct Choice)

Not as useful for procedurally determining behavior (Path Sketching, Mimic)

Control Methods

Several interface methods were used, depending on how well they suited the exampleDirect ChoiceSketchingVideo-Capture

Direct Choice

Display valid states for the avatar, and let the user choose

Path Sketching

Allow the user to specify a path to follow Find motions that will put the avatar in the

right place

Video Mimic

Determine limb and body orientation from video input

Find closest matching frame(s), and imitate the user

Results

TerrainPath Sketching

Step StoolPath SketchingDirect Choice

PlaygroundDirect Choice

Any Questions?

interactive control of avatars animated with human motion data by: jehee lee, jinxiang chai, paul s....

Documents

computational photography light field rendering jinxiang...

running magento 2 - create hosting · diﬀerence between...

cpsc 641 computer graphics: fourier transform jinxiang chai

edge detection and geometric primitive extraction jinxiang...

character animation and control using human motion data...

1 csce 641: computer graphics lighting jinxiang chai

visible-surface detection jehee lee seoul national...

computational photography: image mosaicing jinxiang chai

session 3- don reitsma (use of automated mpd)

rotation and orientation: fundamentals jehee lee seoul...

csce 441: computer graphics image filtering jinxiang chai

csce 641 computer graphics: radiosity jinxiang chai

csce 641 computer graphics: image registration jinxiang chai

yoonsang lee sungeun kim jehee lee seoul national...

two-dimensional viewing jehee lee seoul national university

output primitives jehee lee seoul national university

ren é reitsma & stanislav trubin accounting, finance &...

interactive control of avatars animated with human motion...

jisse reitsma - messing up dependency injection

methods for nonlinear least- square problems jinxiang chai