[ieee 2011 conference for visual media production (cvmp) - london, united kingdom...
TRANSCRIPT
SPACE-TIME EDITING OF 3D VIDEO SEQUENCES
Margara Tejera and Adrian Hilton
University of Surrey, Guildford, United Kingdom
Abstract
A shape constrained Laplacian mesh deformation approachis introduced for interactive editing of mesh sequences. Thisallows low-level constraints, such as foot or hand contact,to be imposed while preserving the natural dynamics ofthe captured surface. The approach also allows artisticmanipulation of motion style to achieve effects such assquash-and-stretch. Interactive editing of key-frames isfollowed by automatic temporal propagation over a windowof frames. User edits are seamlessly integrated into thecaptured mesh sequence. Three spatio-temporal interpolationmethods are evaluated. Results on a variety of real andsynthetic sequences demonstrate that the approach enablesflexible manipulation of captured 3D video sequences.
Keywords: 3D video, 3D animation, performance capture,
mesh editing, stylisation
1 Introduction
Capturing human motion has been of interest in computer
vision and graphics research for over 30 years. Technology
has evolved from sparse marker based motion capture (MoCap)
systems, to dense reconstruction of non-rigid surfaces as 3D
video mesh sequences [20, 24]. Practical reuse of 3D video
sequences for animation requires editing techniques which
provide the level of control available with conventional skeletal
animation while preserving the captured non-rigid surface
dynamics.
Space-time skeletal motion editing techniques have been
developed for manipulation of captured sequences via
interactive changes at key-frames which are propagated across
the sequence [6, 13]. This provides low-level animation control
and allows constraints such as character-object interaction
or foot-floor contact to be imposed to modify a character’s
motion. A similar level of interactive editing control is
desirable for captured 3D video mesh sequences.
In this work we build on previous research in Laplacian mesh
editing [19, 3] to introduce techniques for space-time editing
of mesh sequences. Xu et al. [25] and Kircher and Garland
[10] presented general approaches for key-frame editing of
mesh sequences as a set of transformations on individual mesh
elements which are weighted over a window either side of the
key-frame. We present an analogous approach to key-frame
editing which also constrains the mesh sequence deformation
to a learnt space of motions. This ensures preservation of
the captured motion characteristics and underlying anatomical
structure of the actor performance. In order to propagate
key-frame edits across the sequence, we present two novel
non-linear interpolation methods and evaluate their advantages
and limitations with respect to the traditional linear methods.
Results on real and synthetic 3D video sequences demonstrate
that the proposed space-time editing approach provides a
flexible tool for interactive manipulation allowing both low-
level constraints and artistic stylisation.
2 Related work
Editing and stylisation of skeletal MoCap data: Rose
et al. [17] introduced methods for high-level parametric
control of skeletal motion by interpolation between captured
motions, which lead to more sophisticated techniques where
a parametrised space of motions is created [11, 16]. The idea
of concatenating motions from different parametric motion
spaces inspired the so-called motion graphs [12, 1] or movetrees. This added the possibility to transition seamlessly
between captured motions sequences. Heck and Gleicher [7]
combined parametrised motions with motion graphs to allow
high-level control and transition between multiple motions.
Other research has followed the process traditionally done by
the animators: first, edit a set of key-frames of the sequence,
creating a set of poses that satisfies the constraints set by the
user; and second, create in-between poses that preserve the
naturalness of the original motion. The work of Gleicher [6]
and Lee and Shin [13] are examples of space-time editing
approaches. Gleicher [6] solves for both space and time
constraints simultaneously. Lee and Shin [13] modify the
poses of the skeleton in the key-frames by means of an inverse
kinematics solver (IK), and then apply a multilevel B-Spline
approximation for the interpolation of poses.
As well as depicting an action, characters communicate
feelings to the viewer and can perform the same action
in different styles. Brand and Hertzmann [5] presented
the first method to automatically separate the “style” from
the “structure” by an unsupervised learning process based
on Hidden Markov Models, capable of capturing the data
essential structure and discarding its accidental properties.
Following the same learning approach, Hsu et al. [8] train style
translation models that describe how to transition from one
motion to another, and Shapiro et al. [18] apply Independent
Component Analysis to decompose the motion. Min etal. [15] construct a generative human motion model using
2011 Conference for Visual Media Production
978-0-7695-4621-6/11 $26.00 © 2011 IEEE
DOI 10.1109/CVMP.2011.23
148
multi-linear data analysis techniques. This model is driven by
two parameters and their adjustment produces personalised
stylistic human motion.
Editing and stylisation of 3D video data: 3D video
data has an inherent complex nature. Multiple-view video
reconstruction generates independent meshes at each frame,
resulting in a lack of temporal consistency required to
manipulate the mesh sequence. To overcome this difficulty,
techniques for mesh sequence processing have relied on
either the use of synthetic data, or in the application of
shape similarity measures. Huang et al. [9] concatenated
clips of captured sequences by determining transition links
using similarity matrices based on shape histograms. This
work extended the concept of “motion graphs” [12, 1] from
skeletons to 3D data.
Analogous to the IK methods for skeletal data, several mesh
editing techniques have been developed. They generally
consist of a global optimisation that tries to preserve the
local differential properties of the mesh while satisfying user
constraints. A comprehensive comparison between these
methods is provided in [3]. Sumner et al. [23] formulate the
problem as a least square minimisation that manipulates the
deformation gradients of the triangles, which describe their
transformation with respect to a reference pose. A non-linear
feature space is constructed using the deformation gradients
as feature vectors and applying polar decomposition and
exponential maps to avoid naive linear blending of poses,
which would lead to unnatural results. Laplacian-based
approaches [19] define a linear operator according to the
connectivity and the area of the triangles of the meshes.
The application of this operator yields a set of differential
coordinates whose direction approximates the direction of
the local normals of the triangles, and whose magnitude is
proportional to the local mean curvature. The main drawback
of these methods is having to deal with rotations explicitly.
Lipman et al. [14] introduced a mesh representation based
on rotation-invariant linear coordinates that addresses this
problem: linear shape interpolation of meshes using this
representation handles rotations correctly.
Following the key-frame editing scheme, the mesh editing
problem can be extended to sequences. Xu et al. [25]
introduced an alternating least-square method based on
rotation-invariant linear coordinates [14] demonstrating
natural deformation. Constraints at key-frames are propagated
by a handle trajectory editing algorithm, obtaining an overall
natural-looking motion. Kircher and Garland [10] presented
a new differential surface representation which encodes first
and second order differences of each vertex with respect to its
neighbours giving rotation and translation invariance. These
differences are stored in “connection maps”, one per triangle,
which allow the development of motion processing operations
with better results than vertex-based approaches.
Sumner and Popovic [22] addressed the problem of transferring
poses between characters. Deformation transfer is achieved
by applying the affine transformation that each triangle of a
character’s mesh undergoes to transform from a reference pose
to a desired pose, to the triangles of a different character. This
work was generalised by Baran et al. [2], who presented a
patch-based mesh representation derived from [14] that allows
the semantic transfer of poses, e.g. transferring the motion of
arms to legs.
3 Learning a space of deformation for mesh sequences
Skeletal motion sequences explicitly represent the anatomical
structure which is preserved during editing. For mesh
sequences the underlying physical structure is implicit
requiring editing to be constrained to reproduce anatomically
correct deformations. To preserve the implicit structure we
learn a mesh motion space from the temporally aligned 3D
performance capture data and constrain the Laplacian mesh
editing to lie in this space. A learnt deformation gradient
feature space to constrain the editing of a single mesh using a
sparse set of examples was previously introduced by Sumner etal. [23]. In this work we extend editing to mesh sequences and
directly learn the space in differential coordinates to constrain
subsequent deformation. This effectively combines previous
free-form mesh sequence editing [25, 10] with learnt spaces
of mesh deformation [23] within a Laplacian mesh editing
framework [19, 3].
3.1 Laplacian mesh editing framework
Laplacian mesh editing is based on a differential representation
of the mesh which allows local mesh properties to be encoded.
The gradient of the triangles’ basis functions φi yields a 3× 4matrix Gj for each of the triangles [21]:
Gj = (∇φ1,∇φ2,∇φ3,∇φ4) (1)
=
⎛⎝
(p1 − p4)�
(p2 − p4)�
(p3 − p4)�
⎞⎠−1 ⎛
⎝1 0 0 −10 1 0 −10 0 1 −1
⎞⎠ (2)
Where p1,p2 and p3 are the position of the vertices of the jth
triangle and p4 is a fourth vertex added along the unit normal
[22]. Applying this gradient to every triangle of the mesh, we
can construct a matrix G of size 4m×n, where n is the number
of vertices and m the number of triangles [4].
⎛⎜⎝
G�1...
G�m
⎞⎟⎠ = G
⎛⎜⎝
p�1...
p�n
⎞⎟⎠ (3)
Let A be a diagonal weighting matrix containing the areas
of the triangles, the matrix G�A represents the discrete
divergence operator and the discrete Laplace-Beltrami
operator L can be constructed by performing the following
multiplication: L = G�AG [3]. Given a mesh, its differential
coordinates can be obtained by multiplying the Laplacian
operator by its absolute coordinates: δ(x) = Lx, δ(y) = Lyand δ(z) = Lz.
149
If we assume the addition of positional soft constraints xc, the
x absolute coordinates of the reconstructed mesh (the same
applies for y and z) can be computed in the least-square sense
[19]:
x = arg minx
(‖Lx− δ(xo)‖2 + ‖Wc(x− xc)‖2)
(4)
Where xo are the coordinates of the original mesh and xc are
the soft constraints on vertex locations given by the feature
correspondence with a diagonal weight matrix Wc.
This equation allows the reconstruction of a mesh by means
of the Laplacian operator L that, due to its linear nature,
does not account for changes in rotation. To allow non-linear
interpolation of rotation an iterative appraoch is taken [21]: in
each step of the minimisation the changes in rotation of each
triangle are computed and the Laplacian operator is updated
accordingly. The non-rotational part of the transformations are
discarded in order to help the preservation of the original shape
of the triangles.
3.2 Laplacian editing with a learnt deformation space
In this work, we introduce a novel mesh editing framework
based on the Laplacian deformation scheme presented in
section 3.1. The novelty resides in incorporating into the
algorithm the previously observed deformations of the
character. This constrains the possible solutions of the
deformation solver, ensuring the preservation of the captured
motion characteristics and underlying anatomical structure of
the actor performance.
For a sequence of meshes {M(ti)}Fi=0, where F is the number
of frames, the mesh motion deformation space is built by
taking each mesh represented in differential coordinates as a
deformation example. Our data matrix M is built by placing
the concatenated δ(x), δ(y) and δ(z) differential coordinates
of each example in its rows:
M =
⎛⎜⎜⎜⎝
δ�1 (x) δ�1 (y) δ�1 (z)δ�2 (x) δ�2 (y) δ�2 (z)
......
...
δ�F (x) δ�F (y) δ�F (z)
⎞⎟⎟⎟⎠ (5)
The data matrix is centred obtaining Mc = M − M, where
M is a F × 3n matrix whose rows are the mean of the rows
of the data matrix M . In order to obtain a basis representing
the space of deformations an SVD decomposition is performed
over the matrix Mc: Mc = UDV�, where matrix V is a 3n×F with a vector of the basis in each of its columns. The first
l eigenvectors ek representing 95% of the variance are kept,
which gives a linear basis of the form:
δ(r) = δ +l∑
k=1
rkek = δ + Er, (6)
where rk are scalar weights for each eigenvector, r is an l-
dimensional weight vector and E is an l × 3n matrix whose
rows are the first l eigenvectors of length 3n.
(a) Left: original mesh with user-specified constraints
coloured in orange and red. Centre: Laplacian editing. Right:
Laplacian editing with learnt deformation space.
(b) Detail of the tail. Using Laplacian editing (left) causes the
tail to fold over itself. Using the learnt deformation space
(right) preserves the tail shape.
Figure 1: Effect of incorporating a learnt space ofdeformations into the Laplacian framework. The neck of thehorse has been lengthen dramatically. (Dataset courtesy of[22])
Space-time editing of key-frames in the mesh sequences is
performed using a constrained Laplacian mesh editing within
the space of deformations δ(r). From equation 4 we have:
r, x = arg minr,x
(‖Lx− δ(r)‖2 + ‖Wc(x− xc)‖2) (7)
Equation 7 allows interactive editing of a key-frame mesh
M(tk) to satisfy a set of user-defined constraints xc resulting
in a modified mesh M ′(tk) with vertices x′(tk). To construct
a basis with respect to the mesh M(ti) for each frame in
the mesh sequence the Laplacian Li, defined according to the
discrete gradient operator matrix Gi, is used as a reference in
the construction of the data matrix Mi such that δi(x) = Lix.
Constructing a local basis defines changes in shape in the
learnt motion space taking the reference frame as the origin.
The use of a local basis gives improved speed of convergence
and control in shape deformation within the shape constrained
Laplacian mesh editing.
150
(a) Left: original mesh with user-specified constraints
coloured in orange and red. Centre: Laplacian editing. Right:
Laplacian editing with learnt deformation space.
(b) Detail of the legs. Using the learnt deformation space
(right) preserves the leg shape avoiding mesh collapse which
occurs with Laplacian editing (left).
Figure 2: Effect of incorporating a learnt space ofdeformations into the Laplacian framework. The left leg ofthe character has been straightened.
Examples of the effect of the basis are depicted in figures 1,
2 and 3. Deformations applying the learnt deformation space
within the Laplacian framework preserve the surface details
and the underlying structure of the character. This avoids
artefacts such as thining, unnatuaral bending of limbs and
collapsing of the mesh which occur if the Laplacian is not
constrained to a learnt deformation space.
4 Editing in a learnt deformation space
The space-time editing pipeline consists of deforming a set
of key-frames and subsequently propagating these changes
over a temporal window with the objective of seamlessly
incorporating the edited frames into the sequence. The user
input is necessary to both choose the key-frames and to select
the constrained vertices. Our space-time interface allows
selection of any vertex on the mesh as a constraint. This is
flexible compared to previous mesh editing approaches which
require a set of handles to be predefined.
4.1 Key-frame editing
Key-frame editing is performed within the Laplacian
framework described in section 3.2. During an off-line
process, each frame of the sequences of a given character
is used as the reference frame for computing a space of
deformations. In our implementation, all available frames for
the character are considered as deformation examples for the
(a) Left: original mesh with user-specified constraints
coloured in orange and red. Centre: Laplacian editing. Right:
Laplacian editing with learnt deformation space.
(b) Detail of the legs. Using the learnt deformation space
(right) preserves the leg shape avoiding mesh thining which
occurs with Laplacian editing (left).
Figure 3: Effect of incorporating a learnt space ofdeformations into the Laplacian framework. The right legof the character has been bent.
construction of the deformation space.
The user interactively selects two sets of vertices: the vertices
whose position must stay unchanged during the deformation,
and the vertices that will be dragged to a desired position.
These positional constraints and the space of deformations
associated with the given frame, are incorporated in equation
7. Figure 4 shows an example of a key-frame edit.
Figure 4: Key-frame editing. Left: original horse. Centre:original horse showing the constrained vertices, the red groupwill stay fixed and the orange group will be moved during theediting. Right: edited horse. (Dataset courtesy of [22])
4.2 Space-time propagation
Changes to the key-frames must be propagated over time in
order to obtain a natural-looking motion. Three propagation
methods are evaluated: linear interpolation, non-linear
interpolation and constraint interpolation. A discussion and
151
(a) original sequence
(b) two edited key-frames
(c) space-time editing with Tk = 3frames
(d) space-time editing with Tk = 6frames
Figure 5: Space-time editing of 3D video for a walkingsequence with multiple key-frames to modify character height
comparison between these methods is included at the end of
the section.
Figure 5 illustrates the process of space-time editing for a walk
sequence, a key-frame is selected and modified, changes are
then propagated across a temporal window with weights shown
by the mesh colour. In this example the characters height is
modified in an unnatural way on two key-frames to give an
example of the space-time propagation which is easily visible.
More subtle physically realistic editing examples are included
in the results.
4.2.1 Linear interpolation
Given an edited key-frame mesh M ′(tk) with vertices x′(tk),edits are propagated temporally to other frames of the mesh
sequence M(ti) with vertices x(ti) using a spline to define the
interpolation weights λi for the difference in mesh shape Δk =(x′(tk)− x(tk)):
x′(ti) = x(ti) + λiΔk (8)
Multiple key-frame edits can be combined as a linear sum of
edits:
x′(ti) = x(ti) +Kf∑k=1
λikΔk (9)
where Kf is the number of key-frames. This linear sum allows
compositing of changes from multiple frames in the sequence
with weighted influence on the shape at a particular frame
providing intuitive control over mesh sequence deformation.
In practice weights are interpolated over a temporal window of
influence around each key-frame tk±Tk which can be selected
by the user.
Linear interpolation is computationally efficient but may result
in unrealistic deformation such as shortening of limbs. We
therefore propose a non-linear and a constraint interpolation
approach which aim to preserve the mesh structure. A
comparative evaluation is presented in section 4.3.
4.2.2 Non-linear interpolation
We propose a non-linear interpolation method based on
the propagation of the triangle transformations between the
edited and the original key-frame. Given a key-frame mesh
M ′(tk) and its original version M(tk), the transformation
that the jth triangle of M(tk) undergoes to transform into
the corresponding triangle in M ′(tk) is computed and polar
decomposed into its rotational, R, and non-rotational, S,
components: T ′jk = R′jk S′jk . Let q′jk be the quaternion
associated with R′jk and qjk the quaternion identity for all
j, where the superscript refers to the jth triangle. The
interpolated rotation q′ji is computed as:
q′ji = slerp(qjk, q′jk , λi) (10)
Let Sjk = I for all j, the non-rotational scale/shear part S′ji is
linearly interpolated:
S′ji = Sjk + λi(S
′jk − Sj
k) (11)
Multiple key-frame edits can be combined analogous to
equation 9:
q′ji =Kf∏k=1
slerp(qjk, q′jk , λik) (12)
S′ji =Kf∑k=1
Sjk + λik(S′jk − Sj
k) (13)
where∏
represents quaternion multiplication.
Converting q′ji to R′ji , a set of transformations T ′ji = R′ji S′jican be computed. Applying these transformations directly
to the triangles of M(ti) would result in an unconnected
152
Figure 6: Illustration of the constraint interpolation method. First row: original sequence. Second row: a key-frame hasbeen edited and the constraints (in red and orange) have been interpolated. Third row: for each frame within the window ofpropagation, the Laplacian deformer of equation 7 is run to deform the meshes subject to the interpolated constraints.
mesh. The Laplacian deformation framework of equation 4
is applied to link the triangles back together. In this case the
non-rotational part of the transformations are kept in order to
correctly apply the S′ji components.
4.2.3 Constraint interpolation
The linear and non-linear propagation methods discussed
in previous paragraphs find the edited meshes M ′(ti)by processing information of the original meshes M(ti)and the key-frames edits. An alternative method consists
in propagating the position of the constraints over the
temporal window, and subsequently performing a Laplacian
deformation according to equation 7 to obtain M ′(ti) subject
to the interpolated constraints. This offers the advantage of
controlling the position of the constrained vertices along the
window at the expense of a higher computational cost.
Directly interpolating the constraints coordinates does not
guarantee the preservation of the shape of the submeshcomprised by the selected vertices. Therefore, the non-linear
interpolation method presented in section 4.2.2 is applied
to compute the position of the constrained vertices over
the propagation window. This approach differs from more
simplistic approaches where these positions are found by
averaging the rotations for each of the triangles [25]. An
illustration of the method can be found in figure 6.
Although computationally more expensive, the constraint
interpolation provides full control on the positions of the
constrained vertices along the window of propagation. This
allows fixed constraints to be enforced over a temporal
window, for example on hand or foot location during contact.
4.3 Discussion on interpolation methods
A comparison between the three interpolation methods
discussed in section 4.2 is presented in figure 7. This shows the
propagation of the edited key-frame of figure 4, from the horse
gallop sequence. Applying the linear interpolation causes the
front legs to shorten, while the non-rotational and constraint
interpolation achieve natural-looking results.
The non-linear interpolation incorporates the transformation
of the mesh triangle by triangle taking into account both the
rotation and scale/shear components of the transformations.
This avoids artefacts related to linear interpolations, such as
shortening of the limbs or distortion of the original shape of
the mesh.
Since applying the constraint interpolation method means
deforming each of the meshes within the propagation window
subject to a set of constraints, it provides more control over
the position of the constrained vertices along the temporal
window. However, it is computationally the most expensive
method. Computation times for the propagation of the
space-time editing example of figure 8(e) were 1.601, 5.567
and 13.926 seconds for the linear interpolation, the non-
linear interpolation and the constraint interpolation methods,
respectively.
5 Space-time editing results
Space-time editing is demonstrated on both synthetic and
captured mesh sequences. A variety of editing operations
are illustrated to demonstrate the flexibility of the proposed
153
Figure 7: Comparison of the propagation of an edit using three different interpolation methods. Above the original sequence,below the propagation window for each of the methods. Top row: linear interpolation. Middle row: non-linear interpolation.Bottom row: constraint interpolation. The edited mesh is shown at the left of the figure, and the frame subsequent to thepropagation window is shown at the right. (Dataset courtesy of [ 22])
approach. Space-time editing of a walk sequence to modify
feet positions, avoid obstacles and step up onto a platform are
shown in Figure 8(a,b). This illustrates a common application
of space-time editing of captured sequences to modify contact
positions according to scene constraints.
In figure 8(a)(right) the space-time editing approach has been
used to repair reconstruction errors. The original sequence (see
accompanying video) shows a twirl where there is a loss of
contact between the hand and the skirt. In the edited sequence
the hand has been moved to grasp the skirt correctly.
Figure 8(c) shows a more complex space-time edit to
modify the arm and leg movements for the street-dancer while
preserving both the anatomical structure and surface dynamics.
Space-time editing also allows artistic stylisation of the
motion to create common animation effects such as movement
emphasis, exaggeration and cartoon effects of squash-stretch
as well as re-timing of the sequence for anticipation. Figure
8(d) presents examples of motion stylisation to exaggerate the
walking of a character with a loose dress and to produce a
cartoon style squash-stretch effect for a jump.
Finally, figure 8(e) shows the editing of a synthetic horse
galloping sequence where the torso of the horse has been lifted.
This example illustrates the effect of applying large changes
to a mesh sequence. Constraining the deformation to a learnt
deformation preserves the mesh structure ensuring a natural
motion sequence.
Video sequences corresponding to the results presented
in figure 8 and demonstration of the interactive interface
are included in the supplementary video. Results of
space-time editing demonstrate that the approach allows
flexible interactive editing of captured sequences to satisfy
user-specified constraints while preserving the natural spatio-
temporal dynamics of the captured motion.
The linear interpolation approach has been used to generate
the resulting sequences of figure 8(a,b,c,d). Since the edits
performed over these examples are small deformations, this has
not introduced visual artefacts. For the horse sequence of figure
8(e), where the key-frame undergoes a large deformation, the
non-linear interpolation was preferred to generate the final
sequence. As shown in figure 7, in this case the linear method
introduces significant errors if applied.
Computation times for a selection of space-time editing results
can be found in table 1. Timings show that for meshes of 3000-
6000 vertices the computation time for key-frame editing takes
0.5-1s allowing interactive editing with rapid feedback. These
timings are for a CPU implementation of the approach, real-
time performance could potentially be achieved with transfer
of the Laplacian solver to a GPU.
Typical values of Tk are in the range 4-8 frames.
Supplementary video: http://www.vimeo.com/25663553
154
Type of data Sequence edit # # vertices # constrained vertices Deform. time (s)
Real data
Cones
1 2886 236 0.636
2 2886 242 0.635
3 2886 268 0.644
4 2886 247 0.631
5 2886 255 0.643
Dancer
1 5580 1585 1.187
2 5580 1345 1.446
3 5580 1270 1.430
4 5580 1494 1.171
5 5580 1508 0.880
6 5580 1497 1.042
7 5580 1560 1.082
8 5580 1147 1.057
9 5580 704 0.987
10 5580 1432 1.051
11 5580 1258 1.031
Real data for stylisation Skirt
1 2854 691 0.829
2 2854 722 0.396
3 2854 613 0.673
4 2854 595 0.820
5 2854 550 0.658
6 2854 588 0.536
Synthetic data Horse 1 8431 6753 1.601
Table 1: Computation times of a selection of space-time editing results. The sequence “Cones” corresponds to figure8(a)(middle), sequence “Dancer” to figure 8(c), sequence “Skirt” to figure 8(d)(left) and sequence “Horse” to figure 8(e).Edit numbers refer to different key-frame edits performed on the sequences.
5.1 Discussion
Some of the results included in the video show small artefacts
due to one or more of the following reasons:
• Errors in surface reconstruction which are present in both
the original and edited sequences.
• The walking and running sequences shown are the result
of concatenating shorter sequences and small jumps may
be visible at the end of each cycle.
• If large deformations are applied in a small window of
frames, such as in the example where the feet positions of
the running sequence are modified, the resulting motion
may lack smoothness. Timing could be better controlled
by adding extra frames to the sequence. This remains as
future work.
6 Conclusions
Space-time editing of 3D sequences with a learnt motion
model gives a flexible interactive approach to mesh sequence
editing with a similar level of control to conventional skeletal
animation. This allows constraints such as foot or hand
position to be imposed or modification of the captured
movement to interact with objects while maintaining the
movement characteristics and anatomical structure of the
captured performance.
Three interpolation methods to propagate the changes on the
key-frames have been evaluated. While the linear interpolation
approach provides the fastest solution, it includes artefacts such
as mesh shrinking. The non-linear and constraint interpolation
methods provide more accurate and natural-looking results at
the expense of longer computation times.
This paper focuses on the editing of dynamic surface geometry,
editing the dynamic surface appearance captured in 3D video
remains an open problem for future research.
References
[1] O. Arikan and D. A. Forsyth. Synthesizing constrained
motions from examples. ACM Transactions on Graphics,
2002.
[2] I. Baran, D. Vlasic, E. Grinspun, and J. Popovic.
Semantic deformation transfer. ACM Transactions onGraphics, 28, 2009.
[3] M. Botsch and O. Sorkine. On linear variational
surface deformation methods. IEEE Transactions onVisualization and Computer Graphics, 14(1):213–230,
2008.
155
[4] M. Botsch, R. W. Sumner, M. Pauly, and M. Gross.
Deformation transfer for detail-preserving surface
editing. In Proc. Vision, Modeling, and Visualization,
pages 357–364, 2006.
[5] M. Brand and A. Hertzmann. Style machines. In
SIGGRAPH ’00: Proceedings of the 27th annualconference on Computer graphics and interactivetechniques, pages 183–192, New York, NY, USA, 2000.
ACM Press/Addison-Wesley Publishing Co.
[6] M. Gleicher. Motion editing with spacetime constraints.
In SI3D ’97: Proceedings of the 1997 symposium onInteractive 3D graphics, New York, NY, USA, 1997.
ACM.
[7] R. Heck and M. Gleicher. Parametric motion graphs. In
In ACM Symposium on Interactive 3D Graphics, pages
129–136, 2007.
[8] E. Hsu, K. Pulli, and J. Popovic. Style translation for
human motion. ACM Trans. Graph., 24(3):1082–1089,
2005.
[9] P. Huang, A. Hilton, and J. Starck. Human motion
synthesis from 3d video. In CVPR, 2009.
[10] S. Kircher and M. Garland. Free-form motion processing.
ACM Trans. Graph., 27:12:1–12:13, May 2008.
[11] L. Kovar and M. Gleicher. Automated extraction and
parameterization of motions in large data sets. ACMTrans. Graph., 23:559–568, August 2004.
[12] L. Kovar, M. Gleicher, and F. Pighin. Motion
graphs. In SIGGRAPH ’02: Proceedings of the 29thannual conference on Computer graphics and interactivetechniques, volume 21, pages 473–482, New York, NY,
USA, July 2002. ACM.
[13] J. Lee and S. Y. Shin. A hierarchical approach
to interactive motion editing for human-like figures.
In SIGGRAPH ’99: Proceedings of the 26th annualconference on Computer graphics and interactivetechniques, pages 39–48, New York, NY, USA, 1999.
ACM Press/Addison-Wesley Publishing Co.
[14] Y. Lipman, O. Sorkine, D. Levin, and D. Cohen-Or.
Linear rotation-invariant coordinates for meshes. ACMTrans. Graph., 24:479–487, July 2005.
[15] J. Min, H. Liu, and J. Chai. Synthesis and editing of
personalized stylistic human motion. In Proceedings ofthe 2010 ACM SIGGRAPH symposium on Interactive 3DGraphics and Games, I3D ’10, pages 39–46, New York,
NY, USA, 2010. ACM.
[16] T. Mukai and S. Kuriyama. Geostatistical motion
interpolation. ACM Trans. Graph., 24:1062–1070, July
2005.
[17] C. Rose, B. Bodenheimer, and M. F. Cohen. Verbs and
adverbs: Multidimensional motion interpolation using
radial basis functions. IEEE Computer Graphics andApplications, 18:32–40, 1998.
[18] A. Shapiro, Y. Cao, and P. Faloutsos. Style components.
In In Proc. of Graphics interface, 2006.
[19] O. Sorkine. Differential representations for mesh
processing. Computer Graphics Forum, 25(4):789–807,
December 2006.
[20] J. Starck and A. Hilton. Surface capture for performance-
based animation. IEEE Computer Graphics andApplications, 27(3):21–31, 2007.
[21] C. Stoll, E. de Aguiar, C. Theobalt, and H.-P. Seidel.
A volumetric approach to interactive shape editing.
Technical report, Max-Planck-Institut fur Informatik,
June 2007.
[22] R. W. Sumner and J. Popovic. Deformation transfer for
triangle meshes. In SIGGRAPH ’04: ACM SIGGRAPH2004 Papers, pages 399–405, New York, NY, USA, 2004.
ACM.
[23] R. W. Sumner, M. Zwicker, C. Gotsman, and J. Popovic.
Mesh-based inverse kinematics. In SIGGRAPH ’05:ACM SIGGRAPH 2005 Papers, pages 488–495, New
York, NY, USA, 2005. ACM.
[24] D. Vlasic, I. Baran, W. Matusik, and J. Popovic.
Articulated mesh animation from multi-view silhouettes.
ACM Trans. Graph., 27(3):1–9, 2008.
[25] W. Xu, K. Zhou, Y. Yu, Q. Tan, Q. Peng, and B. Guo.
Gradient domain editing of deforming mesh sequences.
In ACM SIGGRAPH 2007 papers, SIGGRAPH ’07, New
York, NY, USA, 2007. ACM.
156
(a) Space-time editing of walk sequence for changing feet positions, collision avoidance and repairing reconstruction errors (the
hand has been moved to grasp the skirt correctly): original (blue); edited (green).
(b) Space-time editing of walk sequence for stepping onto a platform.
(c) Space-time editing of arm and leg movement for a street dancer sequence: original (blue); edited (green)
(d) Stylised sequences: walk with raised knees and jump with squash and stretch effects: original (blue); edited (green)
(e) Space-time editing of a synthetic horse galloping sequence: top row (original), bottow row (edited). (Dataset courtesy of
[22])
Figure 8: Interactive animation and space-time editing of synthetic and 3D video sequences.
157