hierarchical motion planning in topological representations · 2019. 8. 31. · hierarchical motion...

Hierarchical Motion Planning in TopologicalRepresentations

Dmitry Zarubin†, Vladimir Ivan∗, Marc Toussaint†, Taku Komura∗, Sethu Vijayakumar∗∗ School of Informatics, University of Edinburgh, UK† Department of Computer Science, FU Berlin, Germany

Abstract—Motion can be described in alternative represen-tations, including joint configuration or end-effector spaces,but also more complex topological representations that implya change of Voronoi bias, metric or topology of the motionspace. Certain types of robot interaction problems, e.g. wrappingaround an object, can suitably be described by so-called writheand interaction mesh representations. However, considering mo-tion synthesis solely in topological spaces is insufficient sinceit does not cater for additional tasks and constraints in otherrepresentations. In this paper we propose methods to combineand exploit different representations for motion synthesis, withspecific emphasis on generalization of motion to novel situations.Our approach is formulated in the framework of optimal con-trol as an approximate inference problem, which allows for adirect extension of the graphical model to incorporate multiplerepresentations. Motion generalization is similarly performed byprojecting motion from topological to joint configuration space.We demonstrate the benefits of our methods on problems wheredirect path finding in joint configuration space is extremelyhard whereas local optimal control exploiting a representationwith different topology can efficiently find optimal trajectories.Further, we illustrate the successful online motion generalizationto dynamic environments on challenging, real world problems.

I. INTRODUCTION

Many relevant robotic tasks concern close interactions withcomplex objects. While standard motion synthesis methodsdescribe motion in configuration space, tasks that concernthe interaction with objects can often more appropriately bedescribed in representations that reflect the interaction moredirectly. For instance, consider the wrapping of arms aroundan object, e.g. embracing a human. Described in joint space,such a motion is complex and varies greatly depending on theembraced object. When describing the motion more directlyin terms of the interaction of arm segments with object parts(e.g. using the interaction mesh representation that we willintroduce below) we gain better generalization to objects ofdifferent shape or position and online adaptation to dynamicobjects. Similar arguments apply to other scenarios, e.g. multi-link articulated robots reaching through small openings andcomplex structures, surfaces wrapping around objects andfingers grasping and manipulating objects. In such cases,the alternate representations greatly simplify the problems ofmotion generalization as well as planning.

There are several formal views on the implication of analternate abstract representation: 1) In the context of ran-domized search, such representations alter the Voronoi biasand therefore, the efficiency of RRT or randomized roadmaps. [9] demonstrate this effect in the case of RRTs. 2)

Fig. 1. KUKA LWR 4 robotic arm reaching through a hollow box with taskbeing defined in combined Writhe and interaction mesh space, showing anexample of planning and dynamic remapping using topological representationsas described in Section V-B.

An alternate representation may imply a different metric, suchthat a trajectory that is a complex path in one space becomes asimple geodesic in another. 3) An alternate representation maychange the topology of the space such that local optimizationin one space is sufficient for finding a solution whereas globaloptimization (randomized search) would be needed in theother. 4) Finally, different representations may allow us toexpress different motion priors, for instance, a prior preferring“wrapping-type motions” can be expressed as a simple Brow-nian motion or Gaussian process prior in one space, whereasthe same Brownian motion prior in configuration space renderswrapping motions extremely unlikely.

As opposed to the computer animation domain, wheretopological representation have recently been used [5], syn-thesizing motion in such abstract spaces for planning andcontrol of robotic systems come with additional challenges.Typically, control tasks are specified in world (or end effector)coordinates, the obstacles may be observed in visual (orcamera) coordinates, and the joint limits of the actuators aretypically described in joint coordinates. Therefore, the generalchallenge is to devise motion synthesis methods that combinethe benefits of reasoning in topological coordinates whilepreserving consistency across the control coordinates andmanaging to incorporate dynamic constraints from alternaterepresentations seamlessly.

In this paper, we propose methods that can combine thedifferent representations at the abstract and lower level andplan simultaneously at these different levels to efficientlygenerate optimal, collision free motion that satisfy constraints.

Each representation has their own strength and weaknesses andcoupling them can help solve a wider range of problems. Morespecifically, our contributions are:• The introduction of topological representations tuned to

the domain of robot motion synthesis and manipulation,with a strong focus on the interaction of manipulationand the environment.

• An extension of a stochastic optimal control frameworkthat combines various representations for motion synthe-sis. This is expressed in a graphical model that couplesmotion priors at different levels of representations.

• A method for motion generalization (remapping) to novelsituations via a topological representation.

• Experiments that validate the benefit of Bayesian infer-ence planning with topological representations in prob-lems involving complex interactions.

In the rest of this section, we will first review previouswork on the use of topological representations for characteranimation and abstract spaces used for robot motion synthesis.Section II then introduces two specific types of alternaterepresentations, one of which has the origin in topologicalproperties of strings—the writhe—and the other capturingthe relative distances between interacting parts. Section IIIpresents our approach to combining topological and configura-tion space representations in an optimal control setting throughthe Approximate Inference Control (AICO) [17, 13, 12])framework. This naturally leads to an extension that includesrandom variables for both the topological and configurationspace representations, with their specific motion priors coupledvia the graphical model. Section IV addresses using topolog-ical representation to generalize motion to novel situationsby “remapping” it using this more abstract space. Finally,in Section V we present experiments on using the proposedmethods to solve motion synthesis problems like unwrappingthat are infeasible (e.g. for RRT methods) without exploitingalternate representations and demonstrate generalizability tonovel, dynamic situations.

A. Previous Work

There is strong interest in reducing the dimensionality of thestate space of robots to simplify the control strategy. In [14],the dimensionality is reduced by projecting the states into thetask space and biasing the exploration of random trees towardsthe goal in the task space. Although such an approach is validif the work space is an open area with low number of obstacles,it is difficult to use when the task involves close interactionsbetween the robot and objects in the scene.

Machine learning techniques have been used to reduce thedimensionality of the task space. In [2], a latent manifoldin joint space is computed using Gaussian process fromsample configurations produced by an expert. This manifoldis, however, defined by samples from a valid trajectory in jointspace and it does not capture state of the environment directly.

In order to cope with problems of close interactions, itis necessary to abstract the state space based on the spatialrelations between the body parts and objects. Several robotics

researchers have developed knotting robots that abstract thestatus of the strands and plan the maneuvers by probabilisticroad maps [15, 18]. These approaches represent the ropestate based on how it is overlapping with itself when viewingit from a specific direction [4]. Then, the state transition isachieved by moving the end points toward a specific direction.Such a representation is not very convenient due to the view-dependence and the difficulty of manipulating the rope.

Alternate topological representations that describes the re-lationships between 1D curves using their original configura-tions are proposed for motion synthesis [5, 16] and classifyingpaths into homotopy groups [1]. In [5], a representation calledTopology Coordinates that is based on the Gauss LinkingIntegral is proposed to synthesize tangling movements. Bodiesare mapped from the new representation to the joint angles bythe least squares principle. In [16], the same representationis applied for controlling the movement of a robot that puts ashirt on a human. The mapping from the new representation tothe low level representation is learned through demonstrationsby humans. Such an approach is applicable only when thedesired movement is consistent and simple, but not when theplanning needs to be done between arbitrary sample points inthe state space. An idea to abstract the paths connecting a startpoint and the end point using homotopy classes in 2D based oncomplex numbers and in 3D based on the Ampere’s law [1] arealso proposed. Here the paths are only classified into homotopygroups and there is no discussion about the mapping from thetopological representation to the low level control coordinates.These representations are only applicable for 1D curves andis not applicable for describing the relationship between 2Dsurfaces, which is often needed for controlling robots.

Another representation called interaction mesh that abstractsthe spatial relationship of the body parts is introduced in [6].The relationship is quantified by the Laplacian coordinates ofthe volumetric mesh whose points are sampled on the bodyparts composing the scene. The advantage of this approach isthat it is not restricted to describe the relationship between 1Dcurves but is extendable to those between 2D surfaces. On thecontrary, it is a discrete representation which is only valid inthe neighborhood of the posture that the volumetric mesh iscomputed.

In summary, using a representation based on the spatialrelations is a promising direction to reduce the dimensionalityof the control, although little work has been done for pathplanning or optimal control in such state spaces.

II. TOPOLOGICAL REPRESENTATIONS

The topological spaces we will introduce significantly mod-ify the metric and topology of the search space. For instance,points that are near in the topology space might be far inconfiguration space, or a linear interpolation in topology spacemay translate to complex non-linear motion in configurationspace. The motion synthesis methods proposed in the next sec-tion are independent of the specific choice of representation.Here we will introduce two specific examples of topologicalrepresentations which we will also use in the experiments.

Fig. 2. Illustration of the definition of writhe for two segments. One - ab -belongs to the manipulator and another - cd - is a part of the obstacle.

These representations have previously been used in the contextof computer animation, namely the writhe representation [5]that captures the “windedness” of one object around another,and the interaction mesh representation [6] that captures therelative positioning of keypoints between interacting objects.

Both types of representations can be formalized by a map-ping φ : q 7→ y from configuration space to the topologicalspace, where q ∈ Rn is the configuration state with n degreesof freedom. To be applicable within our motion synthesissystem, we need to compute φ and its Jacobian J for anyq.

A. Writhe matrix and scalarThe writhe [7] is a property of the configuration of two

kinematic chains (or in the continuous limit, of two strings).Intuitively the writhe describes to what degree (and how andwhere) the two chains are wrapped around each other.

Let us describe two kinematic chains by positions p1,21:K oftheir joints, where pik ∈ R3 is the kth point of the ith chain.Using standard kinematics, we know how these points dependon the configuration q ∈ Rn, that is, we have the JacobianJ ik :=

∂pik

∂q for each point. The writhe matrix is a function ofthe link positions p1,21:K .

More precisely, the writhe matrix Wij describes the rel-ative configuration of two points (p1i , p

1i+1) on the first chain

and two points (p2j , p2j+1) on the second where i, j are indexes

of points along the first and the second chain respectively. Forbrevity, let us denote these points by (a, b) = (p1i , p

1i+1) and

(c, d) = (p2j , p2j+1), respectively (see Fig. 2). Then

Wij=sin-1 n>and

|na||nd|+sin-1 n

>bnc

|nb||nc|+sin-1 n

>cna

|nc||na|+sin-1 n

>dnb

|nd||nb|(1)

where na, nb, nc, nd are normals at the points a, b, c, d withrespect to the opposing segment (c.f. Fig. 2),

na=ac×ad , nb=bd×bc , nc=bc×ac , nd=ad×bd . (2)

The above equations for computing the Writhe are an analyt-ical expression for Gauss integral along two segments whichis the solid angle formed by all view directions in whichsegments (a, b) and (c, d) intersect [7].

Since the writhe matrix is a function of the link positionsp1,21:K we can compute its Jacobian using the chain rule

∂Wij

∂q=∂Wij

∂p1iJ1i +

∂Wij

∂p1i+1J1i+1+

∂Wij

∂p2jJ2j +

∂Wij

∂p2j+1J2j+1 . (3)

Fig. 3. Interaction mesh created from the obstacles (red), the goal (green)and the KUKA LWR 4 arm. Edges of the interaction mesh are shown as redlines. Edges between the obstacles have been removed.

The writhe matrix W is a detailed description of the relativeconfiguration of two chains. Fig. 5 illustrates 2 configurationstogether with their writhe matrix representation. Roughly, theamplitude of the writhe (shading) along the diagonal illustrateswhich segments are wrapped around each other.

From the full writhe matrix we can derive simpler metrics,usually by summing over writhe matrix elements. For instance,the Gauss linking integral, which counts the mean number ofintersections of two chains when projecting from all directions,is the sum of all elements of the writhe matrix. In ourexperiments, we will also use the vector wj =

∑iWij as a

representation of the current configuration. Writhe, however,does not provide a unique mapping to joint angles which iswhy we require additional constraints and cost terms especiallyin scenarios where wrapping motion is not dominant.

B. Interaction mesh

An interaction mesh describes the spatial interaction of therobot with its environment [6]. The interaction mesh is afunction of a graph connecting a set of landmark points on therobot and in the environment. More precisely, assume we havea set of points P = {pi}i including landmarks on a kinematicrobot configuration and on objects in the environment. Let Gbe a (fully or partially connected) graph on P . The selection oflandmarks and the graph connectivity should reflect the desiredinteraction of the environment and the robot. To each vertexp ∈ G in the graph, we associate the Laplace coordinate

LG(p) = p−∑

r∈∂Gp

r

|r − p|W(4)

where ∂Gp is the neighborhood of p in the graph G andW =

∑s∈∂Gp |s − p|−1 is the normalization constant for

the weighting inversely proportional to |r − p|. This constantensures that the weights sum up to one and therefore makethe representation invariant to scale. The collection of Laplacecoordinates of all points,

M = (LG(p))p∈P , (5)

is a 3|P |-dimensional vector which we denote as interactionmesh. As with the writhe matrix, we assume the Jacobian ofall robot landmarks in P is given and the Jacobian of otherenvironmental landmarks is zero. The Jacobian ∂M

∂q of theinteraction mesh is given via the chain rule.

We would like to point out that the squared metric in M -space has a deformation energy interpretation [6]. To seethis, consider a change of position of a single vertex p toa new position p′. The deformation energy associated to sucha change in position is defined based on the neighborhood ina tetrahedronisation T of the point set:

ET (p′) =

1

2‖LT (p

′)− LT (p)‖2 (6)

where LT (p) are the Laplace coordinates of p w.r.t. thetetrahedronisation T . The difference to our definition of theinteraction mesh is that we consider Laplace coordinates LG

w.r.t. the fully connected graph G instead of only T , which isa subgraph of the fully connected graph G. Since differentconfigurations lead to topologically different T , using LG

has the benefit of more continuous measures (deformationenergies as well as Jacobian). Neglecting this difference,minimizing squared distances in M -space (as is implicit,e.g., in inverse kinematics approaches as well as the optimalcontrol approaches detailed below) therefore corresponds tominimizing deformation energies.

III. OPTIMAL CONTROL COMBINING TOPOLOGICAL ANDCONFIGURATION SPACE REPRESENTATIONS

The motivation for introducing topological spaces is thatthey may provide better metrics or topology for motionsynthesis, ideally such that local optimization methods withintopological space can solve problems that would require moreexpensive global search in configuration space. In this section,we describe our method for exploiting topological spaces formotion synthesis in an optimal control context. We formulatethe approach within the framework of Approximate Infer-ence Control[13, 17], which is closely related to differentialdynamic programming [10] or iLQG [8, 11, 3] (see detailsbelow), and will allow us to use a graphical model to describethe coupling of motion estimation on both representations. Themain difference between iLQG and AICO is that iLQG (andother classical methods) do not have have an counterpart ofthe forward messages (see [17] for more details). AICO alsoallows for easy definition of the problem by extending thegraphical model. In the following, we first briefly introducethis framework before we explain how to couple additionalrepresentations in the probabilistic inference framework.

A. Approximate Inference Control

Approximate Inference Control (AICO) frames the problemof optimal control as a problem of inference in a dynamicBayesian network. Let xt be the state of the system—wewill always consider the dynamic case where xt = (qt, q̇t).Consider the problem of minimizing (the expectation of) thecost

C(x0:T , u0:T ) =

T∑t=0

[cx(xt) + cu(ut)] (7)

where cu describes costs for the control and cx describes taskcosts depending on the state (usually a quadratic error in some

task space). The robot dynamics are described by the transitionprobabilities P (xt+1 |ut, xt). The AICO framework translatesthis to the graphical model

p(x0:T , u0:T ) ∝ P (x0)T∏

t=0

P (ut)

T∏t=1

P (xt |ut-1, xt-1) (8)

·T∏

t=0

exp{−cx(xt)} .

The control prior P (ut) = exp{−cu(ut)} reflects the controlcosts, whereas the last term exp{−cx(xt)} reflects the taskcosts and can be interpreted as “conditioning on the tasks” inthe following sense: We may introduce an auxiliary randomvariable zt with P (zt = 1 |xt) ∝ exp{−cx(xt)}, that is,z = 1 if the task costs cx(xt) are low in time slice t. Theabove defined distribution is then the posterior p(x0:T , u0:T ) =P (x0:T , u0:T | z0:T = 1). AICO in general tries to estimatep, in particular the posterior trajectory and controls. In [17],this is done using Gaussian message passing (comparable toKalman smoothing) based on local Gaussian approximationsaround the current belief model. In [13], theory on the generalequivalence of this framework with stochastic optimal controlis detailed. Generally, the approach is very similar to differ-ential dynamic programming [10] or iLQG [8] methods withthe difference that not only backward messages or cost-to-gofunctions are propagated but also forward messages (“cost-to-reach functions”), which allows AICO to compute a localGaussian belief estimate b(xt) ∝ α(xt)β(xt) as the product offorward and backward message and utilize it to iterate messageoptimization within each time slice.

B. Expressing motion priors in topological spaces and cou-pling spaces

To estimate the posterior, the controls ut can be marginal-ized, implying the following motion prior:

P (xt+1 |xt) =∫u

P (xt |ut-1, xt-1) P (ut) du . (9)

This motion prior arises as the combination of the systemdynamics and our choice of control costs cu(ut) in x-space;for LQ systems, it is a linear Gaussian.

This motion prior is a unique view on our motivation fortopological representations. In the introduction, we mentionedthe impact of representations on the Voronoi bias, the metric,or the topology. In other words, successful trajectories arelikely to be “simpler” (easier to find, shorter, local) in anappropriate space. In Machine Learning terms, this is ex-pressed in terms of a prior. In this view, topological spacesare essentially a means to express priors about potentiallysuccessful trajectories—in our case we employ the linearGaussian prior in a topological space to express the beliefthat trajectories may appear “simpler” in a suitable topologicalspace.

However, using AICO with a linear Gaussian motion priorin topology space is not sufficient to solve general motionsynthesis problems: 1) The computed posterior in topology

configuration space

control

topological space

tasks

y0 y1 y2 yT

u2u1 uTu0

z0 z1 z2 zT

x0 x1 x2 xT

b̂(xT )

Fig. 4. AICO in configuration and topological space. The blue arcs representthe approximation used in the end-state posterior estimation.

does not directly specify an actual state trajectory or controllaw on the joint level. 2) We neglect the problem of mini-mization of control and task costs originally defined on thejoint level. To address these issues, we need mechanisms tocouple inference in topological and state space. We do so bycoupling topological and joint state representations in AICO’sgraphical model framework.

Fig. 4 displays a corresponding graphical model. The bot-tom layer corresponds to the standard AICO setup, with themotion prior P (xt+1 |xt) =

∫uP (xt |ut-1, xt-1) P (ut) du im-

plied by the system dynamics and control costs. Additionallyit includes the task costs represented by P (zt = 1 |xt) =exp{−cx(xt)}. The top layer represents a process in topolog-ical space with an a priori given linear Gaussian motion priorP (yt+1|yt). Both layers are coupled by introducing additionalfactors f(xt, yt) = exp{− 1

2ρ||φ(qt)−yt||2}, which essentially

aim to minimize the squared distance between the topologicalstate yt and the one computed from the joint configurationφ(qt), weighted by a precision constant ρ. Note that forGaussian message passing between levels using a local lin-earization of φ (having the Jacobian of the topological space)is sufficient. These factors essentially treat the topological stateyt as an additional task variable for the lower level inference,analogous to other potential task variables like end-effectorposition or orientation.

C. End-state posterior estimation

Probabilistic inference in the full factor graph given in Fig. 4would require joint messages over configuration and topolog-ical variables or loopy message passing. We approximate thisinference problem by a stage-wise inference process:

1) We first focus on approximating directly an end-stateposterior b̂(xT ) for the end-state xT which fulfils thetask defined in configuration space. We explain themethod used for this below.

2) Accounting for the coupling of this end-state posterior tothe topological representation, we compute a trajectoryposterior in topological space.

3) We then project this down to the joint level, using AICOin configuration space coupled to the topological spacevia factors introduced above.

Clearly this scheme is limited in that the initial inference intopology space only accounts for the task at the final timestep. To overcome this limitation, we would have to iterateinference between levels. For the problems investigated in ourexperiments, the approximation scheme above is sufficient.

End-state posterior estimation computes an approximatebelief b̂(xT ) ≈ P (xT | x0, z0:T = 1) about the finalstate given the start state and conditioned on the task. Thisapproximation neglects all intermediate task costs and assumeslinear Gaussian system dynamics of the form

P (xt | xt-1) = N(xt | Atxt-1 + at,Wt) . (10)

We integrate the system dynamics,

P (xT | x0) =∑x1:T -1

T∏t=1

P (xt |xt-1) , (11)

which corresponds to the blue arc in Fig. 4. For stationarylinear Gaussian dynamics, we have

P (xT |x0) = N(xT |ATx0+

T -1∑i=0

Aia,

T -1∑i=0

AiWA′i) , (12)

where superscript on A states for a power of matrix, definediteratively Ai = A ∗ Ai−1. To estimate b̂(xT ), we conditionon the task,

P (xT | x0, z = 1) =P (zT =1 | xT ) P (xT | x0)

P (zT =1 | x0). (13)

Since we assume P (xT | x0) to be Gaussian, using a localGaussian approximation of the task P (zT = 1 | xT ) aroundthe current mode of b̂(xT ), P (xT | x0, z = 1) can beapproximated with a Gaussian as well. We iterate this byalternating between updating the end-state estimate b̂(xT ) andre-computing the local Gaussian approximation of the taskvariable. A Levenberg-Marquardt type damping (dependingon monotonous increase of likelihood) ensures robust and fastconvergence.

IV. GENERALIZATION AND REMAPPING USING ATOPOLOGICAL SPACE

The optimal control approach presented in the previoussection exploits the additional topological space to generatean optimal trajectory (and controller). However, the computedoptimal trajectories may no longer be valid when there is achange in the environment, e.g. the obstacles have moved.In order to cope with this issue, we propose a per-frame re-mapping approach in which the optimal trajectory in the topo-logical representation y∗0:T is inverse-mapped to the configura-tion space according to the novel condition of the environment.

Technically, this is done by computing the configurationof the system per-frame such that the remapping error fromthe original optimal topological trajectory y∗0:T is minimized.However, topological representations such as the writhe matrixor the interaction mesh are very high-dimensional—oftenhigher dimensional than the configuration space itself. This isin strong contrast to thinking of y∗0:T as a lower dimensionaltask space like an endeffector space. Therefore, following y∗0:Texactly is generally infeasible and requires a regularisation

procedure that minimizes the 1-step cost function:

f(qt+1) = ||qt+1 − qt − h||2 + ||φ(qt+1)− y∗||2C , (14)

argminqt+1

f(qt+1) = qt + J](y∗ − yt) + (I − J]J)h (15)

with J] = J>(JJ>+ C-1)-1

where C describes a cost metric in y-space.For the case of the interaction mesh, we mentioned the

relation of a squared metric C in M -space to the deformationenergy. Therefore, using the per-frame remapping to followan interaction mesh reference trajectory M∗0:T essentially triesto minimize the deformation energy between the referenceM∗t and the actual φ(qt) at each time step. This impliesgeneralizing to new situations by approximately preservingrelative distances between interacting objects instead of di-rectly transferring joint angles. In conjunction with the use offeedback gains, the methodology proposed here is able to copewith dynamic environments (see Section V-C) and boundedunpredictable changes.

V. EXPERIMENTS

A. Rope unwrapping and reaching using writhe space

As an example of a possible application of the writhespace abstraction we simulated a manipulator, consisting of20 segments and a hand with three fingers, making in total 29DoF. Initially this rope-like manipulator is twisted two and ahalf times around a green pole, giving us approximately 900◦

of writhe density (See Fig. 5(a)).

(a) initial configuration (b) final configuration

(c) initial writhe matrix (d) final writhe matrix

Fig. 5. The experimental task is to grasp the object without collisions.Corresponding writhe matrices are depicted on lower plots - the darknesscorresponds to the amplitude of the writhe value.

0

5

10

15

0 10 20 30 40

'./rss_trajectory' matrix

0

0.2

0.4

0.6

0.8

1

1.2

1.4

1.6

1.8

2

Fig. 6. Each column represents a state in writhe space - values show howmuch each segment of the manipulator is twisted around the stick at eachtime step.

0 100 200 300 400 500 600 700 800 9000

1

2

3

4

5

6

7

8x 10

4 Comparison of planning algorithms

Nu

mb

er

of

co

llisio

n d

ete

ctio

n c

alls

Angle to unwrap (degrees)

Unidirectional RRT

Bi−directional RRT

AICO with WritheTV

Fig. 7. Performance of planning algorithms for unwrapping task. Computa-tional time is proportional to the number of collision detection calls.

The task is to plan a trajectory which should grasp thered cylinder without colliding with the stick (Fig. 5(a),5(b)and video available at http://goo.gl/fwSxG). Clearly, a localfeedback approach using Inverse Kinematics will experiencefailure in this task. On the other hand, a successful trajectorycan be well captured as a linear interpolation in writhespace and projected back to the configuration space usingthe coupling described in Section II-A. Fig. 6 illustrates anexample of a unwrapping trajectory in topological space whenall rows of writhe matrix are summed up into one column,representing the current state.

Hierarchical AICO conditioned on the end-state in writhespace yT was able to generate locally optimal trajectories,consisting of 50 time steps, in only few iterations, requiringa relatively small number of expensive collision checks (lessthan 1000). Comparison with Rapidly-exploring Random Tree(RRT) planning for this reaching task revealed a dependence ofthe performance on distance between end-effector and objectposition. Moreover, total costs of obtained trajectories were onaverage 100 times higher than those generated with the localoptimizer. Here, the end-state in configuration space qT wasgiven as a target for RRTs.

For more systematic evaluation of our planning platformwe have designed a sequence of unwrapping trajectories -gradually increasing the relative angle to be unwrapped. Thissequence of final states was given as goals to uni- and bi-directional RRT planners. The results demonstrate that forsimple trajectories (e.g. in case of nearby lying objects) allmethods have no difficulties, whereas starting with one and ahalf of full twist, unidirectional search fails and bi-directional

http://goo.gl/fwSxG

significantly slows down. (See Fig. 7.)In this comparison, the RRTs solved a somewhat simpler

problem than our system: For the RRTs we assumed to knowthe final state qT in configuration space – we take our finalpose estimate from Section III-C as the target qT for RRTs.This is in contrast to our planning platform, where we use thefinal pose estimate only to estimate a final topological stateyT and then use the hierarchical AICO to compute an optimaltrajectory (including an optimal qT ) conditioned on this finaltopological state. Therefore, the RRT’s problem is reducedto growing to a specific end state. We applied the standardmethod of biasing RRT search towards qT by growing thetree 10% of the time towards qT instead of a random sampleof the configuration space. Knowing qT also allowed us totest bi-directional RRTs, each with 10% bias to grow towardsa random node of the other tree. Further, RRTs output non-smooth paths whereas AICO produces (locally) optimal dy-namic trajectories since it minimizes dynamic control costs1.

B. Dynamic reaching through a loop using writhe and inter-action mesh

Writhe space is a suitable representation for tasks thatinvolve interactions with chains—or loops—of obstacles. As asecond demonstration, we have altered the task from Experi-ment V-A to reaching through a hollow box, where the rim ofthe box forms a loop of segments, see Fig. 8(b). The interac-tion of the manipulator with the box can be described by thewrithe representation between this loop of box segments andthe manipulator. Classically this problem would be addressedusing only an end-effector task variable and collision checks injoint configuration space. The advantage of using writhe as adescription of the interaction is in defining the task as a relativeconfiguration of the robot and the loop—this relative descrip-tion remains effective also when the situation changes or thebox is moved dynamically. The writhe matrix correspondingto the final configuration contains a peak around the last linkwhich passes through the box (see Fig. 1 and 8). In addition tothis, target in writhe space does not uniquely define the task forall arbitrary positions of the box (unlike the unwrapping taskin Experiment V-A). Fig. 8(b) bottom shows configuration thatdoes not reach through the loop but reaches the writhe spacetarget. The writhe is computed as absolute value of Gausslinking integral which means that direction of wrapping isnot clearly defined. This is the reason why one of the armsegments wraps around the rim of the box in wrong direction.In this demonstration we use an additional interaction meshrepresentation which is well suited to represent movement ina way that keeps appropriate distance between robot links andthe obstacle and is mostly invariant w.r.t. the concrete positionof the box. Explicit collision avoidance is then superfluous. Wecoupled all three representations—writhe, interaction mesh,and joint configuration space—using AICO by extending thegraphical model to efficiently generate motions for varying

1AICO can also be applied on the kinematic level, where xt = qt and withthe control ut = qt+1− qt being just the joint angle step. This would reducethe computational cost further.

0

1

2

3

4

0 1 2 3 4

'./FinalWrithe.txt' matrix

0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

(a) (b)

Fig. 8. (a) writhe space target for passing the last link of a robotic armthrough a loop. (b) the configuration corresponding to the target in Writhespace. The loop was built by connecting the colored cylinders attached to therim of the hollow box.

positions of the obstacles. For this the position of a hollow boxdefining the loop was tracked using magnetic motion trackingsystem in our experiment on the real robot. The computation oflocally optimal trajectories using this methods required 3 to 6AICO iterations. We compared to generating trajectories usingonly classical end-effector position and collision avoidancemethods. This required 20 to 30 iterations which shows thatcombined writhe and interaction mesh space is a better choiceof representation than end-effector position with collisionavoidance.

We also tested online remapping as described in Section IVusing both, the writhe and interaction mesh space, to testbehavior of the system when the box position is changeddynamically on the fly. A video of this experiment can befound at http://goo.gl/fwSxG.

C. Dynamic obstacle avoidance using interaction mesh space

Finally we present an experiment in which the robot ma-nipulator reaches a target object in an environment wherespherical obstacles dynamically move around. We show thatthe cost for the movement stayed close to the optimal valuewhen the movement is adjusted through the local re-mappingscheme described in Section IV in response to the unpre-dictable random displacement of the obstacles.

The experiment proceeded as follows. We used AICO withwork-space target for the end-effector and collision avoidancetask variables to compute the optimal trajectory under a givenobstacle configuration. The obstacles are then randomly movedin the workspace and the trajectories were recomputed by(1) re-mapping using Interaction mesh and (2) re-planningusing AICO with complete knowledge of the obstacle motion.The costs of the two methods with respect to the amplitudeof the obstacle displacement are compared in Fig. 9. Whilere-mapping approach (which does not have the benefit ofknowledge of the updated obstacle locations in remaining timesteps) results in a higher cost, it stays within a comparablerange of the full AICO re-planning. If we simply rely on ageneric collision detection / response scheme based on inversekinematics to amend the movements, close to an imminentcollision, the cost increases rapidly as shown in the black plotin Fig. 9. This illustrates that the topological representationgeneralizes the state space well such that the local re-mapping

http://goo.gl/fwSxG

0 0.1 0.2 0.3 0.4 0.50

500

1000

1500

2000

2500

3000Replanning cost

Obstacle displacement (m)

Ave

rag

e t

ota

l co

st

AICO re−plan

Interaction mesh

Inverse kinematics

Fig. 9. The average total cost of (1) AICO re-planning, (2) re-mapping thetrajectory using interaction mesh and (3) reaching using inverse kinematicsto control the end-effector position. The running cost consists of end-effectorposition term, a collision term and a control cost term.

scheme can produce an adequate response in real-time withlow computational effort while achieving the overall task aims.

We demonstrate the results on a real time manipulation us-ing the KUKA LWR 4 robotic manipulator and a vision track-ing system. The movement of the obstacles was individuallycontrolled by three uncorrelated individuals, showing that thismethod can deal with unpredictable movement of obstacles inreal time (Video is available at http://goo.gl/fwSxG).

VI. CONCLUSIONS

Different motion representations have different strengthsand weaknesses depending on the problem. For certain inter-action problems there exist suitable topological representationsin which the interaction can be described in a way that general-izes well to novel or dynamic situations (as with the interactionmesh), or where local optimization methods can find solutionsthat would otherwise require inefficient global search (aswith the writhe representations). However, considering motionplanning only in a topological representation is insufficientfor additionally accounting for tasks and constraints in otherrepresentations.

Previous work with such representations has only testedbasic approaches for inverse mapping of fixed topologicaltrajectories to the joint configuration [4, 5, 16]. In contrast,in this paper we presented methods that combine the differentrepresentations at the abstract and lower level for motionsynthesis. For instance, the reaching task in an endeffectorspace is coupled with a writhe space that allows a localoptimization method to generate an unwrapping-and-reachingmotion. Considering such a problem only in joint configurationand endeffector space leads to “deep local minima” that arepractically infeasible to solve—as our comparison to RRTs inSection V-A demonstrated. Considering such a problem onlyin writhe space would not address the actual reaching task.

We chose to formulate our approach in the framework ofoptimal control as an approximate inference problem sincethis allows for a direct extension of the graphical model toincorporate multiple representations. Alternative formulationsare possible, for instance as a structured constraint opti-mization problem (MAP inference in our graphical model)that could be solved by methods such as SNOPT. Whatwe coined as a motion prior in topological spaces would

here correspond to pseudo control costs for transitions intopological space. Which formulation will eventually lead tocomputationally most efficient algorithms is a matter of futureresearch. As an outlook, we aim to apply the proposed methodsfor dexterous robot manipulation (including grasping) of morecomplex, articulated or flexible objects, where we believe thatmultiple parallel representations will enable more robust andgeneralizing motion synthesis strategies.

Acknowledgements: This work was funded by the EU FP7project TOMSY and EPSRC project EP/H012338/1.

REFERENCES

[1] S Bhattacharya, M Likhachev, and V Kumar. Identification andrepresentation of homotopy classes of trajectories for search-based path planning in 3D. Proc.of RSS, 2011.

[2] S Bitzer and S Vijayakumar. Latent spaces for dynamicmovement primitives. In Proc. of 9th IEEE-RAS InternationalConference on Humanoid Robots, pages 574–581. IEEE, 2009.

[3] D J Braun, M Howard, and S Vijayakumar. Optimal vari-able stiffness control: Formulation and application to explosivemovement tasks. Autonomous Robots (in press), 2012.

[4] C H Dowker and B T Morwen. Classification of knot projec-tions. Topology and its Applications, 16(1):19–31, 1983.

[5] S L Ho Edmond and T Komura. Character motion synthesisby topology coordinates. In Computer Graphics Forum (Proc.Eurographics 2009), volume 28, Munich, Germany, 2009.

[6] Edmond S L Ho, T Komura, and Chiew-Lan Tai. Spatialrelationship preserving character motion adaptation. ACMTransactions on Graphics, 29(4):1–8, 2010.

[7] K Klenin and J Langowski. Computation of writhe in modelingof supercoiled DNA. Biopolymers, 54(5):307–317, 2000.

[8] Weiwei Li and E Todorov. An iterative optimal control andestimation design for nonlinear stochastic system. In Proc. ofthe 45th IEEE Conference on Decision and Control, 2006.

[9] S R Lindemann and S M LaValle. Incrementally reducingdispersion by increasing voronoi bias in RRTs. In Proc. ofInternational Conference on Robotics and Automation, 2004.

[10] D M Murray and S J Yakowitz. Differential dynamic pro-gramming and newtons method for discrete optimal controlproblems. Journal of Optimization Theory and Applications,43(3):395–414, 1984.

[11] J. Nakanishi, K. Rawlik, and S. Vijayakumar. Stiffness and tem-poral optimization in periodic movements: An optimal controlapproach. In Proc. of IROS, pages 718 –724, 2011.

[12] K Rawlik, M Toussaint, and S Vijayakumar. An approximateinference approach to temporal optimization in optimal control.In Proc. of Neural Information Processing Systems, 2010.

[13] K Rawlik, M Toussaint, and S Vijayakumar. On stochasticoptimal control and reinforcement learning by approximateinference. In Proc. of RSS, 2012.

[14] A Shkolnik and R Tedrake. Path planning in 1000+ dimensionsusing a task-space Voronoi bias. In Proc.of ICRA, pages 2892–2898. IEEE, 2009.

[15] J Takamatsu and et.al. Representation for knot-tying tasks. IEEETransactions on Robotics, 22(1):65–78, 2006.

[16] T Tamei, T Matsubara, A Rai, and T Shibata. Reinforcementlearning of clothing assistance with a dual-arm robot. Proceed-ings of Humanoids, pages 733–738, 2011.

[17] M Toussaint. Robot trajectory optimization using approximateinference. In Proc. of ICML, pages 1049–1056. ACM, 2009.

[18] H Wakamatsu, E Arai, and S Hirai. Knotting/unknotting ma-nipulation of deformable linear objects. International Journalof Robotics Research, 25(4):371–395, 2006.

http://goo.gl/fwSxG

hierarchical motion planning in topological representations · 2019. 8. 31. · hierarchical motion...

Documents