explicit formulation of multibody dynamics based on ...dyros.cafe24.com/paper/mubo2016.pdf ·...

Multibody Syst DynDOI 10.1007/s11044-016-9501-3

Explicit formulation of multibody dynamics basedon principle of dynamical balance and its parallelization

Seho Shin1 · Jonghoon Park2 · Jaeheung Park1,3

Received: 19 March 2014 / Accepted: 13 January 2016© Springer Science+Business Media Dordrecht 2016

Abstract Efficient computation of dynamics parameters is one of the important issues insimulation and control of the multibody systems as these systems become more complex.Recent advances in computer architecture are toward multiple core systems rather than high-speed single core systems. Therefore, parallel computation algorithms for dynamics param-eters should be designed to improve the performance on these multicore architectures. In thispaper, a new dynamics computation algorithm is derived using the principle of dynamicalbalance, which provides explicit computation of dynamic parameters. This new algorithmhas the structure to which parallel computation can be easily applicable. Parallel computa-tion methods are then applied so that we can exploit the structure of the proposed dynamicscomputation algorithm based on the principle of dynamical balance. The parallel algorithmis designed based on task and data-parallelism. The performance of the proposed algorithmis verified on robots with various topologies. The improved speed of parallel computation isdemonstrated through these experiments.

Keywords Multibody system dynamics · Principle of dynamical balance · Parallelcomputing

B J. [email protected]

S. [email protected]

J. [email protected]

1 Graduate School of Convergence Science and Technology, Seoul National University,1 Gwanak-ro, Gwanak-gu, Seoul, Republic of Korea

2 Neuromeka, 406 Seongsu IT Center, 37 Seongsui-ro 22-gil, Seongdong-gu, Seoul,Republic of Korea

3 Advanced Institutes of Convergence Technology, Seoul National University, Seoul,Republic of Korea

http://crossmark.crossref.org/dialog/?doi=10.1007/s11044-016-9501-3&domain=pdf

mailto:[email protected]



S. Shin et al.

1 Introduction

Multibody dynamics plays an important role in simulation and control of robotic and vehiclesystems. As dynamic systems become more complex, the computational time of dynamicsalgorithm increases. This increased computation time can cause the degradation of the sys-tem performance and reliability. For this reason, low-order efficient dynamics algorithmshave been developed. These algorithms can be classified into two groups: recursive algo-rithms and explicit algorithms. The recursive Newton–Euler Algorithm (RNEA) [1–3] andArticulated-Body Algorithm (ABA) [4, 5] are recursive algorithms for the inverse dynam-ics and the forward dynamics, respectively. The Composite-Rigid-Body Algorithm (CRBA)[5, 6] is one of the explicit algorithms for calculating the system inertia matrix. By using therecursive algorithms, computations can be completed within O(n) time. However, these al-gorithms cannot be used for a system and controller analysis because dynamics parameters,such as bias matrix, are derived using the matrix–vector product [7].

Recently, parallel computing architectures have been developed to overcome the lim-itation of Moore’s law. There are three types of approaches for parallel computing:architecture-oriented approach, data-oriented approach and process-oriented approach.In [8], a parallel/pipeline algorithm is reported by employing an architecture-oriented ap-proach. This algorithm is optimized for implementation on VLSI (Very Large Scale Inte-gration) and WSI (Wafer Scale Integration). Data-oriented approach (also called Data paral-lelism) is to distribute data across different parallel computing nodes. The Single InstructionMultiple Data (SIMD) is one of data parallelisms at the instruction level parallel process-ing [13]. In [9], an algorithm is designed to exploit SIMD instruction. With the advent ofMultiple Instruction Multiple Data (MIMD) architecture technology, the process-orientedapproach (also called Task parallelism) has been used extensively. This approach dividesthe algorithms into the segments of the same form and assigns them to multiple processorssimultaneously with a message passing model or a fork-join model [10, 11]. In the messagepassing model, distributed processes communicate and synchronize by transferring mes-sages to each other. On the other hand, the fork-join model achieves parallel processingwith threads and shared memory. In [12], the equation of motion for a complex multibodysystem is obtained for distributed computing using the Message Passing Interface (MPI).

In this paper, we first derive a new computation algorithm for multibody dynamics usingthe principle of dynamical balance, which was proposed to compute dynamics equationsfrom two known systems connected with specific conditions [14]. This principle is extendedin this paper to derive the formulation that can explicitly compute the system inertia andbias terms of multibody systems. This new formulation is derived such that parallel compu-tation is possible. Therefore, the parallel computation methods such as fork-join-based taskparallelism and loop-based data parallelism are implemented. Finally, the numerical analy-sis demonstrated the improved performance of the dynamics computation when the parallelcomputation methods were applied. The main contributions of the paper are two-fold: first,the explicit formulation of dynamics parameters is derived by generalizing the principle ofdynamical balance. Second, the implementation of parallel computation algorithm is suc-cessfully demonstrated.

The paper is organized as follows: Section 2 deals with some Preliminaries and Nota-tions. Section 3 derives system parameters for a composite n-subsystem. In Sect. 4, a gen-eral expression of system inertia and bias matrices is derived for a typical multibody system.And then, it is re-structured optimally for an efficient algorithm. Parallelism design is pro-posed in Sect. 5 on the re-organized explicit dynamics computation algorithm. In Sect. 6, theeffectiveness of the proposed parallelism is demonstrated by the reduced computation time

Explicit formulation of multibody dynamics based on principle

on the articulated multibody systems with various topology models. The paper is concludedin Sect. 7.

2 Preliminaries and notations

2.1 Rigid body kinematics

In this subsection, rigid body kinematics is briefly discussed as presented in [15]. The con-figuration of a rigid body system can be described by three types of frames [14]: a referenceframe ({ref }), a body frame ({1},{2}) and a joint frame ({1J2}, {2J1}). The reference frameis used as an inertial frame. To represent the position and attitude of bodies, each bodyhas a frame called a body frame. It is fixed to the body and moves together with the body.Joint frames are used to describe the relative motion between bodies. The coordinate systemtransformation can be performed using a homogeneous transformation matrix as follows:

ref T1 =[

ref R1ref r1

01×3 1

], (1)

where R is an SO(3) rotation matrix and r is an R3 displacement vector from the frame

{Ref } to the frame {1}. This transformation matrix belongs to the special Euclidean groupSE(3) which is 6-dimensional.

A body twist (V ∈ R6) is defined as the translational velocity (v ∈ R

3) and rotationalvelocity (w ∈ R

3) for representing the velocity of a body:

V1 =[

v1

w1

]. (2)

Its 4 × 4 matrix representation is defined by

�V1� =[�w1� v1

01×3 0

], (3)

where

�w� =⎡⎣ 0 −wγ wβ

wγ 0 −wα

−wβ wα 0

⎤⎦ for w =

⎡⎣wα

wβ

wγ

⎤⎦ .

There is a relationship between a body twist and the homogeneous transformation matrix,namely

�V1� = T −11 T1. (4)

It can be equivalently written as

v1 = RT1 r1, (5)

�w1� = RT1 R1. (6)

Note that the translational and the rotational velocities are represented with respect to thebody frame {1}, not the reference frame.

S. Shin et al.

Fig. 1 Rigid body system

The transformation of body twists can be done as

V2 = 2Ad1V1 + 1V2. (7)

The operator 1Ad2 is an adjoint transformation from the frame {1} to the frame {2}. It isdefined by

1Ad2 =[

1R2 �1r2�1R2

03×31R2

]. (8)

2.2 Loading constraint

In Fig. 1, kinematical constraints between two subsystems can be expressed as the relativetwist

1J2V2J1 = (2Ad−1

2J1

)V2 − (

1J2Ad−12J1

)(1Ad−1

1J2

)V1. (9)

If the constraint represents an interaction between two subsystems, we call it a loadingconstraint. This loading constraint is also expressed as a linear transformation of the jointvelocity vector q1. In other words, there is always a matrix E1 satisfying

1J2V2J1 = E1q1, (10)

where E1 is defined as the joint matrix. The joint matrix of a joint having n-DOF is of size6 × n.

2.3 Newton–Euler equation of motion

Newton–Euler equation describes the combined translational and rotational dynamics of arigid body. When a body wrench F1 is applied to a body moving with its body twist V1, it isgoverned by the equations of motion as follows:

F1 = A1V1 + B1V1, (11)

where A1 and B1 are the body inertia and bias term, respectively. These are defined in termsof the body mass (m), the position difference between the body frame and center-of-mass


(ρ1) and the rotational inertia at the center-of-mass (I1):

A1 =[

m1I −m1�ρ1�m1�ρ1� I1 + m1�ρ1��ρ1�T

], (12)

B1 =[

m1�w1� −m1�w1��ρ1�m1�ρ1��w1� −�I1w1� + m1�ρ1��w1��ρ1�T

]. (13)

The body wrench is arranged into the body force and moment as

F1 =[

f1

τ1

]. (14)

2.4 Principle of dynamical balance

d’Alembertian wrench (F ∗) is defined for encapsulating the details of dynamics of eachbody [14] as

F ∗ = F − AV − BV. (15)

Using the d’Alembertian wrench, the dynamical balance condition of the body can be ex-pressed as

F ∗ = 0. (16)

In [14], d’Alembertian wrench and torque satisfy the dynamical balance condition ifthe system is composed of two known subsystems under the joint constraint. It facilitatessystematic formulation for a composite system dynamics from kinematic constraints. Theequations of motion derived by using the principle of dynamical balance can be written interms of the base body twist V and the joint velocity vector q as

WF + Lτ = M

[V

q

]+ C

[V

q

], (17)

where F is the body wrench, including the gravitational wrench, and τ is the joint torquevector. The matrices W and L are called wrench and torque influence matrices, respectively.The matrices M and C are the system inertia and bias matrices, respectively. In Fig. 2, thesystem is connected by N -dimensional actuated joints, the dimension of the system inertiaand bias matrices is (N + 6) × (N + 6).

3 Composition of n-subsystem

The principle of dynamical balance deals with a composition of two subsystems. The equa-tion of motion for a sequential structure can be derived using this concept. In this section,we extend the principle of dynamical balance to derive an equation of motion of a com-posite system consisting of a base and an n-subsystem, as shown in Fig. 2. A joint-loadingconstraint of this composite system is written as

0JkV kJ0 = kAd−1kJ0Vk − (

0JkAd−1kJ0

)(0Ad−1

0Jk

)V0 = Ekqk (k = 1, . . . , n) (18)

S. Shin et al.

Fig. 2 Composition of a basesystem and an n-subsystem

where E is a joint matrix and q is a joint velocity. Equation (18) can be rearranged about Vk

as below:

Vk = (0Ad−1

k

)V0 + (

kAdkJ0

)Ekqk. (19)

This equation means that subsystem velocities are composed using the velocity of the basebody and joint velocity, respectively. By using the dynamical balance condition of the com-posite system, joint-loading constraint (19) can be expressed by the dual expression

F ∗0 =

n∑i=1

−(0Ad−T

i

)F ∗

i , (20)

τk = −ETk

(kAdT

kJ0

)F ∗

k (k = 1, . . . , n) (21)

where F ∗ is a d’Alembertian wrench. These can be substituted by dynamical balance con-dition as below:

F0 +n∑

i=1

(0Ad−T

i

)Fi =

{A0V0 +

n∑i=1

(0Ad−T

i

)AiVi

}

+{

B0V0 +n∑

i=1

(0Ad−T

i

)BiVi

}, (22)

τk + ETk

(kAdT

kJ0

)Fk = ET

k

(kAdT

kJ0

)(AkVk + BkVk) (k = 1, . . . , n). (23)

In order to express the system equation of motion in terms of base body twist V0, Vk isderived by using the time-derivative of the loading constraint (19):

Vk = (0Ad−1

k

)V0 + (k

AdkJ0

)Ekqk − (0

adk

)(0Ad−1

k

)V0 + (k

AdkJ0

)Ekqk (24)

where 0adk is the adjoint operator. It is defined by

0adk = (0Ad−1

k

)( ˙0Adk

). (25)


Finally, substituting (19), (24) into (22), (23), the system equation of motion is simplified as

⎡⎢⎢⎢⎢⎢⎣

F0 + ∑n

k=1(0Ad−T

k )Fk

ET1 (0AdT

1J0)F1

...

ETn (0AdT

nJ0)Fn

⎤⎥⎥⎥⎥⎥⎦

+

⎡⎢⎢⎢⎣

0τ1...

τn

⎤⎥⎥⎥⎦ =

⎡⎢⎢⎢⎣

M00 M01 · · · M0n

M10 M11 · · · M1n

......

. . ....

Mn0 Mn1 · · · Mnn

⎤⎥⎥⎥⎦

⎡⎢⎢⎢⎣

V

q1...

qn

⎤⎥⎥⎥⎦

+

⎡⎢⎢⎢⎣

C00 C01 · · · C0n

C10 C11 · · · C1n

......

. . ....

Cn0 Cn1 · · · Cnn

⎤⎥⎥⎥⎦

⎡⎢⎢⎢⎣

V

q1...

qn

⎤⎥⎥⎥⎦ (26)

where

M00 = A0 +n∑

i=1

(0Ad−T

i

)Ai

(0Ad−1

i

)

M0k = (0Ad−T

k

)Ak

(kAdkJ0

)Ek

Mk0 = MT0k

Mkk = ETk

(kAdT

kJ0

)Ak

(kAdkJ0

)Ek

Mki = Mik = 0 (i �= k)

⎫⎪⎪⎪⎪⎪⎪⎪⎪⎪⎪⎪⎪⎬⎪⎪⎪⎪⎪⎪⎪⎪⎪⎪⎪⎪⎭

(1 ≤ k ≤ n,1 ≤ i ≤ n)

and

C00 = B0 +n∑

i=1

(0Ad−T

i

)(Bi − Ai

(0adi

))(0Ad−1

i

)

C0k = (0Ad−T

k

){Ak

(kAdkJ0

)Ek + Bk

(kAdkJ0

)Ek

}Ck0 = ET

k

(kAdT

kJ0

){−Ak

(0adk

) + Bk

}(0Ad−1

k

)Ckk = ET

k

(kAdT

kJ0

){Ak

(kAdkJ0

)Ek + Bk

(kAdkJ0

)Ek

}Cki = Cik = 0 (i �= k)

⎫⎪⎪⎪⎪⎪⎪⎪⎪⎪⎪⎪⎪⎬⎪⎪⎪⎪⎪⎪⎪⎪⎪⎪⎪⎪⎭

(1 ≤ k ≤ n,1 ≤ i ≤ n).

Through these explicit forms of system inertia and bias, the relationships between a basesystem and subsystems are described. Using this relation recursively, we will derive thegeneral expression for a tree-topology model in the next section.

4 General expression of explicit system parameters

Most dynamics algorithms deal with an abstract formulation that is not a straightforwardderivation. Moreover, explicit formulations are focused on the composition of single rigidbodies. In order to apply them on the controller or the simulator, explicit parameters shouldbe considered in terms of the composition of subsystems. In this reason, system inertia andbias matrices are explicitly derived using the principle of dynamical balance in this section.

S. Shin et al.

Fig. 3 Index numbering schemeof a multibody system structure

In the multibody systems with a tree topology, the parent body is solely defined by allchild nodes (see Fig. 3). Indices k+ and k− indicate the child body and parent body of thebody k, respectively. When there are multiple child bodies, the ith child’s body index ofbody k is denoted as k+[i]. p(k) and c(k) are the index sets of parents and children ofbody k. The number of children of body k is denoted as n(k). The general expression ofexplicit system inertia and bias term is represented by using these notations. To explainsystematically, system inertia and bias matrices are decomposed by separating the base (B)and the joint (J ) parts as

M =[

MBB MBJ

MJB MJJ

], (27)

C =[

CBB CBJ

CJB CJJ

]. (28)

4.1 Accumulated body inertia and bias matrix

First, we define a notion of accumulated inertia and bias term to reduce repeated calcula-tions. The system inertia matrix M and the system bias matrix C can be computed by accu-mulating the body inertia and bias matrices from leaf nodes to root nodes. The accumulatedinertia (A∗) and bias (B∗) are defined on the body with the index of k as

A∗k =

{Ak + ∑n(k)

j=1kAd−T

k+[j ]A∗k+[j ]

kAd−1k+[j ] if n(k) > 0,

Ak if n(k) = 0,(29)

B∗k =

{Bk + ∑n(k)

j=1{kAd−T

k+[j ](B∗k+[j ] − A∗

k+[j ]kadk+[j ])kAd−1

k+[j ]} if n(k) > 0,

Bk if n(k) = 0(30)

where Ak and Bk indicate the kth body inertia and bias term, respectively.

4.2 System inertia matrix

Elements of the system inertia matrix MBB , MBJ , MJB , and MJJ in (27) are expressed usingthe accumulated body inertia A∗. The matrix MBB is the accumulated inertia with respect tothe base body

MBB = A∗0. (31)


The kth column of MBJ and the kth row of MJB , which correspond to the kth body, arecalculated by

MBJ,k = 0Ad−Tk A∗

kEk,

MJB,k = ETk A∗

k0Ad−1

k (k = 1, . . . , n)(32)

where Ek is the adjacent Jacobian of the kth joint. It is defined by

k−Vk = (

kAdkJk−)Ekqk = Ek(qk)qk. (33)

The matrix MJJ shows the dynamic correlation of the joints. MJJ,kk are the diagonal ele-ments of MJJ . MJJ,ik and MJJ,ki are off-diagonal elements of MJJ . Off-diagonal elementshave the meaning of the relationship between the ith and kth body. In the tree-topologymodel having branch nodes and leaf nodes, MJJ may have a sparse structure with zeroentries. Diagonal elements can be calculated as follows:

MJJ,kk = ETk

(A∗

kEk

). (34)

In case of off-diagonal elements, if i ∈ p(k) for all i < k, elements are derived as in (35)and otherwise are zeros:

MJJ,ik = ETi

iAd−Tk A∗

kEk

= ETi

0AdTi

0Ad−Tk A∗

kEk = (0Adi Ei

)TMBJ,k,

MJJ,ki = ETk A∗

kiAd−1

k Ei

= ETk A∗

k0Ad−1

k0Adi Ei = MJB,k

(0Adi Ei

).

(35)

4.3 System bias matrix

The system bias matrix C can be computed similarly to the system body inertia matrix M .However, it is noted that the bias matrix is not symmetric.

First, the top left matrix CBB is equal to the accumulative bias matrix of the base body.That is,

CBB = B∗0 . (36)

The kth column of CBJ and the kth row of CJB , which correspond to the kth body, arecalculated by

CBJ,k = 0Ad−Tk

(A∗

k˙Ek + B∗

k Ek

),

CJB,k = ETk

(B∗

k − A∗k

0adk

)0Ad−1

k

(37)

where ˙E is the derivative of E. CJJ,kk is the kth diagonal element. CJJ,ik and CJJ,ki are off-diagonal elements. In the tree-topology model having branch nodes and leaf nodes, CJJ mayhave a sparse structure. Diagonal elements can be calculated as follows:

CJJ,kk = ETk

(A∗

k˙Ek + B∗

k Ek

). (38)

S. Shin et al.

// Phase I : Updating kinematic parametersfor i ← 1 to N do

Update_Kinematics(qi , qi );end// Phase II : Accumulating body inertia and biasA∗

N← AN ;

B∗N

← BN ;for i ← N to 1 do

A∗i−1 ← Accumulate_Inertia(Ai−1,A∗

i);

B∗i−1 ← Accumulate_Bias(Bi−1,B∗

i);

end// Phase III : Computing elements of matricesMBB = A∗

0;CBB = B∗

0 ;for i ← N to 1 do

compute MBJ,i ;MJB,i ← Transpose(MBJ,i );compute CBJ,i & CJB,i ;compute MJJ,ii ;compute CJJ,ii ;j ← parent(i);while j �= 0 do

compute MJJ,ij ;MJJ,ji ← Transpose(MJJ,ij );compute CJJ,ij & CJJ,ji ;j ← parent(j);

endendAlgorithm 1: Sequential algorithm of computing system inertia and bias matrices

if i ∈ p(k) for all i < k, elements are derived as in (39) and otherwise are zeros:

CJJ,ik = ETi

iAd−Tk

(A∗

k˙Ek + B∗

k Ek

)= ET

i0AdT

i0Ad−T

k

(A∗

k˙Ek + B∗

k Ek

)= (

0Adi Ei

)T(CBJ,k),

CJJ,ki = ETk

((B∗

k − A∗kiadk

)iAd−1

k Ei + A∗kiAd−1

k˙Ei

)= CJB,k

(0Adi Ei

) + MJB,k0Adi

( ˙Ei + 0adiEi

).

(39)

5 Parallel algorithm of computing system inertia and bias matrices

5.1 Algorithm analysis

The sequential algorithm can be implemented using general expressions of system inertiaand bias, as shown in Algorithm 1. This algorithm has three main phases: updating the kine-matic parameters, accumulating the body inertia and bias terms, and computing the elementsof the matrices. The first phase updates the kinematic parameters from the base body to thesubsystem bodies. The data flow of the second phase is in the opposite direction. The third


Fig. 4 Load distribution applying the loop-level parallelization approach

phase computes the elements of the system inertia and bias independently. Each phase isdifferent in accordance with the structural characteristics of the parallelization efficiency.The computation progress from the parent node to a child node, updating kinematic param-eters, has large parallelism if there exist as many the same level nodes as possible becausethey have no dependency on each other. In contrast, when the calculation is from the childnodes to the parent node as in the second phase, all the operations of each level can becarried out after the end of the operation of the parent node. This results in a different ef-ficiency depending on the system structure. Figure 4 describes a load distribution exampleof a system of different structures with the same degree of freedom by applying the bruteforce parallelization approach with four processors. Figure 4(a) shows the load distributionin a structure with a star topology with four branches. An updating process of kinematicparameters (K) can be calculated at the same time because the child nodes have no depen-dencies. In contrast, an accumulation process of the system inertia and bias (A) is necessaryto avoid the data race conditions to prevent write operations in the same memory at thesame time. This causes the parent node to wait until the end of the execution of their childnodes. Thus, in this topology, the accumulation procedure is performed sequentially. Thefinal procedure for computing each element of inertia and bias matrix (E) can be carried outin parallel.

Figure 4(b) is a structure with two branches. The update of the kinematic param-eters (K) can be performed in parallel for each branch on two processors. In the ac-cumulation procedure, accumulated inertia and bias parameters (A) for the leaf nodesof each branch can be computed in parallel. However, waiting due to the critical sec-

S. Shin et al.

Fig. 5 Load distribution applying the proposed parallelization approach

tion occurs when each branch is merged. Through these two examples, in a systemwith more branches, it can be seen that the computation of the kinematics can be per-formed efficiently. On the other hand, the accumulation process can be delayed due tothe critical section, depending on the topology, and it can be seen that there is a differ-ence in performance. Accordingly, in order to improve the performance of paralleliza-tion, it is important to apply the parallelism of a robust algorithm to a variety of struc-tures. In the following subsections, we introduce our proposed parallel algorithm in de-tail.

5.2 Parallelism design

The algorithm of explicit dynamics parameters consists of both forward-propagation andback-propagation. As illustrated in Fig. 4, the waiting time in each phase and idle timedue to the critical section during the accumulation of operations cause the degradation ofefficiency of the parallel algorithm. In order to improve the algorithm efficiency, the waitingtime must be reduced to prevent the occurrence of idle resources in each process.

The determination of parallelism is important for minimizing the latency of the algo-rithm. For allocating computation procedures dynamically to available resources, we exploitthe task and data parallelism. The task parallelism is applied to divide the whole system intosubsystems by functionality. Additionally, the data parallelism is exploited at computationsof the same level. Figure 5 illustrates the parallel model for computing system inertia andbias. Figure 6 describes the load distribution of a system of different structures with the samedegree of freedom by applying the proposed parallelization approach with four processors.During the back-propagation to accumulate system inertia, the spawn and wait operation


Fig. 6 Parallel model for computing system inertia and bias

allow the use of the processor if the task is waiting for other tasks. Therefore, the computingsteps of the matrices’ elements can be completed before finishing the accumulating steps.Using this parallelism, the performance degradation of nodes with many branches can besolved.

To perform the forward and back-propagation efficiently, two classes are inherited fromTask 1 and Task 2. Task 1 is designed to compute kinematic parameters, accumulated inertiaand bias terms. A child Task 1 is spawned by the spawn and awaits instructions of a parentTask 1 for the task parallelism. This instruction propagates body information to the childbodies by constructing a list of subtasks, and then waits for all the spawned subtasks to finish.On the other hand, Task 2 is designed to obtain diagonal and off-diagonal elements for eachelement of the subsystem. Since these computations have no dependencies on each other,Task 2 uses the data parallelism model. After the accumulation process, Task 1 spawns alist of subtasks for Task 2. By managing Task 2 tasks directly, it can reduce the computationtime without the latency of waiting for Task 1. Algorithm 2 represents a pseudo-code forparallel computing of dynamics parameters.

6 Experiments

The proposed parallelization algorithms are applied to compute the system inertia and biasmatrices of articulated multibody systems with different topologies (Fig. 7). Each body isconnected by a revolute joint. The experiments were performed on a PC with the 64 bit

S. Shin et al.

1 class Task1: public task {2 Body∗ _body;3 Task1(Body∗ body) : _body(body)4

5 void Update_Kinematics_Parameters(){...}6 void Accumulate_Inertia_And_Bias(){...}7 void Task_Parallelism(){8 if( _body−> _childNum == 0 ) return;9 tbb::task_list list;

10 for( int i=0; i< _body−>_childNum; i++){11 list.push_back( ∗new( allocate_child() ) Task1(_body−>getChildBody(i));12 }13 set_ref_count(_body−> _childNum +1);14 spawn_and_wait_for_all(list);15 Accumulate_Inertia_And_Bias();16 }17 void Data_Parallelism(){18 if( _body−>_parentNum == 0 ) return;19 tbb::task_list list;20

21 list.push_back( ∗new( allocate_child() ) Task2(_body,true));22 for( int i=0; i< _body−>_ancestorNum; i++){23 list.push_back( ∗new( allocate_child() ) Task2(_body−>_ancestor(i)));24 }25 set_ref_count(_body−>_parentNum);26 spawn(list);27 }28 task∗ execute() {29 Update_Kinematics_Parameters();30 Task_Parallelism();31 Data_Parallelism();32 }33 };34

35 class Task2: public task {36 Body∗ _body;37 Task2(Body∗ body, bool bDiagonal = false) : _body(body)38

39 void Update_Off_Diagonal_Elements(){...}40 void Update_Diagonal_Elements(){...}41

42 task∗ execute() {43 if( bDiagonal ) Update_Off_Diagonal_Elements();44 else Update_Diagonal_Elements();45 }46 };

Algorithm 2 Source code of task programming models

Windows 7 OS and the Intel i7-3930K 3.20 GHz processor with six cores, in which up to 12execution threads were available.

6.1 Serial type model

Figure 8 shows the computation time of the system inertia matrix and bias matrix usingthe proposed parallel algorithm with respect to the degrees-of-freedom (DOF) of the serial


Fig. 7 Tree topology models for experiments

S. Shin et al.

Fig. 8 Comparison of systeminertia and bias algorithms inserial type models: (a) systeminertia matrix; (b) system biasmatrix

type robots. The results indicate the improvement only by updating elements of systeminertia/bias matrix because the influence of the accumulated inertia/bias matrix algorithmis negligible for the serial type robots. The computation time of the sequential algorithmincreased the tendency of the O(n2) where n is the number of the degrees of freedom of thesystem. On the other hand, the parallel algorithm improved the time bound to the tendencyof the O(n2/p) using p processors.

6.2 Tree topology model

The proposed parallelization algorithm is applied to multi-degree of freedom models withdifferent tree-like topologies. And the proposed algorithm is compared not only with a se-quential algorithm but also with a loop-level parallel algorithm in order to verify the paral-lelism performance. The loop-level parallel algorithm was only implemented in data paral-lelism. As shown in Figs. 9 and 10, computation times of the proposed method are less thanfor the sequential algorithm and the loop-level parallel algorithm. The computation time wasreduced as the number of branch nodes of the dynamic system increased when they had thesame degree of freedom. Moreover, compared to other algorithms, the computational effi-ciency of the proposed algorithm was improved for the high degree of freedom system. Inthis case, the number of the generated tasks for the computation of the system bias matrix


Fig. 9 Comparison of systeminertia algorithms in differenttree-like topology models:(a) 2-branch model; (b) 5-branchmodel; (c) 10-branch model

is greater than that of the system inertia matrix due to the fact that the bias matrix is notsymmetric. Therefore, the computation of the system bias matrix is much more improvedthan that of the inertia matrix. Finally, as an example, the proposed algorithm was appliedto a 25-DOF humanoid robot model called MAHRU, which is developed by KIST [16]. The

S. Shin et al.

Fig. 10 Comparison of systembias algorithms in differenttree-like topologies: (d) 2-branchmodel; (e) 5-branch model;(f) 10-branch model

proposed algorithm reduced the processing time from 2.2 to 0.5 ms. Accordingly, these pa-rameters can be adapted to the modern torque controlled robots requiring performance over1 kHz. This result is consistent with the previous analysis in this section.


7 Conclusions

A new dynamics computation algorithm is proposed based on the principle of dynamicalbalance. This algorithm can explicitly compute the inertia matrix and bias terms of themultibody system dynamics. Also, the dynamics formulation is derived so that the dynam-ics parameters can be computed in parallel. The proposed algorithm is verified on varioustree structures of a robot. The experimental results show that the proposed algorithm cansignificantly reduce the computation time for the system inertia and bias matrices, whenparallel computation methods are applied. In the future, we will investigate the performanceof the proposed dynamics computation algorithm compared to other algorithms. Also, thealgorithm will be extended to be applied for the systems that have closed-loops.

Acknowledgement This work was supported by the National Research Foundation of Korea (NRF) grantfunded by the Korea government (MSIP) (No. NRF-2015R1A2A1A10055798) and the Technology Innova-tion Program (10060081) funded by the Ministry of Trade, industry & Energy (MI, Korea).

References

1. Stepanenko, Y., Vukobratovic, M.: Dynamics of articulated open-chain active mechanisms. Math. Biosci.28, 137–170 (1976)

2. Orin, D.E., McGhee, R.B., Vukobratovic, M., Hartoch, G.: Kinematic and kinetic analysis of open-chainlinkages utilizing Newton–Euler methods. Math. Biosci. 43, 107–130 (1979)

3. Luh, J.Y.S., Walker, M.W., Paul, R.P.C.: On-line computational scheme for mechanical manipulators.J. Dyn. Syst. Meas. Control 102(2), 69–76 (1980)

4. Featherstone, R.: The calculation of robot dynamics using articulated-body inertias. Int. J. Robot. Res.2(1), 13–30 (1983)

5. Featherstone, R.: Robot Dynamics Algorithms. Kluwer Academic, Boston/Dordrecht/Lancaster (1987)6. Lilly, K.W., Orin, D.E.: Alternate formulations for the manipulator inertia matrix. Int. J. Robot. Res. 10,

64–74 (1991)7. From, J.: An explicit formulation of singularity-free dynamic equations of mechanical systems in La-

grangian form—part one: single rigid bodies. Model. Identif. Control 33(2), 45–60 (2012)8. Fijany, A., Bejczy, A.K.: A class of parallel algorithms for computation of the manipulator inertia matrix.

IEEE Trans. Robot. Autom. 5(5), 600–615 (1989)9. Lee, C.S.G., Chang, P.R.: Efficient parallel algorithms for robot forward dynamics computation. IEEE

Trans. Syst. Man Cybern. 18(2), 238–251 (1988)10. Bondhugula, U., Baskaran, M.M., Hartono, A., Krishnamoorthy, S., Ramanujam, J., Rountev, A., Sa-

dayappan, P.: Towards effective automatic parallelization for multicore systems. In: IPDPS. IEEE, pp. 1–5 (2008)

11. Agathos, S.N., Hadjidoukas, P.E., Dimakopoulos, V.V.: Design and implementation of openmp tasks inthe ompi compiler. In: Angelidis, P., Michalas, A. (eds.) Panhellenic Conference on Informatics, pp. 265–269. IEEE Press, New York (2011)

12. Duan, S., Anderson, K.S.: Parallel implementation of a low order algorithm for dynamics of multibodysystems on a distributed memory computing system. Eng. Comput. 16(2), 96–108 (2000)

13. Chhugani, J., Nguyen, A.D., Lee, V.W., Macy, W., Hagog, M., Chen, Y.-K., Baransi, A., Kumar, S.,Dubey, P.: Efficient implementation of sorting on multi-core SIMD CPU architecture. In: Proceedingsof the VLDB Endowment, vol. 1, pp. 1313–1324 (2008)

14. Park, J.: Principle of dynamical balance for multibody systems. Multibody Syst. Dyn. 14(3), 269–299(2005)

15. Murray, R.M., Li, Z., Sastry, S.S.: A Mathematical Introduction to Robotic Manipulation. CRC Press,Boca Raton (1994)

16. You, B., Choi, Y., Jeong, M., Kim, D., Oh, Y., Kim, C., Cho, J., Park, M., Oh, S.: Network-based hu-manoids ‘Marhu’ and ‘Ahra’. In: Proc. Int. Conf. on Ubi. Robots and Ambient Intelli, pp. 376–379(2005)

explicit formulation of multibody dynamics based on ...dyros.cafe24.com/paper/mubo2016.pdf ·...

Documents