learning object-centric trajectories of dexterous hand ... · gokhan solak 1and lorenzo jamone...

2
Learning object-centric trajectories of dexterous hand manipulation tasks from demonstration Gokhan Solak 1 and Lorenzo Jamone 1 Dexterous hand manipulation tasks are challenging due to the complexities of encoding the task, coordinating the fingers, and tracking the object. Since some daily tasks require the manipulated object to follow tricky trajectories as in writing and sewing, it is not viable to design the motion manually. When the desired motion is known, the controller has to deal with both the task execution and the object grasping simultaneously. Due to the hand-object occlusions visual object tracking is limited and tactile sensing is necessary. We propose a combination of the dynamical movement primitives (DMP), the virtual spring framework (VSF) and a force-feedback based online grasp adaptation method to overcome these challenges. The DMP is used to learn the complex trajectories from demonstration, the VSF is used to coordinate the fingers to keep the object in grasp during manipulation, and the force-feedback is used to continuously adjust the grasping forces to increase the contact stability. The VSF is an impedace-based, object-agnostic grasping method that assumes virtual springs that connect the finger- tips to a virtual object frame (Fig. 1.a). The virtual frame is approximated using only the fingertip positions without knowing the exact model or pose of the object. Stiffness parameters of the springs are determined heuristically or learned from demonstration as in [4]. It also adds compliance to the system which is helpful against uncertainties [2]. VSF allows designing simple object-centric dexterous manipula- tion actions such as translation and rotation as presented by [7] and [4]. However, some tasks depend not only on the final state but also on the specific trajectory as discussed above. Learning policies to achieve complex trajectories is possi- ble using reinforcement learning [1], learning from demon- stration [3] or both [5]. We follow the demonstration ap- proach since it requires less resources. The DMP makes it possible to learn complex trajectories from a single demon- stration. The DMP is a framework that represents the motion as a dynamical system [3]. The dynamical system is designed to ensure convergence to a goal state. Custom trajectories are learned as an additive term that perturbs the system. It is possible to change the initial and goal conditions of a task without losing the trajectory shape. The dynamical system *This work was partially supported by the EPSRC UK (with projects MAN3, EP/S00453X/1, and NCNR, EP/R02572X/1) 1 Gokhan Solak and Lorenzo Jamone are with School of Elec- tronic Engineering and Computer Science, Queen Mary University of London, 10 Godward Square, Mile End Road, London E1 4FZ {g.solak,l.jamone}@qmul.ac.uk O H t H t+1 H 0 (a) (b) Fig. 1. (a) Virtual spring framework and object trajectory control. Robot fingertips are connected to the object frame O with virtual springs. Object pose is controlled in order to minimize the error between O and the reference frame Ht . The reference frame follows a trajectory that is learned from demonstration. (b) The experiment setup. The initial state (top image), the final state after translation (middle image) and the final state after rotation (bottom image). also provides an attractor landscape which drives the system towards goal in case of perturbations. We use the DMP to learn object-centric trajectories for the virtual object frame of VSF. Thus, the trajectory is unaware of the finger positions and kinematics. The DMP defines the motion of object, the fingers follow the object passively with the influence of virtual springs. Desired finger motion is transformed to desired joint motion using inverse kinematics and the joints torques are calculated using PID control. This provides a base controller to both achieve the task and keep the object in hand. However, grasping an unknown object is prone to occasional contact losses. We use the feedback from fingertip force sensors to prevent contact losses. This method adjusts the stiffness of virtual springs during the manipulation in order to keep the forces around a desired value. This is done by a loose proportinal controller that allows a reasonable steady-state error to enable the manipulation but responds larger errors. We evaluate the method with experiments on a real robot hand [6]. The robot learns the rotation and translation tasks (Fig. 1.b) from a single example, demonstrated kinestheti- cally by a human (Fig. 2.a). It then reproduces the actions autonomously with different initial state, final state, object size conditions as shown in Figure 2. We also observe

Upload: others

Post on 16-Mar-2020

0 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Learning object-centric trajectories of dexterous hand ... · Gokhan Solak 1and Lorenzo Jamone Dexterous hand manipulation tasks are challenging due to the complexities of encoding

Learning object-centric trajectories of dexterous handmanipulation tasks from demonstration

Gokhan Solak1 and Lorenzo Jamone1

Dexterous hand manipulation tasks are challenging dueto the complexities of encoding the task, coordinating thefingers, and tracking the object. Since some daily tasksrequire the manipulated object to follow tricky trajectoriesas in writing and sewing, it is not viable to design themotion manually. When the desired motion is known, thecontroller has to deal with both the task execution andthe object grasping simultaneously. Due to the hand-objectocclusions visual object tracking is limited and tactile sensingis necessary. We propose a combination of the dynamicalmovement primitives (DMP), the virtual spring framework(VSF) and a force-feedback based online grasp adaptationmethod to overcome these challenges. The DMP is usedto learn the complex trajectories from demonstration, theVSF is used to coordinate the fingers to keep the object ingrasp during manipulation, and the force-feedback is usedto continuously adjust the grasping forces to increase thecontact stability.

The VSF is an impedace-based, object-agnostic graspingmethod that assumes virtual springs that connect the finger-tips to a virtual object frame (Fig. 1.a). The virtual frameis approximated using only the fingertip positions withoutknowing the exact model or pose of the object. Stiffnessparameters of the springs are determined heuristically orlearned from demonstration as in [4]. It also adds complianceto the system which is helpful against uncertainties [2]. VSFallows designing simple object-centric dexterous manipula-tion actions such as translation and rotation as presented by[7] and [4]. However, some tasks depend not only on the finalstate but also on the specific trajectory as discussed above.

Learning policies to achieve complex trajectories is possi-ble using reinforcement learning [1], learning from demon-stration [3] or both [5]. We follow the demonstration ap-proach since it requires less resources. The DMP makes itpossible to learn complex trajectories from a single demon-stration.

The DMP is a framework that represents the motion as adynamical system [3]. The dynamical system is designed toensure convergence to a goal state. Custom trajectories arelearned as an additive term that perturbs the system. It ispossible to change the initial and goal conditions of a taskwithout losing the trajectory shape. The dynamical system

*This work was partially supported by the EPSRC UK (with projectsMAN3, EP/S00453X/1, and NCNR, EP/R02572X/1)

1Gokhan Solak and Lorenzo Jamone are with School of Elec-tronic Engineering and Computer Science, Queen Mary Universityof London, 10 Godward Square, Mile End Road, London E1 4FZ{g.solak,l.jamone}@qmul.ac.uk

O

HtHt+1H0

(a) (b)Fig. 1. (a) Virtual spring framework and object trajectory control. Robotfingertips are connected to the object frame O with virtual springs. Objectpose is controlled in order to minimize the error between O and the referenceframe Ht. The reference frame follows a trajectory that is learned fromdemonstration. (b) The experiment setup. The initial state (top image), thefinal state after translation (middle image) and the final state after rotation(bottom image).

also provides an attractor landscape which drives the systemtowards goal in case of perturbations.

We use the DMP to learn object-centric trajectories for thevirtual object frame of VSF. Thus, the trajectory is unawareof the finger positions and kinematics. The DMP definesthe motion of object, the fingers follow the object passivelywith the influence of virtual springs. Desired finger motion istransformed to desired joint motion using inverse kinematicsand the joints torques are calculated using PID control. Thisprovides a base controller to both achieve the task and keepthe object in hand. However, grasping an unknown object isprone to occasional contact losses.

We use the feedback from fingertip force sensors toprevent contact losses. This method adjusts the stiffness ofvirtual springs during the manipulation in order to keepthe forces around a desired value. This is done by a looseproportinal controller that allows a reasonable steady-stateerror to enable the manipulation but responds larger errors.

We evaluate the method with experiments on a real robothand [6]. The robot learns the rotation and translation tasks(Fig. 1.b) from a single example, demonstrated kinestheti-cally by a human (Fig. 2.a). It then reproduces the actionsautonomously with different initial state, final state, objectsize conditions as shown in Figure 2. We also observe

Page 2: Learning object-centric trajectories of dexterous hand ... · Gokhan Solak 1and Lorenzo Jamone Dexterous hand manipulation tasks are challenging due to the complexities of encoding

(a) (b) (c)Fig. 2. (a) Kinesthetic demonstration of the translation action. (b) Reproduction with different initial and goal positions. (c) Reproduction with differentobject sizes. Blue arrows show desired translation amounts.

that the force-feedback improves the grasp stability duringmanipulation.

REFERENCES

[1] M. Andrychowicz, B. Baker, M. Chociej, R. Jozefowicz, B. McGrew,J. Pachocki, A. Petron, M. Plappert, G. Powell, A. Ray, et al. Learningdexterous in-hand manipulation. arXiv preprint arXiv:1808.00177,2018.

[2] Z. Chen, C. Ott, and N. Y. Lii. A compliant multi-finger grasp approachcontrol strategy based on the virtual spring framework. In InternationalConference on Intelligent Robotics and Applications, pages 381–395.Springer, 2015.

[3] A. J. Ijspeert, J. Nakanishi, H. Hoffmann, P. Pastor, and S. Schaal.Dynamical movement primitives: learning attractor models for motorbehaviors. Neural computation, 25(2):328–373, 2013.

[4] M. Li, H. Yin, K. Tahara, and A. Billard. Learning object-levelimpedance control for robust grasping and dexterous manipulation. InRobotics and Automation (ICRA), 2014 IEEE International Conferenceon, pages 6784–6791. IEEE, 2014.

[5] A. Rajeswaran, V. Kumar, A. Gupta, J. Schulman, E. Todorov,and S. Levine. Learning complex dexterous manipulation withdeep reinforcement learning and demonstrations. arXiv preprintarXiv:1709.10087, 2017.

[6] G. Solak and L. Jamone. Learning by demonstration and robustcontrol of dexterous in-hand robotic manipulation skills. In IEEE/RSJInternational Conference on Intelligent Robots and Systems. IEEE,2019.

[7] T. Wimbock, C. Ott, A. Albu-Schaffer, and G. Hirzinger. Comparisonof object-level grasp controllers for dynamic dexterous manipulation.The International Journal of Robotics Research, 31(1):3–23, 2012.