towards a comprehensive grasping system for armar-iii · hand or the arm reaches a certain state,...

TOWARDS A COMPREHENSIVE GRASPING SYSTEM FOR ARMAR-III

Nicolas Gorges, B. Heinz Worn

Institute for Process Control and RoboticsUniversity of Karlsruhe (TH)

Karlsruhe, Germany

Alexander Bierbaum, Rudiger Dillmann

Institute of Computer Science and EngineeringUniversity of Karlsruhe (TH)

Karlsruhe, Germany

ABSTRACT

In this paper, we present our approach of merging two ex-isting grasping systems into one system that comprises vi-sion and touch. The first system is a low-level control systemwhich deals with coordination issues and is closely linked tothe hand, arm and the tactile sensor system. This system is re-sponsible for the grasp execution. The second one starts fromthe perception level and deals with object detection and graspplanning. The missing link is a path planner which transfersa chosen grasp plan to the executive control system. The pro-posed grasping system will be evaluated in a test scenario andbe integrated into the new demonstrator robot of our collabo-rative research center, ARMAR III.

1. INTRODUCTION

The special research area “Humanoid Robots” (formal nameSFB 588) aims at the development of a robot system whichinteracts and cooperates with human beings. The ability tograsp and manipulate common objects is essential for sup-porting the human with everyday tasks. The improved dex-terity of the new SFB 588 demonstrator ARMAR- III allowsthe realization of new evaluation scenarios but also impliesthe need for the integration of a new comprehensive grasp-ing system. The humanoid robot ARMAR-III is described indetail in [1].

The grasping procedure requires different levels of plan-ning, controlling and coordination. In order to grasp and ma-nipulate objects, the robot needs e.g. to drive its actuators tothe intended position. To get there it needs an adequate sen-sor system and requires processing of the sensor data. Thesensors on the other side need to be embedded into the con-trolling system.

Grasping is a major issue in robotics. In this context, vi-sual servoing and visually guided grasping is an essential re-search topic. In the last year also the research of graspingwith tactile feedback has become popular again. Still, mostof the grasping systems focus on one sensor modality and donot combine them into one comprehensive system.

We will present an approach to merge two concepts ofalready existing grasping systems to fulfill the given require-

ments and to establish a basis for ARMAR-III. On the onehand, we have a bottom-up grasping system developed at theIPR (Institute for Process Control and Robotics) which fo-cuses on the controlling level and which is closely connectedto arm, hand and the tactile sensor system. It performs thegrasp execution which includes coordination and synchroniza-tion processes like hand-arm coordination.

On the other hand, we have a top-down grasping systemdeveloped at the CSE/IAIM (Computer Science and Engi-neering/Industrial Applications of Computer Science and Mi-cro Systems) which starts from the visual perception level.The perception incorporates an object detection componentwhich feeds a grasp planning system based on 3D object mod-els. This system is described in detail in [2]. The missing linkis a path planning system which connects both systems - itbuilds an extension of the high-level grasping system and aninterface to the low-level coordination framework.

Talking of grasping systems it is important to mention theapproach of Kragic et al [3] which also comprises vision sys-tem, grasp planner and a robot hand as executive unit. Thissystem needs intervention by a human operator for planningthe grasp for an identified and localized object. Further, theirobject recognition by vision is limited to planar-faced objects.The components of the system presented here comprise an in-tegrated grasp planning module which is directly linked to theobject detector and a vision system that can identify and lo-calize arbitrarily complex objects.

2. SYSTEM

At first, we show the mechanical design of arm and handwhich is essential for the dexterity - it determines the funda-mental skills of the robot. The design of the hand determineswhat kind of objects can be grasped and what kind of ma-nipulations can be performed. Consecutively, we describe theproposed grasping system which will be tested on the evalu-ation platform and then be ported to the central demonstratorrobot of our collaborative research center shown in Fig. 3.

Fig. 1. The current robotic evaluation platform.

2.1. Evaluation platform

The current evaluation platform consists of a 7 DOF arm ofamtec PowerCubes, a pneumatic hand, and a pan-tilt unit car-rying a stereo camera. The hand has been developed by ourproject partners at the IAI in the Karlsruhe Research CenterFZK (reference?). The arm and the hand are equipped withtactile sensors on the upper arm, elbow, forearm, hand shaftsand finger tips. A 6 DOF force-torque sensor at the wrist isused to measure 3 DOF forces and 3 DOF torques. The robotis pictured in Fig. 1, and a close-up of the hand equipped withtactile sensors in Fig. 2.

2.2. Grasp strategies

The underlying grasping approach is based on the idea thatcertain objects are linked with a corresponding grasp. Thegrasps to be performed in most of the scenarios can be re-stricted to a base of relatively simple geometric shapes. Anoffline grasp planer can compute an adequate grasp for thebasis shapes and store them in a grasp library. For perform-ing a grasp online, the object to be grasped is sampled into aset of basis shapes. Then, an adequate grasp skill must nowbe selected from the database and transferred to the currentshape.

The execution of a grasp is decomposed into four phases.Each of these phases is represented by different hand con-figurations, arm trajectories, velocity of the arm movement,sensibility of the hands’ tactile sensors.

1. Coarse approach: The vision system detects the ob-ject and gives a position estimate. The arm is movingwith comparatively high velocity and the hand movesto pre-grasp configuration.

Fig. 2. Hand protype with tactile sensors

2. Fine approach: The arm is moving with slower finemovement and the hand is waiting for contact.

3. Grasp object: The arm position is adjusted with smallmovements while the finger move towards objects. Con-tact forces are are increased to desired values whilegrasp configuration is checked.

4. Depart: Slower fine movements are performed withfixed Cartesian orientation of the hand.

These four phases follow from the concept of synchroniza-tion and coordination of arm-hand movements (c.c. 2.3.4)which requires a discrete presentation. Each of these phasesis represented by different hand configurations, arm trajecto-ries, velocity of the arm movement and each of them reactsdifferently on the input of the sensor system. As soon as thehand or the arm reaches a certain state, the next step of thesynchronization process is triggered. This behavior can beadapted according to the current grasp task.

2.3. High-level grasp control

The control architecture describes the basic building blocksthat make up the robot grasping system and the relations theyhave with each other. It is depicted in Fig. 4. The high-levelrobot control symbolizes the interface to a superior controlstructure. This can be a task planner of a robot or an interfaceto the human. However, it determines the grasping task ac-cording to the current situation which involves the commandof grasping a specific object and selecting a grasp type. Theobject detector identifies objects and provides the high-levelrobot control with information about the environment. Theinvolved modules of this grasping system are described in thefollowing sections.

Fig. 3. The demonstrator robot of our collaborative researchcenter.

2.3.1. Object detector

The vision system has to perform object recognition and lo-calization as the grasp planner must be provided with infor-mation about identity and location of the object that is to begrasped. To achieve this, the system has to compensate forseveral problems. As it is intended for use on a mobile robotit must be operating properly on a moving platform, thus ob-ject segmentation can not be performed by simple backgroundsubtraction, the recognition must be invariant against differ-ent perspectives. As the objects to be recognized might bescattered arbitrarily in the scene, recognition must also be in-variant against 3D rotation and translation. The objects mustbecome fully localized in 6D space,i.e. including their orien-tation. All of these tasks should be accomplished in real-time,which means it should take an interval in the dimension of theframe rate period for calculating the information describedabove.

We have combined visual recognition and localization sys-tems for two classes of objects: objects that can be segmentedby a single color and objects that can be recognized by theirtexture features. The focus for the vision system design waslaid on appropriate performance for grasp planning and graspexecution by a humanoid mobile robot. For the unicolor globalappearance-based object recognition we deploy the algorithmfrom [4] which is suitable for uniformly colored objects. Thisapproach uses a non-adaptive color model, which is sufficientfor constant lighting conditions as in our test environment.During a learning phase the dataset for different views of anobject is generated automatically using a 3D-model. For thatreason this method can be regarded as a combination of anappearance-based with a model-based visual recognition sys-

Fig. 4. High-level grasp control

Fig. 5. Object segmentation by color

tem. These datasets are provided with orientation informationfrom the generated model views. For recognition, candidateregions are segmented from the camera image and matchedwith the formerly acquired datasets. The matching processis realtime capable, with a database of five objects it takesapproximately 5 ms for one potential region to analyze. Anexample of this is shown in figure 5. Localization of an ob-ject via stereo triangulation of the matched regions centroidsin the left and right image is supported. The orientation of amatched object in the dataset can be determined with an ac-curacy of +/-5mm within the robot demonstrators workspace.The example database used for unicolored objects includesdifferent types of dishes like cups and plates.

For recognition of objects with textured surface appear-ance a different approach is used [5] which is capable of iden-tifying and discriminating textured objects like printed TetraPak boxes. The method is appearance-based as it only de-ploys trained images but no models of the objects. For fea-ture extraction, patches from the intensity image with extremegradient appearance or extreme gradient curvature are consid-ered, as shown in figure 6. In the learning phase these patchesare stored as original and as warped transformations, whichcreates translation and orientation invariant datasets. To com-pensate for illumination conditions the patches are normal-

Fig. 6. Object segmentation by texture

ized. The database representations are clustered to increasecomputational efficiency and provide more accurate resultsby the tree-search in the recognition phase. The algorithm istolerant to occlusion if enough features can be extracted fromthe image. Though, the algorithm can naturally not compen-sate for reflections on the surface of objects. The recognitionperformance is about 350ms for a 20 object database on aPentium 4 PC with 3GHz. In the near future the system willbe extended to provide orientation and localization informa-tion, which is implicitly already available as parameter of thecorrespondence search.

2.3.2. Grasp planner

The grasp planner module can access a global database withCAD-models of objects. The current database includes cups,plates, bottles or primitive bodies like cubes. The grasp plan-ning process is currently executed offline using the programGraspIt! [6]. GraspIt! is a robotic grasping simulator, whichuses geometric models of the robot hand and objects in avirtual workspace to determine feasibility and quality of agrasp. It provides collision detection for rigid bodies, con-siders material friction in the contact model and calculates aquality measure for contacts between hand and object. Newrobot hands and object models can be added to the simula-tor. As described in [7] the program was extended to supportthe SFB588 humanoid robot hand for planning. This robothand has eight joints which are partially coupled leading tofour degrees of freedom for actuation. Despite this restric-tion the planner takes into account the full eight degrees offreedom as a hand with full control over all joints will soonbe available and simulation of the coupled joints is complex.In the simulator we can evaluate with this model for threepower grasp types (hook,cylindrical,spherical) and two preci-sion grasp types (pinch,tripod) which represent a subset of thegrasp taxonomy of Cutkosky [8].

The output of the planner module is a set of parametrizedgrasps comprising the following information:

• Grasp starting point (GSP)

• Grasp approaching vector

• Hand orientation rotated around the axis of approach-ing vector. This is specific to robot hand and grasp type.

• Preshape posture and joint closing velocities. This isspecific to the grasp type.

The description created by the planner only covers start-ing conditions, not final conditions, which makes it suitablefor robot hands without tactile sensors and force control. There-fore, obviously the planner does not cover reactive or dynamicgrasping scenarios. The dataset is generated automaticallywith primitive models of objects following [9]: for a given ob-ject the planner generates suitable GSPs and approaching vec-tors and starts testing the available grasp types for the robothand. For power grasps it simulates the hand approachingthe object with full aperture until contact is detected and thencloses it. If the quality of the contact is below a certain thresh-old the planner repeats the test with the hand closing at an in-creased distance until the quality measure indicates maximumstability for the grasp. For precision grasps the test sequenceis different: from the GSP the hand is continuously testedfor contact while closing from the preshape posture at regu-lar distance intervals on the approaching vector. All graspsfor which a quality measure value above a certain thresholdis calculated, are stored in the grasp dataset of the particularobject.

2.3.3. Path planner

The adapted path planner [10] is based on a static scene graphof the environment and takes the shapes and kinematics of therobot hand and arm into account. The framework has to beenhanced to allow a dynamic scene where the human and therobot interact with the environment. After a firm grasp withan object is established, the path planner must also considerthat the object is now part of the kinematic chain.

For synchronization of hand and arm movements, the ex-ecution of a grasp is divided into four phases (c.c. ). For eachof these phases, a 3D-point (6DOF) the path planner speci-fies the position of the hand root where the next step of thesynchronization process is triggered. The first synchroniza-tion point is given by the starting position. The second onemust be chosen within a safety distance to the object and tothe environment as the arm is moving with higher velocity.The third one is given by the pre-grasp position computed bythe grasp planner and the fourth point is given by the departposition.

As the grasp planner does not completely consider thekinematic of the arm, a desired grasp position might not bereachable. Therefore, the path planner get a set of potentialgrasp and process these grasps until a valid configuration isplanned. A process loop is shown in the following:

Coordination

High-level arm control

CoordinationHigh-level

hand control

Localarm control

Local hand control

Tactile sensors

High-level grasp control

Coordinationobject

Fig. 7. Overview, low-level grasp control and interfacing toupper level

1. Take a grasp candidate. The grasp candidate containsthe pre-grasp finger positions and the grasp configura-tion.

2. Compute a collision-free path for the arm from the start-ing position to the pre-grasp position. Compute a collision-free path for fingers from the pre-grasp position to thegrasp position.

3. Choose the first three synchronization points for thehand-arm coordination process.

4. Compute a collision-free path from the grasp positionto the desired depart position.

2.3.4. Grasp execution

The grasp execution is responsible for setting the parametersof the low-level components and for observing of their correctexecution. Figure 7 illustrates the control framework as pro-posed in [11] [12] and shows the different components whichare involved in the low-level control of such a grasping pro-cess.

At the top is the high level robot control which sets theparameters of the low level components and supervises thewhole grasping procedure. On the next lower level are highlevel controllers. These are able to communicate with theirneighboring components via the coordination object in orderto synchronize and coordinate their activities. This communi-cation is independent from the high level robot control and in-creases therefore the efficiency of the control system. Grasp-ing needs a coordination and synchronization of hand and armmovements. The arm places the hand in the right grasping po-sitions before. After performing a grasp, the hand has to de-part. The coordination object determines thereby the behaviorof the hand-arm coordination.

For parametrization, the high-level arm control compo-nent get the trajectory of the arm movement and the synchro-nization points for hand-arm coordination. The high levelhand control needs the pre-grasp and grasp finger configu-ration as well as the finger trajectories.

3. EVALUATION

Finally, the grasping system has to cope with certain bench-mark scenarios. The benchmarks are currently restricted bythe design and the integration level of the mechanical compo-nents of the robot. Therefore, we consider only the use of onehand at the moment and focus on the scenarios listed in thefollowing:

1. Grasping an known object from a known position

2. Give and take objects to/from an human

3. Grasping known objects from unknown positions

Results will be included in the final paper.

4. CONCLUSION

To enable the robot to perform dextrous fine manipulation ingeneral, an adequate mechanical and control system is needed.We have shown that basis skills can already be performedwith the first version of our hand and with the proposed grasp-ing framework. The robot hand is capable to grasp a certainrange of object using vision and touch. The new hand to bebuilt will have a significantly increased functionality and al-low tasks like fine manipulation and tactile exploration.

Currently, the grasp planner requires geometric models ofthe objects to be investigated for grasp planning. As theseobject models need to be created a-priori, the planning pro-cess is executed offline. There is no module which createsa geometric model from the perception of the vision system.The vision system recognizes an object and makes informa-tion about location, orientation and identity available to thegrasp selector. For the future, we plan to implement a moreversatile but possibly less exact geometric component recog-nition and estimation which could be directly fed to the graspplanner in an online grasp planning scenario. The capabilityof estimating unknown geometries in 3D space by vision isa key feature for our planned approach in online grasping ofa-priori unknown objects. Finally, future work will also in-clude visual servoing in an extended coordination frameworkto increase the accuracy of the hand positioning.

5. REFERENCES

[1] T. Asfour, K. Regenstein, P. Azad, J. Schroeder, A. Bier-baum, N. Vahrenkamp, and R.Dillmann, “Armar-iii: An

integrated humanoid platform for sensory-motor con-trol,” in Submitted to: IEEE-RAS International Con-ference on Humanoid Robots, Genova, Italy, 2006.

[2] A. Morales, T. Asfour, P. Azad, S. Knoop, and R. Dill-mann, “Integrated grasp planning and visual objectlocalization for a humanoid robot with five-fingeredhands,” in To appear in: International Conference onIntelligent Robots and Systems (IROS), Beijing, China,2006.

[3] D. Kragic, A. T. Miller, and P. K. Allen, “Real-timetracking meets online grasp planning,” in IEEE Interna-tional Conference on Robotics and Automation, Seoul,Republic of Korea, 2001.

[4] P. Azad, T. Asfour, and R. Dillmann, “Combiningappearance-based and model-based methods for real-time object recognition and 6d localization,” in To ap-pear in: Proc. International Conference on IntelligentRobots and Systems (IROS), Beijing, China, 2006.

[5] K. Welke, P. Azad, and R. Dillmann, “Fast and robustfeature-based recognition of multiple objects,” in Sub-mitted to: IEEE-RAS International Conference on Hu-manoid Robots, Genova, Italy, 2006.

[6] A. T. Miller and P. K. Allen, “A versatile simulatorfor robotic grasping,” in IEEE Robotics & AutomationMagazine, 2004, vol. 11.

[7] A. Morales, P. Azad, and A. Kargov, “An anthropo-morphic grasping approach for an assistant humanoidrobot,” in Procceddings of the 37th International Sym-posium on Robotics, Munich, Germany, 2006.

[8] M. Cutkosky, “On grasp choice, grasp models andthe design of hands for manufacturing tasks,” in IEEETransactions on Robotics and Automation, 1989, vol. 5.

[9] A. T. Miller and S. Knoop, “Automatic grasp planningusing shape primitives,” in Proceedings of the IEEEInternational Conference on Robotics and Automation,2003.

[10] D. Mages, B. Hein, and H. Worn, “Introduction ofadditional information to heuristic path planning al-gorithms,” in IEEE Conference on Mechtronics andRobotics, Aachen, Germany, 2004, pp. 1023–1028, PaulDrews.

[11] Dirk Osswald, Sadi Yigit, Oliver Kerpa, CatherinaBurghart, and Heinz Woern, “Arm-Hand-Koordinationeines anthropomorphen Roboterarms zur Mensch-Roboter Kooperation,” in 18. Fachgesprach “AutonomeMobile Systeme” (AMS), Karlsruhe, Germany, 2003.

[12] Dirk Osswald, Sadi Yigit, Catherina Burghart, andHeinz Woern, “Programming of Humanoid-Robots withMovement-Objects for Interactive Tasks,” in Proceed-ings of the Int. Conf. on Computer, Communication andControl Technologies (CCCT), Orlando, FL, 2003.

towards a comprehensive grasping system for armar-iii · hand or the arm reaches a certain state,...

Documents