GRASPA 1.0: GRASPA is a Robot Arm graSping Performance benchmArk

The use of benchmarks is a widespread and scientifically meaningful practice to validate performance of different approaches to the same task. In the context of robot grasping the use of common object sets has emerged in recent years, however no dominant protocols and metrics to test grasping pipelines have taken root yet. In this paper, we present version 1.0 of GRASPA, a benchmark to test effectiveness of grasping pipelines on physical robot setups. This approach tackles the complexity of such pipelines by proposing different metrics that account for the features and limits of the test platform. As an example application, we deploy GRASPA on the iCub humanoid robot and use it to benchmark our grasping pipeline. As closing remarks, we discuss how the GRASPA indicators we obtained as outcome can provide insight into how different steps of the pipeline affect the overall grasping performance.

READ FULL TEXT VIEW PDF
POST COMMENT

Comments

There are no comments yet.

Authors

page 1

page 2

04/06/2018

Human Robot Interface for Assistive Grasping

This work describes a new human-in-the-loop (HitL) assistive grasping sy...
05/19/2022

HandoverSim: A Simulation Framework and Benchmark for Human-to-Robot Object Handovers

We introduce a new simulation benchmark "HandoverSim" for human-to-robot...
05/17/2022

Automatic Acquisition of a Repertoire of Diverse Grasping Trajectories through Behavior Shaping and Novelty Search

Grasping a particular object may require a dedicated grasping movement t...
09/10/2018

Intelligent flat-and-textureless object manipulation in Service Robots

This work introduces our approach to the flat and textureless object gra...
04/28/2020

Transferable Active Grasping and Real Embodied Dataset

Grasping in cluttered scenes is challenging for robot vision systems, as...
09/24/2019

CAGE: Context-Aware Grasping Engine

Semantic grasping is the problem of selecting stable grasps that are fun...
04/23/2021

OCRTOC: A Cloud-Based Competition and Benchmark for Robotic Grasping and Manipulation

In this paper, we propose a cloud-based benchmark for robotic grasping a...
This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

I Introduction

In recent years, many robotic grasping pipelines have been proposed in the literature featuring consistent differences in hypotheses, methodology and experimental evaluation, in particular with respect to the objects and robotic platform used [bohg_data-driven_2014]. Given such variability, reproducible test conditions, a standardized set of objects, a benchmarking protocol and a suite of metrics are fundamental to make fair performance comparisons. Although a subset of the manipulation research community has already converged on a standard set of objects (i.e. the YCB object and model set [calli_ycb_2015]), a widespread protocol and a system of metrics for properly comparing different pipelines are still missing.

Validation of candidate grasps in simulation alone with force closure quality measures [roa2015grasp] has been proven to be unreliable [kim2013physically]. Such a limitation, together with the lack of a dominant metric, led to the common practice of empirically testing grasp pipelines with a simple success rate over a given number of trials and objects [levine2018learning, mahler2017dex]. However, this kind of binary metric is somewhat limited, since it has no means of decoupling limitations of the algorithm itself from those of the test platform.

Fig. 1: An example of the benchmark setting deployed on the iCub humanoid robot.

In this paper we propose GRASPA 1.0 (GRASPA is a Robot Arm graSping Performance benchmArk), a benchmarking protocol and a set of metrics to evaluate the performance of grasping pipelines. It aims to fairly compare methodologies tested on different robots by measuring and accounting for platform limitations that might hinder the overall performance. The proposed benchmark features:

  • Printable layouts of predefined grasping scenarios (populated with YCB object subsets) equipped with localization markers to enhance test reproducibility.

  • A protocol to assess the robot reachability and the calibration of the vision system within the defined grasping setup area.

  • A widely-used grasp quality metric to evaluate candidate grasping poses before their physical execution.

  • A score to assess grasp stability during the physical execution on the robot.

  • Possibility to benchmark the pipeline either in isolation or in clutter, with the definition of a further metric to evaluate the obstacle avoidance in the latter case.

  • A composite score to quantify the overall performance of the pipeline.

We published on GitHub111https://github.com/robotology/GRASPA-benchmark the code for computing the benchmark scores and instructions on how to collect the required data. Additionally, we made available a Docker container to ease installation and a cloud hosted environment to test the code without requiring any installation.

We employed GRASPA to assess the performance of the grasping pipeline proposed in [nguyen2018merging] using the iCub humanoid robot. The code we used to collect the data on the iCub is also available222https://github.com/robotology-playground/GRASPA-test and can be used as an example procedure to collect the required data.

The paper is organized as follows. Section II reviews relevant work concerning benchmarks available for grasping applications, including object sets and metrics. In Section III we outline the proposed benchmark. In Section V, we provide an example by using GRASPA to benchmark a grasping pipeline on the iCub robot. Section VI concludes the paper with some closing remarks and perspectives for future extensions of the benchmark. As part of this work, we attach to the submission a benchmark and a protocol document compiled according to the YCB benchmark templates333http://www.ycbbenchmarks.com/protocols-and-benchmarks.

Ii Related Work

In recent years, the success of data-driven methods has brought new ideas and advancements in the field of robotic manipulation [bohg_data-driven_2014, mahler2017dex, kopickioneshot2016, levine2018learning, ten2017grasp], at the same time pushing the community towards testing applications on common sets of both real objects [matheus_benchmarking_2010, kasper2012kit, calli_ycb_2015] and meshes [singh_bigbird_2014, ChangFGHHLSSSSX15]

to develop benchmarking protocols. Despite the complexity of grasping pipelines and the variability in test setup design, however, most of the available benchmarks meant to be deployed on real robots are based on simple success/failure binary evaluation metrics

.

Challenges such as the Amazon Picking Challenge [bell2015apc] and RoboCup@Home [stuckler_robocuphome_2012] proved to be quite effective in benchmarking entire autonomous pipelines by defining strict rules and tasks. However, in these contexts the tasks themselves are often difficult to reproduce and the number of accepted teams is typically small.

The VisGraB benchmark [kootstra_visgrab_2012] presents a toolbox to evaluate vision-based grasp planners in simulation. VisGraB provides real stereo images of objects in various conditions and a software environment to analyze the quality of user-planned grasps in a simulated environment. However, it does not account for any real execution of the task, nor the type and performance of the manipulator and end effector.

The ACRV benchmark [leitner_acrv] and the one published by Triantafyllou et al. [grocery_benchmark] tackle the issue of reproducibility by proposing a set of objects and layouts for industrial shelving and pick and place applications. Both argue that physical execution of the task is essential in evaluating the performance of pick and place pipelines, although their protocols do not account for test platform limitations and the score metrics do not provide insight on the performance of single pipeline steps.

Iii Benchmarking Protocol

In this Section we outline the proposed benchmarking protocol, focusing on the design of the grasping layouts, and the metrics to evaluate the individual pipeline steps.

Iii-a Benchmark Layouts

GRASPA is designed to evaluate grasping pipelines on an area located in front of the robot with dimensions 594x420 mm (A2 standard paper size), resulting in the setup shown in Fig. 1. GRASPA uses a subset of the YCB object set (see Fig. 2), selected in order to include a range of shapes, dimensions and challenges for the grasping task. We propose 3 scenarios of increasing complexity in terms of number, shape and pose of the included objects (see Fig. 2(a), 2(b), 2(c)). Moreover, GRASPA can evaluate pipelines that work both in isolation (i.e. one object at a time in the layout) and in clutter444In this work, we refer to clutter as a situation where the objects are visually occluded (as long as a top down view is not used) and the presence of objects limits the task space of the robot while planning for grasp and avoiding collisions (i.e. all objects at the same time). In the latter case, the added challenge is accounted for in the final score.

The 6D object poses are expressed with respect to the layout reference frame shown in Fig. 2(a), 2(b), 2(c). To this end, an ArUco marker board [garrido2014automatic] is embedded in the printable layouts to enable the experimenter555From this point onwards, we refer to the experimenter as ”the user”

to estimate the layout reference frame pose in a robust way. Users need to express all the information collected during the benchmark procedure with respect to the layout reference frame so as to be independent from the position of the physical board. Finally, we provide printable layouts of dimensions 594x420 mm (i.e. A2 format) that include markers and object footprints (e.g. Fig.

2(d)).

(a) Benchmark Layout 0
(b) Benchmark Layout 1
(c) Benchmark Layout 2
(d) Printable Layout 0
Fig. 2: From (a) to (c): the 3D renders of the three layouts defined within the benchmark. (d) shows one of the provided printable boards that allow for reproducibile object placement on a physical setup.

Iii-B Reachability within the Layout

Depending on the testing platform, the robot arm size, mechanical structure or joint range limits may impair the capability of the end-effector to reach some layout regions with accuracy. Therefore, grasps in these regions might fail regardless of the performance of the planner. To avoid penalizing planners for the limits of the test platform, an index of reachability over the layout area must be included in the benchmark. In GRASPA, we adopt an empirical approximation of such measure by dividing the layout area in 6 regions, each with a reachability score for (Fig. 3). The reachability score

for each region is defined over a set of poses uniformly distributed over the layout area with different orientations (Fig.

4). The user makes the robot reach (or attempt to) for these pre-defined poses and then acquire the ones actually reached by querying the forward kinematics. Poses placed on the boundary of contiguous regions are considered to belong to both regions.

Fig. 3: Regions used to determine the robot reachability and the calibration of the vision system within the layout.
(a) Set no. 0
(b) Set no. 1
(c) Set no. 2
Fig. 4: Poses defined for evaluating the robot reachability within the layout. Set of poses no. 1 (Fig. 4(b)) is also used for testing the calibration of the vision system.

The score for the -th region is given by:

(1)

where is the number of poses in region actually reached by the robot with a given accuracy and is the number of poses belonging to the region .

A pose is considered to be reached if the position and orientation errors , are smaller than the thresholds defined by the user. In Section IV-A we elaborate more on such thresholds. The errors are computed as follows:

(2)
(3)

where is the angle of the equivalent axis-angle representation of the matrix:

(4)

with and respectively the desired and reached orientation matrix relative to pose  [sciavicco2012modelling].

For each benchmark layout , we associate to each object (with being the number of objects included in layout ) the reachability score of the region where the object is located. For simplicity, an object belongs to the region its center of mass falls into. Thus, for each object in each layout we obtain the reachability score :

(5)

Iii-C Camera Calibration within the Layout

Drawing a parallel to the reachability problem, GRASPA aims to assess the precision of the manipulator when reaching for poses acquired by the visual system in the camera reference frame. Hence, our benchmark defines a camera calibration score for each -th region introduced in the Section III-B. In order to evaluate the scores , the robot is asked to reach a subset of the poses defined for the reachability evaluation (Fig. 4(b)). The 6D pose reached by the end-effector should be acquired through the vision system (e.g. by affixing a marker to the end-effector, in a known position and orientation with respect to the kinematic chain).

The score is then computed as:

(6)

where is the number of poses in region actually reached by the robot with a given accuracy and is the number of poses belonging to the region .

A pose is considered to be reached if the position and orientation errors , (computed according to Eq. (2) - (4)) are smaller than respective thresholds defined by the user. The only difference with respect to the scores is that the poses actually reached by the robot are acquired through the robot vision system and not from the forward kinematics.

Also in this case, for each benchmark layout , we associate to each object the camera-calibration score of the region where the object is located. Thus, for each object we obtain the camera calibration score :

(7)

Since GRASPA layouts are defined with respect to the board reference frame, the benchmark protocol can be applied to grasping pipelines that do not process visual input (provided the user can reliably define a transform between the robot and the board reference frames). In such case, the benchmark does not take into account the scores .

Iii-D Graspability

Different robots might have diverse grasping capabilities due to the arm maximum payload and the end-effector design and size. Grasping pipelines should not be benchmarked on objects the robot cannot grasp or lift because of hardware limitations. GRASPA encodes this information in the graspability score , defined for each object in layout . is 1 if the weight of the object is compatible with the robot payload and 0 otherwise. is 1 if the end effector aperture is larger than the smaller dimension of the object and 0 otherwise. For simple objects such as a box, this dimension is the shorter edge of the enclosing 3D bounding box, while for complex objects (e.g. the power drill) this can be the diameter of the grip. Objects can also be declared un-graspable by other criteria, if sufficient motivation is given.

Iii-E Grasp Quality

This index evaluates grasps planned by the pipeline before execution, regardless of reachability. GRASPA uses a metric that relies on computation of the Grasp Wrench Space (GWS) and Object Wrench Space (OWS) [borst_grasp_2004]. This metric, while not being the most robust to uncertainty [kim2013physically], is still widely used in many grasping toolboxes such as Simox, OpenRAVE and GraspIt! [simox, diankov2008openrave, graspit].

The user is required to provide the kinematic structure and the collision mesh model of their end-effector. Grasps have to be parametrized in terms of end effector pose and pregrasp configuration of the joints, making GRASPA compatible with both grippers and multifingered hands. Grasps are tested by first moving the end effector model to the desired pose with the desired pregrasp configuration, and then simulating the finger closure motion (in case of multifingered hands, joints are moved with equal velocity). When contact points are detected (via collisions between the object and end effector meshes), joints attached to the links that have collided are stopped. While this approach is straightforward for power grasps, pipelines that plan the contact locations need to be tested by setting the final hand configuration as a pregrasp.

We assume a hard point contact with friction model with a fixed friction coefficient. Non-graspable objects (according to Subsection III-D) do not receive any score. The grasp quality for each graspable object in layout can be expressed as

(8)

where are the radii of the largest spheres contained, respectively:

  • in the GWS defined by the -th grasp planned for the -th object. is obtained by perturbing the grasping pose (before closing the fingers) in both position and orientation to ensure robustness, and then averaging the results;

  • in the OWS of the -th object, and is computed regardless of the grasp.

GRASPA v1.0 uses the implementation of the aforementioned metric included in GraspStudio [simox].

Iii-F Grasp Execution and Stability

GRASPA combines all the previously defined scores with grasp executions on physical robots. A binary success score for each object in layout is evaluated over grasp executions:

(9)

where

The object is considered grasped if it can be lifted by m and held without falling for at least five seconds. Contact slip is acceptable as long as it does not ultimately cause the object to fall. The score can be evaluated by executing both grasping in isolation or in the cluttered scene, assuming the same modality is kept for each object and layout.

Finally, the benchmark evaluates the stability of the grasp during the execution of a fixed trajectory. This trajectory simply consists of rotations around the end effector approach axis and in the vertical plane such axis passes through. Given the grasping pose , with as position and as the rotation matrix representing the orientation, the trajectory consists of 5 waypoints:

(10)
(11)
(12)
(13)
(14)
(15)

where represents a rotation of degrees around the approach axis of the end effector, and a rotation of 30 degrees (towards the table surface) in the vertical plane that contains this axis. The reference duration for each rotation trajectory is two seconds. We define the grasp stability score for each object in layout over as:

(16)

where is the number of the trajectory waypoints reached without dropping the object at trial and is the total number of the trajectory waypoints. Again, contact slip is acceptable if it does not lead to a fall.

If the pipeline under test allows for it, GRASPA can measure its ability to grasp while avoiding other objects. We define the obstacle avoidance score for over trials:

(17)

where is the number of objects hit by the robot while approaching the target object at trial . The score is 1 if the robot is able to avoid all the objects and 0 if it collides with every object. If no obstacle avoidance is accounted for, tests must use single objects and S6 is not computed.

Iv Reporting benchmark scores

In this Section, we explain how the single step metrics are combined into a single composite score. We outline how the benchmark scores are reported, giving guidelines on how to interpret the outcome and how to choose the required user-defined thresholds.

Iv-a Final composite score and summary table

All the scores proposed thus far contribute to the computation of a composite score to evaluate the grasping pipeline performance in each layout , accounting for the limits of the testing platform. To this aim, the final score is computed considering only objects such that:

  • is graspable by the robot, i.e. ;

  • is in a reachable region, i.e. . A region is not considered to be reachable if less than half the test poses were not reached with precision;

  • is in a region with a good calibration of the vision system, where at least half the calibration poses were reached with acceptable precision, i.e. .

The expression of the final score is the following:

(18)

where, if benchmarking with objects in isolation:

whereas, if benchmarking in clutter:

where indicates the layout, indicates the object and indicates the trial, is the grasp quality score, is the grasp stability score, is the obstacle avoidance score (if the pipeline allows for it), and only if the object has been successfully grasped at trial . The scores computed by the benchmark are summarized in Table I.

Score formula Score name Meaning
Reachability score Accounts for whether the object is located in a region characterized by a good reachability of the robot.
Camera-calibration score Accounts for whether the object is located in a region characterized by a good calibration of the vision system.
Graspability score Accounts for whether the object can be physically grasped and lifted by the robot, considering its shape and weight.
Grasp quality score Accounts for how contacts are distributed on the object by simulating grasp closure in simulation and computing the grasp wrench space.
  Binary success score Accounts for whether the robot actually managed to grasp the object in real tests.
Grasp stability score Evaluates the stability of the grasp during the execution of a fixed trajectory.
Obstacle avoidance score (Only in cluttered mode) Accounts for how many objects the robot has hit while executing the grasp.
where in isolation: in the clutter: Final per object score Combines all the scores in order to evaluate the grasping pipeline performance taking into account any limitation of the robotic platform used in real world tests.
TABLE I: Summary of the benchmark scores.

The final output of the benchmark consists of a summary Table (III for an example). In the second column, the value of the final score for each layout is reported. In the rest of the table, each row collects all the scores computed for each object . Analyzing such scores can give insight about the performance of different parts of the grasping pipeline, down to the hardware. For instance, if the grasp quality score is high but the robot could not grasp the object (), the reachability score and the camera-calibration score can outline whether the vision system calibration or the robot reachability are to blame for the failure in the execution of the grasp. On the other hand, if is low, but and are large, this may indicate that the physical execution is able to compensate for the poor grasp quality (e.g. the gripper is compliant and can conform to the object, or the object pose changes during the grasp execution).

Iv-B Defining reachability and camera calibration thresholds

As previously mentioned, GRASPA requires position and orientation thresholds used during the reaching test , see Paragraph III-B) and the camera calibration test (, see Paragraph III-C). Since GRASPA is meant to adapt to different robot platforms, these thresholds cannot be fixed a priori by the benchmark and have to be chosen by the user according to the robot platform and vision system. define how precise the robot kinematics is over the GRASPA layout space. For dexterous and precise arms (e.g. industrial manipulators), small values of the reachability thresholds are advisable (e.g. ). For less precise robots (e.g. research-oriented platforms such as iCub, PR2, Baxter) higher values are needed (e.g. ). On the other hand, depend on camera resolution and the method used to visually infer the end effector pose. Upper bounds on these parameters are , that we found borderline acceptable for a 320x240 resolution camera.

Note that the aforementioned thresholds are mostly useful in the presence of hardware limits, inverse kinematics solver or calibration. In this scenario, low thresholds will likely mark some regions as unreachable or not well calibrated and will allow only grasps in regions where their execution can be more precise. With high thresholds, grasps will be executed and scored in regions where lack of precision might lead to unstable grasps and unfair scoring.

V Example of Application

In this Section, we show an example application of the GRASPA protocol. We evaluated the grasping pipeline proposed in [nguyen2018merging] by using the iCub humanoid robot [icub] as the testing platform. We evaluated right-handed grasps performed in isolation, although GRASPA is extendable to multi-armed planning approaches.

V-a Cardinal Point Grasps

Our grasping pipeline can be briefly summarized as follows.

  • 2D segmentation.

    Using the monocular image stream coming from iCub, we adapt an off-the-shelf Tensorflow implementation 

    [matterport_maskrcnn_2017] of Mask R-CNN [he2017mask] in order to obtain segmentation masks of the objects. We use a ResNet-50 backbone pre-trained on MS Coco, further training it on a subset of YCB-Video [xiang_posecnn_2017] and then fine-tuning it on a custom synthetic dataset. The latter was obtained by augmenting real images with YCB object crops following the Cut, Paste and Learn approach [dwibedi_cut_2017] enhanced with segmentation masks. The dataset features the 16 YCB objects used in GRASPA as classes, and ArUco marker crops as distractors.

  • Object modeling. Partial object point clouds are obtained from segmentation masks through the robot stereo vision. As described in [nguyen2018merging], we approximate the object with the smallest superquadric fitting the point cloud. The superquadric and its 6D pose are estimated by solving a constrained optimization problem, imposing one of the axes of the superquadric to be perpendicular to the table surface.

  • Grasp planning. We generate grasping pose candidates from the cardinal points of the superquadric (i.e. where axes intersect the surface). The candidates are then ranked according to the superquadric and hand size, and the capability of the robot to reach them with sufficient accuracy [nguyen2018merging].

V-B Data collection

Hereafter, we briefly explain the procedure we followed to collect the data required by the benchmark from the physical robot. More information, together with a sample code as well as the reachability and calibration poses and object poses, is available online666https://github.com/robotology-playground/GRASPA-test.

V-B1 Reachability score

Data for the computation of the reachability scores has been acquired by having iCub reach the poses defined within the benchmark with the right hand, querying the forward kinematics to obtain the poses actually reached. We used OpenCV to estimate the pose of the layout marker boards (Fig. 2) with respect to the robot. We used this information to express the target poses in the robot reference frame and save the reached poses in the layout reference frame. Fig. 5 shows some samples of the outcome.

(a) Desired poses (set no. 1)
(b) Reached poses (set no. 1)
Fig. 5: Reachability test results: comparison between the objective poses and those actually reached by iCub.

V-B2 Camera-calibration score

We followed the same procedure just outlined for the reachability score. Instead of acquiring the reached pose through the forward kinematics, we resorted to visual detection of two ArUco markers located on the back and the side of the hand.

V-B3 Graspability

We considered an object to be graspable by the iCub if at least one of its dimensions was smaller than the iCub hand aperture and its weight was compatible with the maximum arm payload (0.5 kg). We considered un-graspable by iCub objects that have a very low profile (i.e. scissors, clamp) when laid flat on the table.

V-B4 Grasp Quality

For each object visible in each layout, we planned for 6D grasping poses according to Section V-A, expressed in the layout reference frame by using the estimated pose of the ArUco marker board. We used the iCub hand model packaged with the GraspStudio suite [simox]. A graphical rendering of some of the planned poses can be seen in Figure 6.

Fig. 6: Rendering of the grasp poses planned for layout 0 with the tested algorithm. For visual clarity, only one pose is rendered for each object.

V-B5 Binary Success and Stability scores ()

We executed in isolation the grasps computed by the algorithm for each object. Whenever the robot managed to grasp the object we also had it execute the trajectory defined in Section III-D. We added a layer of rubber on the robot fingertip to actually have friction on the contact points. In these tests, we did not perform the last waypoint of the stability trajectory.

V-C Results and Discussion

Table II collects the user-defined parameters we used for data acquisition and score computation. The threshold values have been chosen considering the reachability and the visual calibration limits of the iCub. The pipeline under test plans for power grasps (i.e. it computes a pose for the hand palm and not for each fingertip) and is, therefore, able to deal with reachability errors in position of and in orientation of . In order to deal with higher errors in the calibration of the vision system ( and ), we made use of a calibration map obtained by kinesthetically teaching the robot the correction to be applied over a set of poses.

Computed scores are reported in Table III. We highlight: 1) the values of the reachability score and the camera calibration score when and (the object is in a region unreachable by the robot or with an unacceptable visual calibration error); 2) the value of the graspability score when the object is not graspable, i.e. . In these cases, the final score is not computed by the benchmark and is replaced with the placeholder N/A. Because of the proximity of some objects to the robot torso, the stereo vision could not reliably acquire partial point clouds. In these cases, no further score is reported.

Table III shows how our benchmark properly evaluates the grasping pipeline without penalizing its performance wherever the test platform proved its limits. A meaningful example is the foam brick in layout 0. The grasp quality score is good, meaning that the algorithm computes proper grasping poses for the object. However, in practice, the robot could grasp the object only once over the 5 trials (). The reason of such failure can be attributed to the poor vision system calibration in the region of the object (). Therefore, the foam brick scores do not contribute to the computation of the final composite score. On the other hand, other objects (e.g. potted meat can, cracker box and tennis ball) have low in layouts 0 and 1 but show higher values for and . We observed this to be caused by the mechanical underactuation (not modeled in the GraspStudio [simox] environment) in the iCub hand that allows the fingers to conform to the object.

Robot End-effector Modality
iCub Right hand In isolation 0.02 m 0.5 rad 0.045 m 0.8 rad
TABLE II: User-defined parameters used during benchmarking procedure.
Layout Per object scores
Layout 0 Object
0.60 banana 1.0 0.75 1.0 0.32 0.8 0.2 0.36
foam brick 0.75 0.25 1.0 0.27 0.2 0.2 N/A
gelatin box 0.75 0.25 1.0 0.07 0.2 0.0 N/A
mustard bottle 1.0 1.0 1.0 0.15 1.0 0.8 0.95
potted meat can 1.0 1.0 1.0 0.01 0.8 0.45 0.46
Layout 1 Object
0.70 banana 0.25 0.0 1.0 0.19 0.0 0.0 N/A
hammer 0.75 0.25 0.0 N/A N/A N/A N/A
chips can 0.5 0.5 1.0 0.25 1.0 1.0 1.25
tennis ball 1.0 1.0 1.0 0.23 0.2 0.2 0.29
cracker box 1.0 1.0 1.0 0.04 0.8 0.5 0.54
mustard bottle 0.75 0.25 1.0 0.23 0.8 0.15 N/A
potted meat can 1.0 0.75 1.0 0.01 0.8 0.7 0.71
Layout 2 Object
0.77 pear 1.0 0.75 1.0 0.0 0.0 0.0 0.0
scissors 0.75 0.25 0.0 N/A N/A N/A N/A
chips can 0.5 0.5 1.0 0.48 1.0 1.0 1.48
strawberry 1.0 1.0 1.0 0.13 0.6 0.55 0.51
tennis ball 1.0 0.75 1.0 0.07 0.4 0.4 0.43
power drill 0.25 0.0 0.0 N/A N/A N/A N/A
mustard bottle 0.5 0.5 1.0 0.25 1.0 1.0 1.25
medium clamp 0.75 0.25 0.0 N/A N/A N/A N/A
master chef can 1.0 1.0 0.0 N/A N/A N/A N/A
potted meat can 0.75 0.25 1.0 N/A N/A N/A N/A
tomato soup can 0.75 0.25 1.0 N/A N/A N/A N/A
TABLE III: Results obtained when testing Cardinal Point Grasps [nguyen2018merging] on the iCub humanoid robot.

Vi Conclusions

In this paper we proposed version 1.0 of GRASPA, a benchmarking protocol and a set of metrics to fairly evaluate grasping pipelines tested on diverse robotic platforms. As shown by a practical application (Section V), the metrics and the final grasping score we designed allow distinguishing between failures caused by the testing platform and those induced by the limitations of the pipeline itself.

Future work directions for successive releases consist in improving the computation of the grasp quality score by allowing users the possibility to specify custom finger joint trajectories during the hand closure simulation and by allowing users to set specific fingertip-object friction coefficients. As outlined in Section III-E, version 1.0 of GRASPA employs a grasp quality metric based on analysis of the GWS. This indicator has been shown to be brittle with respect to uncertainty [kim2013physically], therefore future development of GRASPA will include a measure of grasp quality that accounts for object dynamics. We also plan to use GRASPA to evaluate the pipeline outlined in V-A and others drawn from the state of the art on a setup equipped with a Franka Panda arm and RGBD cameras in order to compare results. Finally, new objects and layouts can easily be added to the ones presented in this paper to meet the community needs.

References