Post

Grasping Theory

Grasping Theory

Introduction

In this blog post, we will explore the fundamental concepts of robotic grasping. First, we will summarize the theory and mathematical models underlying robotic grasping, focusing on the contacts and forces that define the immobilization of an object. Next, we will use these mathematical models to formalize some of the most commonly used evaluation metrics in the field.

The objective of this post is to provide a clear understanding of grasping evaluation metrics and how to implement them effectively.

Sources

[1] SIRSLab Siena-Robotics-Systems-Lab - “The Art of Grasping and Manipulation in Robotics”

[2] Modern Robotics, Chapter 12: Grasping and Manipulation

[3] Roa, Máximo A., and Raúl Suárez. “Grasp Quality Measures: Review and Performance.” Autonomous Robots 38, no. 1 (January 1, 2015): 65–88. https://doi.org/10.1007/s10514-014-9402-3.

[4] Ferrari, C., and J. Canny. “Planning Optimal Grasps.” In Proceedings 1992 IEEE International Conference on Robotics and Automation, 2290–95 vol.3, 1992. https://doi.org/10.1109/ROBOT.1992.219918.

[5] Miller, Andrew T., and Peter K. Allen. “GraspIt!: A Versatile Simulator for Grasp Analysis.” In Dynamic Systems and Control: Volume 2, 1251–58. Orlando, Florida, USA: American Society of Mechanical Engineers, 2000. https://doi.org/10.1115/IMECE2000-2439. (2004 version)

Grasping Theory

There appear to be two main approaches to conceptualizing grasping. Both utilize Screw Theory to model the forces and velocities at the contact points.

  1. The first approach examines the individual effects of each contact by considering the first-order geometric derivative of the individual contacts [1]. The results are then superimposed to draw conclusions about the overall grasp. This method is simpler to understand and provides a foundational view of grasping.
  2. The second approach combines the contact equations into matrices and derives conclusions based on the properties of these matrices [2].

At first glance, the second approach appears more prevalent in the field of anthropomorphic grasping. However, the simpler method still addresses several critical aspects and, in a sense, the matrix-based approach can be viewed as an extension of the single-contact analysis.

Background

Before diving into the theory of grasping, we need to establish some basic definitions from Linear Algebra. These definitions will be essential for determining the forces that arbitrary contact points can produce to resist an object’s movement.

In these definitions, let us consider $A$ as the set containing three vectors:

\[\begin{equation} A=\{v_1,v_2,v_3\} \text{ where } v_j \in \mathbb{R}^n \end{equation}\]
  • Linear Span: The set of all vectors that can be obtained by a linear combination.
\[\begin{equation} \text{span}(A) = \{ \sum_{i=1}^j k_iv_i | k_i \in \mathbb{R}\} \end{equation}\]

Desktop View Fig 1: Span of the set $A$

  • Positive Span: The set of all vectors that can be obtained by positive linear combination.
\[\begin{equation} \text{pos}(A) = \{ \sum_{i=1}^j k_iv_i | k_i \ge 0 \} \end{equation}\]

Desktop View Fig 2: Positive span of the set $A$

  • Convex Span: The set of all vector that can be obtained by positive linear combination, where the linear coefficients sum to 1.
\[\begin{equation} \text{conv}(A) = \{ \sum_{i=1}^j k_iv_i | k_i \ge 0 \text{ and } \sum k_i = 1 \} \end{equation}\]

Desktop View Fig 3: Convex span of the set $A$

It is important to understand some implications about these concepts:

  • $\text{conv}(A) \subseteq \text{pos}(A) \subseteq \text{span}(A)$
  • For the space $\mathbb{R}^n$ to be fully defined, it needs to be linearly spanned by at least $n$ vectors.
  • For the space $\mathbb{R}^n$ to be fully defined, it needs to be positively spanned by at least $n+1$ vectors.

  • Null Space (kernel): Given a transformation $A: X \rightarrow Y$, where $A$ is matrix and the transformation is defined as $Ax=y$, the null space of $A$ is the set of vectors $x$ where $Ax=0$.

  • Generalized Inverse Decomposition (Moore-Penrose pseudo-inverse): The generalized inverse decomposition is a formulation that can be applied non-squared matrices and singular matrices. Equation $\ref{eq:inv}$ defines this formulation, where $A$ is a generic matrix, $A^\#$ is the pseudo-inverse and $z$ is an arbitrary vector in the free parameter space. Note that if $\mathcal{N}(A) \ne 0$ then $x$ has infinitely many possible values in a subset defined by the null space.
\[\begin{equation} \begin{split} Ax =& y\\ x =& A^\#y + \mathcal{N}(A)z\\ \end{split} \label{eq:inv} \end{equation}\]

Notation

In this subsection, we will define notation commonly utilized in the field of multi-fingered robotic grasping.

Desktop View Fig 4: Diagram of robotic hand in contact with an object

  • $\{\text{N}\}$ : Inertial frame, usually located on the palm
  • $\{\text{B}\}$ : Object frame, usually located at the center of mass
  • $\{\text{C}\}$ : Contact frames, with axis $\{\hat{n}_i,\hat{o}_i,\hat{t}_i\}$ in the $i$-th contact point.
  • $b \in \mathbb{R}^3$ : Origin of $\{\text{B}\}$ expressed in $\{\text{N}\}$
  • $c_i \in \mathbb{R}^3$ : Origin of $\{\text{C_i}\}$ expressed in $\{\text{N}\}$
  • $q = [q_1,q_2,\dots,q_{n_q}] \in \mathbb{R}^{n_q}$ : Vector of joint displacements.
  • $\dot{q} \in \mathbb{R}^{n_q}$ : Vector of joint velocities.
  • $\tau_c = [\tau_1,\tau_2,\dots,\tau_{n_q}] \in \mathbb{R}^{n_q}$ : Vector of joint loads (forces for prismatic and torques for revolute joints).
  • $u = \begin{bmatrix} p \\ \theta \end{bmatrix} \in \mathbb{R}^{n_u}$ : Describes the position and orientation of $\{\text{B}\}$ with respect to $\{\text{N}\}$.
  • $\upsilon = \begin{bmatrix} \text{v} \\ \omega \end{bmatrix} = \begin{bmatrix} \dot{p} \\ \omega \end{bmatrix} \in \mathbb{R}^{n_\upsilon}$ : Describes the linear and angular velocities. This is commonly referred as Twist.
  • $\mathcal{W} = \begin{bmatrix} f \\ m \end{bmatrix} \in \mathbb{R}^{n_{\upsilon}}$ : Object load. This is commonly referred as Wrench.
  • $f \in \mathbb{R}^3$ : Force applied to the object at $p$.
  • $m \in \mathbb{R}^3$ : Moment applied to the object.

Note that the axis $\hat{n}$ is perpendicular to the tangent plane at the contact point and points inward toward the object. The axes $\hat{t}$ and $\hat{o}$ are two arbitrary orthogonal vectors defined within the tangent plane.

Additionally, it is important to address why we do not define $\dot{u} = \upsilon$. This is because the orientation of the object is defined using Euler angles, which represent a sequence of rotational transformations. However, there exists a matrix $V$ such that:

\[\begin{equation} \dot{u} = V\upsilon \end{equation}\]

where $V^TV=I$. Deriving this matrix is outside the scope of this discussion, but it is worth noting that $V$ depends on both the rotation matrix and its derivative.

First-Order Analysis of a Single Contact

In this subsection, we will discuss the first-order analysis of contact points. Our focus will be on defining this concept, understanding what information can be extracted from it, and identifying its limitations.

First, we define the following variables:

  • $q_1,q_2$: The rigid body configurations of body 1 and 2, respectively.
  • $q = (q_1,q_2)$: The set containing the configurations of both bodies.
  • $p(q)$: The smallest distance between the two bodies, which intuitively depends only on $q$.

Utilizing these definitions, we can construct a table of contacts:

$p$$\frac{dp}{dt}$$\frac{d^2p}{dt^2}$Contact State
$> 0$  no contact
$< 0$  infeasible, overlapping bodies
$= 0$$>0$ in contact, but breaking
$= 0$$<0$ infeasible, about to overlap
$= 0$$=0$$>0$in contact, but breaking
$= 0$$=0$$<0$infeasible, about to overlap

As we can see, this analysis can be extended to any order of derivatives, but the most common case involves the first two derivatives:

\[\begin{equation} \begin{split} & \dot{d} = \frac{\partial d}{\partial q} \\ & \ddot{d} = \frac{\partial d}{\partial q} \ddot{q} + \dot{q}^T \frac{\partial^2 d}{\partial q^2}q^2 \\ \end{split} \label{eq:analysis} \end{equation}\]

In Equation $\ref{eq:analysis}$, the first term represents the first-order contact geometry, which corresponds to the contact normal. The second term accounts for the second-order contact geometry, providing information about curvature. In first-order contact kinematics, we assume no second-order derivative.

This implies that any analysis considering only the contact normal is equivalent to a first-order contact analysis, as will be the case in future sections.

This simplification imposes some limitations which are expressed in figure 4.

Desktop View Fig 5: Two cases that appear identical in a first-order analysis but differ in a second-order analysis.

As we can see in the previous figure, the first-order analysis fails to distinguish between the two situations, which can be important if the object starts moving. Intuitively, the first case becomes unstable when the object starts moving, while the second case remains stable.

Contact Types: Rolling, Sliding and Breaking

Given the diagram of figure 5, we can perform a geometric analysis of the two bodies in contact. $p_a$ and $p_b$ represent the same contact point on different bodies, but they are drawn slightly apart for clarity of visualization.

Desktop View Fig 6: Diagram of two bodies in contact

Firstly, we can impose an Impenetrability Constraint, which requires that the distance along the contact normal must increase. This can be formulated using Equation $\ref{eq:impen}$:

\[\begin{equation} \hat{n}^T(\dot{p}_A - \dot{p}_B) \ge 0 \label{eq:impen} \end{equation}\]

In this case, $\dot{p}_i$ represents the instantaneous velocity of contact $i$ expressed as:

\[\begin{equation} \dot{p}_i = \text{v}_i + \omega_i (c_i - b) \end{equation}\]

However, contact can still be broken while satisfying this constraint. To prevent this, we can impose additional restrictions:

\[\begin{equation} \hat{n}^T(\dot{p}_A - \dot{p}_B) = 0 \label{eq:sliding} \end{equation}\]

This restriction ensures that the distance between the contacts does not increase in the normal direction. However, movement in the tangential direction (sliding) can still occur while satisfying this condition. Despite this, the restriction guarantees that contact is maintained.

The only condition that ensures a fixed position between the contacts is:

\[\begin{equation} \dot{p}_A - \dot{p}_B = 0 \label{eq:rolling} \end{equation}\]

Using the same restrictions, but considering wrenches instead of the normal, we can formalize three different types of contacts:

  • Breaking (B): $\mathcal{F}_n^T(\upsilon_A - \upsilon_B) > 0$
  • Sliding (S): $\mathcal{F}_n^T(\upsilon_A - \upsilon_B) = 0$
  • Rolling (R): $\dot{p}_A - \dot{p}_B = 0$

As an example to visualize these conditions, we refer to Figure 7. This interaction represents a single-point contact between two planar bodies, where only translation is allowed.

Therefore, the space of twists of body $A$ consists of only horizontal and vertical velocities, which can be represented in a 2D chart. Body $B$ is assumed to be fixed.

Desktop View Fig 7: Analysis of a 2D contact between $A$ and $B$, and the classification of each type of contact given the twist of $A$

The diagram illustrates that the contact force restricts the possible twists of $A$ to half of the plane. However, a single contact allows for breaking, sliding, or rolling interactions.

Multiple Contacts

We can extend the analysis from the previous subsection to multiple contacts. If multiple contacts constrain the same body, the allowed twist space is given by the union of the constraints imposed by each contact. A graphical interpretation of this concept is presented in Figure 8.

Desktop View Fig 8: Analysis of multiple 2D contacts on $A$ and the classification of the contact modes given the twist of $A$

When evaluating multiple contacts, each contact can belong to different classifications, leading to multiple possible combinations of contact classifications. The concatenation of the contact labels corresponding to a given twist of AA is referred to as the contact mode.

The set of possible twists that determine these contact modes is defined as follows:

If $A$ is a movable body with ii contacts, then the set of feasible twists for $A$ forms the polyhedral convex set $V$: $V = \{\upsilon_A | \mathcal{F}^T_i (\upsilon_A - \upsilon_i) \ge 0 \text{ for all } i \}$

Form Closure

Based on the previous analysis of multiple contacts, we can define form closure as follows:

A body is in first-order form closure if the only solution to $W_i^T \upsilon \ge 0 \text{ for all contacts } i$ is $\upsilon = 0$

A direct corollary of this condition is given by Equation $\ref{eq:form_c}$

\[\begin{equation} \text{pos}(\\{\mathcal{F}_1, \dots,\mathcal{F}_i \\}) = \mathbb{R}^n \label{eq:form_c} \end{equation}\]

In an $n$-dimensional space, at least $n+1$ vectors are required to positively span the space. This implies that for first-order form closure (grasping) of a 3D object, a minimum of 7 contact points is necessary.

Note: Form closure does not take friction into account.

We can verify first-order form closure by following these steps:

  1. Construct the wrench matrix $F = [F_1, \dots, F_j] \in \mathbb{R}^{n \times j}$ where $n$ is 3 (planar) or 6 (spatial)
  2. Verify the following conditions:
    1. $\text{rank}(F) = n$
    2. $Fk=0 \text{ for some } k \in \mathbb{R}^j, k_i \ge \epsilon > 0 \text{ for all } i $

The limitations of first-order form closure are expressed in the following diagram:

Desktop View Fig 9: Illustration of three contact modes that fail planar first-order form closure. (a) is not constrained because it can rotate, whereas (b) and (c) achieve form closure

The three diagrams fail to achieve first-order form closure because they have fewer than four contact points. Specifically, the first object is not constrained as it can still rotate. However, we can intuitively deduce that objects (b) and (c) are fully constrained. Form closure for these objects only becomes apparent in higher-order analyses, where geometric properties are taken into account.

Therefore, we can conclude that first-order form closure implies second-order form closure. However, if the first-order analysis predicts rolling or sliding contacts, form closure might still be achieved in higher-order analyses.

Friction

In robotic grasping, friction is typically modeled using the Coulomb Friction Model. The friction force is expressed as follows:

\[\begin{equation} \begin{split} & \text{v} = 0 \rightarrow ||f_t|| \le \mu f_n \\ & \text{v} \ne 0 \rightarrow f_t = - \mu f_n \frac{\text{v}}{||\text{v}||} \\ \end{split} \label{eq:friction_force} \end{equation}\]

where $\mu$ is the coefficient of friction which depends on the materials of both bodies in contact.

Using this model, we can define the Friction Cone (FC). The FC is the set of all forces that can be transmitted through a Coulomb friction contact. Mathematically, it is defined as:

\[\begin{equation} FC = \\\{f | f_z \ge 0, \sqrt{f_x^2 + f_y^2} \le \mu f_z \\\} \end{equation}\]

Desktop View Fig 10: Diagram of a generic Friction Cone

Figure 10 illustrates the friction cone, where the z-axis will always be aligned to the contact normal. Note that $\alpha = \arctan{\mu}$.

Defining the Friction Cone (FC) for computation is challenging because it is an infinite set. This issue is typically addressed by discretizing the FC into $M$ forces along its boundary and then defining it as the positive span of these discretized forces.

Desktop View Fig 11: Example of Friction Cone discretized with 4 boundary forces

This approximation always results in a slightly different set that is more restricted than the true Friction Cone. The accuracy of the approximation can be improved by increasing the number of boundary forces, though this comes at the cost of higher computational complexity.

Contact Models

In the real world, different fingers should be modeled using different contact models. The three main contact models are:

  • Simple Contact - A single-point contact without friction. It can only apply force in the direction normal to the contact surface.
  • Hard Finger - A single-point contact with friction. It can apply forces within the friction cone, having both normal and tangential components relative to the contact plane.
  • Soft Finger - A contact model for deformable materials. When contact is made, the deformation creates a contact patch capable of applying torques normal to the contact plane, in addition to forces. This model also incorporates friction.

Force Closure

Force closure is similar to form closure, as both define the immobilization of an object. However, force closure has a distinct definition:

A set of frictional contacts achieves force closure if the positive span of the wrench cones is the entire wrench space.

Similar to form closure, we can verify the existence of force closure by following these steps:

  1. Construct the wrench matrix with friction cone boundary edges $F = [F_1, \dots, F_j] \in \mathbb{R}^{n \times j}$ where $n$ is 3 (planar) or 6 (spatial)
  2. Verify the following conditions:
    1. $\text{rank}(F) = n$
    2. $Fk=0 \text{ for some } k \in \mathbb{R}^j, k_i \ge \epsilon > 0 \text{ for all } i $

Insights on Force Closure:

  • There is no guarantee that the contacts can truly resist all external wrenches, as we have not proven whether the motor joints can generate the required torques.
  • If we assume frictionless contacts, force closure reduces to first-order form closure.
  • Force closure with friction (hard fingers) can be achieved with three contacts in space.
  • Soft fingers require only two contacts to achieve force closure in space.
  • In robotic grasping, force closure is generally preferred over form closure because it imposes fewer restrictions. Additionally, friction forces are almost always present in real-world scenarios.

Grasping Types

There are two main types of grasping:

  • Precision Grasping: Involves only the fingertips. This method is typically used when the object needs to be manipulated after grasping. It is also easier to control using traditional methods due to the reduced number of contact points.
  • Power Grasping: Involves the entire hand, often incorporating the palm. This method is commonly used to securely hold an object for relocation. In this case, the object becomes an extension of the hand and does not move relative to it.

Precision grasping requires a high number of degrees of freedom to ensure that the fingertips can generate the necessary forces for force closure.

In contrast, power grasping increases the number of contact points, which adds complexity to control. Some traditional control strategies simplify power grasping by reducing the number of actuated degrees of freedom (e.g., using compliant fingers).

Grasp Matrix

The Grasp Matrix is the cornerstone of grasping. Its properties provide critical insights into the feasibility of a given grasp and play a significant role in evaluation metrics.

Given the vector of contact forces $F^T = [F_1, \dots, F_n] \in \mathbb{R}^{6n \times 1}$, and the external wrench applied to the object $\mathcal{W} \in \mathbb{R}^6$, the Grasp Matrix $G$ is defined as:

\[\begin{equation} \mathcal{W} = GF \end{equation}\]

Basically, the Grasp Matrix transforms the forces at the contacts into the total wrench applied to the object. The previous equation comes from a static evaluation of the forces, and it is dependent on the reference frame. The normal choice of reference frame is the objects center of mass. If $\mathcal{W} = 0$ and the previous equation has a solution, then we say that $F \in \mathcal{N}(G)$ and $F$ is composed by the internal forces.

We will derive the Grasp Matrix based on its definition. We will begin with the support of the following diagram.

Desktop View Fig 12: Diagram of generic body in 3D with a generic contact

For a rigid body, the velocity of the contact can be expressed in function of the object’s twist: $\text{v}_{i,obj}^N = \dot{p} + \omega (c_i - p)$. This velocity is expressed in the inertial frame ${N}$.

We can rewrite this equation using the skew matrix representation:

\[\begin{equation} \text{v}_{i,obj}^N = \dot{p} - S(c_i - p) \omega \label{eq:vel_skew} \end{equation}\]

In this case, $(c_i - p) \in \mathbb{R}^3$, the kew matrix is defined as:

\[\begin{equation} S(a) = \begin{bmatrix} 0 & -a_z & a_y \\ a_z & 0 & -a_x \\ -a_y & a_x & 0\end{bmatrix} \end{equation}\]

Note: $S^T(a) = -S(a) = S(-a)$

We now express Equation $\ref{eq:vel_skew}$ in matrix form:

\[\begin{equation} \begin{split} & \text{v}_{i,obj}^N = \begin{bmatrix} \mathcal{I} & -S(c_i - p) \end{bmatrix} \begin{bmatrix} \dot{p} \\ \omega \end{bmatrix} \\ & \text{v}_{i,obj}^N = \begin{bmatrix} \mathcal{I} & -S(c_i - p) \end{bmatrix} \upsilon \end{split} \label{eq:vel_twist} \end{equation}\]

We can perform a similar analysis for the rotational velocity of the contact. In rigid bodies, every point on the body shares the same angular velocity $\omega$. Therefore:

\[\begin{equation} \omega_{i,obj}^N = \begin{bmatrix} 0 & \mathcal{I}\end{bmatrix} \upsilon \label{eq:omega_twist} \end{equation}\]

We can join Equations $\ref{eq:vel_twist}$ and $\ref{eq:omega_twist}$, we can define the twist at the contact point as a function of the twist of the body.

\[\begin{equation} \upsilon_{i,obj}^N = P_i^T \upsilon \label{eq:contact_twist} \end{equation}\]

where $P_i$ is defined as

\[\begin{equation} P_i = \begin{bmatrix} \mathcal{I} & 0 \\ S(c_i - p) & \mathcal{I} \end{bmatrix} \label{eq:p} \end{equation}\]

In Equation $\ref{eq:contact_twist}$, we’ve defined the transformation matrix as the transpose of $P_i$ beacuse it will be useful in later defining the Grasp Matrix $G$.

For the next step, we need to express the contact twist in the coordinates of the contact frame. The conversion between frames is equivalent to a pure rotation of both the linear and angular velocity components.

\[\begin{equation} \begin{split} & \upsilon_{i,obj} = \tilde{R}_i^T \upsilon_{i,obj}^N \\ & \tilde{R}_i^T = \begin{bmatrix} R_i^T & 0 \\ 0 & R_i^T \end{bmatrix} \end{split} \label{eq:transform} \end{equation}\]

By combining Equations $\ref{eq:contact_twist}$ and $\ref{eq:transform}$, we define the contact twist in the contact frame as:

\[\begin{equation} \upsilon_{i,obj} = \tilde{R}_i^T P_i^T \upsilon = \tilde{G}^T_i \upsilon \label{eq:twist} \end{equation}\]

If we concatenate all the contact twist in a single matrix $\upsilon_{c,obj} \in \mathbb{R}^{6n_c}$, then we can define:

\[\begin{equation} \upsilon_{c,obj} = \tilde{G}^T \upsilon \label{eq:grasp_m} \end{equation}\]

where $\tilde{G}^T \in \mathbb{R}^{6n_c \times 6}$ is the complete grasp matrix, constructed by vertically concatenating all individual matrices $\tilde{G}_i$. It is important to note that the complete grasp matrix does not incorporate contact models.

Hand Jacobian

An important aspect of robotic grasping is the relationship between the joint velocities and the velocity of a given point on the hand. We will now analyze the following diagram.

Desktop View Fig 13: Robot with a generic kinematics in contact with an object in a generic point

The relationship between the twist of the contact (on the hand) and the joint velocities is given by:

\[\begin{equation} \upsilon^N_{i,hand} = Z_i \dot{q} \label{eq:simple_jacob} \end{equation}\]

where $Z_i \in \mathbb{R}^{6 \times n_q}$ is a matrix composed $n_q$ column vectors,

\[\begin{equation} Z_{i,j} = \begin{bmatrix} d_{i,j} \\\ e_{i,j} \end{bmatrix}. \label{eq:Z} \end{equation}\]

The $d_{i,j}$ and $e_{i,j}$ are both column vectors in $\mathbb{R}^3$ representing the translational and rotational influences, respectively, of each joint on the hand’s twist. We can think of $Z_i$ as the influence that a unit of joint velocity has on the twist of the hand.

Note: This matrix is equivalent to the derivative of the hand velocities with respect to the joints, but the current interpretation is a little more intuitive.

To determine the vectors $d_{i,j}$ and $e_{i,j}$, we use the following rules:

  • $d_{i,j} = 0$ and $e_{i,j} = 0$ if the joint does not influence the contact. For example, in the Figure 13, the third joint does not affect the contact position.
  • $d_{i,j} = \hat{z}_j$ if joint $j$ is prismatic, where $\hat{z}_j$ is a unit vector in the direction of the joint’s extension.
  • $d_{i,j} = -S(c_i - \xi_i) \hat{z}_j$ if joint $j$ is revolute, where $\hat{z}_j$ is a unitary vector in the direction of the joint’s rotation. $\xi_i$ is an arbitrary point in the rotation axis.
  • $e_{i,j} = 0$ if joint $j$ is prismatic.
  • $e_{i,j} = \hat{z}_j$ if joint $j$ is revolute.

As in previous derivations, performing the analysis in the contact frame simplifies calculations. Thus, we must convert the expressions by applying a pure rotation.

\[\begin{equation} \upsilon_{i,hand} = \tilde{R}^T Z_i \dot{q} = \tilde{J}_i \dot{q} \label{eq:i_complete_jacob} \end{equation}\]

If we concatenate the twists for all contact points, we can express the equations as a single matrix equation:

\[\begin{equation} \upsilon_{c,hand} = \tilde{J} \dot{q} \label{eq:complete_jacob} \end{equation}\]

In the previous equation, we defined the complete Hand Jacobian $\tilde{J} \in \mathcal{R}^{6n_c \times n_q}$, constructed by vertically concatenating all individual matrices $\tilde{J}_i$. It is important to note that the complete Hand Jacobian does not incorporate contact models.

Contact Models restrictions

To incorporate contact model information, we introduce a matrix $H_i \in \mathbb{R}^{l_i \times 6}$, which sets to zero dimensions that are not influenced by the specific contact model. In this case, $l_i$ depends on the contact model and represents the number of constraints imposed by the contact on the physical system.

If we define the complete twist applied by a contact as $\tilde{\lambda}_i \in \mathbb{R}^6$, the actual twist applied by a given contact model is $\lambda_i = H_i \tilde{\lambda}$, where $\lambda \in \mathbb{R}^{l_i}$.

  • Single Point without friction: This contact model applies only one restriction along the normal direction of the contact. Therefore, $l_i = 1$ and $H_i$ is defined as:
\[\begin{equation} H_i = \begin{bmatrix} 1 & 0 & 0 & 0 & 0 & 0 \end{bmatrix} \label{eq:simple} \end{equation}\]
  • Single Point friction (Hard Finger): This contact model applies a restriction along the normal direction of the contact, as well as two additional constraints along the two orthogonal directions. Therefore, $l_i = 3$ and $H_i$ is defined in equation $\ref{eq:hard}$. Additionally, this type of contact must also satisfy the friction cone constraint. \(\begin{equation} H_i = \begin{bmatrix} 1 & 0 & 0 & 0 & 0 & 0 \\ 0 & 1 & 0 & 0 & 0 & 0 \\ 0 & 0 & 1 & 0 & 0 & 0 \end{bmatrix} \label{eq:hard} \end{equation}\)

  • Soft Finger: This contact model can apply forces along all three axes and also exert a torque in the normal direction. Therefore, $l_i = 4$ and $H_i$ is defined in $\ref{eq:soft}$. Additionally, this type of contact must also satisfy the friction cone constraint.

\[\begin{equation} H_i = \begin{bmatrix} 1 & 0 & 0 & 0 & 0 & 0 \\ 0 & 1 & 0 & 0 & 0 & 0 \\ 0 & 0 & 1 & 0 & 0 & 0 \\ 0 & 0 & 0 & 1 & 0 & 0 \end{bmatrix} \label{eq:soft} \end{equation}\]

For a multi-contact interaction, we can define matrix $H \in \mathbb{R}^{l_i \times 6n_c}$ as the horizontal concatenation of the individual $H_i$ matrices for all contacts. Using this, we can define the Grasp Matrix $G$ and the Hand Jacobian &J& as:

\[\begin{equation} \begin{split} & G = \tilde{G} H^T \\ & J = H \tilde{J} \end{split} \label{eq:grasp_matrix} \end{equation}\]

With this knowledge, we can now formulate a geometric condition to maintain contact by ensuring that the relative twist between the hand and the object at the contact point is zero.

In this formulation, we continue to use $H_i$ to account for the contact type.

\[\begin{equation} H_i ( \upsilon_{i,hand} - \upsilon_{i, obj}) = 0 \label{eq:rel_twist} \end{equation}\]

By using the formulation that concatenates all contact points into a single matrix and applying the derivations for $\upsilon_{i, hand}$ and $\upsilon_{i, obj}$ from previous sections, we obtain the following equation:

\[\begin{equation} \begin{split} & H(\tilde{J}\dot{q} - \tilde{G}^T\upsilon) = 0 \\ & J\dot{q} - G^T\upsilon = 0 \end{split} \label{eq:conctact_const} \end{equation}\]

Quasi-static Grasp Model

Quasi-static evaluation assumes that while the system remains motionless, infinitesimal variations in position are permitted to analyze certain physical relationships.

In this case, we apply the Principle of Virtual Work, expressing energy equilibrium as the total work produced by all forces and torques over an infinitesimal displacement and setting it to zero (static condition).

We begin by formulating the equilibrium condition for the object in contact with the robotic hand.

\[\begin{equation} W^T \delta u + \tilde{\lambda} ^T\delta c_i = 0 \label{eq:vw} \end{equation}\]

In Equation $\ref{eq:vw}$, $W$ represents the external wrench applied to the body, and $\tilde{\lambda}$ represents the contact forces.

Previously, we expressed the twist at the contact points as: $\upsilon_{c,obj} = \tilde{G}^T \upsilon$

Since $\upsilon_{c,obj}$ describes the displacement at the contacts and $\upsilon$ describes the displacement of the body, this implies that: $\delta c_i = \tilde{G}^T \delta u$

We now substitute this equality into Equation $\ref{eq:vw}$.

\[\begin{equation} \begin{split} & W^T \delta u + \tilde{\lambda}^T\tilde{G}^T \delta u = 0 \\ & (W^T + \tilde{\lambda}^T\tilde{G}^T) \delta u = 0 \end{split} \label{eq:vw_middle} \end{equation}\]

For Equation $\ref{eq:vw_middle}$ to hold for any $\delta u$, the expression inside the parentheses must be zero.

\[\begin{equation} \begin{split} & W^T + \tilde{\lambda}^T\tilde{G}^T = 0 \\ & (W^T + \tilde{\lambda}^T\tilde{G}^T)^T = 0 \\ & W + \tilde{G}\tilde{\lambda} = 0 \\ & W + G H H^T \lambda = 0 \\ & W + G \lambda = 0 \end{split} \label{eq:vw_final} \end{equation}\]

This final equation enforces the static constraint on the object. It states that the contact forces must generate a wrench that counteracts the external wrench applied to the object.

Similarly, if we express the equilibrium of virtual work for the hand, which is influenced by $-\lambda$ and the joint torques $\tau$, we obtain a new constraint. This derivation follows the same principles as the previous one and will be presented continuously for conciseness.

Keep in mind that the relationship between the contact points and the hand joints is given by the Hand Jacobian.

\[\begin{equation} \begin{split} \tau^T \delta q &- \tilde{\lambda}^T\delta c_i = 0 \\ \tau^T \delta q &- \tilde{\lambda}^T\tilde{J}\delta q = 0 \\ (\tau^T &- \tilde{\lambda}^T\tilde{J}) \delta q = 0 \\ \tau^T &- \tilde{\lambda}^T\tilde{J} = 0 \\ \tau^T &- \lambda^T H H^T J = 0 \\ \tau^T &- \lambda^T J = 0 \\ (\tau^T &- \lambda^T J)^T = 0 \\ \tau &- \lambda J^T = 0 \\ \tau &= J^T \lambda \end{split} \label{eq:hand_equi} \end{equation}\]

This equation defines the hand equilibrium constraint, which establishes the relationship between the contact forces and the joint motor torques.

Grasp Properties

The derivations in the previous sections provided a set of useful equations describing the grasping process. We can further analyze these equations to understand how the properties of the Hand Jacobian $J$ and the Grasp Matrix $G$ influence the quality of a grasp.

Redundancy

If we isolate the first part of the contact constraint and attempt to find a solution:

\[\begin{equation} \begin{split} & \upsilon_{c,hand} = J \dot{q} \\ & \dot{q} = J^\# \upsilon_{c,hand} + \mathcal{N}(J)\alpha \end{split} \label{eq:cc1} \end{equation}\]

Given that $\alpha$ is an arbitrary vector in the parameter space, if $\mathcal{N}(J) \ne \emptyset$, it implies that the hand is redundant with respect to the contacts.

In other words, multiple joint configurations can achieve the required twist at the contact points. While this redundancy can enable more flexible grasps, the indirect mapping between contact twists and joint velocities may complicate the planning process.

Indeterminacy

If we isolate the second part of the contact constraint and attempt to find a solution:

\[\begin{equation} \begin{split} & \upsilon_{c,obj} = G^T \upsilon \\ & \upsilon = (G^T)^\# \upsilon_{c,obj} + \mathcal{N}(G^T)\beta \end{split} \label{eq:cc2} \end{equation}\]

In this case, if $\mathcal{N}(G) \ne \emptyset$ we say that the grasp is indeterminate. his occurs because it is impossible to define a twist at the contact points that fully determines the twist of the object.

In other words, the contact constraints are insufficient to fully constrain the object’s motion.

Graspability

Graspability is arguably the most important property to analyze. If we take Equation $\ref{eq:vw_final}$, and attempt to solve for the contact wrenches:

\[\begin{equation} \begin{split} & \mathcal{W} = - G\lambda \\ & \lambda = -G^\# \mathcal{W} + \mathcal{N}(G)\gamma \end{split} \label{eq:graspability} \end{equation}\]

In this case, the object is graspable if and only if $\mathcal{N}(G) \ne \emptyset$.

We can build an intuition for this by considering the opposite case. If the null space of $G$ is empty, then there is only one solution for the contact forces $\lambda$ that can resist the external wrench $\mathcal{W}$. However, this unique solution does not necessarily satisfy the Friction Cone constraint, meaning that contact slippage may still occur.

To better understand this, we can decompose the contact forces into two separate components:

  • $\lambda_{ext}$ - the component capable of producing movement, which opposes the external wrench.
  • $\lambda_{int}$ - the internal force component, which does not produce movement but affects the internal forces within the object.

With this decomposition, we can rewrite the previous equation as:

\[\begin{equation} \lambda_{ext} + \lambda_{int} = -G^\# \mathcal{W} + \mathcal{N}(G)\gamma \rightarrow \biggl\{ \begin{array}{ll} \lambda_{ext} = -G^\# \mathcal{W} \\ \lambda_{int} = \mathcal{N}(G)\gamma \end{array} \label{eq:force_duality} \end{equation}\]

This implies that if $\mathcal{N}(G) \ne \emptyset$, w we can always resist the external wrench while also adjusting the internal forces (effectively “squeezing” the object) to ensure that the total contact wrench remains within the Friction Cone.

Defectiveness

Finally, by analyzing Equation $\ref{eq:hand_equi}$ and solving it with respect to the contact forces:

\[\begin{equation} \begin{split} & \tau = J^T\lambda \\ & \lambda = (J^T)^\# \tau + \mathcal{N}(J^T)\delta \end{split} \label{eq:defectiveness} \end{equation}\]

In this case, if $\mathcal{N}(J^T) \ne \emptyset$, we say that the grasp is defective. This occurs because it is impossible to set a joint torque $\tau$ that correctly maps to the desired contact forces.

This equation indicates that a given $\tau$ maps to a set of contacts forces shaped by the null space of $J^T$. In a physical system, this means that factors other than joint torques influence the contact forces.

Evaluation Metrics

In this section, we will apply the knowledge gained in the previous sections to summarize some of the most promising grasp evaluation metrics. This discussion is based on the review article [3], which provides a comprehensive summary of grasp evaluation methods up to 2015. We did not find any subsequent work proposing new evaluation metrics beyond those covered in this article.

Largest-minimum resisted wrench ($\epsilon$-metric)

This metric, first proposed in [4], is presented first because it is arguably the most widely used and comprehensive grasp evaluation metric. It is even the metric of choice for GraspIt!, the most widely used grasp-specific simulator. Although initially introduced in 2000 [5], this simulator is still widely used today due to its robust metric calculation.

We believe that, at least for this metric, a full derivation is essential to understand its assumptions and properties. However, before diving into the derivation, we should first explain its purpose.

This metric essentially determines the Grasp Wrench Space (GWS), which encloses all the wrenches that can be applied to an object given a set of contact points. The metric is defined as the minimum distance between the origin and the boundary of the GWS. This means that the metric is directly proportional to the maximum wrench that can be resisted in any direction by the contact points.

Recall that the set of contact forces that satisfy the friction cone constraint can be approximated by:

\[\begin{equation} f_i \approx \sum_{j=1}^m \alpha_j f_{i,j} = \text{pos}(\{f_{i,1}, f_{i,2}, \dots, f_{i,m}\}) \label{eq:aprox} \end{equation}\]

where $a_j \ge 0$, $\sum_{j=1}^m a_j = 1$ and $f_{i,j}$ represents a force applied at the ii-th contact point on the boundary of the friction cone.

For the following derivation, we assume that the applied forces are limited and normalized to 1. The original paper and review article provide a more detailed justification for this assumption. However, in summary, it simplifies the derivation without any loss of generality. Additionally, it introduces a practical constraint—limiting the power source of each finger, which aligns with real-world physical limitations.

If we assume that each force at the edge of the friction cone generates a wrench on the object, we can define the total wrench acting on the object as:

\[\begin{equation} \mathcal{W}_{obj} = \sum_{i=1}^n \mathcal{W}_i = \sum_{i=1}^n \alpha_{i,j}\mathcal{W}_{i,j} \label{eq:aprox_wrench} \end{equation}\]

where $a_j \ge 0$ and $\sum_{j=1}^m a_j = 1$. The limitation of the power source for each finger arises from the constraint that the sum of $\alpha$ values equals one for each contact.

This allows us to define the set of all possible wrenches that can be produced by the contacts, denoted as $\mathcal{P}$, using the Convex Hull of the Minkowski Sum.

\[\begin{equation} \mathcal{P} = CH(\bigoplus^n_{i=1}\{\mathcal{W}_{i,1}, \mathcal{W}_{i,2}, \dots, \mathcal{W}_{i,m}\}) \label{eq:wrench_set} \end{equation}\]

Equation $\ref{eq:wrench_set}$ applies the Minkowski sum to the sets containing the wrench extremas. Essentially, this sum selects one element from each set, sums them, and repeats this process for every possible combination, resulting in $m^n$ possibilities. This generates all possible extremal wrenches that can be applied to the object.

By applying a convex hull to this resultant set, we define the entire set of wrenches that lie within these extremas. Thus, $\mathcal{P}$ represents the set of all wrenches that can be exerted on the object.

We can now use this set to define the metric, or introduce an additional constraint that assumes all fingers share a common power source. In this case, the constraint changes from $\sum_i^n |f_i|\le 1$ instead of the former $|f_i|\le 1$. With this new constraint, we obtain a more restricted definition of $\mathcal{P}$, where $\sum_{i=1}^n\sum_{j=1}^m \alpha_{i,j} \le 1$, instead of the former $\sum_{j=1}^m \alpha_{i,j} \le 1$.

Now, instead of performing the Minkowski sum, we simply unify all the sets and compute their convex hull:

\[\begin{equation} \mathcal{P} = CH(\bigcup^n_{i=1}\{\mathcal{W}_{i,1}, \mathcal{W}_{i,2}, \dots, \mathcal{W}_{i,m}\}) \label{eq:gws} \end{equation}\]

This approach is more computationally efficient, as it results in $m^n$. Additionally, it defines a more restrictive set. The set $\mathcal{P}$ is commonly referred to as the Grasp Wrench Space (GWS).

If we define $\partial \mathcal{P}$ as the boundary of GWS, we can express the $\epsilon$-metric as:

\[\begin{equation} Q_{LRW} = \min_{\mathcal{W} \in \partial \mathcal{P}} |\mathcal{W}| \label{eq:e-metric} \end{equation}\]

If we imagine a 6D sphere in the wrench space, $Q_{LRW}$ represents the maximum radius of a 6D sphere centered at the origin that is fully contained within $\mathcal{P}$.

This metric is widely used because it accounts for the friction cone constraint as well as real-world limitations. Additionally, it ensures a robust grasp that can resist generic wrenches from all directions.

The only limitation of this metric is that, due to the presence of moments, it is frame-dependent.

Volume of Grasp Wrench Space

Given the formulation in the previous section, one approach to defining a frame-agnostic metric is to measure the volume of the convex hull $\mathcal{P}$.

\[\begin{equation} Q_{VOP} = Volume(\mathcal{P}) \label{eq:vol} \end{equation}\]

A larger volume of the GWS generally indicates a greater ability to resist external wrenches. However, this metric does not account for the minimum resisted wrench. As a result, a grasp might perform poorly in a specific direction despite having good overall performance.

Minimum Singular Value of G

This metric depends solely on the algebraic properties of the grasp matrix $G \in \mathbb{R}^{6\times n}$. This matrix has six singular values, and when at least one of these values approaches zero, the grasp loses the ability to resist wrenches in a specific direction, leading to a singularity.

The metric is then defined as:

\[\begin{equation} Q_{MSV} = \sigma_{min}(G) \label{eq:min_sing} \end{equation}\]

Even if there are no zero singular values, the smallest singular value provides insight into how close a given grasp is to singularity, making it a useful metric for evaluating grasp robustness.

One issue I find with this metric (as well as all other metrics based on the singular values of $G$), is a potential contradiction with the theoretical graspability condition. From the theory sections, we learned that a grasp is considered graspable if and only $\mathcal{N}(G) \ne \emptyset$. However, the existence of a null space inherently implies that there is at least one zero singular value.

This suggests that the metric is trying to move away from the necessary grasp condition established in theory, which I find highly counterintuitive. I am unable to determine which perspective is incorrect, as both seem to be based on solid arguments. Additionally, I do not fully understand whether it is possible that there is no contradiction at all.

Volume of the Ellipsoid in the Wrench Space

If we assume a unit-radius sphere in the force domain, the grasp matrix $G$ transforms this sphere into an ellipsoid in the wrench space.

To evaluate the global contribution of all contact forces in all directions, we can compute the volume of this wrench ellipsoid:

\[\begin{equation} Q_{VEW} = \sqrt{\det(GG^T)} = \sigma_1 \sigma_2 \sigma_3 \sigma_4 \sigma_5 \sigma_6 \label{eq:VEW} \end{equation}\]

Once again, zero singular values result in a poor metric, even though grasp theory requires a defined null space for $G$.

Grasp Isotropy Index

This evaluation metric assesses whether all contact forces contribute equally to resisting applied wrenches. It is computed as the ratio of the minimum singular value to the maximum singular value:

\[\begin{equation} Q_{G11} = \frac{\sigma_{min}(G)}{\sigma_{max}(G)} \label{eq:g11} \end{equation}\]

This metric is defined within the range $[0, 1]$ where zero represents the singularity and one is full isotropy. However, like the previous singular value-based metrics, this metric also contradicts the graspability condition established in theory.

Distance between Centroid and CM

The effects of gravity and inertia typically act at the center of mass (CM) of the object. To minimize their influence, the centroid of the contact points should be as close as possible to the center of mass.

Thus, this metric is defined as:

\[\begin{equation} Q_{DCC} = \text{Dist}(CM, C) \label{eq:centroid} \end{equation}\]

Orthogonality

Some studies suggest that humans tend to prefer grasping positions that align the hand with the principal axis of inertia of an object. Based on this observation, the following metric is defined:

\[\begin{equation} Q_0 = \begin{cases} \delta \text{ if } \delta < \frac{\pi}{4} \\ \frac{\pi}{2} - \delta \text{ if } \frac{\pi}{4} < \delta < \frac{\pi}{2} \\ \delta - \frac{\pi}{2} \text{ if } \frac{\pi}{2} < \delta < \frac{3\pi}{4} \\ \pi - \delta \text{ if } \delta > \frac{3\pi}{4} \end{cases} \label{eq:orth} \end{equation}\]

where $\delta = \arccos(z.u)$, $z$ is the vector perpendicular to the palm of the hand and $u$ is the main inertial axis. This metric is optimal when it is close to zero, meaning that the hand is either parallel or perpendicular to the principal axis.

Positions of the finger joints (Human-like)

This criterion evaluates how far the robot’s joints are from their joint limits, encouraging the robot to maintain a comfortable and stable configuration. A metric to assess this can be defined as:

\[\begin{equation} Q_{PFJ} = \sum_{i=1}^l \left( \frac{\theta_i - \theta_{0,i}}{\theta_{max,i} - \theta_{min,i}} \right)^2 \label{eq:hum} \end{equation}\]

Studies suggest that humans prefer to have similar flexion values across their finger joints during grasping. One way to evaluate this human-likeness is through the following metric:

\[\begin{equation} Q_{PFJ} = \sum_{i=1}^l \max_j \frac{|\theta_{i,j} - \theta_{i,j}|}{\theta_{max,j} - \theta_{min,j}} \label{eq:hum2} \end{equation}\]

Extras: Jacobian-based

Although these metrics are more relevant for manipulation than for grasping, they are still useful to define because they share the same formulation and have a similar interpretation to the $G$-based metrics.

We begin by defining the Hand-Object Jacobian:

\[\begin{equation} J_{ho} = (G^T)^\# J \label{eq:hand_jacob} \end{equation}\]

The metrics based on the Hand-Object Jacobian are defined as follows:

\[\begin{equation} Q_{DSC} = \sigma_{min}(J_{ho}) \label{eq:dsc} \end{equation}\] \[\begin{equation} Q_{VME} = \sqrt{\det{J_{ho}J_{ho}^T}} = \sigma_1 \sigma_2 \dots \sigma_r \label{eq:vme} \end{equation}\] \[\begin{equation} Q_{UOT} = \frac{\sigma_{min}}{\sigma_{max}} \label{eq:uot} \end{equation}\]

Conclusions

This blog post presents a comprehensive study in the field of robotic grasping, with the ultimate goal of defining useful evaluation metrics for assessing grasp quality.

To achieve this, we first established the algebraic foundations necessary to understand these methodologies. We introduced multiple grasp evaluation metrics, with the $\epsilon$-metric emerging as the most significant. This metric quantifies the largest fully enclosed ball within the Grasp Wrench Space (GWS), providing a robust measure of grasp stability.

The next step is to explore implementation strategies for these metrics in MuJoCo and/or investigate whether we can integrate the GraspIt! functions that already perform these calculations.

This post is licensed under CC BY 4.0 by the author.