If you look at your own hand, you’ll realize it is a marvel of engineering. You can wrap your fingers around a heavy hammer to swing it (a power grasp), but you can also delicately hold a key between your thumb and index finger to unlock a door (a pinch grasp), or use three fingers to manipulate a pen (precision grasp).
For robots, replicating this versatility is a massive challenge. While we have built sophisticated multi-fingered robotic hands (like the Shadow Hand or the Allegro Hand), teaching them how to use that dexterity remains difficult. The bottleneck is often data. To train a robot to grasp anything, we need massive datasets containing millions of examples of stable, physically realistic grasps.
Current methods for generating this data tend to be “lazy.” They often converge on the simplest solution: wrapping the whole hand around the object. This leaves us with robots that hold everything like a club, missing out on the refined manipulation required for real-world tasks.
In this post, we will dive into GraspQP, a research paper that proposes a new method to synthesize diverse, physically grounded grasps. By enforcing strict physical constraints through a Differentiable Quadratic Program (QP) and using a clever “population-aware” optimizer, the authors generate a dataset that goes far beyond the standard power grasp.

The Problem: Why is Grasp Synthesis Hard?
To understand the contribution of GraspQP, we first need to look at how robots currently learn to grasp.
Traditional methods fall into two buckets:
- Sampling-based: The computer randomly guesses a hand position and checks if it works. This is slow and inefficient for complex hands with many joints (degrees of freedom).
- Analytical methods: These use geometry and physics to calculate stability. While accurate, they are computationally heavy and hard to optimize using modern deep learning techniques.
Recent advancements have moved toward gradient-based optimization. Imagine the “quality” of a grasp is a score. If we can calculate the gradient (the slope) of that score with respect to the finger positions, we can use gradient descent to slide the fingers into a better position automatically.
However, to make the math work (i.e., to make it differentiable), researchers often cut corners. They might ignore friction or simplify the definition of “force closure” (the physical condition that ensures an object won’t slip). The result? The optimization finds “fake” grasps that look good to the algorithm but fail in the real world, or it finds the same boring power grasp over and over again.
The Solution: GraspQP
The GraspQP paper introduces a framework that doesn’t compromise on physics. It combines a rigorous definition of grasp stability with a modified optimizer designed to hunt for diversity.
Here is the high-level pipeline:

As shown in Figure 2, the process starts with a coarse initialization (a rough guess). The system then enters an optimization loop. It evaluates the grasp using a composite energy function—a mathematical way of scoring how “bad” the grasp is (lower is better).
The total energy function looks like this:

Where:
- \(E_{FC}\): Force Closure Energy. This is the core innovation (more on this below).
- \(E_{dis}\): Distance Energy. Pulls the fingers closer to the object surface.
- \(E_{reg}\): Regularization. Prevents the hand from moving into impossible positions or colliding with itself.
Let’s break down the two main contributions: the rigorous Force Closure metric and the MALA* Optimizer.
1. Differentiable Force Closure via Quadratic Programming
The most critical part of grasping is Force Closure. In simple terms, a grasp has force closure if the fingers can apply forces to resist any external push or twist on the object without it slipping.
Mathematically, this relates to the Wrench Space. A “wrench” combines force and torque. If your fingers can produce a set of wrenches that “positively span” the entire 6-dimensional wrench space (3 forces + 3 torques), you have a stable grasp.
The “Lazy” Approach vs. The GraspQP Approach
Previous differentiable methods often used a simplified condition: “The sum of contact forces must equal zero.” While necessary, this isn’t sufficient. It encourages fingers to touch the object with the absolute minimum force required, often leading to unstable contacts.
GraspQP formulates this condition strictly. It asks: Can we find a set of force coefficients (\(\gamma\)) such that the fingers actively squeeze the object?
The authors propose this energy formulation:

Notice the constraint: \(u \ge \hat{\gamma}_i \ge 1\).
- \(\hat{\gamma}_i \ge 1\): This forces the fingers to apply a minimum non-zero force. It prevents “ghost” contacts where the math says there is a touch, but the force is practically zero.
- \(u \ge \hat{\gamma}_i\): This sets an upper bound, acknowledging that real motors have torque limits.
Solving with a QP
Here lies the challenge: This formulation involves hard inequality constraints (greater than 1, less than \(u\)). Standard gradient descent struggles with hard walls.
The researchers solve this by formulating the energy calculation as a Quadratic Program (QP). A QP is a specific type of optimization problem that is convex and well-understood.

Because this optimization problem is convex, we can use the KKT conditions (a set of optimality conditions in convex optimization) to compute gradients. This means we can solve the physics problem strictly inside the QP, get the optimal force distribution, and then pass the gradient of that result back to the hand’s joints to update the pose.
Finally, to ensure the grasp is robust against disturbances in all directions (not just one), the authors add a term involving the singular values (\(\sigma\)) of the wrench matrix:

By maximizing the product of the singular values (\(\prod \sigma_i\)), they maximize the volume of the wrench space, ensuring the grasp is strong in every direction.
2. The MALA* Optimizer
Even with a perfect energy function, optimization can fail. A common issue is mode collapse. If you ask an algorithm to find a “good” grasp, it will often find the single easiest good grasp (usually a power grasp) every time, regardless of how you initialize it.
To generate a diverse dataset (pinches, tripod grasps, etc.), the optimizer needs to explore.
The authors use the Metropolis-Adjusted Langevin Algorithm (MALA), a method that adds noise to the gradient descent process to help it explore. But they modify it into MALA* (MALA-Star) by making the optimizer “population-aware.”
Instead of optimizing one grasp at a time, they optimize a batch of grasps simultaneously. This allows them to use the statistics of the whole group to fix individual failures:
- Dynamic Resetting: If a specific grasp in the batch has an energy score significantly worse than the rest of the group (it’s stuck in a bad local minimum), the system kills it and resets it to a new random configuration.
- Adaptive Temperature Scaling: In physics simulations, “temperature” controls how much randomness (noise) is added. If a grasp is performing poorly compared to the group, the algorithm turns up the heat (\(T_i\)). This increases the randomness, helping the grasp “jump” out of the bad spot.
Experimental Setup
To test this, the researchers used Isaac Lab for simulation. They tested on a dataset of 50 objects using five different robotic hands, ranging from simple two-finger grippers to the highly complex Shadow Hand.

They measured success using two key metrics:
- Unique Grasp Rate (UGR): The percentage of generated grasps that are both stable (pass a simulation shake test) and geometrically distinct from one another.
- Entropy (H): A measure of statistical diversity in the joint angles and wrist poses. Higher entropy means the hand is using a wider variety of configurations.
Results: Quality and Diversity
The results show that adding rigorous physics and smart optimization pays off.
Grasp Quality and Diversity
Table 1 below compares GraspQP against state-of-the-art baselines like DexGraspNet and GenDexGrasp.

Key Takeaways from the Data:
- Shadow Hand Performance: Look at the “Shadow UGR” column. GraspQP achieves 49% unique grasp rate with 4 contact points, compared to 36% for DexGraspNet (MALA*). This is a massive jump in the ability to find complex, valid grasps for high-DoF hands.
- Diversity (Entropy): The Entropy (\(H\)) scores are consistently higher for GraspQP (bolded values), proving that the method finds a wider variety of hand poses.
- Penetration Depth: While GraspQP generally has slightly higher penetration depth (fingers clipping into objects) due to the aggressive force requirements, it remains within realistic limits (\(<3\)mm).
Ablation Study: Does MALA* Matter?
Is the improvement coming from the QP or the Optimizer? Table 2 shows it’s a combination of both.

When the authors took the baseline methods (GenDexGrasp) and just swapped the standard optimizer for their MALA*, performance jumped significantly (+7.0% UGR). Conversely, relaxing the strict force closure constraints (moving from formulation iii to ii) caused performance to drop. This confirms that both the strict physics constraints and the population-aware optimization are necessary.
Scaling vs. Time
One might argue that because solving a QP is slower than a simple calculation, the method is inefficient. However, the goal is dataset generation, where quality matters more than real-time speed.

Figure 4 illustrates this perfectly. The blue line (DexGraspNet) saturates. Even if you give it 512 attempts (seeds), it hits a ceiling of about 60 unique grasps. It simply runs out of ideas.
GraspQP (the orange line) keeps climbing. With 128 seeds, it reaches nearly 80 unique grasps. It is far more sample-efficient, meaning a smaller number of optimization attempts yields a richer dataset.
Visualizing the Dexterity
Numbers are great, but in robotics, seeing is believing. The “Contact Heatmaps” generated by the researchers vividly show the diversity of the grasps.
For the Shadow Hand (a hand very similar to a human hand), we can see distinct contact patterns for different grasp types:

- Default/Power: The contacts are spread over the palm and proximal phalanges (the base of the fingers).
- Precision: The contacts shift almost entirely to the fingertips.
- Pinch: The heat is concentrated strictly on the thumb and index finger.
We see similar patterns for the Allegro Hand:

And even for simpler grippers like the Robotiq 3F:

These heatmaps prove that GraspQP isn’t just finding “variations on a theme”—it is successfully isolating distinct mechanical strategies for holding objects.
Conclusion
GraspQP represents a significant step forward in robotic grasp synthesis. By refusing to compromise on the physics of Force Closure and implementing it via a differentiable Quadratic Program, the researchers ensured that generated grasps are physically robust. By pairing this with the MALA* optimizer, they ensured that the system explores the full landscape of possibilities, rather than getting stuck in the “power grasp” trap.
For students and researchers in robotics, this work highlights a crucial lesson: Differentiability is powerful, but physics is non-negotiable. Simply relaxing constraints to make math easier can lead to data that lacks the richness required for real-world tasks.
The result of this work is a new, large-scale dataset for 5,700 objects, providing a rich training ground for the next generation of dexterous robots that might finally be able to handle our world as skillfully as we do.
](https://deep-paper.org/en/paper/2508.15002/images/cover.png)