Introduction
Imagine you are carrying a large, heavy box through a doorway. To get through, you might need to use your hip to nudge the door open while balancing the box with your arms, or perhaps you use a foot to kick a doorstop out of the way. As humans, we perform this kind of “loco-manipulation”—coordinating locomotion and manipulation simultaneously—effortlessly. We treat our limbs as versatile tools; a leg is usually for walking, but it can momentarily become a manipulator if the task demands it.
For robots, however, this fluid coordination is a massive computational headache. Most robotic systems treat legs strictly for transport and arms strictly for manipulation. Breaking this rigid assignment requires a control system that can dynamically reassign limb roles on the fly without falling over.
In this post, we are diving deep into ReLIC (Reinforcement Learning for Interlimb Coordination), a new framework presented by researchers at the RAI Institute, UC Berkeley, and Cornell. ReLIC allows a quadruped robot (specifically a Boston Dynamics Spot with an arm) to perform complex tasks like carrying yoga balls, closing drawers with its feet, and manipulating large boxes by dynamically mixing model-based control with reinforcement learning.

As shown in Figure 1 above, the core innovation here is flexibility. The robot isn’t just “walking” or “grasping”; it is coordinating an arm (green) and a selected leg (red) to handle an object, while the remaining three legs (purple) handle the complex physics of balancing and moving.
The Challenge: Why is Loco-Manipulation So Hard?
To understand why ReLIC is a breakthrough, we first need to look at why this problem is difficult.
In traditional robotics, we often see a “divide and conquer” approach. If a mobile manipulator needs to pick up an object, it usually:
- Drives to the location (Locomotion).
- Stops.
- Picks up the object (Manipulation).
- Drives away.
This is safe, but slow and limited. True loco-manipulation—doing both at once—introduces dynamic coupling. The forces exerted by the arm affect the robot’s balance. Conversely, the gait of the legs introduces vibrations and movements that the arm must compensate for.
If you add the requirement that a leg should stop walking and start manipulating (e.g., closing a drawer), the problem explodes in complexity. The robot effectively changes its topology from a stable four-legged crawler to a precarious three-legged balancer. Previous methods mostly relied on pre-defined heuristics (hard-coded rules for specific tasks) or model-based trajectory optimization, which requires precise knowledge of the environment and often struggles with the messy reality of unstructured worlds.
The ReLIC Architecture
The researchers propose a hierarchical framework that splits the problem into two levels: the Task Level (what do I want to do?) and the Command Level (how do I move my motors to do it?).
The heart of the system is the ReLIC Controller, which resides at the command level. This is not a single end-to-end neural network that takes camera pixels and outputs motor torques. Instead, it is a hybrid architecture designed to get the best of both worlds: the precision of classical control and the robustness of Reinforcement Learning (RL).

The Adaptive Controller: A Tale of Two Modules
As illustrated in Figure 2, the controller is composed of two interacting modules:
- The Model-Based (MB) Controller: This module prioritizes task success. It calculates the necessary movements for the limbs assigned to manipulation. It typically uses Inverse Kinematics (IK) to figure out exactly how to angle the joints to reach a specific target coordinate.
- The RL Controller: This module prioritizes locomotion stability. It is a neural network trained to keep the robot upright and walking, regardless of what weird contortions the manipulation limbs are performing.
Dynamic Limb Assignment
The “magic” happens in how these two combine. The system uses a binary mask, denoted as \(m\).
- If \(m=1\) for a specific limb, that limb is in Manipulation Mode.
- If \(m=0\), that limb is in Locomotion Mode.
The final action sent to the robot’s motors is a blended composition. The manipulation limbs follow the precise Model-Based controller, while the locomotion limbs (and the overall body balance) are governed by the RL policy. This decoupling allows the robot to seamlessly switch roles. A leg can be a walker one second and a pusher the next.
Learning to Walk on Three Legs
The most challenging part of this system is training the RL policy. The robot needs to learn how to walk not just with four legs (trotting), but also with three legs (bouncing) while a heavy arm and a fourth leg are waving around doing something else.
Simulation and Training
The researchers trained the policy in a physics simulator (IsaacLab). The robot is subjected to a variety of randomized conditions—different friction levels, robot masses, and external pushes—to ensure the policy is robust.
The RL agent receives a massive stream of data:
- Proprioception: Joint positions, velocities, and gravity vectors.
- Commands: Where the robot should be going.
- History: What action it took previously.
Gait Regularization
If you just tell an RL agent to “don’t fall over,” it will often learn weird, jittery behaviors that look unnatural and might damage the hardware. To prevent this, the researchers introduced specific Gait Regularization rewards.

As shown in Figure 10, the system enforces specific contact timings.
- Four-Legged: It encourages a symmetric trotting gait.
- Three-Legged: It enforces a “cyclic bouncing gait.” When one leg is lifted for manipulation, the other three legs must cycle through a specific staggered pattern to maintain dynamic stability.
This structured approach to learning ensures that when the robot switches from four legs to three, it doesn’t just scramble; it transitions into a stable, rhythmic bounce.
Bridging the Reality Gap: Sim-to-Real
One of the biggest hurdles in robotics is “Sim-to-Real” transfer. A policy that runs perfectly in a clean simulation often fails in the real world because real motors have friction, latency, and torque limits that simulators don’t perfectly model.
The ReLIC team found that standard domain randomization wasn’t enough, particularly for the high-stress scenario of three-legged walking. The solution was Motor Calibration.
They deployed an initial policy on the real robot, collected data on how the actual motors responded (Torque vs. Velocity), and compared it to the simulation.

In Figure 11, the red lines represent the ideal torque limits in the simulator. The blue dots are real data. Notice how the real robot (blue dots) often operates outside the naive red box or behaves differently near the limits. By feeding this calibrated data back into the simulation and retraining, the RL policy learned to respect the actual physical limits of the hardware, leading to much smoother and more successful real-world deployment.
Talking to the Robot: Task Interfaces
A powerful controller is useless if you can’t tell the robot what to do. ReLIC supports three levels of user interaction, ranging from low-level control to high-level AI reasoning.
1. Direct Targets
This is the most straightforward method. An operator uses a joystick or pre-defined trajectory to tell the arm and leg exactly where to go. This is useful for precise, repetitive motions.
2. Contact Points
In this mode, the user points to a spot on an object in a 3D point cloud and says, “Put your foot here” or “Put your hand there.” The system then generates the trajectory to make that contact happen.

3. Language Instructions
This is the most futuristic interface. The user gives a natural language command like “Use the arm and leg to close the two open drawers.”
To achieve this, the system uses a pipeline involving Vision-Language Models (VLMs):
- Segment: The robot takes a picture. A model called SAM2 (Segment Anything Model) outlines all the objects.
- Reason: GPT-4o analyzes the image and the prompt. It decides which object is the “drawer” and infers where a hand or foot should push to close it.
- Execute: These inferred points are fed into the ReLIC controller as targets.

Experimental Results
The researchers pushed Spot to its limits with 12 diverse tasks designed to test different aspects of coordination.

The tasks were categorized into:
- Mobile Interlimb Coordination: Carrying big things while moving (e.g., Yoga Ball, Shipping Box).
- Stationary Interlimb Coordination: Standing on three legs while manipulating (e.g., Trash Bin, Tire Pump).
- Foot-Assisted Manipulation: Tasks where the leg helps the arm (e.g., Tool Chest, Chair).
Success Rates
The results were impressive. The graphs below compare ReLIC against two baselines: an End-to-End RL policy (trying to learn everything at once) and a Model Predictive Control (MPC) baseline.

ReLIC achieved an average success rate of 78.9%.
- ReLIC-Direct (Dark Purple) performed best, as human operators provide the most optimal targets.
- ReLIC-Contact and ReLIC-Language (Lighter Purples) performed slightly lower but still demonstrated that the robot could autonomously figure out how to act.
- The Baselines (Orange and Tan) failed almost completely. The MPC baseline couldn’t handle the complex three-legged dynamics, and the End-to-End RL baseline failed to learn precise manipulation alongside locomotion.
Visualization of Flexible Coordination
One of the most visually interesting results is seeing when the robot decides to use its legs for what purpose.

In Figure 7, we see the “timeline” of limb usage.
- Green: Arm Manipulation.
- Red: Leg Manipulation.
- Purple: Interlimb Coordination (Both).
- Gray: Balancing.
Look at the Deck Box (B) task. The robot uses its arm, then switches to using its leg to prop open the lid, then coordinates both. This seamless switching—without the robot needing to reboot or stop to change “modes”—is the hallmark of the ReLIC system.
Conclusion
ReLIC represents a significant step forward in robotic autonomy. By acknowledging that locomotion and manipulation are distinct but deeply interconnected problems, the researchers designed a system that is both robust (thanks to RL) and precise (thanks to Model-Based control).
The implications extend beyond just opening drawers or carrying boxes. This kind of “whole-body intelligence” is essential for robots that will eventually work in our homes and on construction sites—unstructured environments where a robot might need to use an elbow to open a door or a foot to brace a collapsing shelf.
While limitations remain—such as the reliance on an external vision system for the language interface and the open-loop nature of the high-level planner—ReLIC proves that flexible interlimb coordination is not only possible but practical on current hardware.
Note: This blog post explains the research paper “Versatile Loco-Manipulation through Flexible Interlimb Coordination” by Zhu et al. (2025).
](https://deep-paper.org/en/paper/2506.07876/images/cover.png)