How Cardboard Boxes Can Confuse Autonomous Cars: Inside the OMP-Attack

The promise of Autonomous Driving (AD) is built on trust—trust that the vehicle can perceive its environment, predict what others will do, and plan a safe route. But what if a few strategically placed cardboard boxes could shatter that trust?

In the world of adversarial machine learning, researchers are constantly probing for weaknesses to build safer systems. A recent paper, Enduring, Efficient and Robust Trajectory Prediction Attack in Autonomous Driving via Optimization-Driven Multi-Frame Perturbation Framework, uncovers a significant vulnerability in how self-driving cars predict the movement of other vehicles. The authors introduce a new method, the OMP-Attack, which uses simple physical objects to trick an autonomous vehicle (AV) into slamming on its brakes to avoid a phantom collision.

This blog post breaks down the mechanics of this attack, exploring how it overcomes the limitations of previous methods to become endurance-capable, computationally efficient, and robust in real-world scenarios.

The Vulnerability: Perception and Prediction

To understand the attack, we first need to understand the victim. A typical autonomous driving system operates in a pipeline:

Perception: Sensors (like LiDAR) scan the world to detect objects and track them.
Prediction: The system analyzes the historical trajectory of tracked objects to forecast where they will go next.
Planning: The AV plans its own route to avoid collisions based on those predictions.

The vulnerability lies in the Prediction module. If an attacker can manipulate the historical data fed into this module, they can force the AV to predict a collision that isn’t happening.

The Attack Scenario

Imagine an AV (the “victim”) driving down a lane. Parked on the side of the road is another car (the “adversarial vehicle”). The attacker wants the victim AV to believe the parked car is about to pull out into traffic, causing the victim to emergency brake.

To do this, the attacker places adversarial objects—simple items like cardboard boxes—near the parked car. These objects reflect laser pulses from the victim’s LiDAR sensor. The perception system mistakenly groups these points with the parked car, shifting its perceived center and heading. This corrupted “history” is fed into the prediction model, which then forecasts a dangerous trajectory.

Figure 1. Attack scenario of indirect trajectory prediction attack.

As shown in Figure 1 above, the normal scenario (a) allows the victim to pass safely. In the attack scenario (b), the adversarial objects (triangles) distort the perception of the red car, causing the prediction system (orange line) to forecast a collision, forcing the victim to brake.

The Problem with Previous Attacks

Prior research, specifically the SinglePoint Attack (SP-Attack), attempted this by optimizing object placement for a single frame of data. However, this approach faced three major hurdles:

Lack of Endurance: Prediction models use history (e.g., the last 2 seconds). If you only spoof the position in the current frame, the model’s filters (like Kalman filters) often smooth out the anomaly. The attack is fleeting.
Inefficiency: Finding the exact 3D location to place a box to cause a specific algorithmic error is a massive search problem.
Fragility: Previous attacks required precise alignment. If the cardboard was slightly the wrong size or rotated 5 degrees, the attack failed.

The OMP-Attack (Optimization-driven Multi-frame Perturbation) framework was designed to solve these three specific problems.

The Core Method: OMP-Attack Framework

The OMP-Attack is a sophisticated pipeline that automates the generation of these physical attacks. It operates in two distinct phases: determining what trajectory errors to inject, and determining where to put the physical objects to cause them.

Figure 2. The optimization-driven multi-frame perturbation framework (OMP-Attack) methodology overview.

As illustrated in Figure 2, the framework consists of three main innovations:

Enduring Multi-frame Attack: Generating perturbations across a sequence of time steps, not just one.
Efficient Location Optimization: Using swarm intelligence to find object locations.
Robust Attack Strategy: A “Precise Attack, Vague Optimization” technique to handle physical variations.

Let’s break these down in detail.

1. Enduring Multi-frame Attack

The primary goal is to minimize the Average Trajectory Distance (ATD) between the victim’s original plan and the adversarial predicted plan. In simpler terms, we want to deviate the predicted path as much as possible.

The mathematical objective is to minimize \(D_{avg}\):

Equation describing the minimization of Average Trajectory Distance.

Here, \(Y_t^v\) is the victim’s planned path, and \(Y_t^a\) is the predicted path of the adversarial vehicle.

The OMP-Attack doesn’t just attack the current moment \(t\). It steps back in time. It generates a “chain” of perturbations for frames \(t, t-1, t-2\), and so on. By feeding the optimization loop a history of corrupted states, the attack ensures that the trajectory prediction model receives a consistent, misleading story over several seconds. This prevents the model from filtering out the attack as “noise,” making the manipulation enduring.

2. Efficient Location Optimization

Once the system knows how the vehicle’s state needs to be perturbed (e.g., “shift the perceived center 0.5 meters left and rotate 2 degrees”), it must find the physical location for the cardboard box that causes this sensor error.

The relationship between a physical object’s location and the final output of a deep learning perception network is complex and non-differentiable. You can’t just use simple gradient descent.

The researchers formulated this as a similarity search:

Equation for minimizing similarity between detected and target perturbations.

To solve this, they employ Particle Swarm Optimization (PSO). Imagine a swarm of virtual bees flying through the 3D search space around the parked car. Each “bee” represents a potential location for the cardboard box.

The bees check their location (simulate the LiDAR scan).
They measure how close the resulting error is to the target error derived in step 1.
They communicate with the swarm to find the best spots and move toward them.

To guide this swarm, the authors designed a specialized loss function comprising three parts:

Equation for total loss function combining pose, heading, and shape.

\(\mathcal{L}_{pose}\) (Position): Ensures the perceived object is shifted to the correct coordinates.
\(\mathcal{L}_{heading}\) (Heading): Ensures the perceived object is facing the desired (wrong) direction.
\(\mathcal{L}_{shape}\) (Trajectory Shape): Uses Dynamic Time Warping (DTW) to ensure the shape of the resulting trajectory matches the target attack path.

This multi-loss approach allows the swarm to converge on effective locations much faster than brute-force methods or single-objective searches.

3. Robust Attack Strategy: “Precise Attack, Vague Optimization”

The biggest challenge in physical adversarial attacks is the real world. A cardboard box in a simulation is a perfect geometric shape. In reality, it might be slightly rotated, or the victim car might arrive 0.5 meters off-center.

The OMP-Attack tackles this with a clever decoupling strategy:

Precise Attack: When calculating the target perturbation (the error we want to cause), the system models the cardboard box accurately (exact size, orientation, 4 corners). This ensures the target is potent.
Vague Optimization: When searching for the location to place the box, the system treats the box as a single point (the center). It ignores size and orientation during the search.

Why does this work? By optimizing for the center point, the system finds a “sweet spot” where the presence of any object overlapping that point triggers the error. This means the attack succeeds even if the actual box used is larger, smaller, or rotated differently than expected. It dramatically lowers the precision required by the attacker.

Experiments and Results

The researchers evaluated OMP-Attack using the nuScenes dataset, a standard benchmark for autonomous driving. They compared their method against the state-of-the-art SP-Attack and a Brute-Force baseline.

1. Attack Effectiveness

The table below summarizes the core performance. Key metrics are:

ATD (Average Trajectory Distance): Lower is better for the attacker (indicates the prediction is closer to the victim’s path, causing conflict). Note: In the context of the paper’s target generation, minimizing ATD means creating a collision path. However, in general metrics, a collision implies the distance between trajectories becomes zero.
PRE (Planning-Response Error): Higher is better. This measures how much the victim AV had to deviate from its original plan.
CR (Collision Rate): The percentage of scenarios where the AV predicts a collision.

Table 1. Average attack results compared with counterparts.

As shown in Table 1, OMP-Attack achieves a 64% Collision Rate, significantly higher than SP-Attack’s 42%. It also forces the victim AV to deviate more drastically (PRE of 2.393m vs 1.191m). The “Varying Particle Numbers” section validates that the swarm optimization works; using more particles generally leads to stronger attacks.

2. Endurance: The Test of Time

The defining feature of OMP-Attack is its ability to sustain the deception. The researchers tested the attack over a sequence of time steps (\(t-3\) to \(t\)).

Figure 3. The quantitative results of attack endurance.

In Figure 3, look at the PRE (middle graph) and CR (right graph). The Orange line (OMP-Attack) consistently outperforms the Blue line (SP-Attack). Even 3 frames before the collision point (\(t-3\)), OMP-Attack is already generating significant errors, whereas SP-Attack often requires the vehicle to be at the exact instant of the attack to work.

Visualizing the trajectories makes this difference stark:

Figure 4. Visualization of attack endurance results for OMP-Attack and SP-Attack.

In Figure 4, the top row (a-d) shows OMP-Attack. Notice the orange dashed line (Adversarial Predict). It consistently cuts across the victim’s path (green line), causing a predicted collision in every frame. In the bottom row (e-h), SP-Attack struggles. In frames (e) and (f), the predicted trajectory barely deviates, meaning the victim AV likely wouldn’t react until the very last second.

3. Robustness: Handling Real-World Imperfections

The researchers tested if the attack holds up when the physical conditions aren’t perfect.

Object Size: Does the size of the cardboard box matter? Figure 5. The impact of varying object diameters on OMP-Attack. Figure 5 shows that performance is remarkably stable across diameters from 0.1m to 0.5m. The Collision Rate (CR) line fluctuates but stays high. This validates the “Vague Optimization” strategy—the center point is what matters, not the edges.

Object Orientation: Does it matter how the box is rotated? Table 2. The impact of different object orientations at adversarial locations on OMP-Attack. Table 2 confirms that rotating the box (0° to 135°) has minimal impact on the success of the attack.

Deviation Distance: What if the victim AV drives slightly to the left or right of the target point? Figure 6. The impact of the deviation distance between the victim AV and attack point on attack performance. This is perhaps the most critical result for real-world application. Figure 6 shows the ATD (y-axis) as the victim deviates (x-axis). Ideally, the attacker wants the ATD to stay low (indicating a predicted collision path). The Orange line (OMP-Attack) stays flat and low even as the car deviates by up to 1 meter. The Blue line (SP-Attack) skyrockets immediately—meaning if the victim is just 20cm off, the SP-Attack fails completely.

4. Transferability (Black-Box Attack)

Finally, can this attack work on a model it hasn’t seen? The researchers generated adversarial locations using one system (White-box) and tested them on a completely different model called Agentformer (Black-box).

Table 3. The average attack results of white-box attack and black-box attack.

Table 3 shows that while performance drops slightly in the black-box setting (as expected), OMP-Attack still maintains a significant impact, causing a 1.7m deviation (PRE) compared to SP-Attack’s 1.4m. This suggests the attack exploits fundamental vulnerabilities in LiDAR perception rather than overfitting to a specific model.

Conclusion

The OMP-Attack represents a step change in the sophistication of physical adversarial attacks on autonomous vehicles. By moving from single-point, precise attacks to multi-frame, robust optimization, the researchers have demonstrated that AVs are more vulnerable to “optical illusions” than previously thought.

The implications are significant:

Safety Criticality: Current trajectory prediction modules heavily trust historical data. If that history can be consistently spoofed, the prediction fails.
Low Barrier to Entry: This attack doesn’t require hacking the car’s computer. It only requires placing cheap physical objects on the roadside.
Need for Defense: This research highlights the urgent need for “robust perception”—systems that can identify and ignore illogical sensor data (like a parked car suddenly appearing to drift sideways because of a cardboard box) before it reaches the prediction stage.

As autonomous systems become more common, the cat-and-mouse game between attackers and defenders will continue. Work like OMP-Attack is essential to finding these holes before they can be exploited on the open road.

The Vulnerability: Perception and Prediction#

The Attack Scenario#

The Problem with Previous Attacks#

The Core Method: OMP-Attack Framework#

1. Enduring Multi-frame Attack#

2. Efficient Location Optimization#

3. Robust Attack Strategy: “Precise Attack, Vague Optimization”#

Experiments and Results#

1. Attack Effectiveness#

2. Endurance: The Test of Time#

3. Robustness: Handling Real-World Imperfections#

4. Transferability (Black-Box Attack)#

Conclusion#