Introduction

In the world of safety-critical AI—think autonomous driving, medical diagnosis, or flight control systems—“pretty good” isn’t good enough. We need guarantees. We need to know for a fact that if a stop sign is slightly rotated or has a sticker on it, the car’s neural network won’t misclassify it as a speed limit sign.

This is the domain of Neural Network Verification. The goal is to mathematically prove that for a given set of inputs (a “perturbation set”), the network will always produce the correct output.

Historically, researchers have been stuck between a rock and a hard place:

  1. Linear Bound Propagation (LBP): Methods like CROWN or \(\alpha\)-CROWN are incredibly fast and scale to massive networks. However, they can be “loose,” meaning they might fail to verify a safe network because their mathematical approximations are too conservative.
  2. Semidefinite Programming (SDP): These methods offer much tighter, more precise bounds. The catch? They are excruciatingly slow. Their computational complexity scales cubically (\(O(n^3)\)), making them impossible to use on modern, large-scale deep learning models.

Enter SDP-CROWN.

In a new paper from the University of Illinois at Urbana-Champaign, researchers have proposed a hybrid framework that bridges this gap. By cleverly using the principles of Semidefinite Programming to derive a new type of linear bound, they achieve the tightness of SDP with the speed of bound propagation.

Figure 1: Verification of the ConvLarge network on the CIFAR-10 dataset. A comparison of verified accuracy shows SDP-CROWN significantly outperforming state-of-the-art methods.

As shown in Figure 1, the difference is not marginal. On large convolutional networks where traditional fast verifiers achieve only 2.5% to 6.0% verified accuracy against \(\ell_2\) attacks, SDP-CROWN achieves a striking 63.5%, approaching the theoretical upper bound.

In this post, we will tear down the mathematics of this paper, explain why traditional methods struggle with specific types of attacks, and how SDP-CROWN solves the problem by adding just a single parameter per layer.

The Background: What Are We Verifying?

Before diving into the solution, let’s define the problem. We treat a neural network \(f(x)\) as a sequence of linear transformations (weights \(W\)) and non-linear activations (ReLU).

Equation defining the neural network recursively with ReLU activations.

We want to ensure that for an input \(\hat{x}\) (e.g., an image of a panda), any adversarial example \(x\) within a certain distance remains classified correctly. If the correct class is \(i\), the score for class \(i\) must be greater than the score for any incorrect class \(j\). Mathematically, we want to find the minimum “margin” of safety:

Equation for the robustness margin d_j.

If \(d_j\) is positive, we are safe. If it’s negative, an attack exists.

The Geometry of Attacks: Boxes vs. Balls

The shape of the “danger zone” around the input matters immensely.

  1. \(\ell_\infty\)-norm (The Box): This assumes each pixel can change independently by a small amount \(\epsilon\). Geometrically, this is a hyper-rectangle (or box).
  2. \(\ell_2\)-norm (The Ball): This assumes the total Euclidean distance of the change is bounded by \(\rho\). Geometrically, this is a hypersphere (or ball).

Equation defining the L2 norm ball constraint.

While bound propagation methods (like \(\alpha\)-CROWN) are excellent at handling “Box” perturbations, they are notoriously bad at handling “Ball” (\(\ell_2\)) perturbations.

The Problem: Why is Bound Propagation Loose for \(\ell_2\)?

Linear Bound Propagation (LBP) works by pushing linear constraints through the network layer by layer. It is designed to handle features individually. When you ask an LBP verifier to handle an \(\ell_2\) ball, it doesn’t know how to handle the curvature of the sphere.

To make the math work, standard LBP methods essentially wrap the sphere inside the tightest possible box (an \(\ell_\infty\) box) and verify that box instead.

Equation showing the relaxation of an L2 ball into an L-infinity box.

This seems like a harmless approximation, but in high-dimensional spaces, it is disastrous.

  • A sphere of radius \(\rho\) is contained in a box of side length \(2\rho\).
  • However, the corners of that box are actually distance \(\sqrt{n}\rho\) away from the center.

In a high-dimensional network (where \(n\) is large), the box has a volume vastly larger than the sphere. You are asking the verifier to guarantee safety for a space that is up to \(\sqrt{n}\) times larger than the actual attack space. This leads to overly conservative results—the verifier says “I can’t promise this is safe” even when it is.

The Solution: SDP-CROWN

The researchers propose a method that captures the curvature of the \(\ell_2\) constraint without incurring the massive computational cost of full SDP solvers.

The Core Concept

Standard bound propagation finds linear lower and upper bounds for the network output:

\[ c^T f(x) \geq g^T x + h \]

Here, \(g\) is the slope and \(h\) is the offset (bias).

SDP-CROWN keeps the efficient structure of propagating these linear bounds but calculates the offset \(h\) in a smarter way. Instead of using a box to calculate \(h\), it solves a specialized optimization problem derived from Semidefinite Programming.

Step 1: The Primal Problem

To find the tightest possible offset \(h\) for a given slope \(g\) inside an \(\ell_2\) ball, we theoretically want to solve this minimization:

Minimization problem of linear function over ReLU network with L2 constraint.

This looks simple, but the \(\text{ReLU}(x)\) term makes it non-convex and hard to solve exactly.

Step 2: The SDP Relaxation

To solve this, the authors employ a classic trick from robust control theory: SDP Relaxation. They treat the ReLU constraint \(x = \text{ReLU}(z)\) not as a hard non-linearity, but as a quadratic constraint.

If \(x = \text{ReLU}(z)\), we can split variables into positive and negative parts (\(u\) and \(v\)) such that \(u \geq 0, v \geq 0\) and their product is zero (\(u \odot v = 0\)).

Equation showing the variable splitting for ReLU constraints.

By relaxing the constraint \(u \odot v = 0\) (which is hard) into a positive semidefinite matrix constraint, they obtain a convex problem. This is where standard SDP methods stop—they throw this massive matrix into a solver, which takes \(O(n^3)\) time.

Step 3: The Efficient Dual

Here is the main contribution of the paper. The researchers didn’t stop at the matrix formulation. They derived the Lagrangian dual of this specific SDP relaxation.

By doing the math analytically, they realized that for this specific structure, the massive matrix optimization collapses into a problem that requires optimizing only one scalar variable per layer, denoted as \(\lambda\).

The resulting bound is stated in Theorem 4.1:

Equation showing the linear lower bound derived by SDP-CROWN.

Where the offset \(h(g, \lambda)\) is defined as:

Equation for the offset function h(g, lambda) in SDP-CROWN.

And \(\phi\) is a specific vector function:

Definition of the phi function used in the offset calculation.

Why is this huge?

  1. Complexity: We are no longer solving a matrix problem. We are maximizing a function of a scalar \(\lambda \geq 0\). This is cheap and fast.
  2. Compatibility: This calculation for \(h(g, \lambda)\) can be swapped directly into existing bound propagation tools like \(\alpha\)-CROWN. You simply replace the “Box-based” offset calculation with this “SDP-based” calculation.

Visualizing the Improvement

To understand why this math matters, look at the geometry of the relaxation. The paper provides a visualization of the bounds produced by standard methods versus SDP-CROWN for a simple 2-neuron case.

Figure 2: Comparison of linear relaxations. Left: Bound propagation uses a box, creating a loose, sharp relaxation. Right: SDP-CROWN uses the ball, creating a smooth, tight hull.

  • Left (Standard LBP): The relaxation is sharp and angular. It’s optimized for a box, so when applied to a ball, it leaves large gaps (looseness) between the bound and the true function.
  • Right (SDP-CROWN): The relaxation is smooth and hugs the function surface tightly. This shape comes from implicitly handling the \(\ell_2\) constraint via the \(\lambda\) parameter.

The authors prove theoretically that when the perturbation is centered at zero, SDP-CROWN is exactly \(\sqrt{n}\) times tighter than standard bound propagation.

Experiments and Results

The theory sounds great, but does it work on real networks? The authors tested SDP-CROWN on MNIST and CIFAR-10 models, comparing it against:

  • \(\alpha\)-CROWN / \(\beta\)-CROWN: State-of-the-art bound propagation.
  • LipNaive / LipSDP: Methods based on Lipschitz constants.
  • LP-All / BM-Full: Pure convex relaxation methods (very slow).

Verification Accuracy

Referencing Table 1 (partially visible in the image deck below), the results are consistent. On the ConvLarge model (CIFAR-10), which has ~2.47 million parameters and 65k neurons:

MethodVerified AccuracyTime
\(\alpha\)-CROWN2.5%47s
LipNaive47.5%1.2s
SDP-CROWN (Ours)63.5%73s

Traditional bound propagation falls apart completely (2.5%). The naive Lipschitz method is fast but loose. SDP-CROWN achieves a high certification rate while taking only marginally longer than the fast methods (73s vs 47s).

Table 1: Verified accuracy under L2-norm perturbations showing SDP-CROWN’s superior performance.

Tightness of Lower Bounds

It’s also instructive to look at the raw bounds. In verification, we want the lower bound of the correct class score to stay above 0.

Figure 3: Lower bounds on the robustness margin. SDP-CROWN (red line) stays positive for much larger perturbations than other methods.

In Figure 3, the x-axis represents the size of the attack (\(\ell_2\) radius). As the attack gets stronger, the certified lower bound drops.

  • Blue Line (\(\alpha\)-CROWN): Drops below zero almost immediately for larger models.
  • Red Line (SDP-CROWN): Hugs the upper bound (black line) much closer and stays positive for larger perturbation radii.

Conclusion

SDP-CROWN represents a significant step forward in the formal verification of neural networks. It successfully identifies the root cause of inefficiency in current methods—the mismatch between box-based verification tools and ball-based (\(\ell_2\)) adversarial attacks.

By going back to the mathematical drawing board and leveraging the dual of the Semidefinite Programming relaxation, the authors derived a formulation that is:

  1. Tight: capturing the true geometry of the attack.
  2. Efficient: adding only one parameter per layer rather than an \(n \times n\) matrix.
  3. Scalable: capable of verifying networks with millions of parameters.

For students and practitioners in robust AI, this paper highlights a valuable lesson: sometimes the best way to speed up an algorithm isn’t better code, but better math. By analytically solving the expensive part of the optimization before writing a single line of code, the researchers turned an intractable problem into a tractable one.