Papers

[Scaling Trends in Language Model Robustness 🔗](https://openreview.net/pdf?id=tNGdLEL4R0)

The Arms Race of AI: Does Scale Automatically Fix Robustness?

The Arms Race of AI: Does Scale Automatically Fix Robustness? The rapid ascent of Large Language Models (LLMs) has been defined by a single, powerful concept: scaling laws. We have learned, quite empirically, that adding more parameters, more data, and more compute consistently unlocks new capabilities. From writing code to passing the bar exam, “bigger is better” has been the golden rule of the AI boom. But there is a shadow side to this growth. While models become more capable, they remain stubbornly vulnerable to adversarial attacks. “Jailbreaks”—prompts designed to trick models into generating harmful content—plague even the most advanced systems (like GPT-4 or Claude). As models are integrated into critical systems, from email filtering to autonomous agents, these vulnerabilities transform from curiosities into security risks. ...

[From Language Models over Tokens to Language Models over Characters 🔗](https://arxiv.org/abs/2412.03719)

The Token-Character Gap: Why LLMs Struggle with Trailing Spaces and How to Fix It

If you have ever built an application on top of a Large Language Model (LLM), you have likely encountered behavior that feels inexplicably brittle. You construct a carefully worded prompt, get a great result, and then—perhaps accidentally—you add a single trailing whitespace to the end of your prompt. Suddenly, the model’s output changes completely. Why does a system capable of passing the bar exam stumble over a space bar? The answer lies in a fundamental disconnect between how humans read text and how modern LLMs process it. Humans see characters; models see tokens. This disconnect creates what researchers call the Prompt Boundary Problem. ...

[PANDAS: Improving Many-shot Jailbreaking via Positive Affirmation, Negative Demonstration, and Adaptive Sampling 🔗](https://arxiv.org/abs/2502.01925)

Breaking the Guardrails: How PANDAS Exploits Long-Context LLMs

The capabilities of Large Language Models (LLMs) have exploded in recent years. One of the most significant technical leaps has been the expansion of the context window—the amount of text a model can process at once. We’ve gone from models that could barely remember a few paragraphs to systems like Llama-3 and Gemini that can process entire books or massive codebases in a single prompt. This “long-context” capability enables powerful new applications, such as autonomous agents and deep document analysis. However, it also opens a massive security hole. ...

[FlowDrag: 3D-aware Drag-based Image Editing with Mesh-guided Deformation Vector Flow Fields 🔗](https://arxiv.org/abs/2507.08285)

Fixing the Melting Problem: How FlowDrag Uses 3D Meshes for Precise Image Editing

Imagine you have a photo of a person looking to the left, and you want them to look to the right. With modern Generative AI, specifically “drag-based” editing, this should be simple: you click the nose (the handle point) and drag it to the right (the target point). In theory, the AI should understand the geometry of a face. It should know that when the nose moves, the cheek, the ear, and the hat should rotate along with it. In practice, however, current methods often fail to grasp this structural integrity. Instead of rotating the head, the AI might simply stretch the nose like taffy, distorting the face into a surrealist nightmare. This is known as the geometric inconsistency problem. ...

[Decision Making under the Exponential Family: Distributionally Robust Optimisation with Bayesian Ambiguity Sets 🔗](https://arxiv.org/abs/2411.16829)

Hedging Your Bets: How Bayesian Ambiguity Sets Cure the Optimizer's Curse

Introduction In the world of decision-making, data is king. But data is also messy, finite, and noisy. Whether you are managing a stock portfolio, stocking inventory for a store, or training a machine learning model, you rarely know the true mechanism generating your data. Instead, you have to estimate it. The standard approach is to gather data, fit a probability distribution (your model), and make the decision that minimizes your expected risk based on that model. In a Bayesian framework, you go a step further: you combine your data with prior beliefs to get a posterior distribution, giving you a better sense of parameter uncertainty. ...

[TIMING: Temporality-Aware Integrated Gradients for Time Series Explanation 🔗](https://arxiv.org/abs/2506.05035)

Why Time Series XAI is Broken and How TIMING Fixes It

Introduction In the rapidly evolving landscape of Artificial Intelligence, time series data is the lifeblood of critical industries. From monitoring a patient’s vitals in an ICU (healthcare) to predicting power grid fluctuations (energy) or detecting traffic anomalies (transportation), deep learning models are making decisions that affect human safety. However, these deep neural networks are often “black boxes.” We feed them data, and they spit out a prediction. In high-stakes environments, “it works” isn’t enough; we need to know why it works. This is the domain of Explainable AI (XAI). ...

[Policy-labeled Preference Learning: Is Preference Enough for RLHF? 🔗](https://openreview.net/pdf?id=qLfo1sef50)

Beyond Preferences: Why Knowing 'Who Acted' Matters in RLHF

Introduction Reinforcement Learning from Human Feedback (RLHF) has undeniably changed the landscape of Artificial Intelligence. It is the engine under the hood of modern Large Language Models (LLMs) like GPT-4 and Llama 2, allowing them to align with human intent. The standard recipe for RLHF usually involves training a reward model to mimic human preferences and then optimizing a policy to maximize that reward. However, a new wave of research, spearheaded by methods like Direct Preference Optimization (DPO), has simplified this process. DPO skips the explicit reward modeling step entirely, optimizing the policy directly from preference data. It’s elegant, stable, and effective—at least when the data behaves nicely. ...

[G-Adaptivity: optimised graph-based mesh relocation for finite element methods 🔗](https://arxiv.org/abs/2407.04516)

G-Adaptivity: Revolutionizing Finite Element Analysis with Graph Neural Networks

Introduction In the world of computational science, simulating reality is a balancing act. Whether predicting the weather, designing aerodynamic cars, or modeling structural stress, scientists rely on Finite Element Methods (FEM). These methods break down complex physical shapes into a grid of small, simple shapes—triangles or tetrahedra—called a mesh. The golden rule of FEM is simple: the more points (or nodes) you have in your mesh, the more accurate your simulation. However, more points mean significantly higher computational costs. A simulation that takes minutes on a coarse mesh might take weeks on a dense one. ...

[Prediction Models That Learn to Avoid Missing Values 🔗](https://arxiv.org/abs/2505.03393)

Learning to See Without Looking—How AI Can Avoid Missing Data

If you have ever worked with real-world datasets, particularly in healthcare or finance, you know the pain of missing values. You design a perfect model, train it on cleaned data, and prepare it for deployment. But then comes “test time”—the moment your model faces a real user. The user skips a question on a form, or a specific medical test hasn’t been ordered yet. Suddenly, your model is blind in one eye. ...

[Bridging Layout and RTL: Knowledge Distillation based Timing Prediction 🔗](https://openreview.net/pdf?id=pWs925fKyK)

Can We Teach RTL Models Physics? Inside the RTLDistil Framework

Can We Teach RTL Models Physics? Inside the RTLDistil Framework In the world of modern chip design, speed is everything—not just the clock speed of the final processor, but the speed at which engineers can design it. This creates a fundamental tension in Electronic Design Automation (EDA). On one hand, you want to know if your design meets timing constraints as early as possible (at the Register-Transfer Level, or RTL). On the other hand, you can’t really know the timing until you’ve done the physical layout, which includes placing components and routing wires. ...

[Determining Layer-wise Sparsity for Large Language Models Through a Theoretical Perspective 🔗](https://arxiv.org/abs/2502.14770)

Stop Pruning Uniformly; How a Simple Arithmetic Progression Solves LLM Error Explosion

Introduction Large Language Models (LLMs) like LLaMA and GPT have revolutionized natural language processing, but they come with a massive cost: their size. With billions of parameters, deploying these models on standard hardware is a logistical nightmare due to high memory footprint and computational latency. This has led to a surge in Network Sparsity research—techniques that aim to remove “unimportant” parameters (weights) from the model to make it smaller and faster without sacrificing intelligence. ...

[When Every Millisecond Counts: Real-Time Anomaly Detection via the Multimodal Asynchronous Hybrid Network 🔗](https://arxiv.org/abs/2506.17457)

Milliseconds Matter: Fusing Event Streams and RGB for High-Speed Autonomous Safety

Introduction: The Need for Speed in Autonomous Safety Imagine you are driving down a suburban street. It’s a sunny day, the music is playing, and you are relaxed. Suddenly, from behind a parked truck, a child chases a ball into the middle of the road. Your brain processes this visual information instantly—your foot slams on the brake, and the car screeches to a halt just inches from the child. The difference between a close call and a tragedy was a fraction of a second. ...

[Rethink GraphODE Generalization within Coupled Dynamical System 🔗](https://openreview.net/pdf?id=nVD7KoU09V)

How to Teach AI Physics: Disentangling Static and Dynamic Worlds with GREAT

Introduction Imagine trying to predict the motion of a complex system, like a set of pendulums connected by springs, or charged particles bouncing around in a box. In physics and engineering, these are known as Coupled Dynamical Systems. To model them, we don’t just look at one object in isolation; we have to account for how every component interacts with every other component over time. For years, scientists used handcrafted differential equations to solve these problems. But recently, Deep Learning has entered the chat. Specifically, a framework called Graph Ordinary Differential Equations (GraphODE) has shown immense promise. By combining Graph Neural Networks (GNNs) to model interactions and ODE solvers to model time, these networks can theoretically learn the “laws of physics” directly from data. ...

[STAR: Learning Diverse Robot Skill Abstractions through Rotation-Augmented Vector Quantization 🔗](https://arxiv.org/abs/2506.03863)

Breaking the Codebook Collapse: How STAR Teaches Robots Diverse Skills via Geometric Rotation

Breaking the Codebook Collapse: How STAR Teaches Robots Diverse Skills via Geometric Rotation Imagine trying to teach a robot to cook a meal. You don’t tell the robot every single millisecond of muscle movement required to crack an egg. Instead, you think in terms of “skills”: grasp the egg, hit the edge of the pan, pull the shells apart. This hierarchical approach—breaking complex long-horizon tasks into discrete, reusable skills—is the holy grail of robotic manipulation. However, translating continuous robot actions into these discrete “words” or “tokens” is notoriously difficult. Current methods often suffer from codebook collapse, where the robot ignores most of the skills it could learn, relying on just a tiny subset of repetitive actions. Furthermore, even if the robot learns the skills, stringing them together smoothly (composition) is a separate headache. ...

[Learning Dynamics under Environmental Constraints via Measurement-Induced Bundle Structures 🔗](https://arxiv.org/abs/2505.19521)

When Geometry Meets Uncertainty: A New Framework for Safe Robot Learning

Imagine you are trying to walk through a crowded room in the dark. You can’t see perfectly; perhaps you only have a dim flashlight that flickers. You know roughly how your legs work (your dynamics), but your perception of where the furniture is (the environment) is noisy and uncertain. If you assume you know exactly where everything is, you will likely stub your toe. If you are too paralyzed by fear, you won’t move at all. ...

[Invariant Deep Uplift Modeling for Incentive Assignment in Online Marketing via Probability of Necessity and Sufficiency 🔗](https://openreview.net/pdf?id=mruyFvKDKq)

Beyond Correlation—How Invariant Deep Uplift Modeling (IDUM) Solves the Out-of-Distribution Crisis in Online Marketing

Introduction Imagine you run a massive online platform—perhaps a short-video app or an e-commerce giant. You have a budget to distribute coupons or high-quality video streams to keep users engaged. The central question of your marketing team is simple: “If we give User X a coupon, will they buy something they wouldn’t have bought otherwise?” This is not a prediction of purchase; it is a prediction of influence. This field is called Uplift Modeling. ...

[Continual Reinforcement Learning by Planning with Online World Models 🔗](https://arxiv.org/abs/2507.09177)

How to Build Robots That Never Forget: Planning with Online World Models

Imagine you are teaching a robot to make coffee. After weeks of training, it finally masters the art of grinding beans and pouring water. Next, you teach it to load the dishwasher. It learns quickly, but when you ask it to make coffee again, it stares blankly at the machine. It has completely overwritten its “coffee-making” neurons with “dishwasher-loading” neurons. This phenomenon is known as catastrophic forgetting, and it is the Achilles’ heel of Artificial Intelligence. ...

[Towards Practical Defect-Focused Automated Code Review 🔗](https://arxiv.org/abs/2505.17928)

From Nitpicks to Key Bugs: How to Build a Practical Automated Code Reviewer

Code review is the gatekeeper of software quality. In a perfect world, a senior engineer meticulously checks every line of code you write, catching subtle logic errors, security vulnerabilities, and potential performance bottlenecks before they merge. In the real world, code review is often a bottleneck. Reviewers are busy, context is hard to gather, and “LGTM” (Looks Good To Me) is sometimes typed a bit too quickly. This has driven a massive surge in research into Automated Code Review. If an AI can write code, surely it can review it? However, most existing tools fall into a trap: they treat code review as a simple translation task. They look at a small snippet of code and try to generate a sentence that “sounds” like a review. The result? A flood of “nitpicks”—comments about variable naming or formatting—while critical bugs (like null pointer dereferences or logic errors) slip through. ...

[Fishers for Free? Approximating the Fisher Information Matrix by Recycling the Squared Gradient Accumulator 🔗](https://openreview.net/pdf?id=m3zrHhiCCj)

Fishers for Free: Recycling Your Optimizer State to Estimate Parameter Importance

In the world of deep learning, we often treat model parameters as a means to an end. We train them, save them, and run inference. But not all parameters are created equal. Some weights in your neural network are critical load-bearing columns; others are decorative trim that can be removed or altered without collapsing the structure. Determining which parameters matter most is the domain of parameter sensitivity, and the “gold standard” tool for measuring this is the Fisher Information Matrix (FIM). The Fisher diagonal tells us how much the model’s output distribution would change if we perturbed a specific parameter. It is crucial for advanced techniques like Model Merging, Network Pruning, and Continual Learning. ...

[Feature learning from non-Gaussian inputs: the case of Independent Component Analysis in high dimensions 🔗](https://arxiv.org/abs/2503.23896)

Why Deep Networks Learn Gabor Filters: Unpacking ICA, High Dimensions, and Sample Complexity

Have you ever wondered why the first layer of almost every Convolutional Neural Network (CNN) looks the same? Whether you train a network to classify dogs, recognize cars, or detect tumors, the filters in the very first layer almost invariably converge to specific patterns: oriented edges and oscillating textures known as Gabor filters. This phenomenon is one of the most robust empirical facts in deep learning. It mirrors the biology of the mammalian visual cortex, which also processes visual information using similar edge detectors. But why does this happen? And more importantly, what are the mathematical mechanics driving the learning of these features from raw pixels? ...