Papers

[Efficient Neural Architecture Search via Parameter Sharing 🔗](https://arxiv.org/abs/1802.03268)

ENAS: Making Neural Architecture Search 1000x Faster

Designing a high-performing neural network is often described as a dark art. It requires deep expertise, intuition, and a whole lot of trial and error. What if we could automate this process? This is the promise of Neural Architecture Search (NAS), a field that aims to automatically discover the best network architecture for a given task. The original NAS paper by Zoph & Le (2017) was a landmark achievement. It used reinforcement learning to discover state-of-the-art architectures for image classification and language modeling, surpassing designs created by human experts. But it came with a colossal price tag: the search process required hundreds of GPUs running for several days. For example, NASNet (Zoph et al., 2018) used 450 GPUs for 3–4 days. This level of computational resources is simply out of reach for most researchers, students, and companies. ...

[G-NAS: Generalizable Neural Architecture Search for Single Domain Generalization Object Detection 🔗](https://arxiv.org/abs/2402.04672)

Beyond the Sunny Day: How G-NAS Teaches Object Detectors to See in the Dark

Imagine an autonomous car, its AI trained on thousands of hours of footage from bright, sunny California days. It can spot pedestrians, cars, and cyclists with incredible accuracy. Now, transport that same car to a foggy London morning, a rainy dusk in Seattle, or a dimly lit street in Tokyo at midnight. Will it still perform flawlessly? This is the crux of one of the biggest challenges in modern computer vision: domain generalization. Models trained in one specific environment (a “domain”) often fail dramatically when deployed in a new, unseen one. The problem is even harder when you only have data from a single source domain to learn from. This specific, realistic, and tough challenge is called Single Domain Generalization Object Detection (S-DGOD). ...

[EvoPrompting: Language Models for Code-Level Neural Architecture Search 🔗](https://arxiv.org/abs/2302.14838)

EvoPrompting: How to Evolve Language Models into Expert AI Architects

Large Language Models (LLMs) like GPT-4 and PaLM have become astonishingly good at writing code. Give them a description, and they can generate a functional script, a web component, or even a complex algorithm. But writing code from a clear specification is one thing—designing something truly novel and high-performing from scratch is another. Can an LLM invent a new, state-of-the-art neural network architecture? If you simply ask an LLM to “design a better neural network,” the results are often underwhelming. The task is too complex, the search space of possible architectures is astronomically vast, and the model lacks a structured way to iterate and improve. This is the challenge that a fascinating new paper, EvoPrompting: Language Models for Code-Level Neural Architecture Search, tackles head-on. ...

[NEURAL ARCHITECTURE SEARCH WITH REINFORCEMENT LEARNING 🔗](https://arxiv.org/abs/1611.01578)

How to Train an AI to Design Other AIs: A Deep Dive into Neural Architecture Search

Designing a state-of-the-art neural network has often been described as a “dark art.” It requires deep expertise, countless hours of experimentation, and a healthy dose of intuition. From AlexNet and VGGNet to ResNet and DenseNet, each breakthrough architecture has been the product of painstaking human design. But what if we could automate this process? What if, instead of manually designing architectures, we could design an algorithm that learns to design architectures for us? ...

[Less is More: Recursive Reasoning with Tiny Networks 🔗](https://arxiv.org/abs/2510.04871)

Less is More: How Tiny Recursive Networks Outsmart Giant AI Models on Complex Puzzles

Large Language Models (LLMs) like GPT-4 and Gemini are computational powerhouses, capable of writing code, composing poetry, and answering a vast range of questions. But for all their might, they have an Achilles’ heel: complex, multi-step reasoning puzzles. Tasks like solving a tricky Sudoku or deciphering the abstract patterns in the ARC-AGI benchmark can cause even the most advanced LLMs to stumble. Their auto-regressive, token-by-token generation process means a single mistake can derail the entire solution, with no easy way to backtrack and correct course. ...

[UNSUPERVISED REPRESENTATION LEARNING WITH DEEP CONVOLUTIONAL GENERATIVE ADVERSARIAL NETWORKS 🔗](https://arxiv.org/abs/1511.06434)

DCGANs Explained: Unlocking the Power of Unsupervised Learning with Generative AI

In the world of computer vision, Convolutional Neural Networks (CNNs) have been the undisputed champions for years. Give a CNN enough labeled images of cats and dogs, and it will learn to tell them apart with superhuman accuracy. This is supervised learning, and it has powered modern AI applications from photo tagging to medical imaging. But what happens when you don’t have labels? The internet is overflowing with billions of images, but only a tiny fraction are neatly categorized. This is the challenge of unsupervised learning: can a model learn meaningful, reusable knowledge about the visual world from a massive, messy pile of unlabeled data? ...

[Denoising Diffusion Probabilistic Models 🔗](https://arxiv.org/abs/2006.11239)

From Noise to High-Fidelity Images — A Deep Dive into Denoising Diffusion Models

In the last decade, AI has dazzled the world with deep generative models capable of producing realistic images, audio, and text from scratch. We’ve seen Generative Adversarial Networks (GANs) generate lifelike portraits and Variational Autoencoders (VAEs) learn rich latent representations. But in 2020, a paper titled Denoising Diffusion Probabilistic Models from researchers at UC Berkeley reshaped the conversation. This work introduced a class of models, based on ideas from nonequilibrium thermodynamics first explored in 2015, that were shown for the first time to produce exceptionally high-quality images, rivaling — and in some cases surpassing — the best GANs. ...

[Reflection: Language Agents with Verbal Reinforcement Learning 🔗](https://arxiv.org/abs/2303.11366)

Beyond Trial and Error: How LLM Agents Can Learn by Talking to Themselves

Large Language Models (LLMs) are breaking out of the chatbot box. We’re increasingly seeing them power autonomous agents that can interact with software, play games, and browse the web to accomplish complex goals. But there’s a catch: when these agents make a mistake, how do they learn not to repeat it? Traditionally, the answer in AI has been Reinforcement Learning (RL)—a process of trial and error where an agent is rewarded for good actions and penalized for bad ones. However, applying traditional RL to massive LLMs is incredibly slow and computationally expensive, often requiring months of training and enormous GPU resources to fine-tune billions of parameters. As a result, most LLM agents today learn only from a handful of carefully designed examples in their prompt. ...

[CURL: Contrastive Unsupervised Representations for Reinforcement Learning 🔗](https://arxiv.org/abs/2004.04136)

Learning from Pixels Just Got a Lot Faster: A Deep Dive into CURL

Reinforcement Learning (RL) has given us agents that can master complex video games, control simulated robots, and even grasp real-world objects. However, there’s a catch that has long plagued the field: RL is notoriously data-hungry. An agent often needs millions of interactions with its environment to learn a task. In a fast simulation, that’s fine—but in the real world, where a robot arm might take seconds to perform a single action, this can translate to months or years of training. ...

[Decision Transformer: Reinforcement Learning via Sequence Modeling 🔗](https://arxiv.org/abs/2106.01345)

Decision Transformer: When Language Models Learn to Play Games

What if you could tackle a complex reinforcement learning problem the same way you’d complete a sentence? This is the radical and powerful idea behind the Decision Transformer—a paper that reframes the entire field of sequential decision-making. For decades, Reinforcement Learning (RL) has been dominated by algorithms that learn value functions and policy gradients, often wrestling with complex issues like temporal credit assignment, bootstrapping instability, and discounting. But what if we could sidestep all of that? ...