Papers

The Power of Noise: How Denoising Autoencoders Learn Robust Features

Deep neural networks have become the cornerstone of modern artificial intelligence, achieving remarkable feats in areas like image recognition, natural language processing, and beyond. But before they became so dominant, there was a major hurdle: training them was incredibly difficult. The deeper the network, the harder it was to get it to learn anything useful. A key breakthrough came in the mid-2000s with the idea of unsupervised pre-training, a method of initializing a deep network layer by layer before fine-tuning it on a specific task. ...

Unlocking Deep Learning: How a 2006 Breakthrough Revolutionized Neural Networks

High-dimensional data—like images with millions of pixels, documents with thousands of words, or genomes with countless features—can be incredibly complex to understand and analyze. This is often referred to as the curse of dimensionality: with so many variables, it becomes harder to spot meaningful patterns and relationships, making tasks like classification, visualization, or storage challenging. For decades, the preferred technique to tackle this problem was Principal Component Analysis (PCA). PCA is a linear method that finds the directions of greatest variance in a dataset and projects it into a lower-dimensional space. It’s effective and simple, but inherently limited—especially when the patterns in the data are non-linear, curving through high-dimensional space in complex ways. In such cases, PCA can fail to capture important structure. ...

[NAS-Bench-1Shot1: Benchmarking and Dissecting One-shot Neural Architecture Search 🔗](https://arxiv.org/abs/2001.10422)

Cracking the Code of One-Shot NAS: A Deep Dive into the NAS-Bench-1Shot1 Benchmark

Introduction: The Promise and Peril of Automated AI Neural Architecture Search (NAS) is one of the most exciting frontiers in machine learning. Imagine an algorithm that can automatically design the perfect neural network for your specific task, potentially outperforming architectures crafted by world-class human experts. This is the promise of NAS. Early successes proved that NAS could discover state-of-the-art models for image classification and other tasks — but at a staggering cost. The search often required thousands of GPU-days of computation, making it a luxury only accessible to a few large tech companies. ...

[NEURAL ARCHITECTURE SEARCH ON IMAGENET IN FOUR GPU HOURS: A THEORETICALLY INSPIRED PERSPECTIVE 🔗](https://arxiv.org/abs/2102.11535)

Find Top Neural Networks in Hours, Not Days: A Deep Dive into Training-Free NAS

Neural Architecture Search (NAS) is one of the most exciting frontiers in deep learning. Its promise is simple yet profound: to automatically design the best possible neural network for a given task, freeing humans from the tedious and often intuition-driven process of manual architecture design. But this promise has always come with a hefty price tag—traditional NAS methods can consume thousands of GPU-hours, scouring vast search spaces by training and evaluating countless candidate architectures. This immense computational cost has limited NAS to a handful of well-funded research labs. ...

[Hierarchical Neural Architecture Search for Deep Stereo Matching 🔗](https://arxiv.org/abs/2010.13501)

LEAStereo – How AI Learned to Design State-of-the-Art 3D Vision Models

For decades, getting computers to see the world in 3D like humans do has been a central goal of computer vision. This capability—stereo vision—powers self-driving cars navigating complex streets, robots grasping objects with precision, and augmented reality systems blending virtual objects seamlessly into our surroundings. At its core, stereo vision solves a seemingly simple problem: given two images of the same scene taken from slightly different angles (like our two eyes), can we calculate the depth of everything in the scene? ...

[BigNAS: Scaling Up Neural Architecture Search with Big Single-Stage Models 🔗](https://arxiv.org/abs/2003.11142)

BigNAS: Train Once, Deploy Anywhere with Single-Stage Neural Architecture Search

Deploying machine learning models in the real world is a messy business. The perfect model for a high-end cloud GPU might be a terrible fit for a smartphone, which in turn is overkill for a tiny microcontroller. Each device has its own unique constraints — on latency, memory, and power — and this diversity has sparked rapid growth in Neural Architecture Search (NAS), a field dedicated to automatically designing neural networks tailored for specific hardware. ...

[Neural Architecture Search without Training 🔗](https://arxiv.org/abs/2006.04647)

Finding Top Neural Networks in Seconds—Without a Single Training Step

Designing a high-performing neural network has long been part art, part science, and a whole lot of trial and error. For years, the best deep learning models were forged through immense human effort, intuition, and countless hours of GPU-powered experimentation. This manual design process is a significant bottleneck—one that sparked the rise of an exciting field: Neural Architecture Search (NAS). The goal of NAS is straightforward: automate the design of neural networks. Instead of a human painstakingly choosing layers, connections, and operations, a NAS algorithm explores a vast space of possible architectures to find the best one for a given task. Early NAS methods were revolutionary, discovering state-of-the-art models like NASNet. But they came with staggering computational costs. The original NAS paper required 800 GPUs running for 28 days straight—over 60 GPU-years—for a single search. ...

[NAS-BENCH-201: EXTENDING THE SCOPE OF RE-PRODUCIBLE NEURAL ARCHITECTURE SEARCH 🔗](https://arxiv.org/abs/2001.00326)

A Fair Playground for Neural Networks: A Deep Dive into NAS-Bench-201

Neural Architecture Search (NAS) has transformed the way we design deep learning models. Instead of relying solely on human intuition and years of experience, NAS algorithms can automatically discover powerful and efficient network architectures — often surpassing their hand-crafted predecessors. This paradigm shift has sparked an explosion of new NAS methods, spanning reinforcement learning, evolutionary strategies, and differentiable optimization. But this rapid progress comes with a hidden cost: a crisis of comparability. ...

[Progressive Neural Architecture Search 🔗](https://arxiv.org/abs/1712.00559)

PNAS: How to Find Top-Performing Neural Networks Without Breaking the Bank

Designing the architecture of a neural network has long been considered a dark art — a blend of intuition, experience, and trial-and-error. But what if we could automate this process? What if an AI could design an even better AI? This is the promise of Neural Architecture Search (NAS), a field that has produced some of the best-performing models in computer vision. However, this power has historically come at a staggering cost. Early state-of-the-art methods like Google’s NASNet required enormous computational resources — training and evaluating 20,000 different architectures on 500 high-end GPUs over four days. Such requirements put NAS far beyond the reach of most researchers or organizations without access to a massive data center. ...

[ProxylessNAS: Direct Neural Architecture Search on Target Task and Hardware 🔗](https://arxiv.org/abs/1812.00332)

ProxylessNAS: Searching for Optimal Neural Networks Directly on Your Target Hardware

Neural Architecture Search (NAS) is one of the most exciting frontiers in deep learning. Imagine an algorithm that can automatically design a state-of-the-art neural network for you—perfectly tailored to your specific task. The promise of NAS is to replace the tedious, intuition-driven process of manual network design with a principled, automated search. For years, however, this promise came with a colossal price tag. Early NAS methods required tens of thousands of GPU hours to discover a single architecture—a cost so prohibitive that it was out of reach for most researchers and engineers. To make NAS feasible, the community developed a clever workaround: instead of searching directly on the massive target task (like ImageNet), researchers would search on a smaller, more manageable proxy task—such as using CIFAR-10 instead of ImageNet, training for fewer epochs, or searching for a single reusable block rather than an entire network. ...