Introduction

In the fast-evolving world of computer graphics and vision, few techniques have made as big a splash as 3D Gaussian Splatting (3DGS). Since its introduction in 2023, it has impressed both researchers and developers by combining photorealistic novel view synthesis with real-time rendering speeds. For many, it felt like the practical, high-speed successor to Neural Radiance Fields (NeRFs) we had been waiting for.

However, as people began pushing 3DGS to its limits, cracks started to show. While it produced stunning results for camera views similar to those in the training data, it struggled severely when the viewing scale changed. Zooming in could make objects look overly thin and noisy; zooming out often caused fine details to blur into glowing artifacts.

This is exactly the problem that the paper “Mip-Splatting: Alias-free 3D Gaussian Splatting” sets out to solve. The authors identify the root cause of these scaling artifacts and propose an elegant, principled solution. Their method—Mip-Splatting—modifies the original 3DGS pipeline to make it robust against changes in camera distance and focal length, enabling crisp, artifact-free images across a wide range of scales.

Let’s visualize the problem:

A four-panel figure showing effects of zooming with 3DGS.

Figure 1: Standard 3DGS works well at the training scale (a), but zooming out causes spokes to thicken (c), and zooming in makes them too thin and noisy (d).

In this article, we’ll explore the Mip-Splatting paper step-by-step. We’ll first explain how 3D Gaussian Splatting works, then discuss why it fails at different scales, and finally examine Mip-Splatting’s two-part solution: a 3D smoothing filter to handle zoom-in issues and a 2D Mip filter to perfect the zoom-out.


Background: How 3D Gaussian Splatting Works

Unlike mesh-based rendering or neural networks, 3D Gaussian Splatting represents a scene using an enormous set of semi-transparent, anisotropic blobs called Gaussians.

Each Gaussian is parameterized by:

  • Position (\(\mathbf{p}_k\)): where it is in 3D space.
  • Covariance (\(\boldsymbol{\Sigma}_k\)): a 3×3 matrix defining its shape and size.
  • Color (\(c_k\)): possibly view-dependent, modeled with spherical harmonics.
  • Opacity (\(\alpha_k\)): how transparent it is.

Mathematically, a Gaussian is:

Equation for a 3D Gaussian.

From 3D to 2D

Rendering in 3DGS is fast because it uses rasterization (like game engines) instead of the slower ray tracing used by NeRFs.

  1. Transform to Camera Space
    All Gaussians are transformed from world coordinates to the selected camera view:

    Equation for transformation.

  2. Project to 2D
    Each 3D Gaussian becomes a 2D Gaussian on the image plane:

    Projection equation.

  3. Splat and Alpha-Blend
    These 2D Gaussians are drawn (“splatted”) onto the screen, blending front-to-back:

    Alpha blending formula.


The Problem: Screen-Space Dilation

If a projected Gaussian is smaller than one pixel, holes can appear. To avoid this, the original 3DGS applies a fixed blur: 2D dilation, adding \(s\mathbf{I}\) to the covariance.

2D dilation equation.

While this stabilizes training, it also introduces scale-specific artifacts.


The Core Issue: Zoom-In vs. Zoom-Out

A proper Gaussian and a degenerate ultra-thin one can render almost identically at the training scale because of fixed dilation. This leads to a shrinkage bias—training often produces many ultra-small Gaussians.

Zoom-In: Erosion & High-Frequency Noise

When zooming in, projected sizes grow, but the dilation remains fixed (now negligible). Thin gaps between Gaussians appear, causing erosion artifacts and noise:

  • Thin objects look unnaturally sparse.
  • High-frequency speckling emerges.

Zoom-Out: Dilation, Brightness & Aliasing

Zooming out makes Gaussians smaller, but dilation stays large:

  • Dilation artifacts: thin details appear bloated.
  • Energy spread: brightness becomes artificially high.
  • Aliasing: high-frequency details clash with pixel sampling, causing jaggedness.

Mip-Splatting: The Two-Part Solution

1. The 3D Smoothing Filter — Zoom-In Remedy

Grounded in the Nyquist-Shannon Sampling Theorem, the authors limit the smallest resolvable detail based on training data.

Finding Sampling Limits
For each Gaussian, they compute the world-space sampling interval \(\hat{T}\):

Sampling interval equation.

From all training cameras that can see it, they find the highest sampling frequency \(\hat{\nu}_k\):

Max sampling frequency selection.

Applying the Filter
They convolve each Gaussian \(\mathcal{G}_k\) with a low-pass Gaussian \(\mathcal{G}_{\text{low}}\):

Convolution equation.

Because convolution of Gaussians is simple covariance addition:

Full 3D smoothing formula.

This ensures no Gaussian is sharper than physically possible given the best training view—eliminating erosion and high-frequency noise.


Diagram: sampling rates from multiple cameras.

Figure 3: Different cameras provide different sampling intervals. The smallest interval sets the maximum resolvable detail.


2. The 2D Mip Filter — Zoom-Out Remedy

To replace fixed dilation, the authors introduce a physically-driven anti-aliasing filter.

Inspired by mipmapping, this models how a camera pixel integrates light over its area. The ideal would be a box filter, but they use a Gaussian approximation sized exactly to a single pixel:

2D Mip filter equation.

Unlike dilation, this matches pixel grid spacing and prevents aliasing without excessive blurring or brightness inflation.


Experiments & Results

Zoom-Out Test: Blender Dataset

Trained on full-resolution, rendered at lower resolutions:

Blender results table.

Table 1: PSNR drops sharply for 3DGS when zooming out; Mip-Splatting remains high.

Visual comparison on Blender zoom-out.

Figure 4: Mip-Splatting retains fine structure at low resolutions; others blur or distort.


Zoom-In Test: Mip-NeRF 360 Dataset

Trained at \(1/8\) resolution, rendered at higher scales:

Mip-NeRF 360 results table.

Table 2: Mip-Splatting delivers clean detail across upscales; others suffer erosion or noise.

Visual comparison on Mip-NeRF 360 zoom-in.

Figure 5: Mip-Splatting avoids artifacts and matches ground truth closely.


In-Distribution Performance

On standard same-scale benchmarks, Mip-Splatting matches 3DGS performance—proving it doesn’t sacrifice quality when scale remains unchanged.


Conclusion

Mip-Splatting exemplifies excellent research: identify a critical flaw, trace its cause, and implement a principled fix.

By replacing ad-hoc dilation with:

  • 3D smoothing filter — constrains scene detail to training data limits, fixing zoom-in artifacts.
  • 2D Mip filter — provides physically-correct anti-aliasing, fixing zoom-out artifacts.

Mip-Splatting makes 3DGS adaptable to arbitrary scales—a necessity for VR, games, and visual effects where camera movement is unrestricted.

With Mip-Splatting, zoom-ins and zoom-outs preserve the stunning clarity of 3DGS, no matter the viewpoint.