When Neural Networks Evolve Themselves: A New Model for Open-Ended Evolution

In evolution—whether biological or computational—we use models to understand how variation and selection create complex systems. But most computational models have a key limitation: their evolutionary rules are set by the programmer. The modeler determines how often mutations happen, what kinds of changes are allowed, and how they are distributed. That’s like studying a forest by only observing trees you planted yourself.

What if mutation—the very engine of evolution—could emerge from within the system itself?

A new study by Boaz Shvartzman and Yoav Ram presents a model that does exactly that. Their Self-Replicating Artificial Neural Network (SeRANN) is a neural network that not only learns to perform a task, but also learns to copy its own “genetic” code. Both reproductive success and mutation are endogenous, meaning they emerge naturally from the system’s dynamics rather than being imposed externally.

When the researchers evolved a population of 1,000 SeRANNs for 6,000 generations, they observed a rich array of evolutionary phenomena—adaptation, clonal interference, epistasis, and even evolution of the mutation rate itself. The SeRANN framework offers a fresh and open-ended way to explore what it means for evolution to arise spontaneously.

Background: The Limits of Explicit Mutation

Artificial-life platforms such as Avida and aevol have long been used to study digital evolution. These systems allow small computer programs—digital organisms—to compete and reproduce, with their fitness emerging from performance on computational tasks.

However, mutation in these models is still explicit: an external algorithm determines how to copy the genotype and injects random changes using predefined distributions. This pre-scripted randomness constrains what evolution can discover, since the variability itself is not part of the evolving organism.

In contrast, biological evolution is messier. Mutation rates and patterns are shaped by the organism’s own genes, metabolism, and environment. If replication and mutation could be learned and internalized, we could model a form of evolution that is truly open-ended.

How SeRANN Works

The authors designed SeRANN to perform two simultaneous tasks:

A fertility task—a standard image classification problem.
A replication task—copying its own genetic information.

SeRANN framework illustrating the cycle of learning, replication, and selection

Fig 1. The SeRANN evolutionary framework. A juvenile SeRANN learns to classify images and copy genotypes (A). As an adult, it is evaluated on these tasks to determine fertility and produce offspring (B). Fertility and drift shape the next generation (C). Offspring genotypes are decoded to new source code (D, E), continuing the cycle.

The Genotype and Phenotype

Genotype: A 100-bit string representing heritable information—similar to biological DNA.
Phenotype: A Python source code defining the neural network’s architecture, including layer types, connections, and hyperparameters.

Changes in the genotype are mapped to changes in the phenotype, meaning a single bit mutation could alter the architecture (e.g., layer size or type). The phenotype itself then performs the replication—an elegant loop where the product of evolution becomes its own mechanism of inheritance.

Two Tasks, One Network

Each SeRANN takes an image \( X \) and its genotype \( g \), and outputs:

\[ (\hat{y}, g') = F(X, g) \]

where \( \hat{y} \) is the predicted label for the image (fertility task), and \( g' \) is the copied genotype (replication task).

Errors in copying—where \( g' \neq g \)—represent spontaneous mutations.

The Dual Loss Function

SeRANNs are trained using gradient descent on a composite loss:

\[ \ell(y, \hat{y}, g, g') = \alpha \cdot \ell_X(y, \hat{y}) + (1 - \alpha) \cdot \ell_g(g, g') \]

Here, \( \ell_X \) measures classification error, \( \ell_g \) measures replication accuracy, and \( \alpha \) (the loss_weight) controls their relative importance.

Crucially, the loss_weight itself is encoded in the genotype. Through mutation and selection, SeRANNs evolve their own balance between learning performance and replication fidelity—allowing even the mutation rate to change over time.

Translating Genotypes: The RiboAE

Mapping a 100-bit genotype to syntactically valid Python code is difficult. Most random bit-flips would create broken code. To bridge the gap, the authors built a ribosomal autoencoder (RiboAE)—a separate neural network that acts as the translation machinery of SeRANN evolution.

The RiboAE was trained on one million examples of valid SeRANN source code.
It learned to encode code into bit-strings and decode those bit-strings back into runnable Python scripts.
By design, small changes in genotype usually cause small or non-lethal changes in phenotype, mirroring biological “translational robustness.”

After training, RiboAE remains fixed, providing a stable genetic “language” for all SeRANNs.

The Evolutionary Cycle

Each generation unfolds like a biological life cycle:

Training: Juvenile SeRANNs learn through gradient descent on image and replication tasks.
Evaluation: Adults are tested. Classification accuracy determines fertility, while replication fidelity shapes mutation and offspring survival.
Selection and Drift: Offspring are sampled in proportion to parental fertility, simulating natural selection and genetic drift.
Gene Expression: Offspring genotypes are decoded through RiboAE. Invalid code leads to “death,” leaving only viable individuals.

Fitness thus equals fertility × survival—a complete evolutionary feedback loop.

Results: Evolution Emerges

Running this system for 6,000 generations revealed spontaneous evolutionary dynamics similar to those seen in real microbial populations.

Adaptive Evolution and Mutation Rate Decline

The population’s mean fitness steadily increased as the mutation rate plummeted.

Adaptive evolution over 6000 generations

Fig 2. Adaptive evolution of SeRANNs. Fitness rises while mutation rate drops sharply. Mutational robustness also improves. Color indicates generation across 6,000 cycles.

Initially, the ancestor had a high mutation rate (2–7 mutations per replication) due to a high loss_weight favoring classification. Selection soon favored anti-mutator alleles—genotypes that reduced loss_weight, improving replication accuracy. By the end, the mutation rate fell 175-fold, illustrating real “second-order selection” acting on the replication mechanism itself.

Allele Dynamics and Clonal Interference

Evolution advanced in bursts when beneficial mutations spread or fixed.

Mutant allele trajectories across generations

Fig 3. Allele frequency dynamics. Some mutants fixed rapidly, while others stalled. Highly advantageous alleles fixed faster, following real evolutionary patterns.

Complex phenomena emerged, such as clonal interference—multiple beneficial mutations competing simultaneously—and soft sweeps, where different lineages carrying the same mutation rise together.

A standout case involved the mutant allele at site 62.

Site-62 mutation and genotype competition

Fig 4. Dynamics of the site-62 allele. The allele became common across genotypes (black line), but no single variant reached fixation. A hallmark of clonal interference in large asexual populations.

This allele appeared repeatedly in different genetic backgrounds, each variant rising to high frequency but never dominating—a direct analogue of microbial evolution under high mutation pressure.

Lineage Trade-offs: Fertility vs. Fidelity

The paper highlights a vivid three-generation story showing the tension between replication fidelity and reproductive success.

Three-generation lineage showing network architecture evolution

Fig 5. A parent’s mutation boosted fertility but caused catastrophic mutation rate increase. Its surviving offspring reverted the architecture and stabilized the lineage, showing how evolution navigates trade-offs between short-term gain and long-term stability.

Here, a mutation switched a layer from replication to classification—raising fertility but collapsing survival. Subsequent mutations repaired the architecture, reducing mutation rate and restoring evolutionary viability. This balancing act recapitulates natural trade-offs in mutator lineages.

Shaping the Landscape: Distribution of Fitness Effects

The distribution of fitness effects (DFE) describes how mutations impact fitness. SeRANN’s DFE was bimodal, with clusters of lethal and near-neutral mutations—matching yeast and viral data.

Distribution of fitness effects showing bimodality and evolution over time

Fig 6. The DFE evolved over time: lethal mutations declined while neutral ones rose, a hallmark of “survival of the flattest.” SeRANNs occupy flatter, more robust regions of the fitness landscape as mutation rates drop.

Over generations, the proportion of lethal mutations decreased, while neutral mutations increased. This reflects survival of the flattest: under high mutation load, robustness trumps maximal fitness. Selection favored genotypes located in “flat” regions of the landscape where mutations do less harm.

Gene Interactions: Epistasis

Mutations rarely act alone. At sites 37 and 83, two individually beneficial mutations were disastrous when combined—their interaction halved offspring survival while multiplying mutation rate.

This phenomenon, known as sign epistasis, parallels biological constraints that shape real evolutionary trajectories. Such interactions were widespread, with positive and negative epistasis distributed across the genome, echoing patterns seen in viruses and bacteria.

Phenotypic Variation and Learning Noise

Even identical genotypes produced variable phenotypes due to random initialization during training—an analogue for biological developmental noise. Fertility variation correlated positively with fitness, while high survival variability constrained evolution. SeRANN thus exhibits meaningful phenotypic variation arising purely from stochastic learning processes.

Why SeRANN Matters

Traditional computational evolution relies on fixed rules for generating variation. SeRANN dissolves that boundary: replication errors, mutation rate, and even learning trade-offs evolve. This leads to self-organized phenomena indistinguishable from those in living populations—adaptive evolution, clonal interference, epistasis, hitchhiking, and mutational robustness.

Because both replication and selection are internalized, SeRANN serves as a genuinely open-ended evolutionary system. Its genotype–phenotype map is complex and stochastic, making outcomes impossible to predict yet remarkably organic.

Looking Ahead

The authors suggest rich future directions:

Evolution of division of labor: networks specializing into replication (“germ”) and task-performance (“soma”), paralleling multicellularity.
Social vs. individual learning: evolving reinforcement learners that influence peers.
Alternative tasks and architectures: exploring whether different learning problems yield different evolutionary patterns.

By embedding mutation generation directly into the learning process, SeRANN blurs the line between artificial intelligence and artificial life. It hints that the principles driving biological evolution—heritable variation and differential success—might apply to any system capable of learning and self-replication, even one built from Python code and pixels.

Background: The Limits of Explicit Mutation#

How SeRANN Works#

The Genotype and Phenotype#

Two Tasks, One Network#

The Dual Loss Function#

Translating Genotypes: The RiboAE#

The Evolutionary Cycle#

Results: Evolution Emerges#

Adaptive Evolution and Mutation Rate Decline#

Allele Dynamics and Clonal Interference#

Lineage Trade-offs: Fertility vs. Fidelity#

Shaping the Landscape: Distribution of Fitness Effects#

Gene Interactions: Epistasis#

Phenotypic Variation and Learning Noise#

Why SeRANN Matters#

Looking Ahead#