Beyond Pretraining: How LLMs Remap Their 'Brains' On-the-Fly

Large Language Models (LLMs) like Llama 3 or GPT-4 seem to have an encyclopedic knowledge of the world. Through pretraining on massive text datasets, they learn that “apple” is a fruit, “Monday” precedes “Tuesday,” and a “car” is a vehicle. These relationships form a sprawling semantic map inside the model—a representation space that encodes how words relate to each other.

But what happens when we challenge that map? What if, just for a single prompt, we tell the model that apple is now next to car, and bird is linked to milk? Can the model temporarily rewire its internal understanding and adopt a completely new semantic reality based purely on context?

That question sits at the core of the recent research paper “In-Context Learning of Representations.” The authors explore whether LLMs can fundamentally reorganize their internal representations based only on the information contained in the prompt. Their answer: yes—and the shift happens abruptly, like a switch flipping once enough evidence accumulates.

This discovery suggests LLMs may run an implicit optimization process that dynamically restructures how they represent meaning. Let’s dive into how the researchers uncovered this mechanism and what it teaches us about the evolving “brain” of artificial intelligence.

Setting the Stage: Worlds Within Words

Before we unpack the experiments, let’s clarify two key ideas.

1. Representations. Inside an LLM, every word or concept is represented as a high-dimensional vector—a list of thousands of numbers. Distances and directions between these vectors encode meaning. A classic example:

\[ \text{king} - \text{man} + \text{woman} \approx \text{queen}. \]

This “vector arithmetic” shows that the model has learned geometric relationships between concepts. During pretraining, these structures mirror the relationships in natural language.

2. In-Context Learning (ICL). Modern LLMs can learn new tasks within a prompt, without changing their weights. For instance, given examples like sea otter → loutre de mer, cheese → fromage, the model can correctly translate the next word. This ability reveals the model’s capacity to generalize rules from examples provided in context.

The paper merges these ideas to ask: when an LLM engages in in-context learning, does it simply reuse old representations? Or does it actually form a new semantic map—one adapted to the context?

The Core Method: Teaching an LLM a New Game

To answer this, the researchers designed an elegant experiment called “in-context graph tracing.” It creates a miniature world with brand-new rules and asks the model to learn its structure purely from examples.

Here’s the setup:

Define a Structure. They start with a simple graph—like a 4×4 grid or a 10-node ring—where each node connects to others in a defined way (in a grid, each node has neighbors above, below, and to the sides).
Assign Familiar Words. Common words such as apple, bird, car, and math are randomly placed on the graph. These spatial relationships are arbitrary—the position of apple next to bird says nothing about their usual meanings.
Generate Examples. A random walk through the graph produces sequences like “apple, bird, milk, sand, sun, plane, opera, …”. These word sequences become the context fed to the model.

In-context graph tracing task setup: (a) Familiar words randomly placed on a 4×4 grid. (b) The model reads sequences generated by random walks on the grid. (c) The question: Will its internal representations reorganize to reflect the grid’s structure?

Figure 1: The setup for the in-context graph tracing task.

After ingesting hundreds of these examples, the model receives the last word—say “opera”—and is asked to predict the next one. Success means predicting plane or box, i.e., a valid neighbor in the new grid, not a semantically related word like music.

The task tests whether the model can infer and internalize a context-defined structure—essentially, can it rebuild its conceptual map on the fly?

Peeking Inside the Model’s Mind: Visualizing the Shift

To understand how the model adapts, the researchers used Principal Component Analysis (PCA) to visualize internal activations. PCA reduces high-dimensional data to two dimensions while preserving its overall structure, letting us see patterns in how concepts are represented.

The results were eye-opening. As the model consumed more context, its internal representations began to reorganize. Words positioned randomly on a grid or ring started forming geometric patterns that mirrored those layouts inside the model’s activation space.

Visualizing representations for a ring graph: (a) Words arranged on a ring. (b) Sequences from neighboring pairs. (c) PCA visualizations across layers and context lengths show the emergence of a ring structure.

Figure 2: Emergent ring-shaped representations in the model’s activations.

Initially, the representations looked like random scatter. But after enough examples—and particularly in deeper layers—distinct ring and grid structures emerged. This provided direct evidence that the model wasn’t simply memorizing associations; it was building a coherent internal geometry aligned with the in-context graph structure.

When New Rules Clash with Old Knowledge

Next, the researchers tested what happens when context-specific rules fight against strong pretrained semantic priors.

For example, previous work shows that LLMs represent the days of the week in a circular pattern—Monday next to Tuesday, Tuesday next to Wednesday, and so on. The team randomly shuffled this familiar sequence to create a new ring structure (e.g., connecting “Monday” to “Friday” or “Wednesday” to “Sunday”) and provided in-context examples from this scrambled ring.

Could the model override its deeply embedded weekday structure?

It could—but only partially.

When in-context rules conflict with pretrained knowledge: (a) Days of the week with original semantic links (pink) and new in-context links (blue). (b) The first two principal components still show the pretrained ring. (c) The third and fourth components reveal the in-context ring.

Figure 3: When semantic priors conflict with in-context structures.

In the first two principal components (the dominant directions in the representation space), the model still exhibited the original “semantic ring.” But in the third and fourth components, the new ring structure emerged clearly. This implies the model reserves new representational dimensions for context-specific information while preserving its pretrained map—a fascinating form of cognitive flexibility.

Quantifying the Shift: Dirichlet Energy and Critical Context

Visualizations are insightful, but measurement matters. To quantify the reorganization, the authors borrowed a concept from mathematics and physics: Dirichlet energy.

Dirichlet energy measures how “smooth” a function is over a graph. Here, it captures how similar neighboring nodes are in representation space. Formally:

\[ E_{\mathcal{G}}(\boldsymbol{X}) = \sum_{i,j} \boldsymbol{A}_{i,j} \|\boldsymbol{x}_i - \boldsymbol{x}_j\|^2, \]

where \( \boldsymbol{A}_{i,j}=1 \) if nodes i and j are connected, and 0 otherwise, and \( \boldsymbol{x}_i \) is the representation of node i. Lower energy means that linked nodes have similar representations—indicating the model has learned the graph’s structure.

When the researchers plotted Dirichlet energy and task accuracy versus context length, a clear pattern appeared.

Dirichlet energy (blue/pink lines) and accuracy (green line) vs. context length for grid, ring, and hex graphs. Energy drops sharply just before accuracy rises.

Figure 4: Representations reorganize as context grows—energy falls, accuracy jumps.

As context length increases, Dirichlet energy falls—signaling that neighboring words align more closely in representation space. Then, almost immediately, task accuracy spikes. The sharp change implies that once the model internalizes the graph’s geometry, it can suddenly perform the new task correctly.

This one-two punch—first energy minimization, then behavioral improvement—suggests the model performs an internal optimization to organize meaning efficiently from context.

Beyond Memorization: Implicit Optimization in Action

Could this behavior simply reflect memorization? Perhaps the model just copies which words appear near each other. The researchers tested baseline “memorization” strategies that simulate this and found these could not reproduce the observed accuracy curve.

Model accuracy (pink) compared to 1-shot and 2-shot memorization baselines (cyan/purple). Neither baseline explains the sharp transition.

Figure 5: LLM performance curves compared to memorization baselines.

Instead, the data showed a distinct two-phase ascent: first a slow learning period, then a rapid leap in accuracy—an emergent property unexplained by rote recall. The authors proposed the energy minimization hypothesis: LLMs implicitly search for the lowest-energy configuration of their representations under the constraints given by the context.

To test this hypothesis, they compared the model’s internal geometries to theoretical spectral embeddings, which mathematically minimize Dirichlet energy for a graph. Astonishingly, the model’s PCA visualizations matched these embeddings almost exactly.

Theoretical spectral embedding of a ring graph (left) and a grid graph (right).

Figure 6–7: Spectral embeddings—mathematically optimal low-energy representations—mirror the structures learned by the model.

This alignment suggests the model isn’t just memorizing connections; it’s running an implicit optimization algorithm that finds the most coherent representation of the in-context world.

In-Context Emergence: A Phase Transition in Learning

The authors went further—exploring how this sudden learning shift behaves across different graph sizes. Larger graphs require more context before the model achieves high accuracy, but in every case, the performance jump occurs sharply after a critical threshold.

Graph tracing ability: (a) Accuracy for grids of different sizes shows an abrupt jump. (b) The critical context size scales as a power-law with the number of nodes.

Figure 8: In-context emergence and power-law scaling of the critical context point.

Remarkably, the critical context size scales as a power law with graph size—a hallmark of phase transitions seen in physics. The authors draw an analogy to percolation theory, where connections in a network suddenly form a large “connected component” once a density threshold is crossed. Similarly, the model seems to reach a tipping point where scattered contextual clues coalesce into a unified internal map.

This perspective implies that in-context learning involves discrete, emergent transitions rather than gradual accumulation of knowledge—the moment when structure “clicks” into place.

The Bigger Picture: Why This Matters

This research reveals a startling adaptability in large language models. Far from static encyclopedias of language, they can flexibly reshape their internal semantics based purely on contextual input.

Key takeaways:

Dynamic Representations: LLMs can reorganize their concept geometry in response to new, context-specified structures.
Emergent Behavior: This reorganization occurs abruptly after a critical amount of context—resembling physical phase transitions.
Implicit Optimization: The model’s behavior follows principles of energy minimization, constructing the most coherent internal map for the task.

These insights challenge how we think about “learning” in AI. Scaling context length alone may unlock new abilities—without retraining or fine-tuning. It suggests that prompting could act as a kind of temporary fine-tuning, capable of endowing models with fresh world structures within a single interaction.

Beyond technical implications, this work bridges AI research with cognitive science. Humans, too, build abstract “cognitive maps” from experience—represented in neural circuits like the hippocampus. The parallels between how models and brains form structured representations point to exciting opportunities for understanding general intelligence, both artificial and biological.

By discovering how LLMs remap their internal worlds on-the-fly, this research opens a new lens on the flexible, dynamic nature of artificial cognition—hinting that the true frontier of AI learning might not lie in larger models, but in richer contexts.

Setting the Stage: Worlds Within Words#

The Core Method: Teaching an LLM a New Game#

Peeking Inside the Model’s Mind: Visualizing the Shift#

When New Rules Clash with Old Knowledge#

Quantifying the Shift: Dirichlet Energy and Critical Context#

Beyond Memorization: Implicit Optimization in Action#

In-Context Emergence: A Phase Transition in Learning#

The Bigger Picture: Why This Matters#