Imagine you ask a trusted friend, “Who won the most FIFA World Cup championships?” You expect them to say Brazil. But before they answer, you hand them a stack of news clippings. Some clippings confirm it’s Brazil, while others falsely claim it’s Germany or Argentina. Suddenly, your friend is conflicted. Do they rely on what they know to be true (Brazil), or do they trust the documents you just gave them?

This scenario perfectly illustrates a growing challenge in the field of Artificial Intelligence: Knowledge Conflicts.

Large Language Models (LLMs) like GPT-4 or Llama 2 are repositories of vast world knowledge. However, they are rarely used in isolation anymore. They are often deployed in systems that feed them new, external information—such as search results, user prompts, or database retrievals. When this external information clashes with what the model learned during training, or when the external sources contradict each other, the model faces a knowledge conflict.

In this post, we will explore a comprehensive survey titled Knowledge Conflicts for LLMs, which categorizes these conflicts, analyzes why they happen, and reviews how researchers are trying to solve them.

The Conflict Landscape

To understand knowledge conflicts, we first need to define two types of knowledge an LLM possesses:

  1. Parametric Knowledge (Memory): This is the information stored within the model’s weights during pre-training. It is static and represents the model’s “memory” of the world up to its training cutoff.
  2. Contextual Knowledge (Context): This is dynamic information provided to the model at inference time—via user prompts, dialogue history, or Retrieval-Augmented Generation (RAG).

A knowledge conflict occurs when there are discrepancies between the context and the memory, within the context itself, or even within the memory.

As shown in Figure 1 below, the researchers categorize these conflicts into three distinct types: Context-Memory Conflict, Inter-Context Conflict, and Intra-Memory Conflict.

Figure 1: An LLM may encounter three distinct types of knowledge conflicts, stemming from knowledge sources—either contextual (I. Context, yellow chatboxes) or inherent to the LLM’s parameters (II. Memory, blue chatboxes). When confronted with a user’s question (purple chatbox) entailing knowledge of complex conflicts, the LLM is required to resolve these discrepancies to deliver accurate responses.

In the example above, the user asks about the World Cup. The model’s Memory (II) knows Brazil is the champion. However, the Context (I) contains documents claiming Germany, Argentina, and Italy. This creates a web of conflicts that the model must navigate to provide a trustworthy answer.

The Lifecycle of a Conflict

It is helpful to view knowledge conflicts not just as errors, but as a process. As illustrated in Figure 2, conflicts act as a nexus connecting Causes (like misinformation or outdated data) to Behaviors (how the model reacts).

Figure 2: We view knowledge conflict not only as a standalone phenomenon but also as a nexus that connects various causal triggers (causes) with the behaviors of LLMs.While existing literature mainly focuses on \\(I I .\\) Analysis,our survey involves systematically observing these conflicts,offering insights into their emergence and impact on LLMs’ behavior, along with the desirable behaviors and related solutions.

Understanding this flow is akin to psychoanalysis for AI. We cannot simply look at the wrong output; we must understand the origin of the conflict to engineer a solution that encourages the “Desired Behavior.”

Let’s break down the three specific types of conflicts defined in the survey. The following taxonomy tree (Figure 3) provides a roadmap for our deep dive, outlining the causes, analyses, and solutions for each category.

Figure 3: Taxonomy of knowledge conflicts. We mainly list works in the era of LLMs. \\(\\circledcirc\\) denotes pre-hoc solution and \\(\\gamma\\) denotes post-hoc solution.


1. Context-Memory Conflict: The Internal vs. External Battle

This is the most extensively studied type of conflict. It happens when the external information provided to the model (Context) contradicts the information stored in its weights (Parametric Knowledge).

The Causes

The survey identifies two primary drivers for this conflict:

  • Temporal Misalignment: The world changes, but the model’s training data is frozen in the past. If you ask a model about the UK Prime Minister, its memory might say “Boris Johnson,” but the retrieved news article says “Rishi Sunak.” The context is correct, and the memory is outdated.
  • Misinformation Pollution: Conversely, the model’s memory might be correct, but the context is poisoned. Adversaries can inject fake news into retrieved documents, or users might provide malicious prompts (“Imagine the earth is flat…”). Here, the context is wrong, and the memory is (usually) right.

Model Behavior

How do LLMs react to this tug-of-war? The research is nuanced. Early studies suggested models were stubborn and over-relied on their memory. However, more recent experiments with advanced LLMs show they are highly receptive to external evidence.

Crucially, confirmation bias plays a role. Models are more likely to accept external evidence if it aligns with their internal memory. Furthermore, if the external context is semantically coherent and persuasive, the model is more likely to prioritize it over its own memory, even if that context is factually incorrect.

Solutions

Solutions depend on the objective. Do we want the model to trust the context or its memory?

  • Faithful to Context: If we assume the retrieval system provides up-to-date facts, we want the model to prioritize context. Techniques here include Context-Aware Decoding (CAD), which amplifies the probability of output tokens that are present in the context, and Knowledge Aware Fine-Tuning (KAFT), which trains models to recognize when context is relevant.
  • Faithful to Memory (Discriminating Misinformation): If we fear the context is polluted, we need the model to be skeptical. Solutions include Prompting strategies that warn the model to verify information, and Query Augmentation, where the model cross-references answers from multiple sources to identify inconsistencies.

2. Inter-Context Conflict: The “He Said, She Said” Problem

With the rise of Retrieval-Augmented Generation (RAG), models often retrieve multiple documents to answer a single query. What happens when Document A says “Yes” and Document B says “No”? This is an Inter-Context Conflict.

The Causes

  • Misinformation: RAG systems might retrieve valid news alongside fake news sites.
  • Outdated Information: The retrieval might pull a document from 2010 and another from 2023. Both were true at the time of writing, but they conflict in the present.

Model Behavior

When faced with contradictory documents, LLMs struggle. Research indicates that inconsistencies across sources don’t necessarily lower the model’s confidence scores, which is dangerous—it means the model might confidently hallucinate an answer.

Models often exhibit a positional bias (preferring information presented first or last) or a frequency bias (believing the claim that appears most often in the retrieved documents). They often fail to act like a human rationalist who would look for citations or scientific tone; instead, they prioritize relevance to the query over the credibility of the source.

Solutions

Strategies here focus on helping the model adjudicate between sources:

  • Eliminating Conflict: Specialized models can be trained to detect contradictions before generation begins, effectively filtering out the “noise.”
  • Improving Robustness: Researchers have proposed fine-tuning discriminators—small auxiliary models that judge the reliability of a document before the LLM uses it to generate an answer.

3. Intra-Memory Conflict: The Self-Contradiction

Perhaps the most surprising conflict is Intra-Memory Conflict. This occurs when an LLM provides different answers to the same question depending on how it is phrased. For example, asking “Who is the director of Inception?” might yield “Christopher Nolan,” while the prompt “Inception was directed by…” might result in a different or hallucinated name.

The Causes

  • Bias in Training Corpora: The internet is full of contradictions. If the training data contains conflicting “facts,” the model learns both.
  • Decording Strategy: The inherent randomness in token sampling (like top-p or top-k sampling) means that a slight change in the input prompt can send the generation down a completely different probabilistic path.
  • Latent Representation: Research suggests that factual knowledge is stored in specific layers of the neural network (often the middle layers). However, different layers might encode slightly different variations of a fact, leading to internal dissonance.

Model Behavior

Self-inconsistency is a major issue. Studies on BERT and RoBERTa showed accuracy rates for consistency were barely 50-60%. Even GPT-4 can be tricked into inconsistency by rephrasing questions. This behavior reveals that LLMs often rely on spurious correlations (word co-occurrence) rather than a deep, semantic understanding of the truth.

Solutions

  • Consistency Fine-Tuning: Training models with a specific loss function that penalizes giving different answers to paraphrased questions.
  • Decoding Interventions: Techniques like DoLa (Decoding by Contrasting Layers) dynamically select layers that contain “mature” factual knowledge during generation, filtering out the noise from premature layers.

Evaluating the Conflicts

To study these phenomena, researchers cannot rely on standard benchmarks. They must create datasets that specifically induce conflict.

Table 1 below lists several datasets used in this field. Notice the “Conflicts” column. CM (Context-Memory) datasets often use generative approaches to create fake context that fights the model’s memory. IC (Inter-Context) datasets often use human annotation to find contradictory claims on the web (like WikiContradiction).

table path: images/004.jpg

The Impact on Performance

How bad is the damage when these conflicts occur? As shown in Table 2, the impact is significant.

  • Context-Memory: When misinformation is introduced into the context, performance can degrade by up to 87% (Pan et al., 2023b).
  • Inter-Context: As the “noise rate” (percentage of conflicting evidence) increases, performance drops sharply. When noise exceeds 80%, model performance can drop by over 20%.
  • Intra-Memory: Even powerful models like GPT-4 exhibit inconsistency rates (providing contradictory results) around 15-22% of the time.

Table 2: Comparison of quantitative results on the impact of various types of knowledge conflicts.

The Effectiveness of Mitigation

Are the solutions working? Table 3 provides a snapshot of current progress.

  • Faithful to Context: Techniques like Context-Aware Decoding have shown massive improvements (up to 128% on specific datasets) in forcing the model to stick to the provided text.
  • Discriminating Misinformation: Training discriminators can boost performance by about 5%, identifying fake news before it corrupts the answer.
  • Disentangling: Some methods have achieved an 80% F1 score in simply detecting that a conflict exists, which is the first step toward solving it.

Table 3:Comparisonof quantitative results on the efectiveness of various mitigation strategies w.r.t. their objectives

Conclusion and Future Directions

The survey of \(\aleph\) Knowledge Conflicts reveals that “hallucinations” in LLMs are often complex battles between different sources of information. Whether it is the clash between training data and real-time news, the confusion of contradictory search results, or the model’s own internal inconsistencies, these conflicts represent a major hurdle for the reliability of AI.

Current solutions are promising but often prioritize one side over the other (e.g., blindly trusting context). The authors suggest that the future of this research lies in:

  1. “In the Wild” Analysis: Moving away from artificial datasets to real-world conflicts found in live search engine results.
  2. Explainability: Going beyond the output and looking at the neuron activations to understand exactly when and why a model decides to flip from one fact to another.
  3. Multimodality: As models begin to “see” and “hear,” we will soon face conflicts where the text says one thing, but the image implies another.

For students and researchers entering this field, solving knowledge conflicts is key to building AI systems that are not just knowledgeable, but truly robust and trustworthy.