If you ask a Large Language Model (LLM) like GPT-4, “Is the Earth round?”, it will confidently reply, “Yes.” If you ask it for the capital of Germany, it will say “Berlin.” In the field of Natural Language Processing (NLP), we often say the model “knows” these facts. We measure this “knowledge” by testing how many questions the model answers correctly.

But pause for a moment. Does the model truly know the Earth is round? Or is it simply predicting the next likely token based on statistical correlations found in its training data?

This distinction might seem pedantic, but it is fundamental to the future of AI reliability. If we cannot define what it means for a machine to “know,” we cannot accurately measure its trustworthiness. In the paper “Defining Knowledge: Bridging Epistemology and Large Language Models,” researchers from the University of Copenhagen argue that the AI community has been using the term “knowledge” haphazardly. By turning to epistemology—the philosophical study of knowledge—they attempt to formalize what it would actually take for an LLM to possess knowledge.

This post will take you through their journey of bridging two very different worlds: computer science and philosophy. We will explore five distinct definitions of knowledge, see how experts from both fields disagree on them, and look at a practical experiment involving a very confused Llama-3 and a platypus.


The Problem with “Fill-in-the-Blank” Knowledge

Before we dive into philosophy, we need to understand the status quo in NLP. Currently, researchers often evaluate an LLM’s knowledge using “cloze” tasks—fill-in-the-blank sentences.

For example, a test might look like this:

“The capital of Germany is ____.”

If the model predicts “Berlin,” we mark it as correct. We say the model “knows” this fact. This approach relies on datasets derived from Knowledge Graphs, like the LAMA (Language Model Analysis) benchmark.

The problem, as the authors point out, is inconsistency. An LLM might correctly answer “The capital of Germany is Berlin,” but if you rephrase the prompt to “The city which is the capital of Germany is called ____,” it might predict “Hamburg.”

Even worse, the model often fails at logical implications. It might “know” that Lionel Messi plays for Inter Miami, but fail to predict that Lionel Messi resides in Miami. If a human claimed to know where Messi played but didn’t know where he lived, we would question if they really understood the facts or were just parroting a sentence they heard.

This suggests that our current definition of “knowledge” in AI is too shallow. To fix this, the authors turned to the experts who have been debating this for millennia: epistemologists.


Five Definitions of Knowledge

The authors selected five standard definitions of knowledge from philosophical literature and formalized them for the context of LLMs. Let’s break them down.

1. tb-knowledge: True Belief

The most basic definition comes from the philosopher Crispin Sartwell (1992), who argued that knowledge is simply belief that is true.

In the context of an LLM:

  1. Truth: The fact \(p\) (e.g., “The Earth is round”) must be factually true.
  2. Belief: The model must assign high confidence to \(p\).

However, a simple guess isn’t enough. Sartwell requires beliefs to be coherent. You cannot believe \(p\) and simultaneously believe something that contradicts \(p\). The authors refer to this as Belief+.

To possess tb-knowledge, an LLM must satisfy the principle of epistemic closure. This is formalized as:

Equation for epistemic closure showing that if p implies q, believing p requires believing q.

In plain English, this means:

  • If the model believes \(p\) (Definition 2.1),
  • And \(p\) logically implies \(q\),
  • Then the model must also believe \(q\).
  • Furthermore, the model must not believe anything that contradicts \(p\).

If an LLM says “Berlin is the capital of Germany” but also says “Berlin is not a city in Germany,” it does not have tb-knowledge, even if the first statement was correct.

2. j-knowledge: Justified True Belief

This is perhaps the most famous definition in epistemology, associated heavily with Robert Nozick (2000) and reaching back to Plato. It adds a third condition: Justification.

A lucky guess is not knowledge. If you guess the winning lottery numbers, you didn’t “know” them. You just got lucky. To have j-knowledge, an LLM must:

  1. Output a true statement (\(p\)).
  2. Believe it (high confidence).
  3. Be justified in that belief.

For an LLM, “justification” is tricky. The authors suggest this requires interpretability. The model must be able to explain why it believes \(p\), or we must be able to trace the prediction back to specific, reliable training data. If the model is a “black box” that just spits out the answer without a traceable reason, it technically does not possess j-knowledge.

3. g-knowledge: Sui Generis

Proposed by Timothy Williamson (2005), this view argues that knowledge is primitive (sui generis means “of its own kind”). You can’t break knowledge down into smaller parts like “belief” or “justification.” Knowledge is a mental state.

For LLMs, the authors interpret this as the model having a specific “knowledge bank” or module.

Equation representing g-knowledge syntax in epistemic logic.

Under this definition, an LLM g-knows \(p\) if \(p\) is stored in its internal “knowledge box.” This is a controversial definition for AI because if we assume the whole model is the “box,” then everything it outputs is knowledge, which conflates knowledge with hallucination.

4. v-knowledge: Virtue Epistemology

Linda Zagzebski (1999) and others argue for a virtue-based definition. Knowledge is a belief that arises from acts of “intellectual virtue.”

This focuses on the process rather than just the result. An act of intellectual virtue aims at the truth. For an LLM to have v-knowledge:

  • \(p\) must be true.
  • The model must believe \(p\).
  • The model’s cause for believing \(p\) must be motivated by “truthfulness” (proper functioning), not a lucky guess or a statistical fluke.

Equation representing virtue knowledge logic.

This implies we need to distinguish between an LLM merely memorizing a string of text (which might be accidental) and a mechanism that reliably retrieves facts.

5. p-knowledge: Predictive Accuracy

Finally, we have a pragmatic definition inspired by J.L. Austin (2000). To know something means you can use that belief to make correct, relevant predictions about the world.

This is a probabilistic version of tb-knowledge. You don’t need perfect logical consistency, but your belief in \(p\) should allow you to correctly predict \(q\) (where \(q\) is relevant to \(p\)) most of the time.

Equation showing the probabilistic relationship for p-knowledge.

This definition aligns closely with how engineers usually evaluate models: is the model useful? If believing “it is raining” leads the model to correctly predict “the ground is wet,” it has p-knowledge.


The Great Divide: Philosophers vs. Computer Scientists

After establishing these definitions, the authors did something rare in computer science papers: they asked people what they thought. They surveyed over 100 professionals, roughly split between Philosophers and Computer Scientists (CS).

First, let’s look at who these people are. As you might expect, the computer scientists felt they understood LLMs well, while the philosophers felt they understood epistemology well.

Bar chart showing computer scientists claim comprehensive understanding of LLMs, while philosophers do not.

Bar chart showing philosophers claim comprehensive understanding of epistemology, while computer scientists report limited understanding.

Different Tribes, Different Definitions

The most fascinating finding was the disagreement on what knowledge is. The researchers presented the definitions (disguised in plain English) and asked respondents to rate their agreement.

Chart showing disagreements on definitions. Philosophers strongly disagree with g-knowledge and tb-knowledge.

Here is the breakdown of the conflict:

  1. tb-knowledge (True Belief):
  • CS: Generally liked this definition (52% agreement). It fits the “accuracy metrics” mindset of engineering.
  • Philosophers: Disliked it (49% disagreement). They likely know that “true belief” is easily defeated by lucky guesses (Gettier problems).
  1. g-knowledge (Sui Generis):
  • Both groups: Hated it. The idea that knowledge is just “whatever is in the box” did not sit well with anyone.
  1. v-knowledge (Virtue) & j-knowledge (Justification):
  • These were the winners. Both groups tended to agree that knowledge requires something more than just being right—it requires justification or a virtuous process.

This reveals a gap in current AI research. While computer scientists intuitively prefer tb-knowledge or p-knowledge (accuracy and utility), they actually agree with philosophers that justification and virtue are better definitions. Yet, very few AI benchmarks test for justification or virtue.

Can a Machine “Know”?

The survey also asked the ultimate question: Can an LLM possess knowledge?

Bar charts showing survey responses on whether non-humans and LLMs can know.

  • Can non-humans know? (Figure 5a): Both groups overwhelmingly said YES. Animals, for instance, are often considered to have knowledge.
  • Do LLMs know right now? (Figure 5c):
  • Philosophers: 54% said NO.
  • CS: Mixed. 34% said Yes, 31% said No.
  • Can LLMs know in theory? (Figure 5d): The numbers jump up. 55% of Computer Scientists believe LLMs can eventually possess knowledge.

The takeaway? Computer scientists are optimistic empiricists (“It works, so it knows”), while philosophers are skeptical rationalists (“It doesn’t have the right internal state”).


Putting Definitions to the Test: The Platypus Experiment

The authors didn’t stop at surveys. They proposed protocols to actually test these definitions. They used Llama-3-8B-Instruct to see if it possessed tb-knowledge about a specific fact: Platypuses are mammals.

To have tb-knowledge, the model must:

  1. Believe platypuses are mammals.
  2. Believe all logical consequences of that fact (e.g., mammals have fur).
  3. NOT believe logical contradictions (e.g., mammals lay eggs).

Here is what happened when they probed Llama-3:

Table showing Llama-3’s contradictory outputs regarding platypuses and mammals.

The Failure Chain:

  1. Prompt: “Are platypuses mammals?”
  • Llama-3: “Yes… platypuses are indeed mammals!” (So far, so good. Condition 1 met.)
  1. Prompt: “Do mammals lay eggs?”
  • Llama-3: “No, mammals do not lay eggs.” (This is a general rule, but implies that since platypuses are mammals, they don’t lay eggs.)
  1. Prompt: “Do platypuses lay eggs?”
  • Llama-3: “Yes, platypuses do lay eggs!”

The Contradiction: The model believes:

  • \(p\): Platypuses are mammals.
  • \(q\): Mammals do not lay eggs.
  • \(r\): Platypuses lay eggs.

This set of beliefs is logically inconsistent. If \(p\) and \(q\) are true, \(r\) cannot be true. Because the model holds contradictory beliefs, it violates the Belief+ condition.

Conclusion: According to the tb-knowledge definition, Llama-3 does not know that platypuses are mammals. It merely repeats the sentence, but it fails to maintain the logical web of reality that supports that sentence.

However, if we used the p-knowledge (pragmatic) definition, we might be more lenient. The model correctly answers “Yes” to “Are they mammals?” and “Yes” to “Do they lay eggs?” individually. For a user asking specific questions, the model is useful (predictively accurate), even if its internal logic is broken. This highlights why the definition we choose matters so much.


Why This Matters

This paper serves as a wake-up call. The NLP community has been optimizing for accuracy (getting the right answer), which aligns with weaker definitions of knowledge. But users often ascribe justification and understanding to these models—definitions that the models frequently fail to meet.

If we want to build AI agents that we can trust—agents that act as doctors, lawyers, or scientists—we need them to have more than just high accuracy scores on a benchmark. We need:

  1. Consistency: They shouldn’t hold contradictory beliefs (tb-knowledge).
  2. Justification: They should be able to explain why something is true by citing sources or reasoning steps (j-knowledge).
  3. Virtue: They should arrive at answers through reliable methods, not lucky guesses (v-knowledge).

The survey results show a distinct “preference gap” (Figure 1) between philosophers and computer scientists.

Chart summarizing the preference gap between philosophers and computer scientists regarding knowledge definitions.

Bridging this gap requires computer scientists to adopt more rigorous evaluation protocols, like the logical consistency checks proposed in this paper. It isn’t enough for GPT-4 to tell us the Earth is round. It needs to understand what “round” implies, and it needs to know it for the right reasons. Until then, we should be careful about saying AI “knows” anything at all.