Ungrounded Divergence: A Philosophical Framework for Understanding AI Hallucination

Author: Matthew T. Armendariz
December 12, 2025

INTRODUCTORY NOTE

This paper argues that what the AI field calls “hallucination” in large language models is better understood through Kripke’s rule-following paradox. LLMs learn from finite training data and extrapolate to novel inputs. This is structurally identical to the problem Kripke identified in Wittgenstein on Rules and Private Language: any finite set of examples is compatible with multiple rules, and nothing in the data alone determines which rule the learner has internalized. When the learned function diverges from truth on a novel input, the model has no internal error signal. It is, in Kripke’s terms, computing “quus” rather than “plus.”

The paper defines this phenomenon as “ungrounded divergence” and identifies three architectural features of LLMs that guarantee its occurrence: (1) finite and fixed training corpora, (2) no access to ground truth during generation, and (3) stochastic output selection. Together, these features make divergence ineliminable under current architectures, regardless of scale or fine-tuning.

The framework has direct implications for legal practice and other high-stakes domains. The paper evaluates Retrieval-Augmented Generation (RAG) as a partial mitigation strategy, framing it as a form of Kripkean rigid designation that re-grounds the model’s outputs in authoritative sources. It also critiques recent technical proposals (including logprob monitoring and multi-model consensus systems) that claim to detect or eliminate hallucination, arguing that these approaches fail to address the structural problem the framework identifies.

DISCLOSURE: This paper was drafted with the assistance of AI language models, a fact the author discloses with some amusement given the paper’s thesis. The arguments, analysis, framework, and conclusions are entirely the author’s own, developed over months of research and iterative refinement. The AI served as a drafting tool, producing prose that the author then revised, restructured, and in many cases rewrote. Readers concerned about the reliability of AI-generated text are invited to consider this disclosure a modest practical demonstration of Section VI’s argument: that human verification remains indispensable, even when the output looks right.

I. THE PROBLEM WITH THE TERM “HALLUCINATION”.

Large language models (LLMs) have achieved remarkable fluency in generating text that looks and reads as if it were generated by a human. They draft contracts, summarize case law, answer complex questions, and produce content that often appears indistinguishable from human, even expert, output. Yet these same systems routinely generate content that is factually incorrect, internally contradictory, or entirely fabricated. They cite cases that do not exist, misstate legal standards, and confidently assert falsehoods.

The AI community has adopted “hallucination” as the standard term for this phenomenon. The metaphor suggests a perceptual failure: that the model “sees” things that are not there. But this framing is misleading. LLMs do not perceive. They do not have experiences that could be veridical or hallucinatory. The metaphor anthropomorphizes the system in ways that obscure the actual mechanism of failure.

This paper proposes an alternative framework drawn from analytic philosophy of language. What we call hallucination is better understood as a species of the rule-following problem identified by Saul Kripke in his interpretation of Wittgenstein.2

Specifically, LLM failures exhibit the structure of what Kripke called the “quus” problem: a learned function that coincides with the intended rule across observed cases but diverges on novel applications. I call this phenomenon Ungrounded Divergence.

II. THE “QUUS” PROBLEM

Kripke’s “quus” problem can be stated simply. Suppose you have performed many addition problems throughout your life. You have computed 2+2=4, 10+5=15, and countless others. Now you are asked: what is 73+37?

The natural answer is 110. But Kripke (and to a large extent Hume before him) asks: what fact about your past behavior determines that you were following the rule for addition (“plus”) rather than some other rule? Let’s call this alternative rule “quus” as Kripke did. It yields the same results for all numbers you have previously encountered but yields a different result (say, 5) for this new case.³

The problem is not merely epistemic. It is not that we cannot know which rule we were following. Kripke’s stronger claim is that there is no fact of the matter, nothing in our past behavior, mental states, or dispositions that determines the rule uniquely. Any finite set of examples is consistent with infinitely many functions that agree on those examples but diverge elsewhere.

This insight has generated extensive philosophical debate. But its application to LLMs has been largely overlooked, despite the structural parallel being strikingly similar.

III. LLMS AS QUUS MACHINES

Modern LLMs are “autoregressive” (meaning each output token becomes input for the next token to be generated), founded on so-called “transformer” models (meaning they transform representations through successive layers that build up a contextual understanding of the tokens) and are trained on vast text corpora. Training involves adjusting billions of parameters to minimize prediction error across trillions of tokens. The result is a function that, given a sequence of tokens, produces a probability distribution over possible next tokens.

Crucially, this function is learned inductively from a finite corpus. The model has “seen” an enormous but bounded set of examples. It has learned patterns or statistical regularities in how tokens co-occur, how sentences are structured, and how arguments proceed. It has learned something that looks like semantic understanding, grammatical knowledge, and factual recall. But what exactly has it learned? This is where Kripke’s insight becomes directly relevant.

The model’s training data constitutes a finite set of examples. The learned function, encoded in weights, temperature settings, and other parameters that remain proprietary and opaque, produces outputs that match human expectations on inputs similar to the training distribution. But, we cannot inspect the function to know where it will fail. And we cannot be certain whether the model has learned the actual rule governing semantic relationships (“plus”) or merely a statistical approximation that coincides with that rule on familiar inputs but diverges on novel ones (“quus”).

When the model is prompted with inputs sufficiently different from its training distribution (edge cases, novel combinations, domain-specific queries with sparse precedent), the learned function continues to produce outputs. But those outputs may no longer track truth, coherence, or factual accuracy. The model has no internal signal indicating that it has crossed into territory where its approximation diverges from the actual rule.

This is what we call hallucination. But the term is imprecise. More accurately: the model is applying its learned function to inputs where that function and actual truth diverge.

IV. UNGROUNDED DIVERGENCE DEFINED

I propose the term Ungrounded Divergence (UD) to replace “hallucination” as the descriptor for LLM-generated false content. UD is defined as follows:

Ungrounded Divergence occurs when a large language model, applying a statistical function learned from finite training data, produces output that diverges from truth because the input falls outside the domain where the learned function has any grounding.4

This definition captures several important features:

First, it identifies the mechanism of failure, not perceptual error, but functional divergence. The model is not “seeing things” that aren’t there; it is applying a learned function on a domain where that function no longer tracks truth. The output is ungrounded.

Second, it explains why the model has no internal error signal. From the model’s “perspective”, generating an ungrounded output is indistinguishable from generating a grounded (or “correct”) output. The function is being applied in both cases; the divergence is invisible from within.

Third, it suggests the appropriate mitigation strategy. If the problem is ungrounded outputs, a possible (partial) solution is to re-ground the model’s outputs in external, verified sources rather than relying solely on its internal function.

V. INEVITABILITY

This is not merely a reframing of the same phenomenon with different terminology. It is a structural claim about what LLMs are and what they can do. In my view, three architectural features, taken together, guarantee that Ungrounded Divergence will occur:

(1) LLMs learn from finite data. No training corpus, however vast, covers all possible inputs. The learned function is necessarily an extrapolation beyond observed cases.

(2) LLMs have no access to truth as a regulating constraint. Training optimizes for prediction of the next token based on distributional patterns, not for correspondence with external facts. The model has no mechanism for checking its outputs against reality.

(3) LLMs generate outputs stochastically. Given the same input, different runs may produce different outputs based on probability distributions. There is no deterministic mapping from input to correct answer.

These are features not bugs. They are constitutive of what LLMs are (at least at the current moment). Any system that learns inductively from finite examples, without truth as a constraint, and generates outputs probabilistically, will produce divergent outputs on novel inputs. This is not a prediction about current models that future models might escape. I believe, it is a consequence of the architecture itself.

VI. IMPLICATIONS FOR LEGAL PRACTICE

The UD framework has practical implications for attorneys using AI tools. It is commonly accepted that LLMs exhibit elevated divergence rates in domains characterized by technical terminology, jurisdictional variation, and sparse precedent, which are precisely the conditions that define legal practice.

Legal practice is particularly vulnerable to UD for several reasons. Legal terminology is domain-specific; a term like “consideration” has a precise meaning in contract law that differs from ordinary usage. Jurisdictional variation means that correct answers depend on context the model may not adequately track. Many legal questions involve novel fact patterns with limited precedent, which are exactly the conditions where learned functions are most likely to diverge.

The UD framework supports specific verification protocols:

First, treat AI outputs as hypotheses, not answers. The model’s output represents the application of a learned function that may or may not track the truth. Verification against verified sources is not optional due diligence; it is a necessary check on potential divergence.

Second, be especially cautious in low-precedent areas. UD is most likely where training data is sparse. Administrative law, niche state court decisions, novel statutory interpretations, and cross-jurisdictional questions present elevated divergence risk.

Third, demand (and verify) primary sources. AI-generated summaries of cases may exhibit UD even when citing real authorities. The model’s characterization of what a case holds is a function of its learned patterns, not a direct report of the holding. Always verify by reading the actual opinion. Ad fontes!

VII. RETRIEVAL-AUGMENTED GENERATION AS KRIPKEAN REFERENCE

Retrieval-Augmented Generation (RAG) systems address Ungrounded Divergence by introducing a partial external grounding mechanism. Rather than relying solely on the model’s internal function, RAG architectures retrieve relevant documents from external databases and inject them into the model’s context before generation.

This architecture can be understood in Kripkean terms. In Naming and Necessity, Kripke argued that proper names function as “rigid designators”, meaning they refer directly to their objects through a causal chain, not via descriptive content.5 A name like “Gottlob Frege” picks out a particular individual in all possible worlds, not whoever happens to satisfy some description.

It is important to note, however, that the rigid designator analogy applies to retrieval, not generation. When the system retrieves the text of Brown v. Board of Education, that text has a causal connection to the actual opinion; it rigidly designates. But the model’s characterization of that text is a separate operation, still processed through the same learned function that produces divergence elsewhere. RAG thus bifurcates the risk: fabrication of sources is largely eliminated; mischaracterization of real sources remains. The prudent practitioner verifies both that the source is real and that the AI’s description of it is accurate.

The upshot: RAG does not eliminate Ungrounded Divergence entirely. The model can still misinterpret, misapply, or mischaracterize retrieved content. But it narrows the domain of potential divergence. The raw factual content comes from outside the learned function; only the interpretive layer remains subject to divergence risk.

A properly designed RAG workflow integrates this insight. Legal research platforms that return actual case text provide rigid reference; their AI-generated summaries of that text do not. The prudent practitioner trusts the former and verifies the latter.

VIII. ADDRESSING RECENT “FIXES” VIA LOGPROB ANALYSIS

RAG addresses Ungrounded Divergence by introducing external reference. A different approach attempts to detect divergence from within the model itself and then presumably “stop” or “flag” outputs that do in fact diverge.

These recent systems claim to reduce or eliminate divergence by monitoring the probability distributions (so-called “logprobs”) that models generate for each token. The premise is straightforward: when a model is confident, probability mass concentrates on a single token; when uncertain, it spreads across multiple candidates. High entropy in the distribution may signal that the model has crossed into ungrounded territory.

Sup AI, for example, markets itself as “the only AI that eliminates hallucinations” through real-time logprob analysis. When the model’s confidence wavers, the system stops the response rather than allowing potentially ungrounded output to reach the user.

This approach has surface-level appeal: if we could detect, from within, the boundary where grounded output becomes ungrounded, we could intervene before divergence occurs.

But the approach has fundamental limitations. Logprob confidence measures the model’s certainty, not its correctness. A model may exhibit high confidence precisely where it is wrong, when the learned function produces a definite answer that happens to diverge from truth. The function is being applied smoothly; the divergence is invisible from within. This is the core Kripkean insight: nothing internal to the application of a function distinguishes plus from quus.

OpenAI’s own research confirms this limitation. In a recent paper6, the company acknowledged that “accuracy will never reach 100% because, regardless of model size, search and reasoning capabilities, some real-world questions are inherently unanswerable.” More troublingly, their research demonstrates that models often exhibit high confidence on incorrect answers because training and evaluation procedures “reward guessing rather than acknowledging uncertainty.” 7 Apparently, any answer is better than no answer because language models are “optimized to be good test-takers, and guessing when uncertain improves test performance.”8

Logprob monitoring thus offers a partial signal, namely that low confidence may indicate elevated divergence risk. But high confidence does not guarantee grounded output. The model cannot see its own blind spots, and so, external verification remains essential.

The upshot: Any system that learns from finite data, lacks access to truth as a constraint, and generates outputs stochastically will exhibit Ungrounded Divergence. When practitioners encounter claims that some new technique has “solved” hallucination (chain-of-thought reasoning, multi-agent verification, novel architectures), they should ask whether these three features remain. If so, the technique may reduce frequency but in my view it cannot escape the structure.

IX. PHILOSOPHICAL CAVEATS

The application of Kripke’s insight to LLMs is not without complications. Kripke’s original argument targeted the determinacy of meaning for human rule-followers. Whether LLMs “follow rules” in any robust sense is up for debate. They may be better described as pattern-matching systems with no genuine semantic content.

But this objection arguably strengthens the case for the UD framework. If LLMs lack genuine semantic understanding (if they are, in Wittgenstein’s terms, operating with signs - i.e., physical marks or patterns such as shapes on a page, sounds in the air, or tokens in a sequence - rather than symbols - i.e., signs that have been infused with meaning), then the risk of functional divergence is even greater. There is no underlying grasp of meaning to constrain outputs when learned patterns prove inadequate.

A related concern: embeddings and vector representations might seem to provide determinate semantic content. Each word maps to a specific vector; relationships are mathematically defined. But this apparent determinacy shouldn’t be comforting because the vector space is itself a learned representation, not an objective structure of meaning. In other words, the relationships that emerge during training on a curated corpus are necessarily descriptive, not prescriptive. The coordinates assigned to “consideration” rise up from patterns in training data; they do not correspond to the concept’s actual intension or extension. The mathematical precision of the representation does not guarantee semantic determinacy.

X. CONCLUSION

“Hallucination” has become entrenched in popular discourse about AI. But the term obscures more than it reveals. It suggests a perceptual failure in systems that do not perceive, and it implies that the problem lies in the model’s “mind” rather than in the structure of inductive learning from finite data.

Ungrounded Divergence offers a more precise framework. LLM failures are not hallucinations but functional divergences: cases where a learned approximation, extrapolated beyond its training distribution, no longer tracks the truth. This divergence is not a contingent failure of current implementations but a necessary consequence of learning from finite data without access to truth. This conceptual framing correctly identifies the mechanism, explains the absence of internal error signals, and points toward appropriate mitigation strategies.

For legal professionals, the practical upshot is clear. AI outputs are hypotheses generated by a function that may diverge from truth in precisely the cases where accuracy matters most. Verification is not a courtesy but a necessity. RAG systems and recent fixes like logprob intervention provide partial mitigation by re-grounding outputs based on presumably verified sources or stopping outputs with low confidence, but they do not eliminate divergence risk. The prudent practitioner treats every AI-generated claim as provisional until confirmed against authoritative sources.

The rise of AI in legal practice demands conceptual clarity about what these tools are and are not doing. Ungrounded Divergence provides that clarity. It names the problem precisely, and in doing so, illuminates the path toward responsible use.

1 ABOUT THE AUTHOR: Matt Armendariz is a partner at ScottHulse PC in El Paso, Texas, where he leads the firm’s business practice group and chairs its AI Committee. His practice focuses on corporate transactions, commercial lending, real estate, and healthcare law across Texas and New Mexico jurisdictions. He holds a B.A. in Philosophy from Yale University and a J.D. from Stanford Law School.

2Saul Kripke, Wittgenstein on Rules and Private Language (Harvard University Press, 1982).

3The “quus” function is defined as: x quus y = x + y, if x, y < 110; otherwise = 5. For any finite set of addition problems we’ve actually computed, our answers are consistent with both “plus” and “quus.” Kripke’s point is that nothing in our past behavior determines which function we were following.

4 I use “truth” here in its ordinary, non-philosophical sense: the way a lawyer means it when asking whether a case citation is real or fabricated. I take no position on the metaphysics of truthmakers or the correct theory of truth. The argument requires only that there is some objective fact of the matter that the model’s output either captures or fails to capture. Similarly, while Kripke’s original argument targets semantic determinacy (i.e., whether there is a fact of the matter about which rule is being followed), this paper does not need to resolve or purport to advance that debate. We have robust shared practices for identifying false outputs: fabricated citations, misstated holdings, incorrect legal standards. The phenomenon this paper addresses is real for most practitioners and Kripke’s framework illuminates its structure without requiring commitment to his skeptical conclusions.

5Saul Kripke, Naming and Necessity (Harvard University Press, 1980).

6 Adam Tauman Kalai, Ofir Nachum, Edwin Zhang & Santosh S. Vempala, Why Language Models Hallucinate (OpenAI, Sept. 4, 2025), https://arxiv.org/abs/2509.04664

7 Ibid.

8 Ibid.

9 Armendariz, M., 2025. Ungrounded Divergence: A Philosophical Framework for Understanding AI Hallucination. SSRN. Available at:https://papers.ssrn.com/sol3/papers.cfm?abstract_id=6234159

PRINT WHITE PAPER

Back to News

Ungrounded Divergence: A Philosophical Framework for Understanding AI Hallucination

Sign Up For Future News Updates