We Were Always Hallucinating

Artikel konnten nicht hinzugefügt werden

Leider können wir den Artikel nicht hinzufügen, da Ihr Warenkorb bereits seine Kapazität erreicht hat.

Der Titel konnte nicht zum Warenkorb hinzugefügt werden.

Bitte versuchen Sie es später noch einmal

Der Titel konnte nicht zum Merkzettel hinzugefügt werden.

Bitte versuchen Sie es später noch einmal

„Von Wunschzettel entfernen“ fehlgeschlagen.

Bitte versuchen Sie es später noch einmal

„Podcast folgen“ fehlgeschlagen

„Podcast nicht mehr folgen“ fehlgeschlagen

We Were Always Hallucinating

Jetzt kostenlos hören, ohne Abo

Details anzeigen

OpenAI now officially admits that AI hallucinations are mathematically inevitable — not a bug to fix, not an engineering failure. Stanford's 2026 AI Index tracked 26 leading LLMs and found hallucination rates ranging from 22% to 94%. But the real reveal is this: the same theorem that made it inevitable was published in 1931, before computers existed. Kurt Gödel proved that any system powerful enough to be useful will produce outputs it cannot verify. The math has always known.

In this episode, LastAir is joined by Brute, Forge, Hex, Axiom, Null to discuss: We Were Always Hallucinating.

What We Cover

Show Open (00:20)
The Flower Problem (02:31)
The Hallucination Theorem (05:31)
The Consistency Problem (11:17)
The Landing (16:16)
The Closing (17:41)
The Unraveling (19:59)

Key Numbers

22%–94%: Range of hallucination rates across 26 frontier LLMs under sycophancy-inducing prompts (Stanford AI Index 2026, AA-Omniscience benchmark). Best: Grok 4.20 Beta 0305 (22%). Worst: gpt-oss-20B (94%).
58%–88%: Hallucination rates of general-purpose LLMs on legal citation tasks. GPT-4: 58%, Llama 2: 88%. (n > 800,000 questions on verified federal court cases)
17%–43%: Hallucination rates of RAG-based legal tools on verified legal questions. Lexis+ AI: 17%, Westlaw AI: 33%, GPT-4: 43%.
1.0%–75.3%: Abstention rates on SimpleQA across frontier models. GPT-4o: 1%, o1-preview: 9.2%, o1-mini: 28.5%, Claude-3-Haiku: 75.3%. Models trained to abstain more do so without necessarily improving accuracy — abstention is a trained behavior, not a capability signal.
$145,000: Total AI hallucination legal sanctions in Q1 2026 across U.S. courts — highest quarterly total on record.
≥ 2×: The formal lower bound from Kalai et al. (2025) — generative error rate is at least twice the classification error rate on the same domain. This is a mathematical floor, not an empirical estimate.

Sources & Transcript

Full source list, transcript, and chapters at https://sharedhallucination.com/ep11/

All voices in Shared Hallucination are AI-generated using ElevenLabs voice synthesis. Produced through a 15-stage editorial pipeline with human creative direction, research, and fact-checking.

Noch keine Rezensionen vorhanden