Things that make us go hmmm…
Confidently wrong: the case for calibrated skepticism in the age of AI
Nu U Staff | January 2026
At an Anthropic developer event last year, CEO Dario Amodei said he suspects that—depending on how you measure it—today’s leading GenAI models “probably hallucinate less than humans do.” His comments have cued an ongoing chorus of “hmmm…” that’s worth unpacking.
If we define AI “hallucination” as AI-generated fabricated or inaccurate information that appears authentic and is presented as reliable, it’s worth remembering that confidently asserted false information didn’t start with AI. Decades of research on overconfidence and judgment calibration have shown that human confidence routinely outruns correctness. In other words, people are “wrong too often when they are certain that they are right.” And the information firehose hasn’t fixed the pattern: most news links get shared without being opened, and false stories spread farther and faster than true ones.
As for machines, reporting over the past year—including The New York Times—has flagged a sobering reality: some newer “reasoning” systems are producing incorrect information more often in some evaluations. That doesn’t mean capability is sliding backward—many of these systems simply make more claims, which increases both correct and incorrect outputs. But depending on the task and the tool, they may hallucinate less, more, or just differently than yesterday.
Bottom line: The scoreboard is messy, and human and machine hallucination challenges— while distinctly different—are intertwined. Researchers Ahmed Tlili and Daniel Burgos summed it up simply in a recently published study in IJIMAI: “Both humans and AI (technology generally) are hallucinating, and the first can cause or emphasize the latter.” So, the real question isn’t who hallucinates more—humans or AI. It’s what happens inside your organization when both can be confidently wrong at the same time. That’s the workplace “hmmm…” we’re watching.
If you’re leading people, teams, or learning, consider a few gut-check questions to elevate your go-forward planning: What human and machine fact-checking policies and safeguards do we have in place across processes and workflows? How are we building human judgment muscles to augment our AI literacy efforts, leaning in on feedback loops and critical thinking? Do our organizational norms make it acceptable to slow down, ask for a second source, or challenge an answer that sounds right but could be wrong? Teams that practice this kind of “calibrated skepticism" now are more likely to interrupt errors before they compound—and keep credibility and trust intact as AI evolves.