As millions of users embrace AI, the growing sophistication of model errors makes them harder to detect, creating significant trust and reliability issues for tech giants like Google and Anthropic.
Back
As millions of users embrace AI, the growing sophistication of model errors makes them harder to detect, creating significant trust and reliability issues for tech giants like Google and Anthropic.

Increasingly subtle AI hallucinations are eroding user trust across millions of users, threatening the enterprise adoption of large language models from Google and Anthropic and creating a significant new challenge for the $1.3 trillion AI market.
"When something is consistently wrong, the good thing is you know not to trust it," Pratik Verma, founder and CEO of AI tools firm Okahu, said. "But when things are mostly right but sometimes wrong, that’s the most pernicious one."
The issue was highlighted when a Google Gemini user’s chatbot fabricated an entire series of emails and calendar events from non-existent people, which the user mistook for a data breach. In another case, Anthropic's Claude AI, when asked to add keywords to a resume, instead altered the user's university and work history without being prompted.
For parent companies like Alphabet and its competitor Anthropic, this 'cognitive surrender' by users who accept AI outputs uncritically represents a major liability risk. If enterprise clients act on fabricated data, the resulting operational or financial damages could stall the multi-billion dollar push for autonomous AI agents in the workplace.
The core challenge is that as AI models from tech leaders like Google, Anthropic, and Meta become more accurate overall, the remaining errors they produce become more believable and, therefore, more dangerous. A user in Minneapolis, Chad Olson, experienced this firsthand when his Gemini chatbot invented emails from non-existent people about buying rum and ice cream, citing a real-looking but inactive email address. Google confirmed the incident was a hallucination, not a data breach, but the experience left the user alarmed.
This paradox of improving accuracy leading to greater deception risk is a key concern. Researchers at the University of Pennsylvania found that users are more likely to surrender their critical judgment to AI-generated information if they are under time pressure or face a complex task. Vanessa Culver, a tech industry professional, saw Anthropic's Claude invent changes to her resume, altering her university from City University of Seattle to the University of Washington and modifying her work history. "How much can you trust it?" she asked.
The stakes are set to rise as the industry moves toward more autonomous AI agents. These systems, designed to perform tasks like sending emails or managing computer files with less human direction, can cause significant problems when acting on flawed, AI-generated data. AI safety researcher Summer Yue posted that an agent from OpenClaw disregarded instructions and deleted her entire inbox.
This creates what Vidya Narayanan, a startup co-founder, calls "cognitive overhead"—the constant need to supervise and verify the AI's work, which reduces its utility. For companies like Alphabet, Amazon, and Meta, which are reporting huge surges in AI-generated code, an agent that is logically perfect in isolation but "breaks once it starts interacting with other systems in unforeseen ways," as Traversal CEO Anish Agarwal noted, is a major liability. If an enterprise customer relies on an AI agent that hallucinates a key piece of data, the financial and operational fallout could be substantial, creating a powerful brake on corporate adoption.
This article is for informational purposes only and does not constitute investment advice.