Anthropic launches 2 Mythos-class AI models with safety guardrails

Anthropic released two Mythos-class AI models on June 9, including the first widely available version, after developing safeguards to block responses in cybersecurity and biology — domains the company previously deemed too dangerous for public release.

"Fable 5 shows exceptional performance in software engineering, knowledge work, and vision, with its lead over other models growing as tasks become longer and more complex," Anthropic said in a blog post.

The models — Claude Fable 5 for general availability and Claude Mythos 5 for approved organizations — are built on the same underlying technology. Fable 5 outperforms Claude Opus 4.8 by more than 10% on several benchmarks, according to the company. Pricing is set at $10 per million input tokens and $50 per million output tokens, double the rate of Opus 4.8 but half the cost of the Mythos Preview tier.

The launch comes two months after Anthropic restricted access to its Mythos Preview model, citing risks that a highly capable AI could be misused. The company has since filed confidentially for an initial public offering, and the broader release signals confidence that its safety mechanisms — tested across internal and external red-teaming exercises — can withstand motivated adversaries.

The Self-Improvement Trajectory

Anthropic's decision to widen access follows a June 4 blog post in which researchers Marina Favaro and co-founder Jack Clark warned that AI systems are approaching "recursive self-improvement" — a stage where models can improve themselves with minimal human oversight. The company disclosed internal data showing that Claude-powered agents completed an open-ended AI safety research project in April 2026, with human researchers recovering about 23% of the performance gap over a week while Claude agents recovered 97%.

Claude Mythos Preview, the precursor to the newly released models, achieved a 52-times speedup over baseline code on optimization tasks, where a skilled human researcher would need four to eight hours to achieve a 4-times improvement. Claude now writes about 80% of Anthropic's new production code, and success rates on complex engineering problems rose to 76% in May 2026, the company said.

Task horizons that Claude can handle reliably have roughly doubled every four months, progressing from minutes-long tasks in early 2024 to 12-hour tasks today. Anthropic projects week-long autonomous tasks by 2027.

Safeguards vs. Adversaries

Anthropic said Fable 5 underwent extensive internal and external red-teaming exercises designed to identify common AI vulnerabilities, including jailbreak attempts. The testing did not uncover any known "universal" jailbreak techniques capable of consistently bypassing the model's safeguards, according to the company. In testing, 95% of Fable sessions ran entirely on Fable responses without falling back to Opus 4.8.

Still, the company acknowledged that cybersecurity researchers have historically found ways to circumvent safety mechanisms in earlier AI models. "The uplift from Mythos-level capabilities is valuable to many adversaries — for instance, those who could financially gain from cyberattacks — and we therefore expect them to be motivated to try to circumvent our safety measures," Anthropic said.

Claude Mythos 5, available to organizations already approved through Anthropic's Project Glasswing initiative, offers the same underlying model with safeguards lifted in some areas. The company said it plans to expand access over time through a more systematic trusted-access program.

Competitive Stakes and Investor Impact

The release positions Anthropic to compete more directly with OpenAI and Google in the enterprise AI market, where inference pricing and safety assurances are key differentiators. Fable 5's $50 per million output tokens places it at a premium to many publicly available models, reflecting the company's bet that enterprises will pay more for models with stronger safety guardrails.

Anthropic's confidential IPO filing, reported in recent weeks, adds pressure to demonstrate a clear path to revenue growth. One unnamed enterprise client accrued roughly $500 million in a single month on Claude due to unrestricted usage, according to a previous report from The Dallas Express, underscoring both the demand and the cost risks associated with powerful AI systems.

Ethan Mollick, a professor at the University of Pennsylvania's Wharton School, said that while some critics view Anthropic's safety messaging as publicity, many people inside the company are "true believers," according to the Wall Street Journal. His book about AI, "Co-Existence," is set to be released in the fall.

This article is for informational purposes only and does not constitute investment advice.