Inference Costs Quadruple to $8.4B, Derailing Profit Goals
The profitability models for leading artificial intelligence companies are under severe strain as the cost to run their services escalates. Both OpenAI and Anthropic missed their internal gross margin targets due to higher-than-expected inference costs—the expense of using cloud servers to generate user responses. Last year, OpenAI's gross margin fell to 33% from 40%, significantly below its 46% forecast. Anthropic also faces challenges, with its projected 2025 gross margin of 40% now sitting 10 percentage points below its earlier goal.
The cost overruns are stark. OpenAI's inference costs surged fourfold last year to $8.4 billion, exceeding its $6.6 billion projection. The company attributed the increase to higher-than-anticipated service demand, forcing it to purchase more expensive on-demand server capacity. Similarly, Anthropic's inference costs are projected to more than triple to $2.7 billion in 2025. This cost inflation is particularly notable as it occurs while overall cloud computing prices are declining and both firms claim to be improving model efficiency.
Free Users and Sora Video Tool Strain Finances
OpenAI's financial pressure is heavily compounded by its massive base of non-paying users. Of its approximately 910 million weekly active users, only about 5% are paying customers. Last year, these free users accounted for $3.9 billion in inference costs, nearly half of the company's total. This dynamic forces paying subscribers and enterprise clients to subsidize the vast majority of the platform's usage.
The product mix is also a major cost driver. Computationally intensive tools like OpenAI's video-generation model, Sora, consume far more server resources than simple text queries. The company also absorbed significant costs by offering unrestricted access to powerful features, such as the popular GPT-4o model's ability to create stylized images, which temporarily drove a massive spike in computing resource consumption.
Paid User Profitability Improves to 70%
Despite the overall margin compression, OpenAI has demonstrated significant efficiency gains with its paying customer segment. The company's margin on revenue from paid users, after deducting model running costs, improved to approximately 70% in October of last year. This marks a substantial increase from just 52% at the end of the prior year and 35% in January 2024, suggesting the core business model is viable if monetization can be expanded.
To address the imbalance, OpenAI is pursuing new revenue streams, including advertising and expanded subscription tiers. In January, the company launched an ad-supported ChatGPT subscription for around $5 to $8 per month. Looking ahead, OpenAI projects to reach a 67% gross margin by 2030, by which point it expects 94% of its then-$850 billion in inference costs will be dedicated to serving paying customers. Achieving this long-term target while managing runaway near-term expenses remains the central challenge for the AI leader.