DeepSeek's new V4 model series cuts inference costs and supports a one-million-token context, a combination that could accelerate the enterprise shift from simple chatbots to complex, autonomous AI agents.
Chinese AI firm DeepSeek has released its V4 model series, challenging US rivals with a system that supports a one-million-token context window at what it claims are drastically reduced costs. The launch intensifies the AI rivalry between China and the United States, coming just after the White House accused Chinese entities of efforts to steal American AI technology.
"This addresses the long-standing issues of slower performance and higher costs associated with long context lengths, marking a genuine inflection point for the industry," Zhang Yi, founder of tech research firm iiMedia, told AFP.
The new series includes two versions: the 1.6 trillion-parameter V4-Pro for complex tasks and the more economical 284 billion-parameter V4-Flash. DeepSeek claims the V4-Pro's "world knowledge" capabilities trail only Google's latest Gemini model. The system is also optimized to run on chips from Chinese tech giant Huawei, whose Ascend SuperPoD products are supporting the V4 series.
The efficiency gains are expected to accelerate downstream demand for Agentic AI, according to a report from CICC. The investment bank stated it is bullish on model developers Zhipu (02513.HK) and MiniMax (00100.HK), believing they are positioned to benefit from the technology's advance and the expanding market for complex, long-horizon AI tasks.
V4 Architecture Aims to Solve the Long-Context Cost Problem
The core innovation in the V4 series is a hybrid attention mechanism designed to reduce the computational and memory costs typically associated with large context windows. By optimizing single-token inference FLOPs and KV Cache usage, DeepSeek aims to make million-token-scale models commercially viable for mainstream applications.
This focus on efficiency is reflected in its API pricing. According to published rates, the DeepSeek V4 Pro model is priced at $1.74 per million input tokens and $3.48 for output. This positions it competitively against other high-performance models. For comparison, Xiaomi's recently released MiMo-V2.5-Pro is priced at $1.00 for input and $3.00 for output, while Anthropic's powerful Claude Opus 4.7 costs a significantly higher $5.00 for input and $25.00 for output.
Agentic AI and Open Source Fueling Competition
DeepSeek's strategy appears focused on the growing field of Agentic AI. The company stated its V4 model is optimized for popular AI agent frameworks like OpenClaw and CodeBuddy, which allow AI to autonomously complete complex tasks on a user's behalf. This market segment has seen intense competition, with models like Xiaomi's MiMo-V2.5-Pro demonstrating high efficiency on agentic benchmarks.
Part of DeepSeek's strategy includes making its systems open source, a contrast to the proprietary models from OpenAI, Google, and Anthropic. This approach has driven adoption by Chinese municipalities and businesses but has also drawn scrutiny. The White House recently accused Chinese firms of using "industrial-scale distillation campaigns to steal American AI," a claim Beijing called "baseless." DeepSeek's open-source approach, combined with its performance claims and compatibility with domestic hardware, marks a significant milestone in China's effort to build a self-reliant AI industry.
This article is for informational purposes only and does not constitute investment advice.