Huawei leads China's chip pivot as Nvidia's market share falls to zero

China's AI chip industry has abandoned efforts to clone Nvidia's general-purpose GPUs, pivoting instead to custom ASICs that sacrifice flexibility for raw efficiency — a structural shift accelerated by sustained US export controls that block access to the most powerful American processors.

"Enterprises with robust AI engineering capabilities and a clear roadmap benefit from ASICs, while those running mixed workloads still lean toward general-purpose GPUs," Su Lian Jye, chief analyst at Omdia, said.

Huawei Technologies is projected to capture 62% of China's domestic AI accelerator market in 2026, followed by Cambricon Technologies at 14%, according to a Morgan Stanley report published May 8. Baidu and Alibaba Group are each expected to take roughly 5% among big tech firms building proprietary chips. Huawei expects AI chip revenue to reach about $12 billion in 2026, up from $7.5 billion in 2025. Nvidia's share of the Chinese AI accelerator market has effectively collapsed to zero, a development Chief Executive Officer Jensen Huang has described as a "horrible outcome" for the United States because it breaks the software dependency on Nvidia's CUDA ecosystem.

The divergence carries long-term consequences for investors. If China's AI industry standardizes on a mix of Huawei neural processing units, Alibaba parallel processing units, and Cambricon domain-specific chips — each running its own software stack — the result will be a fragmented but domestically self-sufficient ecosystem that operates on fundamentally different architectural assumptions from the Nvidia-dominated West. Nvidia's CUDA lock-in, built over two decades, faces its first credible challenge.

Three architectures, one direction

Chinese companies are pursuing three distinct ASIC designs. Huawei is betting on neural processing units through its Ascend series, including the widely deployed 910C and the upcoming Ascend 950. Cambricon is building domain-specific architectures with its Siyuan 590 and 690 series. Alibaba's semiconductor unit T-Head launched the Zhenwu M890 parallel processing unit at its annual cloud computing summit last week, claiming three times the performance of its predecessor.

On the GPU side, Moore Threads — founded in 2020 by Zhang Jianzhong, Nvidia's former China executive — leads the domestic effort with general-purpose chips like the MTT S5000 series. Biren Technology, Enflame, and Iluvatar CoreX are also competing, but none has achieved the scale of the ASIC leaders.

The performance gap between Chinese chips and Nvidia's export-compliant hardware has narrowed significantly. Morgan Stanley data shows that Huawei's Ascend 950 cards and Cambricon's Siyuan 690 can outperform Nvidia's H20 — the most powerful chip Nvidia is currently permitted to sell to China — by 50% to 150% as measured in tokens per second. The H20 itself is roughly one-sixth as powerful as Nvidia's H200, according to a Council on Foreign Relations report.

The software stack challenge

Hardware performance is only half the equation. The deeper challenge for China's chip industry is breaking the lock-in created by Nvidia's CUDA platform, the software layer that millions of AI developers worldwide use to write code for Nvidia hardware. Virtually every AI framework, every research paper, and every pre-trained model assumes CUDA compatibility.

Huawei is building CANN as its alternative, while Moore Threads has developed MUSA. DeepSeek has spent months rewriting its core code to work with Huawei's CANN framework, moving away from the CUDA ecosystem. But semiconductor analyst Zhang Haijun notes that as AI models grow more complex, the boundaries between custom ASICs and flexible GPUs are "becoming increasingly blurry," suggesting the winning architecture may eventually combine elements of both.

For China's highly commercialized AI market, which focuses on deploying applications to hundreds of millions of users rather than conducting frontier research, the ASIC approach makes particular sense. Inference — the process of running a trained model at scale — rewards the kind of narrow optimization that custom silicon provides. Training new models still benefits from GPU flexibility, but the revenue is in deployment.

The long-term consequence of this divergence may be more significant than near-term performance benchmarks. If China's AI industry standardizes on domestic chips and software stacks, cross-border AI collaboration becomes harder when the underlying compute stacks are incompatible. And the lack of a single dominant platform means no Chinese chip maker benefits from the kind of ecosystem lock-in that made Nvidia's CUDA so powerful in the first place.

Nvidia shares, trading at about 35 times forward earnings, face a structural overhang from the China revenue loss. While the company's data center business remains dominant globally — generating $62 billion in the last fiscal year — the erosion of its China franchise removes a growth vector that analysts had previously modeled as a multiyear tailwind. The question for investors is whether the custom silicon ecosystem China is building can match the pace of innovation in the Nvidia-powered West.

This article is for informational purposes only and does not constitute investment advice.