Chinese GPU developer Moore Threads is shifting from selling chips to delivering a complete AI infrastructure, aiming to capture a domestic market struggling with U.S. export controls.
Chinese GPU firm Moore Threads is rolling out a full-stack “cloud-to-edge” AI platform, a direct challenge to Nvidia Corp.’s dominance in China by offering an integrated system of hardware and software designed to lower the barrier for companies migrating from Nvidia’s CUDA ecosystem.
“Single card performance is the entry point, but system capability is what influences procurement and repurchase,” the company’s press materials for the May 18 launch stated, signaling a strategic pivot from components to integrated AI infrastructure delivery.
The launch includes the Kua’e exascale computing cluster, which has been deployed and achieves up to 60% Model Flops Utilization (MFU) in training large models. It's complemented by the MUSA SDK 5.1.0, now compatible with Nvidia’s CUDA 12.8 and supporting all 3,194 PyTorch operators.
The move positions Moore Threads to capture a piece of China’s estimated $50 billion annual AI market, a segment where Nvidia’s access has been curtailed by U.S. export restrictions. If successful, the strategy could accelerate China’s AI self-sufficiency and challenge Nvidia’s long-term revenue prospects in the region, which accounted for $17.1 billion, or 13% of its total, before the tightest controls were implemented.
From GPU Vendor to System Architect
Moore Threads’ announcement marks a significant strategic evolution from a hardware vendor into a system architect. The company’s new product matrix is built around a three-pronged approach: the Kua’e cluster for cloud-based AI training, products based on the new Changjiang SoC for edge and terminal devices, and the MT Lambda platform for simulation. This integrated portfolio is designed to prove to large-scale enterprise clients that the company can deliver and maintain a complex, end-to-end AI workflow, a crucial factor for customers undertaking multi-year AI projects.
At the edge, the company introduced the E300 module, based on the Changjiang SoC, which provides 50 TOPS of heterogeneous AI compute power for applications like industrial inspection, autonomous vehicles, and robotics that require low-latency, local inference. By providing a unified architecture from the cloud to the edge, Moore Threads aims to simplify deployment for developers building hybrid AI applications.
Lowering the CUDA Moat
For years, the biggest obstacle for any would-be Nvidia competitor has been CUDA, the company’s proprietary software platform that has become deeply entrenched in the AI development community. Moore Threads is tackling this challenge head-on. By open-sourcing vLLM-MUSA and achieving native support in the popular SGLang framework, the company is working to minimize the friction developers face when moving away from Nvidia’s ecosystem.
The effort addresses the “long tail” of compatibility issues—such as custom kernels and legacy dependencies—that often derail migration projects. While supporting major frameworks is a baseline requirement, ensuring that a company’s entire historical engineering effort can be ported smoothly is the real test. Moore Threads’ focus on its MUSA software stack, including an automatic migration tool, is a direct attempt to make its GPUs not just usable, but easy to adopt for a development community largely trained on Nvidia’s tools.
Targeting Embodied AI
Perhaps the most forward-looking component of the launch is the MT Lambda simulation platform, which pushes Moore Threads’ GPU narrative into the realm of physical AI. As AI moves from digital spaces to interacting with the physical world in robotics and autonomous driving, the need for high-fidelity simulation becomes paramount. Training these systems in the real world is expensive and dangerous.
Moore Threads is positioning its “full-function GPU,” which integrates graphics rendering, physics simulation, and AI computation on a single chip, as the ideal foundation for this work. By enabling the efficient generation of synthetic data and the validation of control policies in a virtual environment, the platform could become a critical piece of infrastructure for companies like Pony.ai and Zhipu AI, both of which are listed as partners. This move pits Moore Threads not just against Nvidia’s GPU hardware, but also its comprehensive simulation platforms like Omniverse.
The strategy is not without risk. By expanding its scope from chips to full systems, Moore Threads is now competing on multiple fronts—cloud stability, developer experience, and real-world application performance. However, with U.S. restrictions creating a potential opening for domestic players like Huawei and Moore Threads, the opportunity to become deeply embedded in China’s AI buildout may be worth the risk.
This article is for informational purposes only and does not constitute investment advice.