China's Z.ai Tops SWE-Bench Pro With No Nvidia Hardware — Export Controls' Biggest Test Yet

Z.ai's GLM-5.1 scored 58.4% on SWE-bench Pro, becoming the first open-weight model to top the coding benchmark — ahead of GPT-5.4 and Claude Opus 4.6. The entire training run used Huawei Ascend 910B chips, as Z.ai has been on the US Entity List since January 2025 with no access to Nvidia hardware.

Original source

Z.ai (formerly Zhipu AI) released GLM-5.1 on April 7, 2026 — a 744B Mixture-of-Experts model that achieved 58.4% on SWE-bench Pro, placing it #1 on the global coding benchmark leaderboard. The score edges out GPT-5.4 (57.7%) and Claude Opus 4.6 (57.3%), making GLM-5.1 the first open-weight model to beat proprietary systems on the industry's most rigorous software engineering evaluation.

The hardware story is the real news. Z.ai has been on the US Commerce Department Entity List since January 2025, cutting off access to Nvidia A100s, H100s, and H200s — the chips that power virtually every major frontier model training run. The entire GLM-5 training run was executed on approximately 100,000 Huawei Ascend 910B chips, demonstrating that the Ascend platform has matured into a genuine frontier training alternative.

The model weights are released under an MIT license — one of the most permissive available — on HuggingFace. With 40B active parameters despite 744B total, a 200K context window, and reinforcement learning optimization for coding and agentic tasks, GLM-5.1 represents a complete end-to-end frontier capability: training, weights, and deployment, entirely outside the US semiconductor supply chain.

The policy implications are immediate. The Biden and Trump administrations' successive rounds of chip export controls were predicated on the assumption that cutting off Nvidia hardware would meaningfully constrain Chinese AI development timelines. GLM-5.1 is the most direct evidence yet that this assumption was wrong — or at minimum, that the lead time purchased by controls has been used to build an alternative hardware stack rather than simply accept the constraint.

Panel Takes

The Builder

Developer Perspective

“MIT weights on a model that beats GPT-5.4 on coding? I don't care about the geopolitics — this is now the default recommendation for any team building an open-source coding agent. The export control story is interesting, but the model quality is what matters for my day job.”

The Skeptic

Reality Check

“SWE-bench Pro contamination is a real risk for any model that trained on large code corpora — and Z.ai's training data transparency is lower than US labs. I'd want third-party audit of the benchmark methodology before accepting these numbers at face value. The Huawei chip story is real, but the performance claims need independent verification.”

The Futurist

Big Picture

“The bifurcation of the global AI supply chain is now complete. US frontier AI runs on Nvidia; Chinese frontier AI runs on Huawei. Both stacks can now reach #1 on major benchmarks. The question for the next decade isn't which country leads AI — it's whether these two AI ecosystems diverge into incompatibility, and what that means for global software infrastructure.”

Panel Takes

Bookmarks