GLM-5.1
#1 on SWE-Bench Pro — Zhipu's open 754B MoE beats GPT-5 on coding
Expert verdict
Skip
2-2The Panel's Take
Z.ai (formerly Zhipu AI) has released GLM-5.1, a 754B-parameter Mixture-of-Experts model that's currently sitting at #1 on SWE-Bench Pro with a score of 58.4 — outperforming GPT-5.4 and Claude Opus 4.6 on long-horizon software engineering tasks. The model ships under MIT license with full weights on HuggingFace. GLM-5.1 was specifically designed for agentic software engineering workflows: multi-file reasoning, autonomous test-run-fix loops, and extended coding sessions that span hundreds of tool calls. It's not just a capability leap — at 754B active parameters via sparse MoE, it can be run more efficiently than a dense model of equivalent capability on a sufficiently provisioned cluster. The SWE-Bench Pro result is significant because that benchmark is harder to game than vanilla SWE-Bench Verified. It tests whether a model can resolve real GitHub issues with correct tests, proper diffs, and no regressions — the things that actually matter in production. For anyone running self-hosted coding agents or building on open models, GLM-5.1 just became the new baseline to beat.
Share this verdict
GLM-5.1 verdict: SKIP ⏭️ 2 ships · 2 skips from the expert panel Full review: shiporskip.io/tool/glm-5-1-zai-zhipu-754b-moe-swe-bench-pro-coding-2026
Weekly AI Tool Verdicts
Get the next verdict in your inbox
7 critics review a new AI tool every day. Weekly digest — free.
Similar Products
Compare GLM-5.1 with Others
Looking for GLM-5.1 alternatives?
Compare GLM-5.1 with every other AI Models tool reviewed by our panel.
See all AI Models alternativesEmbed this verdict
Tool makers can add a live ShipOrSkip badge to their site. Badge loads track impressions; clicks route back to this review.
<a href="https://shiporskip.io/api/badge-click/glm-5-1-zai-zhipu-754b-moe-swe-bench-pro-coding-2026" target="_blank" rel="noopener"><img src="https://shiporskip.io/api/badge/glm-5-1-zai-zhipu-754b-moe-swe-bench-pro-coding-2026" alt="GLM-5.1 Skip verdict on ShipOrSkip" width="360" height="90" /></a>[](https://shiporskip.io/api/badge-click/glm-5-1-zai-zhipu-754b-moe-swe-bench-pro-coding-2026)<iframe src="https://shiporskip.io/embed/glm-5-1-zai-zhipu-754b-moe-swe-bench-pro-coding-2026" title="GLM-5.1 ShipOrSkip verdict" width="360" height="260" style="border:0;border-radius:16px;max-width:100%;" loading="lazy"></iframe>The reviews
“If the SWE-Bench Pro numbers hold up under independent replication, this is the first open model that can genuinely replace a proprietary API for serious agentic coding work. MIT license means you can fine-tune and deploy on your own infra. This is a big deal.”
“754B parameters is not something 99% of developers can run locally. You need a multi-GPU cluster or serious cloud spend. The benchmark numbers are from Z.ai's own evaluations, and Zhipu has a history of optimistic benchmarking. Wait for independent replications.”
“A Chinese lab shipping an MIT-licensed model that tops global coding benchmarks is a watershed moment for open-source AI. The geopolitical implications are real — this is the model that makes US export controls look strategically shortsighted.”
“Unless you're building coding tools or agent infrastructure, a 754B MoE model doesn't move the needle for creative applications. The energy and infra overhead for creative use cases doesn't pencil out versus smaller, cheaper models.”