Compare/DeepSeek V4-Pro vs MiniMax M2.7

AI tool comparison

DeepSeek V4-Pro vs MiniMax M2.7

Which one should you ship with? Here is the side-by-side panel verdict, pricing read, reviewer split, and community vote comparison.

D

Foundation Models

DeepSeek V4-Pro

1.6T-param MoE model, 1M context, Nvidia-free — just dropped Apache 2.0

Ship

75%

Panel ship

Community

Paid

Entry

DeepSeek just dropped V4-Pro and V4-Flash simultaneously — and it's a statement release. V4-Pro packs 1.6 trillion total parameters in a MoE architecture with only 49B active per token, a 1-million-token context window, and a hybrid attention system (Compressed Sparse Attention + Heavily Compressed Attention) that requires just 27% of single-token inference FLOPs compared to V3.2. Both models are Apache 2.0. The hardware story is arguably the bigger news: V4 was trained entirely on Huawei Ascend 950PR chips, zero NVIDIA. That's a geopolitical and technical milestone — it validates China's domestic AI compute stack at frontier scale. The Engram Memory System gives V4 conditional context recall (94% at 128K tokens vs ~45% for V3.2), enabling genuinely long-context reasoning. V4-Flash at 284B parameters (13B active) is the cheaper, faster sibling for production use. Pricing is expected around $0.30/M tokens for Pro. The timing — released to HN today with 99+ points within hours — confirms this as an immediate conversation in the developer community about whether open-weight frontier models have finally matched proprietary ones.

M

AI Models

MiniMax M2.7

The open-source AI that improves its own training

Ship

75%

Panel ship

Community

Paid

Entry

MiniMax M2.7 is a 230B-parameter Mixture-of-Experts model (10B active) that does something no major open-source model has done before: it participates in its own development cycle. During training, M2.7 updated its own memory, built skills for RL experiments, and improved its own learning process — with an internal version autonomously optimizing a programming scaffold over 100+ rounds to achieve a 30% performance improvement. On benchmarks, M2.7 scores 56.22% on SWE-Pro and 57.0% on TerminalBench 2, putting it in the same tier as GPT-5.3 for coding tasks. It achieves an ELO of 1495 on GDPval-AA (highest among open-source models) and 97% skill adherence across 40+ complex, multi-thousand-token skills. For office productivity tasks — generating Word, Excel, and PowerPoint files, running financial analysis — it performs at junior analyst level. Released under MIT license on April 12, 2026, M2.7 is available on Hugging Face and via the MiniMax API. The model is particularly strong at agentic workflows: tool calling, multi-step task execution, and professional productivity use cases that require sustained context and precise instruction following.

Decision
DeepSeek V4-Pro
MiniMax M2.7
Panel verdict
Ship · 3 ship / 1 skip
Ship · 3 ship / 1 skip
Community
No community votes yet
No community votes yet
Pricing
Open Source (Apache 2.0) / ~$0.30/MTok API
API pricing / Open Source (MIT)
Best for
1.6T-param MoE model, 1M context, Nvidia-free — just dropped Apache 2.0
The open-source AI that improves its own training
Category
Foundation Models
AI Models

Reviewer scorecard

Builder
80/100 · ship

Apache 2.0 with 1M context and frontier-level benchmarks changes the commercial calculus entirely. Self-host for sensitive workloads, use the API for production — the 49B active params means reasonable inference costs if you have the hardware.

80/100 · ship

MIT license, 10B active params, and SWE-Pro scores matching GPT-5.3? This is the open-source agentic backbone I've been waiting for. The self-improvement angle is genuinely unprecedented — watching a model optimize its own scaffold over 100 rounds is the kind of thing that used to be sci-fi.

Skeptic
45/100 · skip

Benchmark claims from DeepSeek have historically been hard to independently replicate at launch. The Huawei chip story is compelling but also means the Western open-source deployment story requires significant hardware work. And 1.6T parameters is not consumer hardware territory.

45/100 · skip

230B total parameters is not something most people can run locally — you need serious cluster access or you're using their API, which means the 'open source' framing is mostly PR. And 'self-evolving' sounds revolutionary but the actual mechanism is AutoML loop, something the field has had for years.

Futurist
80/100 · ship

V4's Nvidia-free training stack is a geopolitical inflection point as much as a technical one. It proves the export control strategy isn't containing China's AI progress — and gives the global open-source community a frontier model with no licensing restrictions.

80/100 · ship

A model that improves its own training process is a meaningful step toward recursive self-improvement. Even if the current implementation is narrow, this is the architectural direction that matters. MiniMax just showed a credible open-source path to it.

Creator
80/100 · ship

A 1M-token context model at $0.30/MTok Apache 2.0 means long-form creative projects — novels, screenplays, brand bibles — can finally be processed holistically. The Flash variant's low cost makes it accessible even for creative side projects with tight budgets.

80/100 · ship

97% skill adherence across 2,000-token skills means M2.7 can actually execute complex creative briefs without drifting. For long-form content workflows that need consistent style and structure, this is a real upgrade over models that forget instructions halfway through.

Weekly AI Tool Verdicts

Get the next comparison in your inbox

New AI tools ship daily. We compare them before you waste an afternoon.

Bookmarks

Loading bookmarks...

No bookmarks yet

Bookmark tools to save them for later

DeepSeek V4-Pro vs MiniMax M2.7: Which AI Tool Should You Ship? — Ship or Skip