Question 1

Which is better: DeepSeek V4-Pro or Kimi K2.5?

Accepted Answer

Based on our expert panel, DeepSeek V4-Pro has a stronger verdict with a 75% Ship rate. DeepSeek V4-Pro received a panel verdict of Ship and Kimi K2.5 received Ship.

Question 2

Is DeepSeek V4-Pro free?

Accepted Answer

DeepSeek V4-Pro pricing: Open Source (Apache 2.0) / ~$0.30/MTok API

Question 3

Is Kimi K2.5 free?

Accepted Answer

Kimi K2.5 pricing: Open Source (Modified MIT) + API

Question 4

What do experts say about DeepSeek V4-Pro vs Kimi K2.5?

Accepted Answer

DeepSeek V4-Pro: DeepSeek just dropped V4-Pro and V4-Flash simultaneously — and it's a statement release. V4-Pro packs 1.6 trillion total parameters in a MoE architecture with only 49B active per token, a 1-million-token context window, and a hybrid attention system (Compressed Sparse Attention + Heavily Compressed Attention) that requires just 27% of single-token inference FLOPs compared to V3.2. Both models are Apache 2.0.

The hardware story is arguably the bigger news: V4 was trained entirely on Huawei Ascend 950PR chips, zero NVIDIA. That's a geopolitical and technical milestone — it validates China's domestic AI compute stack at frontier scale. The Engram Memory System gives V4 conditional context recall (94% at 128K tokens vs ~45% for V3.2), enabling genuinely long-context reasoning.

V4-Flash at 284B parameters (13B active) is the cheaper, faster sibling for production use. Pricing is expected around $0.30/M tokens for Pro. The timing — released to HN today with 99+ points within hours — confirms this as an immediate conversation in the developer community about whether open-weight frontier models have finally matched proprietary ones. Kimi K2.5: Kimi K2.5 is Moonshot AI's flagship open-weight model, combining multimodal vision–language understanding with frontier-level agentic capabilities. Built by continual pretraining on approximately 15 trillion mixed visual and text tokens atop the Kimi-K2-Base architecture, with Moonshot's MoonViT-3D vision encoder added for native image understanding and 256K context.

The standout feature is Agent Swarm mode: K2.5 can orchestrate up to 100 parallel sub-agents using a new RL training technique called Parallel Agent Reinforcement Learning (PARL). This lets it decompose complex tasks and execute them concurrently rather than serially — a meaningful architectural bet on where frontier AI is heading. It supports both instant and thinking modes, and conversational and agentic paradigms.

Benchmark-wise, Moonshot claims K2.5 outperforms GPT-5.2 Pro on BrowseComp and Claude Opus 4.5 on WideSearch. Model weights are available on HuggingFace under a Modified MIT License. This is one of the most capable open-weight multimodal models available.

DeepSeek V4-Pro vs Kimi K2.5

DeepSeek V4-Pro

Kimi K2.5

Bookmarks