Back
DeepSeek / CNBCLaunchDeepSeek / CNBC2026-04-28

DeepSeek V4 Goes Open-Weight With 1M Context — And Undercuts Every Major Competitor by 10×

DeepSeek launched V4 Preview on April 24 with two open-weight variants: the 1.6T-param V4-Pro (49B active) and 284B-param V4-Flash (13B active), both with 1M context. V4-Pro lists at $1.74/MTok input — roughly 10× cheaper than GPT-5.5 and Claude Opus 4.7.

Original source

DeepSeek launched preview versions of its V4 model on April 24, 2026, reigniting the China-US AI rivalry just months after V3.2 sparked a market rout. The release comprises two open-weight models: DeepSeek-V4-Pro (1.6 trillion total / 49 billion active parameters) and DeepSeek-V4-Flash (284 billion total / 13 billion active parameters), both supporting 1 million token context windows and dual Thinking/Non-Thinking modes.

The pricing is the headline: V4-Pro lists at $1.74 per million input tokens and $3.48 per million output tokens — roughly an order of magnitude cheaper than GPT-5.5 and Claude Opus 4.7, both of which compete in the same performance tier. Legacy aliases deepseek-chat and deepseek-reasoner are marked for deprecation on July 24, 2026.

DeepSeek's infrastructure story is also significant. The company partnered with Huawei, which provides "Supernode" clusters of Ascend 950 chips to replace the Nvidia H100s blocked by US export controls. The V4 architecture claims a 73% reduction in per-token inference FLOPs and a 90% reduction in KV cache memory versus V3.2 — meaning cheaper operation even before the pricing discount.

On benchmarks, V4-Pro claims performance "rivaling the world's top closed-source models," with particularly strong results in agent-based tasks, knowledge processing, and inference. Independent evals place it above GPT-5.4 on SWE-bench at 80.6%. The open-weight release means the model weights are available on Hugging Face for self-hosting, which closes the gap between what Chinese and Western developers can access.

The V4 Preview is available now at chat.deepseek.com in Expert and Instant modes, with the API live on the same day. The release continues a pattern of DeepSeek shipping models that compress the performance-to-cost curve in ways that force competitors to respond — and this time, they did it without Nvidia hardware.

Panel Takes

The Builder

The Builder

Developer Perspective

10× cheaper at the same performance tier is not a discount, it's a different market. I'm already migrating batch inference jobs from GPT-5.4 to V4-Flash and the numbers are absurd in the best way. The Huawei compute story is also important — this proves the export controls aren't working as designed.

The Skeptic

The Skeptic

Reality Check

DeepSeek's benchmark claims need more independent scrutiny before you bet production infrastructure on them. V3.2's 'preview' had reliability issues for weeks after launch. The 10× price advantage also evaporates quickly if you're routing through API latency from China into US production systems.

The Futurist

The Futurist

Big Picture

DeepSeek V4 Preview is proof that the compute export controls have failed as a containment strategy and may have accelerated Chinese innovation in chip-efficient architectures. The 73% FLOPs reduction is the real story — it means the next generation of frontier models needs less hardware, not more.

Bookmarks

Loading bookmarks...

No bookmarks yet

Bookmark tools to save them for later