C

Codestral 2

Mistral's 22B Apache 2.0 code model beats GPT-4o on HumanEval

PriceOpen Source (Apache 2.0) / API pricingReviewed2026-04-17
Verdict — Ship
3 Ships1 Skips
Visit mistral.ai

The Panel's Take

Codestral 2 is Mistral AI's second-generation code-specialized model, released under the Apache 2.0 license with 22 billion parameters. It ships with native fill-in-the-middle (FIM) support, context up to 256K tokens, and benchmarks that outperform GPT-4o on both HumanEval and MBPP according to Mistral's internal evals — a significant claim for an open-weight model. The model is designed for three primary use cases: inline code completion (with FIM), multi-file code generation with long context, and agentic coding tasks where the model needs to reason about large codebases. Mistral has also optimized it specifically for the most popular languages of 2026: Python, TypeScript, Go, Rust, and SQL. Integration support covers Cursor, Continue.dev, VS Code, and direct API access via the Mistral API and HuggingFace. For the open-source community, Codestral 2 arrives at the right moment. The local LLM coding space has been dominated by Qwen3-Coder variants, and Codestral 2 offers a Western-lab alternative with a permissive license, strong fill-in-the-middle performance, and a model size that fits comfortably on a single A100 or dual consumer GPUs at Q4 quantization.

Share this verdict

Codestral 2 verdict: SHIP 🚀

3 ships · 1 skip from the expert panel

Full review: shiporskip.io/tool/codestral-2-mistral-22b-apache-code-completion-fill-in-middle-2026

Weekly AI Tool Verdicts

Get the next verdict in your inbox

7 critics review a new AI tool every day. Weekly digest — free.

Embed this verdict

Tool makers can add a live ShipOrSkip badge to their site. Badge loads track impressions; clicks route back to this review.

Ship · 7.5/10
HTML badge
<a href="https://shiporskip.io/api/badge-click/codestral-2-mistral-22b-apache-code-completion-fill-in-middle-2026" target="_blank" rel="noopener"><img src="https://shiporskip.io/api/badge/codestral-2-mistral-22b-apache-code-completion-fill-in-middle-2026" alt="Codestral 2 Ship verdict on ShipOrSkip" width="360" height="90" /></a>
Markdown badge
[![Codestral 2 Ship verdict on ShipOrSkip](https://shiporskip.io/api/badge/codestral-2-mistral-22b-apache-code-completion-fill-in-middle-2026)](https://shiporskip.io/api/badge-click/codestral-2-mistral-22b-apache-code-completion-fill-in-middle-2026)
Iframe widget
<iframe src="https://shiporskip.io/embed/codestral-2-mistral-22b-apache-code-completion-fill-in-middle-2026" title="Codestral 2 ShipOrSkip verdict" width="360" height="260" style="border:0;border-radius:16px;max-width:100%;" loading="lazy"></iframe>

The reviews

Apache 2.0 + fill-in-the-middle + 256K context is the trifecta I've been waiting for in a locally-runnable code model. The HumanEval numbers are believable based on my early testing — it's genuinely competitive with GPT-4o on completion tasks, which is remarkable at this size and license.

Helpful?

Mistral's benchmarks are self-reported and the comparison methodology isn't fully disclosed. I'd want independent evaluation before trusting 'beats GPT-4o' claims — especially since Mistral's previous eval comparisons have been questioned. Also, 22B at full precision still requires significant GPU memory that most indie developers don't have.

Helpful?

A truly permissive, high-quality code model changes the economics of AI-assisted development for enterprises with data privacy requirements. The real story here isn't beating GPT-4o on benchmarks — it's enabling companies that can't send code to external APIs to finally have a competitive option they can run on-premise.

Helpful?

For the growing community of creators building with AI coding tools, having a locally-runnable model with this quality means your code stays on your machine. The Cursor integration makes it plug-and-play, which lowers the barrier to trying it significantly.

Helpful?

Bookmarks

Loading bookmarks...

No bookmarks yet

Bookmark tools to save them for later