Darwin-4B-David

4.5B merged model beats Gemma-4-31B on GPQA — no training needed

Price — Open SourceReviewed — 2026-04-11

Expert verdict

Ship

3-1

▲ 3 Ships— 1 Skips

Visit huggingface.co

The Panel's Take

Darwin-4B-David is a 4.5-billion-parameter model that achieves 85.0% on GPQA Diamond — outperforming Google's Gemma-4-31B (84.3%) at roughly 1/7th the parameter count. The kicker: it required no training whatsoever. It was built in 45 minutes on a single H100 using MRI-guided DARE-TIES model merging, a novel variant of the merge-and-trim technique. The MRI-guided approach uses activation analysis to identify which parameters in each source model are most critical, then applies DARE-TIES merging only to the high-value weight regions. This avoids the catastrophic interference that usually degrades merged models. The result is a small model that inherits the strengths of multiple larger predecessors without any of the compute cost of fine-tuning. For the AI community, this is a meaningful data point: model merging continues to close the gap with expensive training runs. Darwin-4B-David demonstrates that thoughtful merge strategies can extract benchmark-level performance from models that are a fraction of the size, making capable AI more accessible on consumer hardware.

The reviews

Builder

Ship

“45 minutes on a single H100 to beat a 31B parameter model? That's an extraordinary efficiency ratio. MRI-guided merging is a technique I'll be watching closely. If this holds up across more benchmarks, it fundamentally changes how teams should think about building capable small models.”

Helpful?

Skeptic

Skip

“GPQA Diamond is one benchmark. One. Benchmark performance doesn't translate linearly to real-world task performance, especially for a merged model that hasn't been fine-tuned for instruction following or RLHF alignment. Impressive number, but I'd want to see this on coding, reasoning chains, and RAG tasks before getting excited.”

Helpful?

Futurist

Ship

“Model merging is the dark horse of AI efficiency research. If MRI-guided DARE-TIES merging can reliably produce results like this, it suggests we're nowhere near the ceiling for extracting value from existing open-weight models. The future may involve less training and more intelligent composition.”

Helpful?

Creator

Ship

“A capable model in the 4-5B range that can run on a MacBook M-series is exactly what solo creators need for on-device inference. If Darwin-4B-David's performance holds on creative tasks, it's a genuine local creative AI for people without cloud budgets.”

Helpful?

Share this verdict

Darwin-4B-David verdict: SHIP 🚀

3 ships · 1 skip from the expert panel

Full review: https://shiporskip.io/tool/darwin-4b-david-model-merge-gpqa-diamond-85-no-training-45min-h100-2026?utm_source=share_card&utm_medium=social&utm_campaign=verdict_share&utm_content=x_share

Weekly AI Tool Verdicts

Get the next verdict in your inbox

7 critics review a new AI tool every day. Weekly digest — free.

MMicrosoft MAI ModelsSkip

MMistral Medium 3.5Ship

NNemotron 3 Nano OmniShip

QQwen3.6-27BShip

MMeta Muse SparkSkip

Compare Darwin-4B-David with Others

Darwin-4B-David vs Microsoft MAI Models Darwin-4B-David vs Mistral Medium 3.5 Darwin-4B-David vs Nemotron 3 Nano Omni Darwin-4B-David vs Qwen3.6-27B Darwin-4B-David vs Meta Muse Spark

Looking for Darwin-4B-David alternatives?

Compare Darwin-4B-David with every other AI Models tool reviewed by our panel.

See all AI Models alternatives

Embed this verdict

Tool makers can add a live ShipOrSkip badge to their site. Badge loads track impressions; clicks route back to this review.

Ship · 7.5/10

HTML badge

<a href="https://shiporskip.io/api/badge-click/darwin-4b-david-model-merge-gpqa-diamond-85-no-training-45min-h100-2026" target="_blank" rel="noopener"><img src="https://shiporskip.io/api/badge/darwin-4b-david-model-merge-gpqa-diamond-85-no-training-45min-h100-2026" alt="Darwin-4B-David Ship verdict on ShipOrSkip" width="360" height="90" /></a>

Markdown badge

[![Darwin-4B-David Ship verdict on ShipOrSkip](https://shiporskip.io/api/badge/darwin-4b-david-model-merge-gpqa-diamond-85-no-training-45min-h100-2026)](https://shiporskip.io/api/badge-click/darwin-4b-david-model-merge-gpqa-diamond-85-no-training-45min-h100-2026)

Iframe widget

<iframe src="https://shiporskip.io/embed/darwin-4b-david-model-merge-gpqa-diamond-85-no-training-45min-h100-2026" title="Darwin-4B-David ShipOrSkip verdict" width="360" height="260" style="border:0;border-radius:16px;max-width:100%;" loading="lazy"></iframe>

Darwin-4B-David

Bookmarks