Meta Releases Llama 4 Behemoth Weights for Research
Meta has released the full weights of Llama 4 Behemoth, a 288-billion-parameter mixture-of-experts model, under a community license for non-commercial research and evaluation. Weights are available directly on Hugging Face starting today.
Original sourceMeta has made Llama 4 Behemoth's weights publicly available for non-commercial research and evaluation, extending the Llama 4 family's open-weight strategy to its largest model. The release covers the full 288-billion-parameter mixture-of-experts architecture, downloadable through Hugging Face under Meta's community license, which restricts commercial use but permits academic research, benchmarking, and internal evaluation.
Behemoth sits at the top of the Llama 4 lineup, above the Scout and Maverick variants that were made available earlier this year. The MoE architecture activates a subset of parameters per token, meaning inference is more efficient than a dense 288B model — though running it still requires serious hardware. The community license terms follow Meta's established pattern: free for research, commercial use requires a separate agreement, and derivatives must stay within the license terms.
The practical significance of this release is access for evaluation. Researchers and institutions that want to benchmark frontier-scale models against their own datasets, study alignment properties, or probe emergent behaviors now have direct weight access rather than relying on API access alone. For teams building in regulated industries, the ability to run inference on-premise without routing data through a third-party API is itself a meaningful unlock.
Meta has been consistent about releasing weights across the Llama lineage while keeping commercial deployment of the largest models behind a licensing gate. This release continues that pattern — Behemoth joins the research commons without becoming fully open-source in the traditional sense. How broadly the research community engages with a model at this scale, where even download and storage represent a non-trivial infrastructure commitment, remains to be seen.
Panel Takes
The Builder
Developer Perspective
“The primitive here is straightforward: full weights on Hugging Face, download and run. What I actually care about is whether the inference stack is coherent — does transformers support this MoE architecture cleanly, or are you patching together custom kernels before you can run a single forward pass? The 288B MoE topology means you need documentation on which experts activate and what the hardware floor actually looks like in practice, not just the parameter count. If Meta ships this with a working example that gets you to a generation in under 20 lines and names the minimum GPU config, this is genuinely useful. If the README is a license notice and a model card, that's a skip.”
The Skeptic
Reality Check
“Let's be precise about what 'open weights' means here: non-commercial, no derivatives for competing foundation models, and you still need Meta's blessing for commercial deployment. That's not open-source, it's a research preview with a licensing moat. The real question is who actually has the infra to run 288B MoE weights — this is a release for well-funded labs and large university compute clusters, not the independent researcher crowd Meta's blog post implies. What kills this in 12 months is simpler: either Meta ships a commercial API with better latency than anyone can self-host, or a fully permissive competitor at similar scale makes the license restrictions irrelevant.”
The Futurist
Big Picture
“The thesis Meta is betting on: frontier-scale weights in research hands accelerates safety research, alignment probing, and capability evaluation faster than any closed API program could. The dependency is that the academic compute gap doesn't widen to the point where 'weights available' and 'weights runnable' become disconnected concepts for most researchers. The second-order effect that matters most here isn't the research papers — it's that enterprise security and compliance teams now have a credible on-premise story for frontier-scale inference, which starts pulling regulated industries off the OpenAI API dependency. Meta is riding the trend of compute commoditization and betting it reaches 288B-class inference within 18-24 months. That bet is early, but not crazy.”
The Founder
Business & Market
“The business logic for Meta here is distribution moat disguised as generosity: every research institution that trains workflows on Llama 4 Behemoth is a future commercial license customer. The community license isn't altruism — it's a land-and-expand strategy where the expand motion is 'contact sales when you want to deploy this in production.' What I'd stress-test is whether the commercial license terms are competitive with just running GPT-4-class models through an API at current pricing, because the infrastructure cost of self-hosting 288B MoE has to beat the API bill by enough to justify the ops overhead. If Meta prices the commercial tier aggressively, this is a real wedge into enterprise; if they price it like enterprise software, the research release just feeds competitor benchmarks.”