PrismML Exits Stealth With $16.25M and the First Commercial 1-Bit LLM Family

Caltech-founded PrismML emerged from stealth this week with Bonsai, a family of 1-bit LLMs (1.7B, 4B, 8B) and $16.25M in seed funding. The models compress to sign-only (+1/-1) weights, fitting an 8B model into 1.15 GB RAM — 14x smaller and 8x faster than FP16 equivalents with a commercial license on HuggingFace.

Original source

Caltech-founded PrismML came out of stealth this week with a $16.25M seed round and the open release of **Bonsai** — claimed to be the first commercially licensed 1-bit LLM family. The release includes three model sizes (1.7B, 4B, and 8B parameters) available on HuggingFace under a commercial license, with an API in preview for teams that want hosted inference.

The technical core of Bonsai is aggressive weight quantization: rather than storing weights as 16-bit or 4-bit floating point numbers, Bonsai uses sign-only representation (+1 or -1) with group scaling factors to recover numerical range. The result is dramatic: the 8B model compresses to 1.15 GB in memory, achieves 8x faster inference compared to FP16 equivalents on the same hardware, and consumes roughly 5x less energy per token. On benchmark tasks, the 8B model is competitive with 4-bit quantized models of similar parameter counts — though critics have noted that reasoning and long-context benchmarks show a wider performance gap.

PrismML's founding team includes researchers with backgrounds in binary neural network research at Caltech, and the company has been in stealth for roughly 18 months. The $16.25M seed was led by a mix of hardware-adjacent and enterprise SaaS investors, reflecting the target market: edge AI hardware manufacturers, robotics OEMs, and regulated enterprise environments where sending inference requests to cloud APIs is restricted by policy or connectivity.

The release positions Bonsai squarely against Microsoft's BitNet research lineage, which pioneered 1-bit LLMs academically but never produced a commercially usable artifact. "We wanted to close the gap between 'this is theoretically possible' and 'you can ship a product with this today,'" PrismML's CEO told HPCwire.

The key test for Bonsai will be whether the quality-per-watt trade-offs hold up in production deployments outside the company's curated benchmark suite. Early community testing on r/LocalLLaMA suggests the models perform well for classification, structured output, and short-form generation tasks — the canonical edge AI use cases — but lag meaningfully on instruction-following complexity. For the right use case, the size and speed numbers are genuinely compelling.

Panel Takes

The Builder

Developer Perspective

“Commercial license on HuggingFace is the move that makes this real. Research-only 1-bit models have existed for two years — the problem was always 'I can't ship this in a product.' PrismML closed that gap. I'll be testing the 4B model on ARM edge hardware this week.”

The Skeptic

Reality Check

“The benchmark suite is narrow and the reasoning gap is real. Before betting infrastructure on Bonsai, teams should benchmark their specific task distribution — short classification tasks will look great, multi-step reasoning will disappoint. The $16.25M seed buys time but not solved benchmarks.”

The Futurist

Big Picture

“The cloud LLM paradigm assumes you always have connectivity and don't care about energy cost. Billions of edge devices break both assumptions. Bonsai is the first serious commercial stake in the ground for post-cloud AI inference — even if v1 isn't perfect, the trajectory matters.”

Panel Takes

Bookmarks