OpenAI's First Custom Chip 'Jalapeño' Is Built by Broadcom

OpenAI has unveiled Jalapeño, its first custom silicon chip designed in partnership with Broadcom and optimized specifically for inference workloads. The move signals OpenAI's intent to reduce dependency on Nvidia and control more of its compute stack.

Original source

OpenAI announced Jalapeño, its first purpose-built AI chip, developed in collaboration with semiconductor giant Broadcom. The chip is designed from the ground up for inference — the process of running trained models to generate outputs — rather than training, where Nvidia's H100 and H200 GPUs have long dominated. The decision to target inference specifically reflects where OpenAI's actual cost structure lives: once models are trained, serving billions of daily queries is the continuous and compounding expense.

The Broadcom partnership gives OpenAI access to advanced chip design and fabrication pipelines without needing to build a full semiconductor team from scratch. Broadcom has quietly become a major force in custom AI silicon, having also worked with Google on its TPU line. Jalapeño reportedly runs on TSMC's latest process node, though OpenAI has not published detailed specs or benchmark methodology.

Custom silicon for inference is a well-worn strategy at hyperscale — Google has TPUs, Amazon has Inferentia, and Meta has MTIA. What's notable here is that OpenAI, long dependent on Microsoft's Azure infrastructure and Nvidia supply chains, is now pursuing vertical integration at the chip level. This is both a cost play and a strategic hedge against GPU supply constraints and pricing leverage.

OpenAI has not announced a timeline for Jalapeño to power customer-facing products, nor has it clarified what percentage of its inference workload the chip is expected to handle. The announcement positions Jalapeño as a long-term infrastructure investment rather than an immediate product shift.

Panel Takes

The Builder

Developer Perspective

“Custom inference silicon is a legitimate performance win — but only if OpenAI publishes real numbers. 'Designed specifically for inference' is exactly the kind of claim that needs a methodology, a token-per-second comparison against H100s, and a power-efficiency curve — none of which are in this announcement. The performance win that comes from thinking about the problem at the silicon level is real; the performance win that exists only in a press release is marketing copy with a chip render at the bottom.”

The Skeptic

Reality Check

“Google has been doing this with TPUs since 2016, Amazon ships Inferentia, Meta has MTIA — and none of them have fully displaced Nvidia for the workloads that matter most. The specific scenario where Jalapeño breaks is obvious: any model architecture change that wasn't anticipated during chip design forces a fallback to Nvidia anyway, and OpenAI changes architectures constantly. What kills this in 18 months isn't a competitor — it's OpenAI's own research roadmap outpacing the chip's design assumptions.”

The Futurist

Big Picture

“The thesis here is falsifiable: OpenAI believes that inference, not training, is the compute surface that compounds — and that owning that layer at the silicon level creates cost and latency advantages no software optimization can match. The second-order effect that nobody's talking about is leverage: a company that controls its own inference silicon can price API calls in ways that are structurally impossible for competitors running on merchant Nvidia hardware. If Jalapeño works, the real winner isn't OpenAI's margins — it's OpenAI's ability to undercut every inference API on the market and survive doing it.”

The Founder

Business & Market

“This is the right move for exactly one reason: inference is the cost of goods sold for every OpenAI product, and right now Nvidia sets the price. Custom silicon is how you turn a variable, supplier-controlled cost into a fixed, amortizable one — that's not a technology bet, that's a unit economics decision. The risk isn't technical execution; it's that Jalapeño takes 3-5 years to reach the utilization rates where the amortization math actually works, and a lot can change in OpenAI's competitive position by then.”

Panel Takes

Bookmarks