OpenAI Releases GPT-5 Mini Preview — Outperforms GPT-4o on Coding at 3x Lower Cost

OpenAI today released a preview of GPT-5 Mini, a distilled model designed to deliver GPT-5-class reasoning at a fraction of the cost. In early benchmarks, GPT-5 Mini outperforms GPT-4o on SWE-bench Verified (the coding agent benchmark) with a score of 71.4% versus GPT-4o's 68.2%, while pricing drops to approximately $0.40/M input tokens and $1.60/M output tokens — roughly 3x cheaper than GPT-4o.

The model is available today in preview via API and ChatGPT Plus. It supports the same 128k context window as GPT-4o and includes full function calling, structured outputs, and vision capabilities. Latency is approximately 40% lower than GPT-4o on equivalent tasks.

For developers, the economics flip meaningfully. Tasks that were previously only cost-effective with Claude Haiku or Gemini Flash now have a competitive OpenAI option. Agentic workflows with thousands of LLM calls — previously reserved for batch jobs — become viable for real-time applications. Early testers report strong performance on code generation, summarization, and classification tasks, with degradation mainly on complex multi-step reasoning where the full GPT-5 still leads.

The release continues OpenAI's pattern of releasing mini/small variants of flagship models to capture the high-volume, latency-sensitive developer market that Anthropic's Claude Haiku and Google's Gemini Flash have dominated in recent months.

Panel Takes

The Builder

Developer Perspective

“A coding model that beats GPT-4o at 3x lower cost is a genuine workflow changer. I'm immediately evaluating this for my agentic pipelines where token costs are the main scaling constraint. If the quality holds under production load, this reshuffles the small model pecking order.”

The Skeptic

Reality Check

“OpenAI 'mini' previews have a history of benchmark performance that doesn't fully survive contact with real-world edge cases. The SWE-bench score is promising but the benchmark has known saturation issues — wait for independent evals and production reports before reallocating your inference budget.”

The Futurist

Big Picture

“The commoditization of capable coding models is accelerating. When a 'mini' model outperforms last year's flagship, AI-assisted development stops being a premium feature and becomes the baseline for any engineering team. The abstraction layer between human intent and working code just got thinner again.”

Panel Takes

Bookmarks