Claude 4 Opus Arrives with 1M Token Context and Extended Thinking
Anthropic has released Claude 4 Opus, featuring a 1 million token context window and an upgraded extended thinking mode designed for sustained multi-step reasoning. The release also includes a new low-latency API tier targeting enterprise deployments.
Original sourceAnthropic's Claude 4 Opus represents the company's most capable model release to date, pairing a 1 million token context window with an overhauled extended thinking mode. The extended thinking upgrade is designed to let the model spend more compute on complex, multi-step problems before producing a response — a meaningful distinction from simply increasing context length, as it targets reasoning depth rather than just information retention.
The 1 million token context window puts Opus in the same league as Google's Gemini 1.5 Pro on raw context capacity, though context length and effective utilization are different problems. Anthropic has historically been cautious about overpromising retrieval accuracy at extreme context lengths, and it remains to be seen how Opus performs at the practical edges of that window on real workloads rather than synthetic benchmarks.
The new enterprise API tier is the less-discussed but potentially more commercially significant piece of this release. Lower latency access has been a consistent friction point for teams trying to run Opus-class models in production pipelines where response time matters. Anthropic is positioning this tier as the path for organizations that need both the top-end capability and the throughput to make it usable in real-time workflows.
For developers already in the Claude ecosystem, the extended thinking mode surfaces as a configurable parameter in the API rather than a separate model endpoint — a cleaner primitive than shipping it as a distinct product. Pricing for the new API tier has not been fully detailed at launch, which will be a key factor in whether enterprise teams treat this as a serious upgrade or a features announcement ahead of contracts.
Panel Takes
The Builder
Developer Perspective
“Extended thinking as a configurable API parameter rather than a separate model endpoint is the right call — that's a composable primitive, not a platform you have to opt into wholesale. The 1M context window matters less to me than whether the API surface for controlling thinking budget is clean and predictable, which I can't verify until I see the docs and run the latency numbers myself. I'll ship a verdict once I've hit the new tier with a real codebase, not a demo prompt.”
The Skeptic
Reality Check
“The 1M token context claim needs a retrieval accuracy stress test before it means anything — Google shipped 1M context in Gemini 1.5 and the drop-off in effective recall past a few hundred thousand tokens was real and documented. Extended thinking is genuinely interesting if the compute-to-quality curve is favorable, but Anthropic hasn't published methodology on when it helps versus when it's just slower. The enterprise API tier is where the real question lives: if the pricing doesn't survive contact with high-volume users, this is a flagship announcement that converts to nothing.”
The Futurist
Big Picture
“The thesis here is that reasoning depth and context breadth are converging into a single primitive — and that thesis is plausible and well-timed. If extended thinking can hold coherent state across a million tokens while running multi-step inference, you're not looking at a better chatbot, you're looking at the substrate for autonomous workflows that can ingest an entire codebase, a legal corpus, or a year of company communications as a single working context. The second-order effect is that this shifts the bottleneck from 'what can the model hold' to 'what can the human verify,' which is a fundamentally different product design problem for every tool built on top of it.”
The Founder
Business & Market
“The enterprise API tier is the only part of this announcement that moves Anthropic's revenue line, and the fact that pricing isn't fully detailed at launch is a yellow flag — enterprise buyers want to model total cost before they commit, not after. The moat here is real but narrow: Anthropic has genuine differentiation on safety benchmarks and model behavior, but a low-latency API tier is a capability OpenAI and Google can match in a quarter if this gets traction. The window to lock in enterprise contracts on the back of this release is short, and converting a model launch into multi-year deals requires a sales motion that doesn't show up in a press announcement.”