Claude Code's $200/Month Pro Max Plan Burns Out in 90 Minutes — And It's an Anthropic Bug

A detailed bug report from a Pro Max subscriber revealed that Claude Code's 1M context window can exhaust the entire monthly quota in as little as 90 minutes due to expensive cache misses — not actual usage. The issue garnered 554 points on Hacker News and a response from Anthropic's Boris Cherny confirming the problem and proposing a default context window reduction to 400k.

Original source

A GitHub issue filed against `anthropics/claude-code` has become one of the most-discussed developer grievances in the AI tools space this week. The reporter, Joey Chen, documented how his Claude Code Pro Max 5x plan — which costs $200+/month — was completely exhausted in 1.5 hours of what he described as "moderate usage." The issue now has 96 upvotes on GitHub and drove 554 points and 506 comments on Hacker News.

The technical root cause is subtle and arguably worse than a simple billing bug. When Claude Code uses a 1M-token context window and a session goes stale (more than 1 hour, which is Anthropic's current cache TTL), the entire context must be re-created from scratch as an expensive cache miss. Because the cache miss costs full tokens at full rate, a single auto-compact or session resume can generate a cache creation spike of ~966k tokens — hitting the quota ceiling before any real work begins.

Community investigator Chris Nighswonger ran a telemetry analysis comparing three hypotheses about how cache_read tokens count against quota. The data showed that cache reads themselves don't significantly count — the issue is purely the cost of cache *misses* when operating at 1M context scale. Background sessions left open in other terminals compounded the problem, with one user reporting that idle sessions consumed 78% of their quota after a reset without any active prompting.

Anthropic's Boris Cherny, responding in the GitHub thread, confirmed the diagnosis and outlined planned fixes: defaulting the context window to 400k tokens (instead of 1M), adding UX nudges to run `/clear` before resuming stale sessions, and improving staleness detection. The proposed default reduction is significant — it represents a quiet rollback of one of Pro Max's most-marketed features.

The episode has reignited debate about AI subscription pricing transparency. Several users noted that Anthropic's quota system counts tokens in ways that don't match the marketing — what reads as "5x usage" in the plan name doesn't translate to "5x more work you can do." For a subscription tier aimed at professional developers, the gap between perceived and actual value has been a sharp community flashpoint.

Panel Takes

The Builder

Developer Perspective

“This is a serious trust issue. Anthropic is charging $200+/month for a plan that gets unexpectedly exhausted by background sessions and cache misses — not actual productive usage. The workaround (run /clear manually, don't use 1M context) negates what you're paying for. Boris Cherny's response is good, but the fix needed to ship before the plan went on sale.”

The Skeptic

Reality Check

“The community analysis here is genuinely impressive — crowdsourced telemetry that identified a non-obvious root cause faster than any support ticket would have. But 96 upvotes on a niche GitHub issue means this affects a small slice of power users. Most Claude Code users aren't running multi-terminal sessions with 1M context windows and won't hit this at all.”

The Futurist

Big Picture

“This is the inevitable friction of selling large-context AI subscriptions before the tooling for managing context costs is mature. The gap between 'tokens in your context window' and 'tokens that count against quota' is going to confuse developers for years until the industry standardizes on transparent, intuitive pricing units. Anthropic is getting there, but the growing pains are real.”

Panel Takes

Bookmarks