Anthropic Quietly Cut Prompt Cache TTL — Claude Code Users Are Paying More With No Warning

Anthropic reduced the time-to-live (TTL) for prompt cache entries sometime in late March with no public announcement, effectively increasing inference costs for heavy Claude Code users who built their cost models around the previous cache duration. The issue has 200+ comments and landed on Hacker News' front page today.

Original source

Developers are waking up to higher Claude API bills this week after discovering that Anthropic silently reduced the TTL (time-to-live) for prompt cache entries — the window during which a previously-cached prompt prefix remains valid and re-use is charged at the discounted cache-read rate rather than the full input-token rate.

The change, which appears to have taken effect around March 6th based on billing timestamps shared in the GitHub issue thread, was not announced in Anthropic's changelog, release notes, or status page. Developers who built Claude Code workflows, agent orchestration systems, and AI-heavy products around the previous cache behavior found their costs increased significantly — in some reported cases, doubling for heavy long-context workloads.

Prompt caching is one of Anthropic's key cost-reduction features for developers working with large context windows. The discounted rate (roughly 10% of standard input token pricing) applies when the same prompt prefix is reused within the TTL window. Reducing the TTL means more prompts fall outside the cache window and are charged at full rate. For Claude Code users running multi-hour coding sessions, the effective cost per session increased substantially.

The GitHub issue thread has attracted over 200 comments, a mix of cost impact reports, requests for official clarification, and broader discussion about Anthropic's communication practices with developers. As of the time of writing, Anthropic has not posted an official explanation, timeline, or migration guidance.

The incident touches a recurring tension in the API business: providers have the right to adjust infrastructure parameters, but developers building on those parameters need predictability to price and architect their products. Anthropic's silence on the change — whether intentional or an oversight — has damaged trust among a vocal segment of its most technically sophisticated users.

Panel Takes

The Builder

Developer Perspective

“This is a serious breach of developer trust. Infrastructure pricing changes that break cost models should come with at minimum 30 days notice and a clear changelog entry — not be discovered after the bill arrives. Anthropic's developer relations team needs to address this directly, not let it fester in a GitHub issue.”

The Skeptic

Reality Check

“Providers adjust infrastructure parameters all the time, and the Terms of Service almost universally reserve the right to do so. But 'legally permissible' and 'developer-friendly' are different standards — Anthropic has built its brand on being the trustworthy AI lab, and silent pricing changes that increase costs undermine that narrative at exactly the moment competitors are eager to attract dissatisfied developers.”

The Futurist

Big Picture

“This incident is a preview of a coming governance challenge for AI APIs: as agentic workflows become more complex and expensive, the stakes of undocumented infrastructure changes rise dramatically. The industry will likely need formal SLAs for cache behavior, rate limits, and model availability — similar to what cloud providers offer for compute. Anthropic is learning this lesson the hard way.”

Panel Takes

Bookmarks