Prompt caching
Short answer
Prompt caching lets an API provider reuse a previously seen input prefix at a discounted rate instead of reprocessing every token. GPT-5.5 cached input is $0.50 per 1M vs $5.00 fresh — a 90% saving.
Prompt caching is the mechanism; cached input is the price. Most major providers — OpenAI, Anthropic, Google — implement prompt caching, but they differ in how they expose it on the price page.
- OpenAI exposes a single "cached input" tier. The discount is implicit (the price is just lower).
- Anthropic separates cache read from cache write, and offers 5-minute and 1-hour write windows. Writes are billed once when the prefix is first cached; reads are billed each time a request hits the cache.
- Google charges cached input plus a separate hourly storage fee per million tokens stored.
For most workloads the headline number to compare is the cached-input rate per 1M tokens, because that's what dominates the bill.