Anthropic recently rolled out prompt caching - a quietly released feature that can cut API costs by up to 90% and reduce latency by 50%, yet most AI builders don't use it. Prompt caching enables you to store and reuse context within your prompts, allowing you to include detailed instructions, examples, and large reference documents without incurring repeated costs. Implementation is straightforward - mark portions of your prompt with {"cache_control": {"type": "ephemeral"}} and Anthropic will store this content server-side for subsequent calls. The cache has a minimum lifetime of 5 minutes, refreshed each time the cached content is accessed. This is particularly valuable for conversational agents with extensive instructions, coding assistants working with large codebases, document processing scenarios, and agentic tool use cases. You can now include those 20+ examples you wanted in your system prompt without the cost + latency penalty. You can load entire books or codebases into context while paying just 10% of the regular price after the first call. API docs https://lnkd.in/gngJpgNX — Join thousands of world-class researchers and engineers from Google, Stanford, OpenAI, and Meta staying ahead on AI http://aitidbits.ai
Anthropic already had prompt caching, did the method change, from august's release?
Impressive
We’ve actually started using Anthropic’s prompt caching in one of our internal agentic systems that processes long-form documents and performs context-aware actions. The cost savings are significant—especially when working with large instruction sets and reusable reference materials. It’s one of those low-effort, high-impact features that every serious AI builder should be taking advantage of. Hats off to the team for rolling this out quietly but effectively!
👀
Polymath Ai Systems Engineer for Human-Ai-Business Intelligence🧠, EdTech Knowledge iso42001 compliance & Sr Cyber Operations/Project Manager 👔
17hWe already do this programmed into Excel. It really gives more control and cost savings by deciding what really needs to be sent. Data stired in excel cells makes it beey easy to irganize and recall srructurally