Anthropic recently rolled out prompt caching - a quietly released feature…

I help researchers and builders make sense of AI | ex-Stripe | aitidbits.ai | Angel Investor

17h

Anthropic recently rolled out prompt caching - a quietly released feature that can cut API costs by up to 90% and reduce latency by 50%, yet most AI builders don't use it. Prompt caching enables you to store and reuse context within your prompts, allowing you to include detailed instructions, examples, and large reference documents without incurring repeated costs. Implementation is straightforward - mark portions of your prompt with {"cache_control": {"type": "ephemeral"}} and Anthropic will store this content server-side for subsequent calls. The cache has a minimum lifetime of 5 minutes, refreshed each time the cached content is accessed. This is particularly valuable for conversational agents with extensive instructions, coding assistants working with large codebases, document processing scenarios, and agentic tool use cases. You can now include those 20+ examples you wanted in your system prompt without the cost + latency penalty. You can load entire books or codebases into context while paying just 10% of the regular price after the first call. API docs https://lnkd.in/gngJpgNX — Join thousands of world-class researchers and engineers from Google, Stanford, OpenAI, and Meta staying ahead on AI http://aitidbits.ai

6 Comments

Phillip M. Sparks

Polymath Ai Systems Engineer for Human-Ai-Business Intelligence🧠, EdTech Knowledge iso42001 compliance & Sr Cyber Operations/Project Manager 👔

17h

We already do this programmed into Excel. It really gives more control and cost savings by deciding what really needs to be sent. Data stired in excel cells makes it beey easy to irganize and recall srructurally

1 Reaction

Felipe Rosa

Founder & CEO @ Namastex Labs | GenAI Alchemist

10h

Anthropic already had prompt caching, did the method change, from august's release?

Heinrich Krupp

13h

Impressive

Gagandeep Singh

🚀 Data Science Manager/Gen AI Specialist | MLOps & LLM Innovator | 10+ Years in AI, ML, NLP | Python, AWS, Azure, GCP | x-Walmart, TCS,Xerox

14h

We’ve actually started using Anthropic’s prompt caching in one of our internal agentic systems that processes long-form documents and performs context-aware actions. The cost savings are significant—especially when working with large instruction sets and reusable reference materials. It’s one of those low-effort, high-impact features that every serious AI builder should be taking advantage of. Hats off to the team for rolling this out quietly but effectively!

2 Reactions

Julia Baez Valentin

Ops | Experimenting with AI | Collector

11h

👀

See more comments

To view or add a comment, sign in

Sahar Mor’s Post

Explore topics