Every Thursday, our developer relations team at Redis runs a session called DevRel Professional Development. We study something deeply. We debate it. Then share our findings. This week, Guy Royse, Raphael De Lio, Bhavana Anant Giri, Ashwin Hariharan, Samuel Agbede, and yours truly focused on the course: Agent Skills with Anthropic (By Elie Schoppik, from Anthropic) This is a course from DeepLearning.AI. The [1] link to the course is in the comments. What are agent skills? Well, in short: 🔥 Skills are ways for you to define the business logic of agentic systems. 🔥 ➡️ Here's our take. ◼️ Skills are a great packaging format. ◼️ They encode procedural knowledge. ◼️ They make workflows reusable. ◼️ They allow scoped references. ◼️ They can include executable scripts. On that last point, Guy Royse built one skill from scratch and demoed it to us. The [2] link to his repository is in the comments. He shared that you are not restricted to #Python, as the course has shown. You can write your scripts in #JavaScript, too. (Raphael De Lio and I were itching to try using #Java as well 😅) What we liked most about agent skills was its clever structure: Metadata → Skill definition → [References, Scripts] This prevents you from flooding the context window with everything upfront. Agent skills load what's needed, when it's needed. Starting with the metadata. The course calls this progressive disclosure. 📝 It feels less like prompt engineering. 🧑🏻💻 More like capability engineering. These were some questions raised during the meeting: ◼️ If I install 10 skills into my coding agent, how do I update them all? ◼️ Is there versioning? Can an agent skill have multiple live versions? ◼️ Is there a central registry? Like Tessl, or Skills.sh from Vercel? As agent skills grow in popularity, lifecycle management will become critical. We are still very early here. But this will matter. At scale, teams must avoid agents referring to redundant/outdated skills. Other debates that we had: ◼️ What's the right granularity? Too specific → less reusable. Too generic → vague and fragile. ◼️ Should skills pull data directly from data systems? Or should they rely on MCP servers for this? What are the trade-offs of more opinionated instructions versus open-ended queries that MCP allows? ◼️ If agents can dynamically create or update skills… who governs that? ◼️ Teams that treat skills like production code—versioned, tested, and owned—will build more reliable agent systems. 👉🏻 We wrapped up our session with this: Let's create a public repository with the skills we develop. Skills that govern the way we build our content, demos, and workshops. It's a great way to apply skills in our own domain and learn the common friction points that developers may also struggle with. What's your biggest open question — granularity, governance, or lifecycle? Would you define your business rules as agent skills given the chance?
Redis DevRel Discusses Agent Skills with Anthropic
More Relevant Posts
-
I’m damn sure we’ve all faced that "needle in a haystack" moment in production. You get a generic error code, a failing endpoint, and a log that tells you absolutely nothing about the "why." Without the actual payload, your RCA is dead in the water. In a high-traffic system, you can’t log every single request body—it would crush your disk space and your performance. So, when a recent production issue hit, we were flying blind. No unique IDs, just a sea of failing requests. But here’s where a balanced system design saved us. We have a pattern where failing requests are stored in MongoDB to be retried later by a scheduler. It’s our safety net. We knew the payloads were in there, but there was another hurdle: thousands of records from different days and different error types. The Savior: MongoDB Aggregation 🛠️ Yeah, you read it right—manually clicking through documents wasn't an option. We used the prowess of MongoDB Aggregations to slice through the noise. Instead of a basic query, we built a pipeline: ✅ $match: Filtered by the specific errorType and the exact timestamp window. ✅ $project: Grabbed only the requestPayload so we didn't drown in metadata. ✅ $group: Identified patterns—was it the same user ID or the same product type? Why this matters: Trust me, your "Retry" database isn't just for retries; it’s a goldmine for debugging. 1. Audit Trail: It keeps the evidence when the logs fail you. 2. Precision: Aggregation finds the exact "bad" payload in seconds, not hours. 3. Prowess: It turns a clunky manual investigation into a data-driven RCA. If you aren't storing your failed payloads in a queryable way, you’re just one prod issue away from a massive headache. #MongoDB #SystemDesign #BackendEngineering #DevOps #DataDriven
To view or add a comment, sign in
-
The most important lessons learned from writing a MCP server for PostgreSQL? 🐘 > If I were starting this project from scratch, I'd design for token efficiency from day one rather than re-engineering it after the fact. Our initial prototype returned JSON with no pagination and no filtering, and whilst it made for impressive demos on small databases, it fell apart the moment we pointed it at anything resembling a production dataset. > I'd also invest in better observability earlier. We added token estimation logging that records the approximate token count for every tool result, and it's been invaluable for identifying wasteful patterns. Knowing that a particular tool call consumed an estimated 2,500 tokens makes it much easier to decide whether the output format needs tightening or whether a new filtering parameter would help. Actually, running from a concept to production typically means there’s plenty of lessons learned along the way. Dave Page shares his insights from creating our MCP server for PostgreSQL in detail, here: https://hubs.la/Q044XqsY0 ✨ If you haven’t read it yet, check it out to learn about what’s important to avoid the next time you’re developing an AI application that needs to go to production. #aidevelopment #aiengineering #agenticai #mcp #mcpserver #Postgres #PostgreSQL #OpenSource #engineering #programming #devops #aiops
To view or add a comment, sign in
-
𝐉𝐮𝐬𝐭 𝐬𝐡𝐢𝐩𝐩𝐞𝐝 𝐚 𝐩𝐫𝐨𝐝𝐮𝐜𝐭𝐢𝐨𝐧-𝐠𝐫𝐚𝐝𝐞 𝐑𝐚𝐭𝐞-𝐋𝐢𝐦𝐢𝐭𝐞𝐝 𝐀𝐏𝐈 𝐟𝐫𝐨𝐦 𝐬𝐜𝐫𝐚𝐭𝐜𝐡. 𝐇𝐞𝐫𝐞'𝐬 𝐰𝐡𝐚𝐭 𝐈 𝐥𝐞𝐚𝐫𝐧𝐞𝐝 𝐚𝐛𝐨𝐮𝐭 𝐛𝐮𝐢𝐥𝐝𝐢𝐧𝐠 𝐬𝐜𝐚𝐥𝐚𝐛𝐥𝐞 𝐬𝐲𝐬𝐭𝐞𝐦𝐬 I built a Developer Metrics API. But the real learning wasn't in the code – it was in the architectural decisions. 𝐓𝐡𝐞 𝐓𝐞𝐜𝐡𝐧𝐢𝐜𝐚𝐥 𝐂𝐡𝐚𝐥𝐥𝐞𝐧𝐠𝐞: Building an API is easy. Building one that𝘤𝘢𝘭𝘦𝘴, 𝘱𝘦𝘳𝘧𝘰𝘳𝘮𝘴, 𝘢𝘯𝘥 𝘱𝘳𝘦𝘷𝘦𝘯𝘵𝘴 𝘢𝘣𝘶𝘴𝘦? That's the real engineering problem. 𝟑 𝐂𝐫𝐢𝐭𝐢𝐜𝐚𝐥 𝐃𝐞𝐜𝐢𝐬𝐢𝐨𝐧𝐬 𝐓𝐡𝐚𝐭 𝐌𝐚𝐝𝐞 𝐓𝐡𝐞 𝐃𝐢𝐟𝐟𝐞𝐫𝐞𝐧𝐜𝐞: • 𝗥𝗮𝘁𝗲 𝗟𝗶𝗺𝗶𝘁𝗶𝗻𝗴 𝗔𝗿𝗰𝗵𝗶𝘁𝗲𝗰𝘁𝘂𝗿𝗲 I chose a sliding window algorithm with Redis Sorted Sets over the simpler fixed window approach. Why? Fixed windows have a fatal flaw: burst attacks. Example: 100 requests at 10:59 + 100 at 11:01 = 200 requests in 2 minutes. With sliding windows, I enforce limits over 𝘢𝘯𝘺 60-𝘮𝘪𝘯𝘶𝘵𝘦 𝘱𝘦𝘳𝘪𝘰𝘥, preventing this exploit entirely. The implementation uses atomic Lua scripts to avoid race conditions at scale. • 𝗥𝗮𝘄 𝗦𝗤𝗟 𝘃𝘀 𝗢𝗥𝗠 (𝗖𝗼𝗻𝘁𝗿𝗼𝘃𝗲𝗿𝘀𝗶𝗮𝗹 𝗧𝗮𝗸𝗲) I deliberately avoided ORMs like Prisma despite the productivity boost. For analytics-heavy APIs, this paid off: - Direct query optimization reduced response time from 500ms to 20ms (25x improvement) - 𝘌𝘟𝘗𝘓𝘈𝘐𝘕 𝘈𝘕𝘈𝘓𝘠𝘡𝘌 became my best friend - No hidden N+1 queries or abstraction leaks Is this always the right choice? No. For CRUD apps, ORMs win. But for time-series data and complex aggregations, raw SQL + proper indexing is unbeatable. • 𝗠𝘂𝗹𝘁𝗶-𝗟𝗮𝘆𝗲𝗿 𝗖𝗮𝗰𝗵𝗶𝗻𝗴 𝗦𝘁𝗿𝗮𝘁𝗲𝗴𝘆 The system uses a tiered caching approach: - Redis for expensive metrics (15-min TTL) - PostgreSQL composite indexes for sub-100ms queries - Batch operations for high-volume inserts (100 commits at a time) Result: API handles 10,000+ req/sec with consistent sub-50ms latency. • 𝐓𝐡𝐞 𝐍𝐮𝐦𝐛𝐞𝐫𝐬: - 80%+ test coverage (50+ test suites) - 3-tier rate limiting (100/1K/10K req/hour) - Automated CI/CD with GitHub Actions - Complete RFC 7807 error handling - SHA-256 API key hashing (security first) 𝐁𝐢𝐠𝐠𝐞𝐬𝐭 𝐋𝐞𝐬𝐬𝐨𝐧: Senior engineering isn't about writing more code – it's about making fewer, better decisions. Every architectural choice has trade-offs. Understanding 𝘸𝘩𝘦𝘯 to use which pattern matters more than knowing how. 𝐓𝐡𝐞 𝐩𝐫𝐨𝐣𝐞𝐜𝐭 𝐝𝐞𝐦𝐨𝐧𝐬𝐭𝐫𝐚𝐭𝐞𝐬: • Production-ready error handling • Sliding window rate limiting (prevents burst attacks) • Database optimization (strategic indexing) • Multi-layer caching (25x performance gain) • CI/CD automation (quality gates) • Security best practices (never store raw keys) Checkout my Github Repo: https://lnkd.in/gFqjQ2MG #SoftwareEngineering #SystemDesign #APIs #RateLimiting #PostgreSQL #Redis #TypeScript #BackendDevelopment #SoftwareArchitecture
To view or add a comment, sign in
-
I have a few boxes in my loft labeled "sort out later." Some of them date back even to my London era. Dont we all do it? We create a space for things we cannot deal with right now and then forget they exist. I was looking at an architecture diagram a couple of weeks ago and thought the Dead Letter Queue (DLQ) is often that exact box. It satisfies error handling but in reality it is often just a place where data goes to die. I don’t know Kafka in detail, but I've been asking myself how I would resolve this "ignored box" problem using Pub/Sub in GCP. A couple of principles first: 1. Never let DLQ messages accumulate to an unmanageable amount. 2. Never let the messages sit in the box and virtually ignore them. Then what you can do in GCP: - Use a low max_delivery_attempts (e.g., 5) and pair it with a Cloud Monitoring alert. If the dead letter message count is > 0, get someone's attention immediately. This stops it from becoming a "silent" problem. - Design the DLQ to sink into a BigQuery Table using a BigQuery subscription. You can then query the CloudPubSubDeadLetterReason attribute with native SQL to group errors by type. This mirrors the Kafka "Group by Error Class" logic . - Set topic retention to 31 days for critical pipelines. Yes, the boxes will pile up, but it buys you a full month of "forensic time" to fix a bug without losing data. - Dataflow has a side-output DLQ. While Pub/Sub just tells you that something failed, a side-output can capture the actual Java/Python Exception and attach it to the message. - Note on a detail: do not forget "retain_acked_messages" has to be on otherwise you rewind button has no power:-) - this can be at subscription or topic level. Cannot guarantee 100% correctness, just tring to apply what I have learned and am happy to be corrected. https://lnkd.in/gbyRd7pc
To view or add a comment, sign in
-
Pragmatism: Handling Distributed Consistency without the Overhead 🚀 Last week, I faced a dilemma: I needed to update a record in my local db and then notify a second microservice via a REST call. The "textbook" approach would be a Transactional Outbox or a Saga pattern. But to be honest setting up dedicated outbox tables, message relayers, and complex compensation logic for a non-critical feature is often like using a sledgehammer to crack a nut. 🛠️ The Problem: Simple REST in a separate thread: Fast, but risky. If the instance restarts, the in-memory retry queue is wiped out. Data is lost. Transactional Outbox: Rock-solid, but high development overhead for this specific use case. The Middle Ground: Redis-Backed Reliability 💡 Since the data wasn't "mission-critical", I opted for a Redis-backed background worker instead of a simple in-memory thread. Why this works: ✅ Statelessness: By moving the "pending tasks" from the JVM heap to Redis, my service remains stateless. A deployment or a crash doesn't kill the task. ✅ Persistence: With RDB/AOF enabled, Redis ensures the retry queue survives even if the cache server itself blinks. ✅ Simplicity: It took me 20% of the time it would take to implement a full Outbox pattern, providing 99.9% of the required reliability. The Tech Stack: Spring Boot (Custom @Async or ThreadPoolTaskExecutor) Redis (as a lightweight persistent queue) Exponential Backoff Retry logic. The Lesson: As engineers, our job isn't to build the most complex system possible, but to find the right balance between Reliability, Speed of Delivery, and System Complexity. Have you ever "downgraded" a pattern to save time without sacrificing too much stability? Let’s discuss in the comments! 👇 #Java #Backend #SystemDesign #Microservices #Redis #CleanCode #SoftwareEngineering
To view or add a comment, sign in
-
-
🚨 STOP using a single DB node and praying it won’t crash. Your app deserves a Rust‑powered proxy that pools, balances AND shards – meet pgdog. 3‑minute read, 100% game changer. Built in Rust, pgdog talks straight to the PostgreSQL wire protocol – no driver changes, zero code rewrites. It pools connections (session & transaction), cuts latency, and can slash required DB processes by up to 70 %. Three load‑balancing strategies + health checks automatically route reads to replicas and writes to primaries. Automatic sharding with hash, list or range – schema‑based multi‑tenant isolation without a single line of app code. Two‑phase commit guarantees atomic cross‑shard transactions, and a built‑in unique_id() serves millions of IDs per second. Metrics? OpenMetrics endpoint feeds Prometheus/Grafana, so you see every hit, fail or success. Since 2023 it’s gathered 3.4k GitHub stars, TLS support, and a thriving community of 30 PRs/month. Curious? Dive into the full article and the repo 👉 https://lnkd.in/d4_sSrK8 What’s the biggest bottleneck you’re hitting with PostgreSQL today? Drop a comment. #PostgreSQL #Rust #pgdog #DatabasePerformance #DevOps
To view or add a comment, sign in
-
A year ago, managing a large cluster of interdependent states pushed our Redis instance to its limits — frequent reads, writes, and recomputation cascades compounding until we had a real problem. We bought time by switching to Valkey. Lua scripts were the logical next step, but we never prioritized them, and eventually the pressure eased enough that we moved on. At home, I kept thinking about the underlying problem: dependent state. Every time a value changes, downstream values need to recompute. Most systems handle this at read time — you query, you compute. I wanted to try the opposite: propagate changes at write time using triggers, so reads stay simple. That became map-cache — an in-memory cache built around that idea: - Any JSON-representable structure — values aren't limited to strings and integers; store and query arbitrarily nested objects, arrays, and mixed types, accessed by path - Multiple independent caches — run isolated namespaces simultaneously, useful for per-tenant or per-service separation - Cascading trigger updates — changes propagate automatically through dependency chains - Pattern-based key matching — wildcards for flexible key routing - Conditional workflows — if/for/return logic inside batch commands - RESP protocol support — compatible with standard Redis clients, no JSON overhead The project never saw production. The problem that inspired it was already solved — which, in hindsight, was probably the right outcome. This is experimental, not Redis-complete, and hasn't faced real traffic. But building it taught me a lot about API design, protocol implementation, and where the interesting tradeoffs live between simplicity and capability. If you're curious about write-time state propagation or want to explore an opinionated cache design, it's open source: https://lnkd.in/gj-tBNeg
To view or add a comment, sign in
-
Your MCP Server Works Locally. Then Kubernetes Kills the Session. We just published a new blog post about a sneaky issue we hit running a Spring AI MCP server in production on GKE. The symptom: intermittent "Session not found" errors from Claude Code — even though the auth token was valid and the server was up. The root cause: MCP's Streamable HTTP transport stores sessions in an in-memory ConcurrentHashMap. Rolling deployments, pod restarts, and horizontal scaling all wipe them out. And no, you can't persist them to Redis — they contain live Reactor FluxSink objects tied to the JVM. The fix: one config change — switch to STATELESS transport. No sessions, no sticky routing, any pod can serve any request. In the post we break down: - How MCP's two auth layers (your JWT tokens vs. SDK transport sessions) interact - Why the SDK's session store isn't designed for Kubernetes yet - What you actually lose going stateless (probably nothing) - Why Redis session persistence isn't viable (we checked the SDK source) If you're one of the early teams pushing MCP servers beyond localhost, this might save you some debugging. Read the full post: https://lnkd.in/dT29prgR #MCP #Kubernetes #SpringAI #Java #AI #CloudNative #GKE
To view or add a comment, sign in
-
Lately I’ve been reflecting on something that separates “working APIs” from production-ready systems: response time. In backend systems (especially with Django), API response time is not just a performance metric, it’s a user experience decision. Every extra second spent waiting is friction, and most times the problem isn’t Django itself, but what we ask Django to do synchronously. One lesson I’ve learned is this: not everything belongs in the request–response cycle. Tasks like sending emails, verifying payments, generating reports, assigning jobs, or logging analytics should not block an API response. This is where asynchronous processing comes in. Using tools like Celery and cron jobs allows the API to respond fast while heavy or non-critical work runs safely in the background. The difference this makes in real-world systems is huge. Another key piece is caching. Hitting the database for the same data repeatedly is expensive. Introducing a cache layer like Redis for frequently accessed data, session data, or temporary state can dramatically reduce response times and database load. In many cases, the fastest query is the one you never run. Then there’s the database itself. I’ve seen “slow APIs” magically become fast just by adding the right indexes. Indexes are not optional in production systems, they’re essential. A well-indexed database can turn expensive table scans into efficient lookups, especially as data grows. What ties all of this together is intentional backend design: Keep API responses lean and fast Push heavy work to background workers (Celery) Cache smartly with Redis Index your database based on real query patterns Django gives us solid tools out of the box, but performance comes from how we use them, not just from using them. Curious to hear how others approach API performance and background processing in their systems. #BackendEngineering #Django #APIs #Celery #Redis #DatabaseOptimization #SoftwareEngineering
To view or add a comment, sign in
-
Ever wondered what happens when you hit "Submit" on LeetCode? 🤔 Spoiler: It's WAY more complex than it looks. I spent the past few weeks building CodeXFlow - a production-grade distributed system that mimics how platforms like LeetCode and Codeforces process millions of code submissions. Here's what happens in under a second when you submit code: 1️⃣ Client sends submission → API Gateway 2️⃣ Gateway validates auth & applies tier-based rate limiting (FREE: 10/s, PREMIUM: 50/s, ENTERPRISE: 100/s) 3️⃣ Submission Service validates the problem (HTTP → Problem Service) 4️⃣ Submission persisted to MongoDB 5️⃣ Event published to Evaluation Queue (RabbitMQ) 6️⃣ Evaluation Service (async consumer) picks up the job 7️⃣ Code executes in an isolated Docker container 8️⃣ Results published back to Submission Result Queue 9️⃣ Submission Service consumes result & updates DB 🔟 Real-time WebSocket push to client → "Accepted ✅" All of this without blocking a single request 🚀 🛠️ Tech Stack: - TypeScript + Node.js (type-safe backend) - Docker (isolated code execution) - RabbitMQ (async event-driven messaging) - MongoDB (persistent data storage) - WebSockets (real-time client updates) - Redis (API Gateway rate limiting) 🧠 Key Engineering Patterns: ✅ Event-Driven Architecture - loose coupling via message queues ✅ Async Processing - heavy work off the critical path ✅ Service Isolation - each service owns its domain ✅ Containerized Execution - secure sandboxed runtime ✅ Real-Time Updates - WebSocket push notifications ✅ Rate Limiting - Redis-backed tiered throttling Why This Architecture? ⚡ Non-Blocking: Users get instant feedback, execution happens in background 🔒 Secure: Docker isolation prevents malicious code from escaping 📈 Scalable: Horizontal scaling by adding more consumers 🔄 Reliable: Message queues ensure zero data loss ⚡ Real-Time: WebSockets deliver results the moment they're ready This project taught me how world-class platforms handle async processing, containerized execution, and real-time communication at scale. Architecture diagram attached 👇 Open source & ready for feedback! 🔗 GitHub: https://lnkd.in/ggcDMtUt What would you build differently? Drop your thoughts! 💬 #BackendEngineering #SystemDesign #DistributedSystems #Microservices #NodeJS #Docker #RabbitMQ #WebSockets #EventDriven #SoftwareArchitecture#BackendEngineering #SystemDesign #DistributedSystems #Microservices #NodeJS #RabbitMQ #Docker #WebSockets
To view or add a comment, sign in
-
[1] https://learn.deeplearning.ai/courses/agent-skills-with-anthropic [2] https://github.com/guyroyse/ttrpg-campaign-generator-skill