SQLite shipped an AGENTS.md file this week, an explicit contract for how agents should interact with the database, not just an API reference. A document that says: here is how to use this if you're a model, not a human. Infrastructure is acknowledging it: the agent is a first-class consumer of your data and needs different affordances than a human does. We ran into this building Podium. The enrichment layer powers both Sage and Familiar, two products with completely different access patterns. Sage needs deep, personalized queries against a user's skin diary and product history. Familiar needs fast, broad queries across creator storefronts. Both are agents consuming product data. Neither pattern looked like anything we had built for human-facing interfaces. We rebuilt the data model before we rebuilt the agents. Every team I see struggling with agent reliability built the agent first and retrofitted the memory layer. When the consumer is an agent, the data model is the primary design constraint, not an implementation detail you get to later. If you're scoping an agent project right now, the question I'd start with is: what does the agent need to know, and how fast can it find out? That answer determines your schema, your memory layer, and which model you even need to call. #agentinfra #aiengineering #founders
SQLite Agents Contract for AI Data Consumption
More Relevant Posts
-
🚨 WE as engineers are wasting tokens in Claude Code… without realizing it. We write a prompt → Claude stops halfway → then we re-prompt → Claude stops again → repeat 40 times. Right?? I recently came across Anthropic /𝗴𝗼𝗮𝗹 command documentation, and honestly, this changes how you should interact with AI coding agents. Here's what it does in one line: 𝗬𝗼𝘂 𝗱𝗲𝗳𝗶𝗻𝗲 𝘁𝗵𝗲 𝗳𝗶𝗻𝗶𝘀𝗵 𝗹𝗶𝗻𝗲. 𝗖𝗹𝗮𝘂𝗱𝗲 𝗿𝘂𝗻𝘀 𝘂𝗻𝘁𝗶𝗹 𝗶𝘁 𝗰𝗿𝗼𝘀𝘀𝗲𝘀 𝗶𝘁. Instead of vague prompts like: ❌ “Fix the app” ❌ “Improve the code quality” We define a clear success condition once, and Claude keeps iterating until that condition is met. Under the hood, two models run in a loop: • Sonnet/Opus does the actual coding work • Haiku (fast + cheap) evaluates after EVERY turn: "Is the condition met?" • If not → Claude auto-starts the next turn. No prompting from you. The real power is in how you define the goal. A strong /goal should include: • Goal • Context • Constraints • Plan • Done When • Verify • Stop Rules Example: ❌ Weak Prompt “Fix the retrieval pipeline.” ✅ Strong /goal Goal: Fix failing retrieval tests Done When: pytest tests/retrieval exits 0 Verify: Run full test suite Stop Rule: Stop after 25 turns This converts AI workflows from: 🔁 Endless expensive loops into 🎯 Deterministic execution Set the goal. Walk away. Come back to done. 📌 Official docs: https://lnkd.in/gpny_ziC #AI #GenAI #Claude #Anthropic #LLM #PlatformEngineering #SolutionArchitecture #DevOps #DeveloperExperience #PromptEngineering #SoftwareEngineering #AIAgents #CloudArchitecture #Productivity #EngineeringLeadership
To view or add a comment, sign in
-
Claude Code just told me it was “Lollygagging.” 😂 Then “Gusting.” Then “Boondoggling.” It was refactoring my API. The whole thing took 9 seconds. There are apparently 187 of these absurd gerunds buried in Claude Code’s source. Someone at Anthropic sat down and decided that “Processing…” wasn’t good enough. That a developer staring at a terminal at 11pm deserved a tiny moment of amusement. This is the part most enterprise software still doesn’t understand. For 20 years, we built tools for the buyer: • the CIO signing the contract • the procurement checklist • the RFP scorecard Nobody optimized the 3 seconds between clicking a button and seeing a result. Because nobody on the buying committee ever experiences those 3 seconds. But AI changes the relationship completely. You are no longer interacting with software occasionally. You are interacting with simulated cognition continuously. And once that happens, tiny behavioral details suddenly matter: • the loading state • the pacing • the tone • the error message • the empty-state copy • the feeling that something thoughtful is happening on the other side That’s why Claude Code feels different from older developer tools. Not necessarily because the model is smarter. Because the interaction feels more human. I’m building a 0-to-1 product right now, and every time I’m tempted to ship “Loading…” I think about Lollygagging 😄 Small moments compound. They’re often the difference between software people use… and software people love enough to talk about.
To view or add a comment, sign in
-
A tension I keep running into building agent systems for finance: LangGraph and Google ADK are excellent for deterministic workflows. You hardcode trace mechanisms, log every step, and know exactly what the agent did and why. Auditability is essentially free. The newer free-flowing patterns — OpenClaw-style agents with computer use, managing themselves through skill folders and dynamic tools — are dramatically more capable. They actually use the model’s full intelligence and improvisation. But in regulated sectors (finance, law, medical), they can be an audit nightmare. A few hacks that have helped me: • Thread external logs from Python tools into GCS or a cloud bucket. Every call, every param. This is gold for debugging wrong decisions after the fact. • For non-code skills, develop an explicit trust policy with the agent — Agents.md, Soul.md, that kind of artifact. • Monitor the VM itself. Bash logs and system actions on the host give you another layer of ground truth when the ambient model goes off-script. The place I keep landing: a hybrid architecture. Creative, exploratory work runs on capable free-flowing agents in a sandboxed environment that records everything. The strict regulated paths — executing trades, touching accounts, anything with a compliance footprint — run on LangGraph. Capability where you want it, determinism where you need it. Curious how others are drawing this line
To view or add a comment, sign in
-
-
Most observability tools were built assuming one primary user: a human looking at dashboards but that is changing. Developers are now spending more time inside coding agents like Claude Code, Cursor, Codex, and Gemini CLI. But if these agents have to help debug production issues, code context is not enough. They need production context too. Logs, traces, metrics, alerts, service topology, deployment history. We are calling this Agent-Native Observability Our view is not that agents will replace humans in observability. Agents can collect context, correlate signals, and do the first layer of investigation. Humans still decide what matters. Which customer journey is critical. Which alerts deserve attention. Which SLOs matter. Which fix is safe. This is also why we think open standards and open source matter even more in this new workflow. If agents are going to reason over production systems, the underlying telemetry layer should be open, understandable, and portable. We wrote more about what we’re building in the blog below.
To view or add a comment, sign in
-
-
Your team uses Claude Code. Inside Domino, it doesn't stop at code. It runs jobs, tracks experiments, registers models and writes to the audit trail. Same agent. Whole lifecycle. 30 seconds: https://gag.gl/5e0NWF Blog: https://gag.gl/Nz8toB
To view or add a comment, sign in
-
Your team uses Claude Code. Inside Domino, it doesn't stop at code. It runs jobs, tracks experiments, registers models and writes to the audit trail. Same agent. Whole lifecycle. 30 seconds: https://gag.gl/5e0NWF Blog: https://gag.gl/Nz8toB
To view or add a comment, sign in
-
🤫 "Anybody can code now" is a seductive lie. If you have never seriously coded in your life, vibe-coding your way to production-grade software isn't democratization. It's a delayed disaster. Here's what actually happens 3 months in. - Slow API calls nobody can diagnose. - The same logic copy-pasted in four places. - Database queries that collapse the moment real users show up. - A codebase so tangled even the AI that wrote it can't help you fix it. Vibe-coding is a powerful tool. In the hands of engineers who understand systems, it's a real multiplier. In the hands of someone who has never thought about memory, latency, or data integrity, it's just software that looks finished and isn't. The problem isn't the tools. It's optimising for the demo moment instead of the maintenance moment. Getting something to run is not the same as knowing why it runs. Or what to do when it doesn't.
To view or add a comment, sign in
-
-
Claude Code just got a lot more autonomous. 🎯 New feature: /goal Instead of babysitting your AI coding assistant turn by turn, you set a completion condition and Claude keeps working until it's met. Real examples: ✅ "All tests in test/auth pass and lint is clean" ✅ "Every PR from this week has a CHANGELOG entry" ✅ "Split this large file until each module is under the size budget" How it works: → You type /goal [your condition] → Claude starts working immediately → After each turn, a fast model checks whether the condition is met → If not, Claude keeps going — automatically → When done, the goal clears itself The clever part? The evaluator is a separate model from the one doing the work. So completion isn't decided by the agent hoping it's done — it's verified independently. Pair it with auto mode and you've got fully unattended coding sessions that stop exactly when the job is done. This is what "agentic" actually looks like in practice. 🔗 Claude Code docs: https://lnkd.in/dCnGBVPi #AI #DeveloperTools #ClaudeCode #Anthropic #AIEngineering #Productivity
To view or add a comment, sign in
-
🚀🍔 Context engineering is the real secret sauce 🧠 Every layer shapes how the agent thinks, reasons, and acts ⚡ Claude Code’s burger analogy makes the architecture surprisingly intuitive If Claude Code is a burger... Before each model call, Claude Code assembles a context window from 9 distinct sources. Think of it as a burger, each layer adds something different. 👉 System Prompt: Defines Claude's role, behavior, and tone. This sets the foundation. 👉 Environment Info: Git status, branch info, and current date. Pulled in via getSystemContext() 👉 CLAUDE.md: A four-level instruction hierarchy: managed → user → project → local. Plain-text Markdown, so users can read, edit, and version-control everything the model sees. 👉 Auto Memory: Contextually relevant memory entries prefetched asynchronously. An LLM scans memory-file headers and surfaces up to 5 relevant files on demand. 👉 Path-scoped Rules: Conditional rules that load lazily when the agent reads files 👉 Tool Metadata: Skill descriptions, MCP tool names, and deferred tool definitions. 👉 Conversation History: Carried forward across iterations. 👉 Tool Results: File reads, command outputs, and subagent summaries. 👉 Compact Summaries: When history grows too long, older segments are replaced by model-generated summaries. The whole design treats context as a scarce resource. Personally, I find the biggest leverage comes from tuning the instruction layers and memory systems well because that’s where consistency and long-term agent behavior really emerge. Over to you: Which of these 9 layers do you tune the most when working with Claude Code? #ClaudeCode #AIEngineering #ContextEngineering #LLM #AIAgents
To view or add a comment, sign in
-
-
~ A few updates on Keen Code + There are now 9 different providers supported: https://lnkd.in/ekQxNfEw Try Keen Code with your subscription (Codex or OpenCode Go) or with BYOK and check whether it helps with your limits. My expectation is that you will burn less tokens with Keen in a multi-turn task. If not, I will be curious why. + Agent Skills are now a first-class citizen in Keen. If you add a skill in the specified directories, Keen will scan and load it. Later, you can either trigger with `slash` commands or just let the agents decide: https://lnkd.in/eTc4uX24 + Keen now has bundled skills support too. In v0.15.0, there is only one bundled skill: `/commit`. More will be added. + Turn memory now has reliability. Unlike typical coding agents, Keen doesn't retain tool input/output beyond a single agent turn. When an agent turn finishes, Keen only retains a deterministic "TurnMemory" object that stores what files were changed and what bash commands failed. This conservative approach reduces context size and cost in many software engineering tasks but trades it off with agent memory in longer multi-turn conversation. But what if a turn fails right in the middle? Previously, Keen didn't have reliability for such failures, but now it does: https://lnkd.in/eqW6g7na I wholeheartedly welcome feedback and suggestions for Keen!
To view or add a comment, sign in