Build Intuition Not a Tool

Schuster Braun

Published Jul 7, 2025

As we work with AI models/tooling we build intuition around what they're good and bad at. I used to think the role of a software developer was to smooth rough edges and to Automate the Boring Stuff with Python. However, I'm finding a new perspective in the age of AI. In this post Riding the AI Wave I talked about a number of different tools I built to help cope with these edges.

AI couldn't read files before (I built a file reader)
When AI was good at reading files it wouldn't build high quality code (I built a code quality rater)
AI wouldn't iterate on project management tooling, I built an agentic project management workflow

Models develop and deprecate the need for these tools. I'm still running into limitations though. For example, I've built a ton of prototypes to display concepts. Whenever I build UIs though AI I inevitably have been getting light text on a light background. I just figured out the issue today: these AI tools, defaults to making UIs theme-able. They are prepping the UI for a light/dark theme toggle feature (even if I don't ask for it). The limitation of this feature is that every element needs to register with the theme-able stylings, which doesn't always happen.

Now by habit I think to myself maybe I could update my github copliot prompt files or claude.md or cursor rules. I'd add something like "When adding tailwind classes make sure to check the background will switch with light/dark mode theme toggles." and give it a few shot prompt. With this solution I'd expect the next models may deprecate this system prompt.

Maintaining such a rudimentary band-aid is an expensive piece of technical debt. The "proper way" that I know to maintain a fix like this would be to have an evaluation framework checking to see that it's a helpful clause. In my head I'm still going back and forth on whether I should bite the bullet and start working on that evaluation framework.

Recommended by LinkedIn

Why measuring 🤖 AI-Written vs 🧠 Human-Written code…

Ravi Subramaniam 6 months ago

The YAML-First Philosophy: Building Production AI…

Sami J P Heikkinen 1 month ago

Unlocking GenAI's True ROI: It's Not the Subscription,…

Shivakumar Bangalore Somashekariah 4 months ago

There are other consistent problems out there like:

Dependency Assumption: AI trained on a version and assuming that's the version being used. Band-aid: add a call out in system prompt for particular libraries that keep messing up. (Claude used to have a heck of a time setting up Azure Open AI SDK).
Component Assumption: AI reads a component name and assumes it knows the inner workings of it. Band-aid: Build the component library documentation into the system prompt.
Documentation Pollution: AI puts more value in a code comment than in reading the actual functionality of the code. Band-aid: Add to the system prompt that every AI action should maintain code comments

A big lesson I've learned over this last little bit here is that without proper resourcing your tool isn't going to be widely valuable. Looking at the industry landscape if these prompt snippets were valuable I'd assume there'd be semver managed prompts that have a team updating them with evaluation metrics on the package pages. But these moving targets are hard to plan for and hard to solution. This work though is in a known problem space that I'm pretty sure is being actively worked on by model makers.

So, I'll join the chorus of folks out there and relegate my message to say "get good at prompting" and keep building. That's at least what I'm doing for the time being. As long as folks coding with AI are checking the answers and understand the system then I'd imagine you'd notice where AI is commonly failing. Don't get discouraged that you have to pick up AI and manually code some stuff or really dig deep into some code.

I would like to hear about folks' experience supporting these little band-aids. What challenges are you running into? Have you seen consistent issues between models? Do you think they'll be fixed in future models? Or do you think our workflows will just eventually take these limitations into account?

Schuster Braun 8mo

Can't believe this article feels old after a single day. Just now seeing an eval framework for Github Copilot... https://github.com/bepuca/copilot-chat-eval

To view or add a comment, sign in

Build Intuition Not a Tool

Schuster Braun

Recommended by LinkedIn

More articles by Schuster Braun

Others also viewed

Beyond the Hype: My Practical AI Playbook for Coding and Research

AI Writes the Code. What’s Left for Developers?

Everyone's debating whether AI makes developers faster. They're measuring the wrong thing.

AI Weekly: Agents Take Over, MCP Evolves, and Models Battle for Code

Bridging AI and Engineering: Deterministic Systems vs Probabilistic Systems

Context Engineering: The Differentiator in the Age of Commoditized Language Models

A Chronicle of an AI Product’s Birth

The Refactored Mind: Thriving in the Era of AI Agency

Explore content categories

Recommended by LinkedIn

More articles by Schuster Braun

Scaling AI Agent Workflows: Lessons from Building an Automated Development Team

From Frustrating AI to a Pleasant Workflow

Riding the AI Wave: How I’m Navigating Uncertainty at Microsoft

Others also viewed

Beyond the Hype: My Practical AI Playbook for Coding and Research

AI Writes the Code. What’s Left for Developers?

Everyone's debating whether AI makes developers faster. They're measuring the wrong thing.

AI Weekly: Agents Take Over, MCP Evolves, and Models Battle for Code

Bridging AI and Engineering: Deterministic Systems vs Probabilistic Systems

Context Engineering: The Differentiator in the Age of Commoditized Language Models

A Chronicle of an AI Product’s Birth

The Refactored Mind: Thriving in the Era of AI Agency

Similar topics

Explore content categories