Scaling AI Coding Agents: Lessons from a Seasoned Engineer

This title was summarized by AI from the post below.
View profile for Shariq Hashmi

AI Fluens5K followers

Couple of months ago, I went from leading engineering teams to going hands-on and that required working a lot on AI coding agents — Claude Code, Copilot, Lovable, Replit — on real projects, daily.   As someone who's built and scaled teams at tech giants, I thought I'd pick these up fast.   I was wrong. The gap between using these tools and using them well is enormous.   Four things changed my output quality dramatically:   𝟭. CLAUDE.md / AGENTS.md A markdown file at your project root describing your architecture, conventions, and constraints. Without it, the agent guesses. With it, it follows. The agent stopped inventing its own naming conventions overnight. Highest-ROI setup I've found.   𝟮. Task decomposition Early on I'd describe an entire feature and get something that looked done in 10 minutes. Fascinating — untill code base grew and bugs increased exponentially in generated code. Breaking features into small, testable chunks improved agent effectiveness significantly.   𝟯. Model selection I burned two days on iteration cycles before realizing I was using a fast model for a task that needed deep reasoning. The fix: reasoning models for architecture, fast models for implementation. Auto-routing isn't there yet.   𝟰. Prompt quality The most underestimated lever. 5 minutes structuring a prompt saves 30–45 minutes of back-and-forth. Every single time.   The tooling is here. The craft of using it well is still emerging.     #SoftwareEngineering #CodingAgents #EngineeringLeadership #DeveloperProductivity

Shariq Hashmi about your last point. One issue I see with prompting at this point is that there is no indication from the model about whether it has understood exactly what you were trying to tell it. Prompting at this point feels probably a lot like coding in 1980s without IDEs. One was probably coding in the blind. Only once your code compiles you could be sure that at least syntactically your code is right. And if there was an error you would be able to figure out maybe in a reasonable time frame about where did you go wrong. With prompts, for large features, it can be very hard to reason about the correctness of the generated code, ie it does exactly what it was supposed to, neither more nor less. And once you figure out what went wrong in the generated code, comes the trickier part. Figuring out what went wrong with the prompt.

All four of these compound around the same root problem that agents don't carry state between sessions. CLAUDE.md helps, but it's static it describes the project, not what you've actually tried and why. We've been logging Claude Code sessions to SQLite for exactly this reason. The prompt quality point especially you write a great prompt, it works, and two weeks later you're reconstructing it from memory. Capturing that reasoning as it happens changes the iteration loop significantly.

How deep are you going on task decomposition — is this just breaking features into user stories, or are you actively constraining what the agent can change in each prompt (like "only modify this file, don't touch database schema")? Curious if you're finding the agent gets more reliable as you tighten the scope, or if it's more about you being able to validate each piece faster.

Like
Reply
Dipesh Bhakat

Sonetel8K followers

4w

Wonderful post. Another important point to add is the need to develop skills for different types of work and apply them in parallel, functioning collaboratively as a team within Claude CLI. This multi-skill, parallel workflow model forms the foundational building block of Claude CoWork, which has been commercialized recently.

Trivikram Potluri

Willspired Solutions8K followers

3w

I think the whole exercise is improving our clarity of purpose and organization of work. Almost similar to challenges of teams

Like
Reply
Shon Shah

Microsoft1K followers

4w

Good observations, Shariq Hashmi. Thank you for sharing! Token consumption would make it to your next set of observations. 😊

Charanjeet Kaur

Senior Engineering Manager at Microsoft

4w

Useful post. Thank you for sharing

Mohd Anwar Jamal Faiz

Expedia Group736 followers

3w

Good to read your insights. And, yeah! The last line is worth taking note of.

See more comments

To view or add a comment, sign in

Explore content categories