When working with multiple LLM providers, managing prompts, and handling complex data flows — structure isn't a luxury, it's a necessity. A well-organized architecture enables: → Collaboration between ML engineers and developers → Rapid experimentation with reproducibility → Consistent error handling, rate limiting, and logging → Clear separation of configuration (YAML) and logic (code) 𝗞𝗲𝘆 𝗖𝗼𝗺𝗽𝗼𝗻𝗲𝗻𝘁𝘀 𝗧𝗵𝗮𝘁 𝗗𝗿𝗶𝘃𝗲 𝗦𝘂𝗰𝗰𝗲𝘀𝘀 It’s not just about folder layout — it’s how components interact and scale together: → Centralized configuration using YAML files → A dedicated prompt engineering module with templates and few-shot examples → Properly sandboxed model clients with standardized interfaces → Utilities for caching, observability, and structured logging → Modular handlers for managing API calls and workflows This setup can save teams countless hours in debugging, onboarding, and scaling real-world GenAI systems — whether you're building RAG pipelines, fine-tuning models, or developing agent-based architectures. → What’s your go-to project structure when working with LLMs or Generative AI systems? Let’s share ideas and learn from each other.
Engineering Workflow Management Systems
Explore top LinkedIn content from expert professionals.
-
-
Automation, AI workflow, or AI agent? To always 𝘬𝘯𝘰𝘸 𝘸𝘩𝘪𝘤𝘩 𝘰𝘯𝘦 𝘵𝘰 𝘣𝘶𝘪𝘭𝘥, follow this 𝘧𝘳𝘢𝘮𝘦𝘸𝘰𝘳𝘬: Remember when I explained why many "𝘈𝘐 𝘢𝘨𝘦𝘯𝘵𝘴" shared on LinkedIn are actually 𝘈𝘐 𝘸𝘰𝘳𝘬𝘧𝘭𝘰𝘸𝘴 or 𝘢𝘶𝘵𝘰𝘮𝘢𝘵𝘪𝘰𝘯𝘴 in disguise? Turns out: understanding the difference is only partially helpful. The real challenge is knowing 𝘸𝘩𝘪𝘤𝘩 𝘴𝘰𝘭𝘶𝘵𝘪𝘰𝘯 𝘵𝘰 𝘣𝘶𝘪𝘭𝘥 𝘧𝘰𝘳 𝘺𝘰𝘶𝘳 𝘶𝘴𝘦 𝘤𝘢𝘴𝘦. So I built this framework to help you decide. There are 6 key dimensions to consider - working in pairs: 𝐏𝐚𝐢𝐫 #1: 𝐃𝐞𝐜𝐢𝐬𝐢𝐨𝐧-𝐌𝐚𝐤𝐢𝐧𝐠 ↔️ 𝐇𝐮𝐦𝐚𝐧 𝐈𝐧𝐯𝐨𝐥𝐯𝐞𝐦𝐞𝐧𝐭 aka. how decisions are made - and how much human intervention is required: → 𝘈𝘶𝘵𝘰𝘮𝘢𝘵𝘪𝘰𝘯: You make ALL decisions upfront when designing your automation, which means that no human intervention is needed after. → 𝘈𝘐 𝘸𝘰𝘳𝘬𝘧𝘭𝘰𝘸: You set boundaries for the AI to operate within; humans occasionally review outputs or intervene when the system encounters edge cases. → 𝘈𝘐 𝘢𝘨𝘦𝘯𝘵: You set high-level goals, and AI determines its own path; this means humans need to provide ongoing feedback to ensure it makes the right decisions. 𝐏𝐚𝐢𝐫 #2: 𝐃𝐚𝐭𝐚 𝐒𝐭𝐫𝐮𝐜𝐭𝐮𝐫𝐞 ↔️ 𝐀𝐝𝐚𝐩𝐭𝐚𝐛𝐢𝐥𝐢𝐭𝐲 a.k.a which type of data the system should process - and how adaptable it has to be: → 𝘈𝘶𝘵𝘰𝘮𝘢𝘵𝘪𝘰𝘯: Requires strictly predefined data formats with no deviation; breaks when encountering unexpected inputs and needs to be re-engineered when processes change. → 𝘈𝘐 𝘸𝘰𝘳𝘬𝘧𝘭𝘰𝘸: Handles mostly structured data with some variability allowed; can adjust to parameter variations within defined parameters but needs guidance for significant changes. → 𝘈𝘐 𝘢𝘨𝘦𝘯𝘵: Processes diverse unstructured data across multiple sources with varying formats; independently adapts to different inputs and shifting environments without reprogramming. 𝐏𝐚𝐢𝐫 #3: 𝐑𝐞𝐥𝐢𝐚𝐛𝐢𝐥𝐢𝐭𝐲 ↔️ 𝐑𝐢𝐬𝐤 𝐓𝐨𝐥𝐞𝐫𝐚𝐧𝐜𝐞 a.k.a how predictable the outcomes must be - and what level of risk is acceptable: → 𝘈𝘶𝘵𝘰𝘮𝘢𝘵𝘪𝘰𝘯: Delivers highly consistent, predictable results every time; ideal for mission-critical processes where errors cannot be tolerated and predictability is essential. → 𝘈𝘐 𝘸𝘰𝘳𝘬𝘧𝘭𝘰𝘸: Produces mostly reliable outcomes with occasional variations in edge cases; balances flexibility with guardrails to prevent major errors while allowing some adaptability. → 𝘈𝘐 𝘢𝘨𝘦𝘯𝘵: Creates outcomes that can vary significantly between iterations; optimized for scenarios where discovering novel approaches and adaptability outweigh the need for consistent results. How to use this framework: Always 𝘴𝘵𝘢𝘳𝘵 𝘧𝘳𝘰𝘮 𝘵𝘩𝘦 𝘭𝘦𝘧𝘵 and move right only when necessary. 1. Start with automation 2. Move to AI workflows when you need more flexibility within guardrails 3. Only move to agents when you need high adaptability Don’t fall for the AI agent hype - most processes can be automated without agents.
-
If I had a system design interview tomorrow at Google, this is exactly how I’d approach it. (This framework helped me crack 3 FAANG+ companies in the past, including Amazon.) Most engineers fail system design interviews not because they lack knowledge but because they lack structure. You don’t need to memorize 100 architectures. You need a structured and easy-to-apply approach that works every time. Here’s how I break it down: 1/ Clarify the Problem Before Writing Anything - System design interviews aren’t about throwing buzzwords—they’re about trade-offs. - Start with scoping → Are we designing just one feature or the entire system? - Ask constraints upfront → How many users? Read/write ratio? Latency requirements? - Define success criteria → What matters most? Scalability? Cost? Low-latency? Most candidates assume things and jump into solutions. I make sure I know what we’re solving before I even start. 2/ Define Functional & Non-Functional Requirements Clearly - Functional: What features does the system need? - Non-functional: What are the performance expectations? - What’s the biggest technical challenge? (This helps guide the discussion.) Example: If we’re designing YouTube, is the focus on video uploads, recommendations, or live streaming? Each has a different set of constraints. 3/ Estimate the Scale & Plan Capacity Like an Engineer - Users per second? Requests per second? - Storage needs? If we store 10MB per user and have 100M users, what does that mean? - Throughput? Can a single database handle the load, or do we need sharding? Most candidates throw random numbers. I do quick, back-of-the-envelope calculations to validate my assumptions. 4/ Break the System into Core Components (High-Level Design) - Define the major building blocks → API Gateway, Load Balancer, Service Layers, Databases. - Don’t overcomplicate. Simple and scalable always wins. - Clearly define the interactions between services. If I’m designing a messaging app, I break it down into: — User Service (auth, profiles) — Messaging Service (storing chats) — Notification Service (real-time updates) — Media Storage (for images, videos) Each has different constraints, so I build around what’s most important. Continued Here: https://lnkd.in/eiHQs-qT P.S. If you’re preparing for tech interviews or appearing soon for one as a SWE. Check out my book Awesome Tech Interviews. It will help you: — Learn techniques to win behavioral interviews — Learn DSA with a detailed 6-month roadmap — Build your foundations of System Design - all in one place. Along with 300+ free online resources. Digital copy: https://lnkd.in/efc7u85w Paperback (Available on Amazon internationally): https://lnkd.in/ePWCr74g
-
Power Automate Work Queues are not built for scale! That's a fact. When you think about scalability in Power Automate, one thing that will definitely come to mind at some point is queues and workload management. While you might be able to survive without them in some event-based transactional flows that only process a single item at a time, but whenever you process tasks in batches, or when RPA gets involved, you'll need queues. Power Automate comes with Work Queues out of the box. And you would think that's your go-to queueing mechanism for scaling. After all, it's at scale that you really need those queues - to de-couple your flows and make it easier to maintain, support, debug them, as well as make them more robust and efficient. Queues is a must even at medium scale. Heck, we use them even in small scale implementations. But the surprising thing about Power Automate Work Queues is that they are not fit for high scale implementations. And that is by design! The docs themselves (link in the comments) explicitly state that if have high volumes or if you dequeue (pick up work items from the queue for processing) concurrently, you should either do it within moderate levels or use something else. If you try and use Power Automate Work Queues for high scale implementations (more than 5 concurrent dequeue operations or hundreds/thousands of any type operations involving the queues), you'll get in trouble. There can be all sorts of issues that could happen - your data may get duplicated, you may accidentally deque the same work item in multiple concurrent instances, or your flows might simply get throttled or even crash. This is because of the way they're build and the way they utilize Dataverse tables for storing work items and work queue metadata. So, if you do want to scale, it's best to use an alternative. And, obviously, Microsoft wouldn't be Microsoft if they didn't have an alternative tool to do that. The docs themselves recommend Azure Service Bus Queues for high throughput queueing mechanisms. Another alternative could also be Azure Storage Queues, but that only makes sense if the individual work items in your queue can get large (lots of data or even documents) or when you expect your queue to grow beyond 80GB (which is possible in very large scale implementations). Otherwise, Azure Service Bus Queues are absolutely perfect for very large volumes of small transactions. On top of that, they have some very advanced features for managing, tracking, auditing and otherwise handling your work items. And, of course, there's a existing connector in Power Automate to use it. So, while I do love Power Automate Work Queues, I'll only use them in relatively small scale implementations. And for everything else - my queues will go to Azure. And so should yours.
-
Still, many of us get confused about using LangChain or LlamaIndex. LangChain specializes in workflow orchestration, making it ideal for complex multi-step processes that chain together multiple LLM operations. It excels in applications requiring tool/API integrations, agent-based systems with reasoning capabilities, and scenarios needing extensive prompt engineering. LangChain also provides frameworks for evaluation and comparison of different approaches. LlamaIndex, on the other hand, focuses on document processing and data retrieval. Its strengths lie in handling complex document ingestion, advanced indexing of knowledge bases, and providing structured data access for LLMs. LlamaIndex is particularly valuable for customizing retrieval strategies, processing diverse document formats, and implementing query transformations and routing. When deciding between them, consider your primary focus: choose LangChain if your project involves complex workflows requiring multiple integrated steps and tools working together in sequence. Select LlamaIndex if your application centers on document processing, knowledge base creation, and sophisticated data retrieval strategies. You can in fact, if you want, can use both but that becomes a overhead and burden for your engineers. For many RAG projects, the choice depends on whether workflow orchestration or document processing capabilities are more critical to your specific implementation. Build Your First RAG Application Using LlamaIndex: https://lnkd.in/g6iN7dmz Here is my LangChain RAG tutorial for beginners: https://lnkd.in/gYYDdXwH Here is my video on creating powerful Agentic RAG applications using LlamaIndex: https://lnkd.in/gAUmmaju Here is my complete article on different LLM frameworks: https://lnkd.in/eZdxPGiR
-
The biggest myth in AI today? That tools like LLMs, CoPilots, MCPs, and Agents will do the engineering for you. They won’t — because AI is engineering. LLMs. MCP. Agents. They’re all just that — tools. Yet many organizations are spending an extraordinary amount of time comparing, evaluating, and switching between tools — while missing the real essence of AI transformation. The real differentiator isn’t the toolchain. It’s the engineering mindset behind how those tools are used. Most organizations miss that AI is an engineering discipline — not a collection of experiments. It demands the same rigor as any mature system: design, development, testing, validation, rollout, and continuous optimization. Don’t go by leaderboards — they’re tested to work in controlled benchmarks, not in real-world, multi-system environments where context, latency, data, and cost all collide. And don’t fall for the misconception that AI will replace engineers. That’s a narrative being set — but having worked with top LLMs and chatbots, one thing is clear: they often fail when confronted with real engineering. Their code lacks depth, structure, and holistic system thinking. Tools never replace real engineering. They amplify those who understand it. Invest in the core. Invest in robust engineering practices. Upskill your teams. This will be your foundation in building scalable, responsible, and future-ready AI systems. Because tools will change. Frameworks will evolve. But engineering excellence — that’s what endures #aiengineering #ai #leanagenticai
-
Enterprise AI teams are struggle with agents that cannot reliably access and act on core enterprise systems and data, lack deep understanding of domain‑specific language and context, and sit on top of weak retrieval and grounding over internal knowledge, which leads to brittle answers and hallucinations. At the same time, organizations have poor observability, control and evaluation of AI behaviour, with limited guardrails, benchmarking and testing and they find it hard to compose multiple agents and tools into robust workflows, so initiatives stall in “pilot purgatory” instead of scaling into production. Part of the solution to these challenges is integrating domain‑aware models with enterprise systems and adding strong retrieval, monitoring, guardrails and orchestration for multi‑agent workflows.
-
You must know these 𝗔𝗴𝗲𝗻𝘁𝗶𝗰 𝗦𝘆𝘀𝘁𝗲𝗺 𝗪𝗼𝗿𝗸𝗳𝗹𝗼𝘄 𝗣𝗮𝘁𝘁𝗲𝗿𝗻𝘀 as an 𝗔𝗜 𝗘𝗻𝗴𝗶𝗻𝗲𝗲𝗿. If you are building Agentic Systems in an Enterprise setting you will soon discover that the simplest workflow patterns work the best and bring the most business value. At the end of last year Anthropic did a great job summarising the top patterns for these workflows and they still hold strong. Let’s explore what they are and where each can be useful: 𝟭. 𝗣𝗿𝗼𝗺𝗽𝘁 𝗖𝗵𝗮𝗶𝗻𝗶𝗻𝗴: This pattern decomposes a complex task and tries to solve it in manageable pieces by chaining them together. Output of one LLM call becomes an output to another. ✅ In most cases such decomposition results in higher accuracy with sacrifice for latency. ℹ️ In heavy production use cases Prompt Chaining would be combined with following patterns, a pattern replace an LLM Call node in Prompt Chaining pattern. 𝟮. 𝗥𝗼𝘂𝘁𝗶𝗻𝗴: In this pattern, the input is classified into multiple potential paths and the appropriate is taken. ✅ Useful when the workflow is complex and specific topology paths could be more efficiently solved by a specialized workflow. ℹ️ Example: Agentic Chatbot - should I answer the question with RAG or should I perform some actions that a user has prompted for? 𝟯. 𝗣𝗮𝗿𝗮𝗹𝗹𝗲𝗹𝗶𝘇𝗮𝘁𝗶𝗼𝗻: Initial input is split into multiple queries to be passed to the LLM, then the answers are aggregated to produce the final answer. ✅ Useful when speed is important and multiple inputs can be processed in parallel without needing to wait for other outputs. Also, when additional accuracy is required. ℹ️ Example 1: Query rewrite in Agentic RAG to produce multiple different queries for majority voting. Improves accuracy. ℹ️ Example 2: Multiple items are extracted from an invoice, all of them can be processed further in parallel for better speed. 𝟰. 𝗢𝗿𝗰𝗵𝗲𝘀𝘁𝗿𝗮𝘁𝗼𝗿: An orchestrator LLM dynamically breaks down tasks and delegates to other LLMs or sub-workflows. ✅ Useful when the system is complex and there is no clear hardcoded topology path to achieve the final result. ℹ️ Example: Choice of datasets to be used in Agentic RAG. 𝟱. 𝗘𝘃𝗮𝗹𝘂𝗮𝘁𝗼𝗿-𝗼𝗽𝘁𝗶𝗺𝗶𝘇𝗲𝗿: Generator LLM produces a result then Evaluator LLM evaluates it and provides feedback for further improvement if necessary. ✅ Useful for tasks that require continuous refinement. ℹ️ Example: Deep Research Agent workflow when refinement of a report paragraph via continuous web search is required. 𝗧𝗶𝗽𝘀: ❗️ Before going for full fledged Agents you should always try to solve a problem with simpler Workflows described in the article. What are the most complex workflows you have deployed to production? Let me know in the comments 👇 #LLM #AI #MachineLearning
-
It is the next source of competitive edge. In the future, companies won’t win because of their tech stack, but because of how well they can orchestrate agents. The tech stack used to be the differentiator when infrastructure and architecture weren’t standardised, leading to large capability gaps. Today, everyone has access to the same tools: APIs, cloud, shared infrastructure. Companies are looking for the next source of real advantage - and agents are emerging as that layer: • Differentiation no longer comes from the technology a company owns, but from how effectively that technology is used to deliver outcomes. • Enterprise agents navigate the stack, interpret what needs to happen, connect the necessary systems, and complete the work end-to-end. • As a result, two companies with the same infrastructure can operate at completely different levels of speed, accuracy, and outcome - purely because one has agents that execute better. That’s where the competitive gap reopens: not in the tools themselves, but in the quality of the execution built on top of them. Many companies already understand this, and the frontrunners are working out how to build, deploy, and scale agents across their organisation. 𝗕𝘂𝘁 𝘁𝗵𝗲𝗿𝗲 𝗮𝗿𝗲 𝗮 𝗳𝗲𝘄 𝘀𝘁𝗿𝘂𝗰𝘁𝘂𝗿𝗮𝗹 𝗯𝗮𝗿𝗿𝗶𝗲𝗿𝘀: • Data sits in too many places and lacks consistent standards. • APIs are uneven, outdated in parts, and not designed for the kind of end-to-end coordination agents require. • Older platforms and newer services still don’t fit together cleanly, creating integration gaps. • Ownership of core workflows is often unclear, making it hard to decide where agents should operate and who should be accountable for their outputs. • Despite high oversight requirements (transparency, auditability, control), most organisations cannot demonstrate how agents will behave under real conditions. Here are some 𝗯𝗲𝘀𝘁-𝗽𝗿𝗮𝗰𝘁𝗶𝗰𝗲 𝘄𝗮𝘆𝘀 to get started: • Pick a workflow that has real business impact and a process that can be described end-to-end. • Map how the workflow actually runs today, including exceptions. • Stabilise the specific data sources and API calls the agent will rely on, ensuring they are consistent enough to support reliable execution. • Assign clear ownership: someone responsible for the workflow, someone for the agent’s behaviour, and someone for the operational guardrails. • Put basic controls in place early - logging, visibility into decisions, and clear escalation paths. • Test the agent under real conditions and real volume. • Expand the agent’s remit gradually. Tomorrow’s winners are already building these capabilities today. Opinions: my own, graphic source: BCG 𝐒𝐮𝐛𝐬𝐜𝐫𝐢𝐛𝐞 𝐭𝐨 𝐦𝐲 𝐧𝐞𝐰𝐬𝐥𝐞𝐭𝐭𝐞𝐫: https://lnkd.in/dkqhnxdg
-
The deeper our understanding of systems, the more wisely and skillfully we can impact sustainable change and improvement. Way back in the 1940's, General Systems Theory showed us that systems could NOT be fully understood by breaking them apart and analyzing the pieces. Instead, systems had to be observed as wholes ,seen in context, with attention to how the parts interacted, evolved, and influenced each other over time. This shift in thinking (from analysis to synthesis) changed everything. It taught us that organizations, supply chains, customer experiences, and even simple production lines are not collections of isolated parts. They are dynamic, interconnected living systems. And THIS perspective is what's needed to guide Lean thinking and Lean practices. Lean is not just about cutting waste or speeding up production. At its core, Lean is about seeing the system- how value flows (or fails to flow) across people, processes, and technology. It’s about understanding that the performance of a system depends far more on the interactions between the parts than on the performance of any single part. When Lean asks us to "go to the Gemba", to the real place where work happens, it is inviting us to observe with curiosity, to understand and not judge or measure. And when Lean guides us to improve processes, it teaches us to create flow and pull systems instead of pushing work downstream blindly...and it teaches us to seek out the communication and collaboration practices that create or prevent flow and pull. When Lean practitioners don't 'get' systems thinking, three major things happen: 1️⃣ They focus too much on local improvements. They optimize one department, one process, or one step but unknowingly hurt the system as a whole. 2️⃣ They treat symptoms, not causes. Without a systems view, people often chase the obvious issues (like bottlenecks or rework) without seeing the underlying system conditions that are creating those issues. 3️⃣ They miss the bigger opportunity. Lean isn't just about making tasks quicker, it's about redesigning how value flows across the organization. Without systems thinking, efforts stay tactical, fragmented, and superficial and real transformation never happens. Systems thinking reminds us: 👉 Optimizing one piece without regard to the whole can cause greater problems elsewhere. 👉 True improvement happens when we see the relationships and dependencies , not just the activities. 👉 To create sustainable change, we must first understand how the system behaves, not just how it is designed. Why is it so hard for many organizations to think in systems, not silos? Is it anything to do with the people/leader traits highlighted below? Leave your thoughts in the comments and lets chat! 🙏