🚀 Day 2 of Learning & Implementing RAG: Bringing the Data to Life! Yesterday, I built the base data ingestion pipeline. Today was all about bridging the gap between retrieved vectors and real-time generation. Day 2 is officially complete, and our RAG system is fully alive! 🧠✨ Here is the progress report: 1️⃣ The Generator (Groq + Llama 3.1): Integrated the Groq API running Llama 3.1-8B-Instant. By passing our retrieved context chunks from Qdrant directly into the LLM prompt, we're getting hyper-accurate, context-grounded answers. 2️⃣ Server-Sent Events (SSE) Streaming: Static, slow REST responses are out. I built SSE streaming on the backend! It works in two phases: 👉 Phase A: Instantly stream the context citation references to the client so the user sees source documents immediately. 👉 Phase B: Stream the synthesized answer token-by-token for that ultra-smooth, real-time typing interface. 3️⃣ Production-Grade Validation: Added Zod schema validation to secure and type-check the request payloads. 4️⃣ Client-Side Security (Flutter): Integrated secure device storage (`secure_storage.dart`) to store API credentials safely on the app. 💡 What I learned today: Streaming responses isn't just a UI detail—it completely changes the user experience by reducing perceived latency. Designing SSE connections between a TypeScript backend and a Flutter client takes some parsing care, but the real-time UX is incredibly satisfying! Day 3: Parsing the live stream on the frontend and building an interactive dashboard. How do you handle real-time streaming connections in your AI apps? Let's discuss in the comments! 👇 #RAG #ArtificialIntelligence #TypeScript #NodeJS #Flutter #SSE #Streaming #LLM #Groq #Llama3 #BuildInPublic #SoftwareEngineering
sathish G’s Post
More Relevant Posts
-
𝗔𝗴𝗻𝗼𝘀𝘁𝗶𝗰𝗨𝗜 𝗻𝗼𝘄 𝗮𝗿𝗺𝘀 𝘆𝗼𝘂𝗿 𝗳𝗮𝘃𝗼𝗿𝗶𝘁𝗲 𝗟𝗟𝗠 𝘄𝗶𝘁𝗵 𝘆𝗼𝘂𝗿 𝗲𝗻𝘁𝗶𝗿𝗲 𝗨𝗜 𝘀𝘁𝗮𝗰𝗸 Run ag context once and your AI agent of choice knows every component in your project: its prop types, import paths, and layout blueprints. No hallucinated APIs. No re-teaching your stack each session. ⎈ 𝘁𝗵𝗿𝗲𝗲 𝗰𝗼𝗺𝗺𝗮𝗻𝗱𝘀, 𝘇𝗲𝗿𝗼 𝗵𝗮𝗹𝗹𝘂𝗰𝗶𝗻𝗮𝘁𝗶𝗼𝗻𝘀 npx agnosticui-cli init copies the component library into your project as readable source. npx agnosticui-cli add gets you the specific components you need. npx agnosticui-cli context generates an Agentic Intent context file and any AI coding assistant (Claude, Copilot, Cursor, Windsurf, Gemini, Codex) now has exact knowledge of your stack. Three commands. Done. ⎔ 𝘁𝗵𝗲 𝗔𝗚 𝗖𝗼𝗻𝘁𝗲𝘅𝘁 𝗚𝗲𝗻𝗲𝗿𝗮𝘁𝗼𝗿 𝗶𝘀 𝗮𝗴𝗲𝗻𝘁-𝗮𝗴𝗻𝗼𝘀𝘁𝗶𝗰 It auto-detects whichever AI tool you use and writes to the right file: CLAUDE.md for Claude Code, .cursor/rules/ for Cursor, .windsurfrules for Windsurf, AGENTS.md for Codex, and more. Same command, every agent, no configuration required. ◈ 𝗽𝗹𝗮𝘆𝗯𝗼𝗼𝗸 𝗿𝗲𝗰𝗶𝗽𝗲𝘀 𝗮𝗿𝗲 𝗶𝗻𝗷𝗲𝗰𝘁𝗲𝗱 𝗮𝘂𝘁𝗼𝗺𝗮𝘁𝗶𝗰𝗮𝗹𝗹𝘆 Install an AgnosticUI Playbook (Login, Dashboard, Grid, and more) and ag context detects it and injects its structural Intent Schema into the context file alongside your component types. Prompt "build me a dashboard" and your agent follows a deterministic blueprint instead of guessing. ▸ 𝗿𝗲𝗮𝗰𝘁, 𝘃𝘂𝗲, 𝗮𝗻𝗱 𝗹𝗶𝘁: 𝗮𝗻𝘆 𝗮𝗴𝗲𝗻𝘁 The AG Context Generator works across all three supported frameworks. The CLI is now at alpha.23 with the ag context command and Agentic Intent pipeline stable and ready to use today. AgnosticUI Docs: https://www.agnosticui.com AgnosticUI GitHub: https://lnkd.in/gRDG9eMe #AgnosticUI #AIAgent #LLM #WebComponents #React #Vue #Lit #OpenSource #WebDevelopment
To view or add a comment, sign in
-
-
Shipped a small Cycles docs update today. Main idea: teams won’t always read 20 docs pages before integrating new infra. They’ll ask Claude, Codex, Cursor, or Windsurf to wire it in. So we added a canonical AI-coder path: one prompt, clear do/don’t rules, SDK examples, and a test that proves the model/tool call does not run when Cycles denies. The rule is simple: Cycles must run before the costly or risky action. Otherwise it’s just logging. https://lnkd.in/efqwXTtP
To view or add a comment, sign in
-
Documenting a 10-package React library is harder than building one. Tour Kit spans 150+ exported APIs across 10 packages, each with its own hooks, components, and TypeScript types. The core alone has 12 hooks and 30+ type exports. We benchmarked the getting-started path: 7 minutes in Vite, 9 minutes in Next.js. That "time to first working tour" metric has been the most useful quality signal for our documentation. Three patterns that made monorepo docs navigable: 1. Unified search across all packages (Orama, client-side, zero API calls) 2. 200+ cross-package links connecting 60+ doc pages 3. Package-scoped navigation mirroring the repo structure We also generate /llms.txt files so AI tools give accurate answers about our API. 88% of companies now use AI in documentation workflows (McKinsey Q4 2025). Making docs machine-readable isn't optional anymore. Full documentation hub: https://lnkd.in/gzKtkQmw #react #typescript #opensource #webdevelopment #documentation
To view or add a comment, sign in
-
Following my agentic AI course by Ed Donner, I have created another fun project, It's a local MVP project-management web ap with authentication, a persistent Kanban board, and an AI assistant sidebar. Users can sign in, manage one board, rename/add/delete columns, and create/edit/move/delete cards with drag-and-drop support. Board data is persisted per user in SQLite, and AI-assisted commands can be applied to the board through backend action handling. The AI path supports OpenRouter connectivity and a deterministic parser fallback mode when external AI is unavailable. It’s always fascinating to see a project come to life from A to Z right before our eyes. I ask the AI a lot of questions to understand specific solutions and technical choices. Here, I mov step by step after defining the architecture's project and agent's role. In the coming months, I’ll be adding data science projects, my former mini 3D game, and likely the outcome of the GenAI hackathon. Frontend Tech - Next.js - React - TypeScript - Tailwind CSS - @dnd-kit (drag and drop) - Vitest + Testing Library Backend Tech - Python - FastAPI - Uvicorn - SQLite - httpx - python-dotenv - Pytest link github: https://lnkd.in/dAuvGXUf #agenticAI #AI
To view or add a comment, sign in
-
-
🧠 What happens when AI-assisted development moves beyond simple CRUD applications? In this article, we explore the process of building a full-stack React planning application with Lovable AI and DHTMLX React Gantt. The project includes complex UI behavior such as: • drag-and-drop scheduling, • task dependencies, • multi-project timelines, • and backend synchronization with Supabase. The article also highlights an important reality of AI-assisted engineering: once applications become stateful and interaction-heavy, reliable delivery depends less on prompts and more on context management, architectural constraints, and validation workflows. For teams experimenting with AI-driven development, this provides a practical look at both the opportunities and the limitations of current tooling. 👉 https://lnkd.in/diDDA4Zn #ReactJS #AIassistedDevelopment #FrontendEngineering #GanttChart #Supabase #SoftwareEngineering
To view or add a comment, sign in
-
-
𝗗𝗶𝘁𝗰𝗵𝗶𝗻𝗴 𝘁𝗵𝗲 𝗰𝗵𝗮𝘁𝗯𝗼𝘁 𝘄𝗿𝗮𝗽𝗽𝗲𝗿. 𝗞𝗼𝘇𝗮𝗸𝗘𝘆𝗲 𝗢𝗦 𝗶𝘀 𝗻𝗼𝘄 𝗮 𝗱𝗶𝘀𝘁𝗿𝗶𝗯𝘂𝘁𝗲𝗱, 𝗺𝘂𝗹𝘁𝗶-𝗮𝗴𝗲𝗻𝘁 𝘄𝗼𝗿𝗸𝘀𝗽𝗮𝗰𝗲 (𝗣𝗵𝗮𝘀𝗲 𝟭𝟱.𝟮). The exact engineering debt I cleared in this sprint to make concurrent AI streams stable: ⚡ 𝗦𝗦𝗘 𝗗𝗲𝗺𝘂𝗹𝘁𝗶𝗽𝗹𝗲𝘅𝗶𝗻𝗴: To prevent three concurrent agent streams (Chef, Auditor, Architect) from locking the Vue Virtual DOM, I wrote a custom client that demultiplexes chunks directly into isolated shallowRef buffers. High-frequency updates route directly to hardware-accelerated CSS animations. 🧠 𝗧𝗿𝗶-𝗦𝘁𝗼𝗿𝗲 𝗠𝗲𝗺𝗼𝗿𝘆 & `/𝗸𝗶𝗻𝗲𝗰`: Session context is no longer a flat string. Triggering the `/kinec` wrap-up command kicks off FastAPI BackgroundTasks to extract behavioral traits and index them into ChromaDB and an associative NetworkX graph. 📍 𝗨𝗜 𝗖𝗼𝗼𝗿𝗱𝗶𝗻𝗮𝘁𝗲 𝗣𝗲𝗿𝘀𝗶𝘀𝘁𝗲𝗻𝗰𝗲: Fixed layout amnesia. Widget states, absolute positions (useDraggable.js), and FSM engine logic sync with PostgreSQL in real-time and hydrate strictly from localStorage on initialization. The underlying repository architecture has scaled to 388 nodes and 558 edges, validated via an automated CLI codebase audit. 🔗 Full architecture and multi-agent HUD logic are live on GitHub (link in bio). #KozakEyeOS #SystemArchitecture #AppliedAI #MultiAgentSystems #Python #FastAPI #Vue3 #GraphRAG #SSEStreaming #WebDevelopment #BuildWithGemini
To view or add a comment, sign in
-
I had a problem. I love Claude Code. The vast plugin ecosystem, hooks, the `/usage` panel, the spinner that says "Canoodling…" while it greps your repo. The whole UX is dialed in. I also love OpenCode. It's the open-source coding agent that talks to 75+ providers via the Vercel AI SDK - Anthropic, OpenAI, Google, Bedrock, OpenRouter, local llama.cpp, anything OpenAI-shaped. No vendor lock-in. MIT license. But every time I jumped between them, I missed half the experience. Claude Code's polish OR OpenCode's freedom. Never both. So I created the love child of both. Introducing OpenCode X. What it does on top of upstream OpenCode: → Reads your existing ~/.claude/settings.json hooks and ~/.claude/plugins/installed_plugins.json. Same events, same env vars. Zero migration. Run your favorite claude code plugins (caveman, rtk, context-mode etc) natively. → Spinner verbs, /usage cost panel, push-to-background chord - the Claude Code UX bits that make the TUI feel alive. → Tokens consumed for every tool call made transparent on the TUI. → Goal system: /goal <objective> and the agent loops autonomously, calls goal_complete with evidence when done. Configurable turn + token cap. → Tool output compression: a cheap model pre-compresses big tool outputs (3 templates: extract / summarize / filter) before they hit your expensive cloud model. 30-60% token savings on bash-heavy work. → Persistent + session memory. Cache stability via stable-prefix prompt split. 3-tier context safety net. → Doom loop detection, safe parallel tool calls within single LLM round-trip. → MIT, no telemetry, bring your own keys. See the full comparison here: https://lnkd.in/dBiKf35F Would love feedback from anyone running multi-provider setups, especially if you've hit the "Claude Code is great but Anthropic-only" wall. #opensource #developertools #opencode #claudecode #codingagents
To view or add a comment, sign in
-
-
Building a production-grade LLM service from scratch — Days 2 & 3 🧵 This week I moved past AI demos and started treating LLM integration as real infrastructure. Two endpoints, and a lot of decisions about what "production-ready" actually requires. Day 2 - A properly structured FastAPI service Wrapped the Gemini API in a /chat endpoint, with the focus on doing it correctly rather than just getting output: → Async-first design (async def + await) so the server handles concurrent load without blocking → Pydantic models validating every request and response, so invalid input never reaches business logic → pydantic-settings for configuration - secrets in environment variables, never hardcoded → A clean, maintainable layout: app/, config.py, routes/, services/, schemas/ The endpoint returning a response was the easy part. The structure that makes it maintainable and testable was the real work. Day 3 - Streaming, and handling failure gracefully An 8-second wait for a full response feels broken; streaming the same response feels responsive - even though total latency is identical. That UX gap is worth engineering for. Built /chat/stream using Server-Sent Events and FastAPI's StreamingResponse, with three event types: → delta - text chunks as they're generated → usage - token counts on completion → error - upstream failures surfaced cleanly within the stream I also tested the failure path most demos ignore: what happens when a client disconnects mid-stream? Left unhandled, the server keeps generating billable tokens for a client that's no longer listening. The fix - if await request.is_disconnected(): break - propagates cancellation down to the SDK's connection to Google, so ungenerated tokens are never billed. At scale (100k requests/day, ~10% disconnect rate, 500-token responses), that's a meaningful cost saving. Finally, I built a lightweight JS demo to experience the result from the user's side - a useful reminder that latency is as much about perception as raw speed. Two days, two endpoints, and a clearer sense of the gap between a working prototype and a production service. #AI #MachineLearning #Python #FastAPI #LLM #SoftwareEngineering #BackendDevelopment
To view or add a comment, sign in
-
-
🚀 Built an AI-Powered Quiz Generator using Bolt.new, Supabase, Dify AI, and React! Over the past few days, I designed and developed a smart quiz generation platform that can automatically create interactive quizzes for students. 🔹 Features: • AI-generated MCQs with explanations • Instant answer validation and scoring • Personalized feedback system • Dynamic quiz rendering • PDF-ready quiz export • Responsive UI for desktop and mobile 🔹 Tech Stack: • React + TypeScript • Tailwind CSS • Supabase Edge Functions • Dify AI Workflow API • Ngrok for local API tunneling • Bolt.new for rapid development 🔹 Challenges Solved: • Fixed JSON parsing/rendering issues • Corrected answer validation mismatches • Handled API 500 and ngrok connectivity errors • Improved UI responsiveness and quiz logic • Added detailed feedback and answer review sections This project helped me gain hands-on experience in: AI integration, frontend architecture, API debugging, edge functions, and full-stack workflow design. #AI #React #TypeScript #Supabase #Dify #BoltNew #WebDevelopment #EdTech #FrontendDevelopment #FullStackDevelopment
To view or add a comment, sign in
-
Everyone talks about RAG. so i decided to build the entire pipeline myself to understand what actually happens under the hood. I built Kortex AI — a full-stack app where users can upload a PDF and chat with it intelligently. ⚙️ what i implemented: • page-aware PDF extraction • semantic recursive chunking with overlap • embedding generation using HuggingFace • vector similarity retrieval using Pinecone • grounded response generation using Groq • token budget handling for cost control • separate modes for document Q&A and general AI chat 🛠️ tech stack: • Next.js 16 • TypeScript • Node.js + Express • Pinecone • HuggingFace • Groq 💡 biggest learnings: • chunk size heavily affects retrieval quality • metadata makes RAG systems debuggable • similarity scores should not be passed directly into prompts • building primitives first makes frameworks much easier to understand building this manually helped me understand RAG systems far more deeply than just calling abstractions. 💻 github: https://lnkd.in/dtUsEa4T here’s the demo 🚀 #GenAI #RAG #LLM #AIEngineering #NextJS #TypeScript #Pinecone #BuildInPublic
To view or add a comment, sign in
Explore related topics
- LLaMA 3 Applications in Machine Learning
- AI Capabilities For Streaming Data Solutions
- How to Build Intelligent Rag Systems
- How to Improve AI Using Rag Techniques
- Understanding the Role of Rag in AI Applications
- How to Use RAG Architecture for Better Information Retrieval
- Real-Time API Data Streaming
- How to Streamline RAG Pipeline Integration Workflows
- Streamline RAG Testing Using LLM Feedback
- How LLMs Handle Selective Reading Prompts