High-accuracy PDF-to-Markdown OCR API using LLMs with vision capabilities. Features parallel processing, batching, and auto-retry logic for scalable extraction.
-
Updated
Nov 29, 2025 - Python
High-accuracy PDF-to-Markdown OCR API using LLMs with vision capabilities. Features parallel processing, batching, and auto-retry logic for scalable extraction.
Demystify RAG by building it from scratch. Local LLMs, no black boxes - real understanding of embeddings, vector search, retrieval, and context-augmented generation.
A minimal Agentic RAG built with LangGraph — learn Retrieval-Augmented Agents in minutes.
A Python CLI to test, benchmark, and find the best RAG chunking strategy for your Markdown documents.
We believe that every SOTA result is only valid on its own dataset. RAGView provides a unified evaluation platform to benchmark different RAG methods on your specific data.
RAG boilerplate with semantic/propositional chunking, hybrid search (BM25 + dense), LLM reranking, query enhancement agents, CrewAI orchestration, Qdrant vector search, Redis/Mongo sessioning, Celery ingestion pipeline, Gradio UI, and an evaluation suite (Hit-Rate, MRR, hybrid configs).
Generate & Ship UI with minimal effort - Open Source Generative UI with natural language
Agent Fusion is a local RAG semantic search engine that gives AI agents instant access to your code, documentation (Markdown, Word, PDF). Query your codebase from code agents without hallucinations. Runs 100% locally, includes a lightweight embedding model, and optional multi-agent task orchestration. Deploy with a single JAR
Modular full-stack ML project leveraging Groq API, Streamlit, Supabase, JSON, SciPy, SciKit-Learn, Plotly & EmailJS, alongside libraries - NumPy, Pandas, Utils, OS, Base64, Re, Pillow & DateTime.
Quickest way to production grade RAG UI.
AI-powered mock interview platform using Next.js, Gemini AI, JSON, Drizzle, NeonDB, API routes and Clerk for dynamic questions, feedback & session recording, plus Dockerized & deployed microservices.
A python module library that simplifies RAG through abstraction
The implementation of Test Time Diffusion paper by Google with some tweaks to run on 24gb gpu
The Audited Context Generation (ACG) Protocol prevents AI hallucinations with a dual-layer system. The UGVP layer links every fact to a precise source for verification. The RSVP layer audits the AI's logical reasoning when combining facts. This creates a fully transparent, machine-auditable trail for both source and logical integrity.
A multi-provider AI coding agent with the persona of a Tech-Priest
Build a RAG preprocessing pipeline
Vectrain is a high-performance, modular, plug-and-play RAG pipeline that ingests data, generates vector embeddings, and stores them in vector databases for semantic search, recommendations, and analytics.
Search for a holiday and get destination advice from an LLM. Observability by Dynatrace.
An intelligent customer support system powered by LangGraph and LangChain that uses Retrieval-Augmented Generation (RAG) to provide accurate, context-aware responses to customer queries. Built with FastAPI, FAISS, and multi-stage validation for production-ready deployment.
Production-ready Chainlit RAG application with Pinecone pipeline offering all Groq and OpenAI Models, to chat with your documents.
Add a description, image, and links to the rag-pipeline topic page so that developers can more easily learn about it.
To associate your repository with the rag-pipeline topic, visit your repo's landing page and select "manage topics."