#

intelligent-document-processing

Here are 18 public repositories matching this topic...

yigitkonur / llm-based-ocr

High-accuracy PDF-to-Markdown OCR API using LLMs with vision capabilities. Features parallel processing, batching, and auto-retry logic for scalable extraction.

table-extraction pymupdf document-extraction azure-openai intelligent-document-processing gpt4-vision rag-pipeline vision-ocr complex-layout-analysis batch-ocr text-digitization

Updated Nov 29, 2025
Python

formkiq-core

formkiq / formkiq-core

Open-source document management platform leveraging AWS managed services. RESTful API for document storage, processing, full-text search, and metadata management. Multi-tenant serverless architecture with auto-scaling... deployed entirely in your AWS account.

aws ocr serverless headless cloud-storage document-database amazon-web-services dms document-management optical-character-recognition document-processing document-management-system document-api document-apis intelligent-document-processing document-layer

Updated Dec 1, 2025
Java

rhubarb

awslabs / rhubarb

A Python framework for multi-modal document understanding with Amazon Bedrock

multi-modal document-processing generative-ai intelligent-document-processing amazon-bedrock

Updated Nov 18, 2025
Python

Addepto / graph_builder

Open-source toolkit to extract structured knowledge graphs from documents and tables — power analytics, digital twins, and AI-driven assistants.

cad graph-database graph-visualization graph-api semantic-search enterprise-knowledge-graph document-processing digital-twin knowledge-graph-construction fastapi pdf-table-extraction knowledge-graphs graph-extraction intelligent-document-processing intelligent-document-recognition rag-chatbot intelligent-document-processor

Updated Sep 15, 2025
Python

aws-samples / sample-document-processing-with-amazon-bedrock-data-automation

This repository contains examples for customers to get started using Amazon Bedrock Data Automation. The samples focus mainly on document processing use cases

bedrock idp document-processing bda generative-ai intelligent-document-processing amazon-bedrock amazon-bedrock-agents amazon-bedrock-data-automation

Updated Nov 5, 2025
Jupyter Notebook

BABIN-JOE / NeuroDoc

NeuroDoc is a powerful AI-based offline document summarization tool that leverages OCR and NLP to intelligently analyze PDFs and generate structured summaries. Built using Flask, this tool is designed to run completely offline and supports both text-based and scanned/image-based documents.

python nlp flask ocr ai tesseract text-summarization document-classification document-processing huggingface paddleocr easyocr pdf-summarization intelligent-document-processing bart-model ocr-tool

Updated Sep 2, 2025
Python

aws-samples / sample-serverless-bedrock-idp

This open-source project provides a serverless solution for automated identity document processing (IDP) using Amazon Bedrock's Claude-3 model. The solution creates an end-to-end pipeline that processes identity documents, particularly optimized for birth certificates, by automatically extracting relevant information.

aws lambda amazon terraform sam bedrock sample-code llm intelligent-document-processing aws-bedrock

Updated Oct 28, 2025
HCL

GrooperGuru / GrooperCSS

Boilerplate CSS that can be used with any Grooper DataModel

css intelligent-document-processing

Updated Mar 12, 2025
CSS

paulsamuel-w-e / Multi-Modal-Government-ID-Classification

AI-powered Gov. ID classifier using OCR, BERT, ResNet, and LayoutLMv3 for Aadhar, PAN, Passport, and other scanned IDs.

ocr ai computer-vision deep-learning pytorch document-classification resnet bert multimodal fastapi streamlit windows-executable paddleocr layoutlmv3 intelligent-document-processing id-recognition

Updated Sep 10, 2025
Python

rahuldongre-us / idp-bedrock

An end-to-end serverless pipeline for Intelligent Document Processing (IDP) using Amazon Bedrock and Anthropic Claude 3 Sonnet. This project extracts structured data from scanned documents (e.g., PDFs, forms, invoices) using GenAI models, and stores results in a scalable cloud-native architecture.

aws-lambda terraform python3 artificial-intelligence serverless-framework cloud-architecture prompt-engineering generative-ai intelligent-document-processing macine-learning amazon-bedrock claude-3

Updated May 22, 2025
Python

SimplePDF / pdf-ai-analyzer-with-robocorp

Leveraging the Robocorp integration to analyse customer feedback

automation ai pdf-document-processor pdf-editor intelligent-document-processing

Updated Aug 27, 2023
Python

tanyajain1207 / AdobeHackathon_1b

Project Repository for Problem 1(b) of Adobe Hackathon 2025

multilingual python ml language-model intelligent-document-processing persona-analysis persona-driven-document-extraction

Updated Jul 28, 2025
Python

krishachikka / Intelligent_Document_Processing

SealSure is an AI-powered tool for real-time document validation and forgery detection. Built with MERN, FastAPI, and OCR/NLP models, it helps extract, analyze, and verify data from scanned or image-based documents efficiently.

python nlp ocr ai deep-learning mern flask-api forgery-detection document-validation intelligent-document-processing sealsure

Updated Jun 10, 2025
JavaScript

aws-samples / sample-aws-idp-pipeline

End-to-end Intelligent Document Processing (IDP) pipeline using Amazon Bedrock, OpenSearch, Lambda, LangGraph Agents, and Step Functions. Supports multimodal document analysis for PDFs, images, videos, and audio.

bedrock strands idp opensearch bda intelligent-document-processing

Updated Nov 28, 2025
Python

M-Husnain-Ali / Cognivia-AI

Cognivia AI is a powerful AI-powered PDF search and question-answering system built with LangChain, Pinecone Vector Store, OpenAI, and Supabase. Upload PDFs, ask questions, and get intelligent answers with persistent conversation memory.

embedded-systems openai question-answering semantic-search text-embedding pinecone streamli rag pdf-search vector-database ai-chatbot supbase langchain intelligent-document-processing chat-with-pdf pdf-rag

Updated Sep 8, 2025
Python

HK-Transfield / terraform-aws-gen-ai-idp

Intelligent document processing (IDP) with AWS generative AI services to automate information extraction from documents of different types and formats, without the need for machine learning skills.

aws machine-learning artificial-intelligence intelligent-document-processing

Updated Mar 6, 2025
HCL

windson / bda-usecases

Amazon Bedrock Data Automation usecases

bedrock idp amazon-s3 resume-parsing intelligent-document-processing amazon-bedrock amazon-bedrock-data-automation bedrock-data-automation amazon-bda

Updated Sep 25, 2025
Shell

harshman7 / insight-agent-idp

AI-powered Intelligent Document Processing (IDP) system with RAG, anomaly detection, and natural language insights. Local, zero-cost alternative to AWS Textract + Bedrock.

python ocr postgresql idp anomaly-detection faiss rag expense-tracking fastapi streamlit ai-agent llm intelligent-document-processing ollama document-analytics

Updated Nov 26, 2025
Python

Improve this page

Add a description, image, and links to the intelligent-document-processing topic page so that developers can more easily learn about it.

Curate this topic

Add this topic to your repo

To associate your repository with the intelligent-document-processing topic, visit your repo's landing page and select "manage topics."