Skip to content

Automate document processing with an advanced AI-driven document analyzer for Paperless-ngx, leveraging OpenAI API, Ollama, DeepSeek-R1, Azure AI, and any OpenAI-compatible API. This tool intelligently analyzes, categorizes, and tags your documents—enhancing searchability, organization, and workflow efficiency.

License

Notifications You must be signed in to change notification settings

insionCEO/RAG-based-document-analyzer

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

58 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Paperless-AI is an AI-powered extension for Paperless-ngx that brings automatic document classification, smart tagging, and semantic search using OpenAI-compatible APIs and Ollama.

It enables fully automated document workflows, contextual chat, and powerful customization — all via an intuitive web interface.

💡 Just ask:
“When did I sign my rental agreement?”
“What was the amount of the last electricity bill?”
“Which documents mention my health insurance?”

Powered by Retrieval-Augmented Generation (RAG), you can now search semantically across your full archive and get precise, natural language answers.


✨ Features

🔄 Automated Document Processing

  • Detects new documents in Paperless-ngx automatically
  • Analyzes content using OpenAI API, Ollama, and other compatible backends
  • Assigns title, tags, document type, and correspondent
  • Built-in support for:
    • Ollama (Mistral, Llama, Phi-3, Gemma-2)
    • OpenAI
    • DeepSeek.ai
    • OpenRouter.ai
    • Perplexity.ai
    • Together.ai
    • LiteLLM
    • VLLM
    • Fastchat
    • Gemini (Google)
    • ...and more!

🧠 RAG-Based AI Chat

  • Natural language document search and Q&A
  • Understands full document context (not just keywords)
  • Semantic memory powered by your own data
  • Fast, intelligent, privacy-friendly document queries
    RAG_CHAT_DEMO

⚙️ Manual Processing

  • Web interface for manual AI tagging
  • Useful when reviewing sensitive documents
  • Accessible via /manual

🧩 Smart Tagging & Rules

  • Define rules to limit which documents are processed
  • Disable prompts and apply tags automatically
  • Set custom output tags for tracked classification
    PPAI_SHOWCASE3

🚀 Installation

⚠️ First-time install: Restart the container after completing setup (API keys, preferences) to build RAG index.
🔁 Not required for updates.

📘 Installation Wiki


🐳 Docker Support

  • Health monitoring and auto-restart
  • Persistent volumes and graceful shutdown
  • Works out of the box with minimal setup

🔧 Local Development

# Install dependencies
npm install

# Start development/test mode
npm run test

🧭 Roadmap Highlights

  • ✅ Multi-AI model support
  • ✅ Multilingual document analysis
  • ✅ Tag rules and filters
  • ✅ Integrated document chat with RAG
  • ✅ Responsive web interface

🤝 Contributing

We welcome PRs and contributions!

# Fork, clone, then:
git checkout -b feature/YourFeature
# After changes:
git commit -m "Add YourFeature"
git push origin feature/YourFeature

Then open a Pull Request via GitHub.


🆘 Support & Community


📄 License

This project is licensed under the MIT License. See LICENSE for details.


About

Automate document processing with an advanced AI-driven document analyzer for Paperless-ngx, leveraging OpenAI API, Ollama, DeepSeek-R1, Azure AI, and any OpenAI-compatible API. This tool intelligently analyzes, categorizes, and tags your documents—enhancing searchability, organization, and workflow efficiency.

Topics

Resources

License

Code of conduct

Security policy

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 2

  •  
  •