Skip to content

Conversation

@rklasen
Copy link

@rklasen rklasen commented Dec 30, 2025

This commit addresses the docling installation issues reported in #12187
by making docling a core dependency installed during image build rather than at runtime.

I realize we want to keep the images small, but without docling, all the cuda/torch/onnx stuff isn't installed either, which means Deepdoc and minerU can't use the GPU at all, although only docling is missing.

Changes:

  • Add docling>=2.58.0 to pyproject.toml dependencies
  • Update Dockerfile to use Python 3.12 consistently (was 3.11)
  • Remove ensure_docling() runtime installation from entrypoint.sh
  • Remove USE_DOCLING environment variable from .env

Benefits:

  • Fixes Python 3.12 pip/ensurepip missing module errors
  • Ensures torch is available for GPU-accelerated deepdoc processing
  • Simplifies deployment by eliminating conditional runtime installation
  • Maintains consistency with uv-based dependency management
  • Aligns with project's requires-python = ">=3.12,<3.15" specification

Trade-offs:

  • Image size increases by ~2-3GB due to torch dependencies
  • Docling is now always included (no opt-out via USE_DOCLING)

Fixes #12187
Related to #12317

What problem does this PR solve?

Briefly describe what this PR aims to solve. Include background context that will help reviewers understand the purpose of the PR.

Type of change

  • Bug Fix (non-breaking change which fixes an issue)
  • New Feature (non-breaking change which adds functionality)
  • Documentation Update
  • Refactoring
  • Performance Improvement
  • Other (please describe):
This commit addresses the docling installation issues reported in infiniflow#12187
by making docling a core dependency installed during image build rather
than at runtime.

Changes:
- Add docling>=2.58.0 to pyproject.toml dependencies
- Update Dockerfile to use Python 3.12 consistently (was 3.11)
- Remove ensure_docling() runtime installation from entrypoint.sh
- Remove USE_DOCLING environment variable from .env

Benefits:
- Fixes Python 3.12 pip/ensurepip missing module errors
- Ensures torch is available for GPU-accelerated deepdoc processing
- Simplifies deployment by eliminating conditional runtime installation
- Maintains consistency with uv-based dependency management
- Aligns with project's requires-python = ">=3.12,<3.15" specification

Trade-offs:
- Image size increases by ~2-3GB due to torch dependencies
- Docling is now always included (no opt-out via USE_DOCLING)

Fixes infiniflow#12187
Related to infiniflow#12317
@dosubot dosubot bot added the size:S This PR changes 10-29 lines, ignoring generated files. label Dec 30, 2025
Add build configuration to ragflow-cpu and ragflow-gpu services
to enable local image building with the updated dependencies.

This allows developers to build the image locally with the new
docling dependency and Python 3.12 changes.
@KevinHuSh KevinHuSh requested a review from buua436 December 31, 2025 02:24
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

size:S This PR changes 10-29 lines, ignoring generated files.

1 participant