Best performance settings #2516
Replies: 1 comment
-
|
Hi @piotrkandziora! I'm Dosu and I’m helping the docling team. For best performance with Docling on AWS, focus on these settings and strategies:
If you want to leverage AWS Bedrock for VLM, Docling can route PDFs and images to VLM pipelines, but VLMs do not replace OCR for text recognition—OCR must be explicitly enabled. Timeout and slow inference can result from inefficient batching or model loading, so tune batch size and worker count for best results source. Let me know if you want a sample config or have specific bottlenecks in your pipeline! To reply, just mention @dosu. How did I do? Good | Irrelevant | Incorrect | Verbose | Hallucination | Report 🐛 | Other |
Beta Was this translation helpful? Give feedback.
Uh oh!
There was an error while loading. Please reload this page.
Uh oh!
There was an error while loading. Please reload this page.
-
Hello,
What best performance settings do you guys recommend for analyzing documents?
I am currently experimenting on a single AWS EC2 instance (m5.2xlarge, 8xCPU, 32GB, dockerized docling with num_threads=8 Acceleration Options - other instance with GPU support is also possible for me). Currently I am parsing pdf file (~140 pages) and it last quite long. In the future I would like to analyze pdfs and confluence pages.
My current experimental pipeline settings:
As it's hosted on AWS I am thinking also about leveraging Bedrock for VLM.
Beta Was this translation helpful? Give feedback.
All reactions