Feat: Add image (JPG/PNG) OCR support to document parsing and extraction

The problem

Maxun has a feature where users can upload a document and get back its content as text, HTML, or a list of links (soon also a summary and AI-based structured extraction). Right now this only works for PDFs - if you try to upload an image, like a scanned receipt as a JPG or PNG, it gets rejected outright since the system only lets PDFs through.

A lot of the documents people actually want text from are just photos or scans, not PDFs.

What needs to change

Add JPG and PNG as supported upload types and run OCR on them to pull out their text, the same way scanned PDFs are already handled. Maxun already has OCR built in for scanned PDFs (tesseract.js and a PaddleOCR-based tool), so this is mostly about reusing that pipeline for images directly.

What "done" looks like

A user uploads a .jpg or .png file instead of a PDF.

The system runs OCR on it and returns the extracted text in the same output formats already supported for PDFs (markdown/html/links, plus extraction).

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Feat: Add image (JPG/PNG) OCR support to document parsing and extraction #1110

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Uh oh!

Feat: Add image (JPG/PNG) OCR support to document parsing and extraction #1110

Description

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions