Skip to content

Conversation

@Ludy87
Copy link
Collaborator

@Ludy87 Ludy87 commented Oct 29, 2025

Description of Changes

What was changed

  • Introduced a shared PDFJS_DEFAULT_OPTIONS object and applied it across frontend modules using PDF.js:
    • Sets cMapUrl, cMapPacked, and standardFontDataUrl so PDF.js can correctly load CMaps and standard fonts.
    • Switches all GlobalWorkerOptions.workerSrc usages to the dynamic pdfjsPath + 'pdf.worker.mjs'.
  • Exposed pdfjsPath globally in navbar.html to support deployments under subpaths/reverse proxies.
  • Updated multiple pages and utilities to use the new defaults:
    • DecryptFiles.js, downloader.js, merge.js, Multi-Tool (PdfContainer.js), and feature pages (add-image.js, adjust-contrast.js, change-metadata.js, crop.js, pdf-to-csv.js, sign.js, rotate-pdf.html, convert/pdf-to-pdfa.html, merge-pdfs.html).
  • Comparison tool hardening:
    • Added robust worker protocol (type: 'COMPARE' | 'SET_*') and safer logs.
    • Improved text tokenization, adaptive batch diffing with overlap de-duplication, and color fallbacks.
    • Early validation for empty/oversized/invalid PDFs with clearer user messages.
    • Disabled PDF.js worker in specific templates where legacy CMap handling caused issues (disableWorker: true) to prevent rendering failures.
    • UI/UX tweaks: processing state on the compare button, progress hints during text extraction, and more resilient error handling.
  • Fixed relative path to popularity data (./files/popularity.txt) to respect base paths.

Why the change was made

  • PDFs using CID fonts (e.g., CJK and other complex scripts) were rendering with missing glyphs or falling back incorrectly because CMaps and standard font data were not being provided to PDF.js. Providing proper CMap and font resources resolves CID character visibility issues and related console warnings.
  • Some environments (subpath deployments, reverse proxies) broke PDF.js worker/static asset resolution; centralizing pdfjsPath and using it consistently fixes this.
  • The comparison feature struggled with large/complex documents and lacked robust validation; improvements reduce timeouts, improve accuracy, and provide clearer feedback.

Closes #4391


Checklist

General

Documentation

UI Changes (if applicable)

  • Screenshots or videos demonstrating the UI changes are attached (e.g., as comments or direct attachments in the PR)

Testing (if applicable)

  • I have tested my changes locally. Refer to the Testing Guide for more details.
Standardizes PDF.js configuration across all JS modules and templates by introducing a PDFJS_DEFAULT_OPTIONS object and using a dynamic pdfjsPath for worker and font resources. Updates all PDF.js usages to use the new options, improving compatibility and font loading. Enhances the PDF comparison tool with better error handling, robust file validation, improved worker communication, and more user-friendly feedback for large or invalid files. Also fixes minor issues such as relative file path in homecard.js and button event handling in compare.html.
Copilot AI review requested due to automatic review settings October 29, 2025 17:48
@dosubot dosubot bot added size:XL This PR changes 500-999 lines ignoring generated files. Bugfix Pull requests that fix bugs labels Oct 29, 2025
@stirlingbot stirlingbot bot added the Front End Issues or pull requests related to front-end development label Oct 29, 2025
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR refactors PDF.js usage across the application to improve reliability and maintainability. The changes introduce centralized PDF.js configuration with CMap and font options, update worker path references to use a global pdfjsPath variable, and enhance error handling in the PDF comparison feature.

  • Centralized PDF.js configuration with PDFJS_DEFAULT_OPTIONS across multiple files
  • Updated PDF.js worker paths from hardcoded strings to use pdfjsPath variable
  • Improved error handling and validation in PDF comparison feature
  • Refactored PDF comparison worker to use diff.js library instead of custom LCS algorithm

Reviewed Changes

Copilot reviewed 17 out of 17 changed files in this pull request and generated 4 comments.

Show a summary per file
File Description
rotate-pdf.html Added module import for PDF.js and PDFJS_DEFAULT_OPTIONS configuration
misc/compare.html Major refactor: added error handling, PDF validation, worker timeout, and improved structure
merge-pdfs.html Updated worker path to use pdfjsPath variable
fragments/navbar.html Defined global pdfjsPath constant for consistent path resolution
convert/pdf-to-pdfa.html Updated worker path to use pdfjsPath variable
js/pages/*.js Added PDFJS_DEFAULT_OPTIONS and updated worker paths in sign, pdf-to-csv, crop, change-metadata, adjust-contrast, add-image files
js/multitool/PdfContainer.js Added PDFJS_DEFAULT_OPTIONS and updated PDF loading
js/merge.js Added PDFJS_DEFAULT_OPTIONS and worker path update
js/homecard.js Fixed fetch path from absolute to relative
js/downloader.js Added PDFJS_DEFAULT_OPTIONS and updated worker paths
js/compare/pdfWorker.js Refactored to use diff.js library, improved message handling, and enhanced batching logic
js/DecryptFiles.js Added PDFJS_DEFAULT_OPTIONS to decrypt operations

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Refines the batching logic in pdfWorker.js to handle overlaps and processed word tracking more accurately, reducing duplicate processing and improving diff granularity. Also exposes pdfjsLib as a global variable in rotate-pdf.html for easier access in other scripts.
Pass the status element to the extractText function to update progress dynamically for each file, simplifying the code and improving maintainability.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Bugfix Pull requests that fix bugs Front End Issues or pull requests related to front-end development size:XL This PR changes 500-999 lines ignoring generated files.

1 participant