Skip to content

Conversation

@Ludy87
Copy link
Collaborator

@Ludy87 Ludy87 commented Oct 26, 2025

This pull request introduces an improved workflow for validating and synchronizing localization (translation) JSON files in pull requests. The main changes include replacing the old check_language_json.py script with a new, more robust sync_translations.py script, updating configuration to recognize translation files, and adding a dedicated GitHub Actions workflow to automatically check translation files on PRs. This significantly enhances automation and reliability for translation consistency checks.

Localization Automation Improvements:

  • Added a new GitHub Actions workflow (.github/workflows/check_locales.yml) that automatically checks modified translation files in PRs for consistency, posts a summary as a PR comment, and fails the job if issues are found. This includes secure file handling, reference file selection, and cleanup steps.
  • Updated the labeler configuration (.github/labeler-config-srvaroa.yml) to recognize sync_translations.py and translation JSON files for appropriate labeling.

Script and Workflow Updates:

  • Replaced the use of the old check_language_json.py script with the new sync_translations.py script in the translation sync workflow (.github/workflows/sync_files_v2.yml).
  • Removed the obsolete check_language_json.py script, consolidating translation validation logic in the new script.

These changes streamline and automate translation file validation, reduce manual review effort, and help ensure the integrity of localization files in the codebase.

Introduces a Python script for checking and synchronizing JSON translation files against a reference, ensuring consistency across locales. Adds a GitHub Actions workflow to automatically verify and comment on translation file changes in pull requests.
@stirlingbot stirlingbot bot added v2 Issues or pull requests related to the v2 branch Github labels Oct 26, 2025
Changed the regular expressions in the workflow to match locale directories using hyphens (e.g., en-US) instead of underscores. This ensures the workflow correctly identifies translation.json files in the updated directory structure.
Refactors sync_translations.py to better handle reference and target file path resolution, including fallback logic and improved safety checks. Also adds support for splitting space-separated file lists and auto-discovery of locale files when --files is omitted. Removes unnecessary debug output from check_locales.yml.
@stirlingbot
Copy link
Contributor

stirlingbot bot commented Oct 26, 2025

🚀 Translation Verification Summary

🔄 Reference File: pr-branch-translation.json (branch root: /home/runner/work/Stirling-PDF/Stirling-PDF/pr-branch)

📄 File: frontend/public/locales/de-DE/translation.json

💬 Translated: 83.68%
Passed: All keys in sync.

  • Untranslated values: 436 / 2672 (16.32%)

📄 File: frontend/public/locales/en-GB/translation.json

Passed: All keys in sync.


📄 File: frontend/public/locales/en-US/translation.json

Passed: All keys in sync.


✅ Overall Status: Success

Thanks @Ludy87 for keeping translations in sync! 🎉

Ludy87 added 14 commits October 26, 2025 12:04
Refactored and updated the de-DE translation.json file to remove unused keys, add missing tags, and improve consistency in tool descriptions and options. Also added the --check flag to the check_locales.yml workflow for enhanced locale validation.
Improves .github/scripts/sync_translations.py with clearer docstrings, better reporting, and more robust handling of missing/extra translation keys. Adds scripts/ignore_locales.toml to specify keys/paths to ignore during locale synchronization checks.
Refactors and extends .github/scripts/sync_translations.py to support reading, updating, and cleaning up ignored translation keys via scripts/ignore_locales.toml. Now, when a previously ignored key is translated, it is automatically removed from the ignore list. Also updates the de-DE translation.json to move the 'mobile' section and adjusts formatting in ignore_locales.toml.
@Ludy87 Ludy87 marked this pull request as ready for review October 26, 2025 12:58
Copilot AI review requested due to automatic review settings October 26, 2025 12:58
@Ludy87 Ludy87 changed the title Add translation sync script and CI workflow Oct 26, 2025
@stirlingbot stirlingbot bot removed the enhancement New feature or request label Oct 26, 2025
@stirlingbot stirlingbot bot added the ci Changes to CI configuration files and scripts label Oct 26, 2025
Refines the regex for matching locale translation.json files in both the labeler config and the check_locales workflow by removing the underscore from the language code pattern. This ensures consistent and accurate detection of locale files.
Updated sync_translations.py to detect and optionally remove ignore entries referencing non-existent keys in the reference translation. Cleaned up scripts/ignore_locales.toml by removing many obsolete ignore entries, reducing maintenance overhead and improving accuracy.
@stirlingbot stirlingbot bot removed the Translation label Oct 26, 2025
Eliminates the addition of report messages when ignore entries reference missing reference keys, streamlining the reporting output in the translation sync script.
Improved the translation sync script to better handle dotted reference keys and type mismatches, ensuring more robust merging and reporting of missing translations. Updated German and English translation files to use flattened keys for certain options, expanded tool descriptions, and added or reorganized many UI strings for new features and improved clarity.
Removed .github/scripts/check_language_json.py and updated the workflow to use .github/scripts/sync_translations.py for translation checks and syncing. Updated the usage documentation in sync_translations.py. Refactored scripts/counter_translation_v2.py to use sync_translations.py for translation progress calculation, simplifying logic and removing TOML-based ignore handling.
Copy link

@jbrunton96 jbrunton96 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for this PR, Ludy, looks in a good state.

I'm in two minds about whether it's a good idea to merge this now considering the comments I've made. I'd be happy enough for it to go in like it is, but I suspect it will cause quite a lot of churn in the translation files with work that is likely coming up very soon to trim down and potentially change format for the translations.

If other people are happy for this to go in now and we accept the churn, I'm happy for it to go in. Otherwise, I'm happy to either cherrypick the changes from this branch and update the code for Toml if we go down that route, or update this branch after we come back to it once the extra keys are deleted etc 😄

continue
break

parsed = tomllib.loads(text)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Interesting, I didn't realise we had a Toml file for ignores. This has got me thinking though, Toml is probably a better format for storing the translation files than JSON for several reasons (more similar to V1 translations, no indentation needed, doesn't have lack of trailing comma issues like JSON).

There's already some separate work I was planning to do in the next few days to trim out loads of dead translations in the files, so I think I'll look into changing the translation files into Toml at the same time assuming other people (hopefully including you!) also think it's a good idea 😄

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have to be honest, I really don't care whether it's .json, .properties, or .toml. 🤪

Comment on lines 561 to 565
parser.add_argument(
"--procent-translations",
action="store_true",
help="Report percentage of translated values (not same as English).",
)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure I understand what this commandline option is for. Is it just for reporting percentages of translated values, but in another language? Is there a separate flag for reporting percentages in English?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It returns the percentage of values ​​that have been translated, i.e., are no longer identical to the English version.
See: https://github.com/Stirling-Tools/Stirling-PDF/blob/427c52e0cc9e5c09d8775e16ac1a797a8cc26ae6/scripts/counter_translation_v2.py

def percent_done_for_file(file_path: Path) -> int:
"""
Calls sync_translations.py --procent-translations for a single locale file.
Returns an int 0..100.
"""
# en-GB / en-US are always 100% by definition
norm = str(file_path).replace("\\", "/")
if norm.endswith("en-GB/translation.json") or norm.endswith(
"en-US/translation.json"
):
return 100
cmd = [
"python",
str(SYNC_SCRIPT),
"--reference-file",
str(REF_FILE),
"--files",
str(file_path),
"--check",
"--procent-translations",
]

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah right I understand, that's what I hoped it would do. I'd suggest that the flag should be renamed into English to match the others - --report-percentages or something

@Ludy87 Ludy87 marked this pull request as draft October 27, 2025 15:33
Ludy87 and others added 9 commits October 28, 2025 08:12
Co-authored-by: James Brunton <jbrunton96@gmail.com>
Changed the GitHub Actions workflow to use the 'pull_request' event instead of 'pull_request_target'. Updated the event check and trigger to improve security and ensure the workflow runs only for pull requests.
Updated both sync_translations.py and counter_translation_v2.py to use the more descriptive --report-percentages flag instead of --procent-translations for reporting translation percentages.
#4739 (comment)
@stirlingbot
Copy link
Contributor

stirlingbot bot commented Oct 29, 2025

🚀 V2 Auto-Deployment Complete!

Your V2 PR with the new frontend/backend split architecture has been deployed!

🔗 Direct Test URL (non-SSL) http://185.252.234.121:4739

🔐 Secure HTTPS URL: https://4739.ssl.stirlingpdf.cloud

This deployment will be automatically cleaned up when the PR is closed.

🔄 Auto-deployed because PR title or branch name contains V2/version2/React keywords.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

ci Changes to CI configuration files and scripts Github size:XXL This PR changes 1000+ lines ignoring generated files. v2 Issues or pull requests related to the v2 branch

3 participants