-
Notifications
You must be signed in to change notification settings - Fork 5.9k
ci(i18n): add locale check workflow & translation sync tool #4739
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: V2
Are you sure you want to change the base?
Conversation
Introduces a Python script for checking and synchronizing JSON translation files against a reference, ensuring consistency across locales. Adds a GitHub Actions workflow to automatically verify and comment on translation file changes in pull requests.
Changed the regular expressions in the workflow to match locale directories using hyphens (e.g., en-US) instead of underscores. This ensures the workflow correctly identifies translation.json files in the updated directory structure.
Refactors sync_translations.py to better handle reference and target file path resolution, including fallback logic and improved safety checks. Also adds support for splitting space-separated file lists and auto-discovery of locale files when --files is omitted. Removes unnecessary debug output from check_locales.yml.
🚀 Translation Verification Summary🔄 Reference File:
|
Refactored and updated the de-DE translation.json file to remove unused keys, add missing tags, and improve consistency in tool descriptions and options. Also added the --check flag to the check_locales.yml workflow for enhanced locale validation.
Improves .github/scripts/sync_translations.py with clearer docstrings, better reporting, and more robust handling of missing/extra translation keys. Adds scripts/ignore_locales.toml to specify keys/paths to ignore during locale synchronization checks.
Refactors and extends .github/scripts/sync_translations.py to support reading, updating, and cleaning up ignored translation keys via scripts/ignore_locales.toml. Now, when a previously ignored key is translated, it is automatically removed from the ignore list. Also updates the de-DE translation.json to move the 'mobile' section and adjusts formatting in ignore_locales.toml.
Refines the regex for matching locale translation.json files in both the labeler config and the check_locales workflow by removing the underscore from the language code pattern. This ensures consistent and accurate detection of locale files.
Updated sync_translations.py to detect and optionally remove ignore entries referencing non-existent keys in the reference translation. Cleaned up scripts/ignore_locales.toml by removing many obsolete ignore entries, reducing maintenance overhead and improving accuracy.
Eliminates the addition of report messages when ignore entries reference missing reference keys, streamlining the reporting output in the translation sync script.
Improved the translation sync script to better handle dotted reference keys and type mismatches, ensuring more robust merging and reporting of missing translations. Updated German and English translation files to use flattened keys for certain options, expanded tool descriptions, and added or reorganized many UI strings for new features and improved clarity.
Removed .github/scripts/check_language_json.py and updated the workflow to use .github/scripts/sync_translations.py for translation checks and syncing. Updated the usage documentation in sync_translations.py. Refactored scripts/counter_translation_v2.py to use sync_translations.py for translation progress calculation, simplifying logic and removing TOML-based ignore handling.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for this PR, Ludy, looks in a good state.
I'm in two minds about whether it's a good idea to merge this now considering the comments I've made. I'd be happy enough for it to go in like it is, but I suspect it will cause quite a lot of churn in the translation files with work that is likely coming up very soon to trim down and potentially change format for the translations.
If other people are happy for this to go in now and we accept the churn, I'm happy for it to go in. Otherwise, I'm happy to either cherrypick the changes from this branch and update the code for Toml if we go down that route, or update this branch after we come back to it once the extra keys are deleted etc 😄
| continue | ||
| break | ||
|
|
||
| parsed = tomllib.loads(text) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Interesting, I didn't realise we had a Toml file for ignores. This has got me thinking though, Toml is probably a better format for storing the translation files than JSON for several reasons (more similar to V1 translations, no indentation needed, doesn't have lack of trailing comma issues like JSON).
There's already some separate work I was planning to do in the next few days to trim out loads of dead translations in the files, so I think I'll look into changing the translation files into Toml at the same time assuming other people (hopefully including you!) also think it's a good idea 😄
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have to be honest, I really don't care whether it's .json, .properties, or .toml. 🤪
| parser.add_argument( | ||
| "--procent-translations", | ||
| action="store_true", | ||
| help="Report percentage of translated values (not same as English).", | ||
| ) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure I understand what this commandline option is for. Is it just for reporting percentages of translated values, but in another language? Is there a separate flag for reporting percentages in English?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It returns the percentage of values that have been translated, i.e., are no longer identical to the English version.
See: https://github.com/Stirling-Tools/Stirling-PDF/blob/427c52e0cc9e5c09d8775e16ac1a797a8cc26ae6/scripts/counter_translation_v2.py
Stirling-PDF/scripts/counter_translation_v2.py
Lines 33 to 54 in 427c52e
| def percent_done_for_file(file_path: Path) -> int: | |
| """ | |
| Calls sync_translations.py --procent-translations for a single locale file. | |
| Returns an int 0..100. | |
| """ | |
| # en-GB / en-US are always 100% by definition | |
| norm = str(file_path).replace("\\", "/") | |
| if norm.endswith("en-GB/translation.json") or norm.endswith( | |
| "en-US/translation.json" | |
| ): | |
| return 100 | |
| cmd = [ | |
| "python", | |
| str(SYNC_SCRIPT), | |
| "--reference-file", | |
| str(REF_FILE), | |
| "--files", | |
| str(file_path), | |
| "--check", | |
| "--procent-translations", | |
| ] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah right I understand, that's what I hoped it would do. I'd suggest that the flag should be renamed into English to match the others - --report-percentages or something
Co-authored-by: James Brunton <jbrunton96@gmail.com>
Changed the GitHub Actions workflow to use the 'pull_request' event instead of 'pull_request_target'. Updated the event check and trigger to improve security and ensure the workflow runs only for pull requests.
Updated both sync_translations.py and counter_translation_v2.py to use the more descriptive --report-percentages flag instead of --procent-translations for reporting translation percentages. #4739 (comment)
🚀 V2 Auto-Deployment Complete!Your V2 PR with the new frontend/backend split architecture has been deployed! 🔗 Direct Test URL (non-SSL) http://185.252.234.121:4739 🔐 Secure HTTPS URL: https://4739.ssl.stirlingpdf.cloud This deployment will be automatically cleaned up when the PR is closed. 🔄 Auto-deployed because PR title or branch name contains V2/version2/React keywords. |
This pull request introduces an improved workflow for validating and synchronizing localization (translation) JSON files in pull requests. The main changes include replacing the old
check_language_json.pyscript with a new, more robustsync_translations.pyscript, updating configuration to recognize translation files, and adding a dedicated GitHub Actions workflow to automatically check translation files on PRs. This significantly enhances automation and reliability for translation consistency checks.Localization Automation Improvements:
.github/workflows/check_locales.yml) that automatically checks modified translation files in PRs for consistency, posts a summary as a PR comment, and fails the job if issues are found. This includes secure file handling, reference file selection, and cleanup steps..github/labeler-config-srvaroa.yml) to recognizesync_translations.pyand translation JSON files for appropriate labeling.Script and Workflow Updates:
check_language_json.pyscript with the newsync_translations.pyscript in the translation sync workflow (.github/workflows/sync_files_v2.yml).check_language_json.pyscript, consolidating translation validation logic in the new script.These changes streamline and automate translation file validation, reduce manual review effort, and help ensure the integrity of localization files in the codebase.