Skip to content

Conversation

@antonarhipov
Copy link

  • Grid layout and random positioning of watermarks per page
  • Random orientation
  • Per watermark font randomization of color and size
  • Text watermark per-letter randomization of font, color, size, and orientation
  • Sharing
  • Random mirroring of image watermarks
  • Add WatermarkRandomizer utility and enhance watermark request handling with new validations and randomization features

Description of Changes


Checklist

General

Documentation

UI Changes (if applicable)

  • Screenshots or videos demonstrating the UI changes are attached (e.g., as comments or direct attachments in the PR)
Screenshot 2025-10-29 at 13 13 17 Screenshot 2025-10-29 at 13 12 29

Testing (if applicable)

  • I have tested my changes locally. Refer to the Testing Guide for more details.
- Grid layout and random positioning of watermarks per page
- Random orientation
- Per watermark font randomization of color and size
- Text watermark per-letter randomization of font, color, size, and orientation
- Sharing
- Random mirroring of image watermarks
- Add `WatermarkRandomizer` utility and enhance watermark request handling with new validations and randomization features
Copilot AI review requested due to automatic review settings October 29, 2025 11:15
@dosubot dosubot bot added size:XXL This PR changes 1000+ lines ignoring generated files. enhancement New feature or request labels Oct 29, 2025
@stirlingbot stirlingbot bot added Java Pull requests that update Java code Front End Issues or pull requests related to front-end development Back End Issues related to back-end development API API-related issues or pull requests Test Testing-related issues or pull requests and removed enhancement New feature or request labels Oct 29, 2025
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR adds comprehensive watermark validation and randomization capabilities to the PDF watermarking feature. The changes introduce extensive input validation, advanced watermark customization options (random positioning, rotation ranges, per-letter variations, mirroring, etc.), and deterministic randomness for testing purposes.

Key Changes:

  • Added request validation with detailed error messages for watermark parameters
  • Introduced WatermarkRandomizer utility class for deterministic random watermark attributes
  • Extended watermark request model with 20+ new configuration fields for advanced customization
  • Enhanced HTML template with collapsible advanced options and client-side validation

Reviewed Changes

Copilot reviewed 7 out of 7 changed files in this pull request and generated 13 comments.

Show a summary per file
File Description
WatermarkController.java Added comprehensive validation logic and refactored watermark rendering to support advanced features
AddWatermarkRequest.java Extended model with validation annotations and new fields for enhanced watermark control
WatermarkRandomizer.java New utility class providing deterministic randomization for watermark attributes
add-watermark.html Enhanced UI with advanced options panel and client-side validation
WatermarkValidationTest.java New test suite validating watermark parameter constraints
WatermarkControllerIntegrationTest.java New integration tests covering various watermark scenarios
WatermarkRandomizerTest.java Comprehensive unit tests for randomizer utility

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +264 to +265
<input type="number" id="perLetterColorCount" name="perLetterColorCount" class="form-control" value="4" min="1" max="12">
<small class="form-text text-muted">Number of colors to randomly select from (1-12, default: 4)</small>
Copy link

Copilot AI Oct 29, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The maximum value for perLetterColorCount in the HTML form is set to 12, but the validation annotation in AddWatermarkRequest.java (line 164) sets @max(value = 20). This inconsistency could cause confusion for users who set values between 12 and 20 through the API directly. Consider aligning these values or documenting the reason for the difference.

Suggested change
<input type="number" id="perLetterColorCount" name="perLetterColorCount" class="form-control" value="4" min="1" max="12">
<small class="form-text text-muted">Number of colors to randomly select from (1-12, default: 4)</small>
<input type="number" id="perLetterColorCount" name="perLetterColorCount" class="form-control" value="4" min="1" max="20">
<small class="form-text text-muted">Number of colors to randomly select from (1-20, default: 4)</small>
Copilot uses AI. Check for mistakes.
…ecurity/WatermarkController.java


remove unused variable

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
@balazs-szucs balazs-szucs added the Priority: Medium Issues or pull requests with medium priority label Oct 29, 2025
antonarhipov and others added 5 commits October 29, 2025 13:32
…ecurity/WatermarkController.java

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
…ecurity/WatermarkController.java

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
…ecurity/WatermarkController.java


save handling of bound parameters

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
…ecurity/WatermarkController.java


safe handling of bounds

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
@Ludy87
Copy link
Collaborator

Ludy87 commented Oct 29, 2025

/prdeploy

@Ludy87 Ludy87 linked an issue Oct 29, 2025 that may be closed by this pull request
1 task
@stirlingbot stirlingbot bot added the pr-deployed Pull request has been deployed to a test environment label Oct 29, 2025
@stirlingbot
Copy link
Contributor

stirlingbot bot commented Oct 29, 2025

🚀 PR Test Deployment

Your PR has been deployed for testing!

🔗 Test URL: http://185.252.234.121:4758
Security Disabled

This deployment will be automatically cleaned up when the PR is closed.

Copy link
Collaborator

@balazs-szucs balazs-szucs left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi, I took quick look, made some suggestions.

<!-- Positioning Mode -->
<div class="mb-3">
<div class="form-check">
<input type="checkbox" id="randomPosition" name="randomPosition" class="form-check-input">
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Image

Checkboxes are a bit buggy on hover. Copy the checkbox template from Convert PDF to PDF-Image checkbox on this html and use that. (it is done via custom CSS that is overriding bootstrap hence, it is not your "standard bootstrap checkbox" as basically everybody assumes)

"Unsupported image file extension in: '%s'. Supported extensions: .png, .jpg, .jpeg, .gif, .bmp",
originalFilename);
log.warn("Validation failed: {}", errorMsg);
throw new IllegalArgumentException(errorMsg);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can use ExceptionUtils to throw custom Exceptions

throw ExceptionUtils.createIllegalArgumentException

Although, I have pending PR about exceptions because it is currently not very consistent across the codebase.

Boolean randomColor = request.getRandomColor();
if (customColor != null && !Boolean.TRUE.equals(randomColor)) {
// Check if color is valid hex format (#RRGGBB or #RRGGBBAA)
if (!customColor.matches("^#([A-Fa-f0-9]{6}|[A-Fa-f0-9]{8})$")) {
Copy link
Collaborator

@balazs-szucs balazs-szucs Oct 29, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can convert this to a pre-compiled Pattern, which is more performant because it is compiled when the class is initialized. We also have a dedicated class for regex patterns at RegexPatternUtils. In that case the comment also becomes redundant because you can name your regex pattern with something like this:

private static final Pattern COLOR_PATTERN = Pattern.compile("^#([A-Fa-f0-9]{6}|[A-Fa-f0-9]{8})$");

// Then use:
if (!COLOR_PATTERN.matcher(customColor).matches()) {
String charStr = String.valueOf(c);

// Determine per-letter attributes
float letterSize =
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So, this might be a nitpick (feel free to ignore), but is it wise to do this per-character thing? I mean, performance-wise, I am scared to think what would happen when using a larger PDF 😅

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This was in the original feature request (#3421), so I thought I'd give it a try.
I'll do some testing with large PDFs and see if I can optimize this.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

True, thanks for taking a look. I think there's probably a clever way to quasi-optimize this with some trickery by randomly calculating certain characters and using those randomized characters as anchors for others. In that case, it would still give the "look" of being random but would be more efficient. This might also be a non-issue in case your testing reveals that even large PDFs get processed without significant CPU pressure. Anyway, in the grand scheme of things, it's not that big of a deal, so no need to lose sleep over this.

Copy link
Author

@antonarhipov antonarhipov Nov 1, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Implemented caching using the PDF Form XObject containing all watermarks once for the first page. It then draws that cached form on each subsequent page with a single operation. I tested the algorithm with a 1000-page PDF book, with 30 watermarks per page and per-letter variations for size and rotation, and it processed the document in under 1 second (on M3 mac pro).


@Nested
@DisplayName("Annotation-based Validation Tests")
class AnnotationBasedValidationTests {
Copy link
Collaborator

@balazs-szucs balazs-szucs Oct 29, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So, this doesn't actually test anything, in order to test you have to do:

The big "A"'s of testing:

  1. Arrange
  2. Act
  3. Assert

There are no assertions in this whole class, or am I missing something? 😅

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Indeed. That's an oversight. I'll make sure to correct this.

assertEquals(200, response2.getStatusCode().value(), "Second request should succeed");

// Note: Exact byte comparison may fail due to PDF metadata (timestamps, etc.)
// But both should produce valid PDFs of similar size
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wouldn't comparing like the content stream be more reliable? 🤔

Copy link
Collaborator

@balazs-szucs balazs-szucs left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A few more suggestions.

<div class="mb-3">
<div class="form-check">
<input type="checkbox" id="randomPosition" name="randomPosition" class="form-check-input">
<label for="randomPosition" class="form-check-label">Random Positioning</label>
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this should be on by default

<div class="mb-3">
<label for="count">Number of Watermarks</label>
<input type="number" id="count" name="count" class="form-control" value="1" min="1" max="1000">
<small class="form-text text-muted">Number of watermark instances per page (1-1000)</small>
Copy link
Collaborator

@balazs-szucs balazs-szucs Oct 29, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the default should be around ~30 or so. Not 1. It looks a bit weird. For context here are two examples with my suggestion vs default:

EDIT: This might also be better capped at a more reasonable maximum. 100 watermarks is already so much that the original content is barely visible. At 1000, I would be very surprised if any PDF would still be usable.

Image vs Image
<div id="advancedOptions" class="collapse">
<!-- Watermark Count -->
<div class="mb-3">
<label for="count">Number of Watermarks</label>
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We use internationalization with properties files. You'll need to add the strings on front-end to src/main/resources/messages_en_GB.properties and reference them in your Thymeleaf templates using the #{key} syntax (or just copy the structure of the other items on this PDF)

* @param count Maximum number of watermarks (0 for unlimited grid)
* @return List of [x, y] coordinate arrays
*/
public List<float[]> generateGridPositions(
Copy link
Collaborator

@balazs-szucs balazs-szucs Oct 29, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This does not respect margins, which means basically the custom margin (e.g., the Margin from Edges (points)) only works when the random position is on, because that takes the margin but the grid positioning does not:

Random positioning:



public float[] generateRandomPosition(
    float pageWidth, float pageHeight,
    float watermarkWidth, float watermarkHeight,
    float margin, // <- Has margin parameter
    Float boundsX, Float boundsY,
    Float boundsWidth, Float boundsHeight)

float effectiveX = (boundsX != null) ? boundsX : margin;
float effectiveY = (boundsY != null) ? boundsY : margin;

vs

Grid positioning:



public List<float[]> generateGridPositions(
    float pageWidth, float pageHeight,
    float watermarkWidth, float watermarkHeight,
    int widthSpacer, int heightSpacer,
    int count) // <- NO margin parameter
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I managed to fix this for some cases, but now I gave it a second thought, and looks like the margin parameter does not make sense to implement. There are too many edge cases where it wouldn't work: font type, font size, length of the text, rotations - in many of those cases, the watermark would shift out of the boundaries. Probably it isn't even something to optimize for - it's fine that the text partially shifts out of the visible area. While it could work for some cases, it adds a lot of complexity to the code and will be harder to maintain, so I better remove this parameter for good.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

so I better remove this parameter for good.

Sure, no worries. I think, I do agree with you on this. Thanks for looking into this.


// Calculate how many rows and columns can fit on the page based on spacers
int maxRows = (int) (pageHeight / (watermarkHeight + heightSpacer) + 1);
int maxCols = (int) (pageWidth / (watermarkWidth + widthSpacer) + 1);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This may be problematic on edge cases because of Off-By-One Error, e.g., it might exceed page boundaries because of the +1

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same as for margin: exceeding the page boundaries for the watermarks is actually OK.
However, I see the point here - we need to ensure that the number of rendered watermarks appearing on the page should match the number the user configured in the settings. I'll give this a deeper thought.

Copy link
Collaborator

@balazs-szucs balazs-szucs Oct 31, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I somewhat disagree. Users tend not to care about specifics, but the bigger problem is the overhead this may create because it compounds, which in turn makes the feature somewhat unreliable. The +1 isn't that much on its own; however, in this code, it determines how many times the loop can run, which means it can compound to a large difference. What makes it even worse is that it also compounds per page, so on a large poster-size PDF or a PDF with many pages, the difference would be quite significant. (as in performance degradation)

For a specific example, a 100-page document with A4 size (595x842 pixels), watermark 40*40, and spacer 10. With the bug, maxRows equals 17 and maxCols equals 12, so the loop runs 18 * 13 = 234 watermarks per page, totaling 23,400 watermarks. Without the bug, maxRows equals 16 and maxCols equals 11, so the loop runs 17 * 12 = 204 watermarks per page, totaling 20,400 watermarks. TLDR: 3,000 extra out-of-bounds watermark operations, which is approximately 14.7% overhead. On larger documents or with smaller watermarks relative to page size, this compounding effect could easily reach 40-50% overhead imho.

But I don't particularly think this would be a blocker or something, but still you can go down a very deep rabbit hole with these Off-By-One Errors.

* @param maxRotation Maximum rotation angle in degrees
* @return Random rotation angle in degrees
*/
public float generatePerLetterRotationInRange(float minRotation, float maxRotation) {
Copy link
Collaborator

@balazs-szucs balazs-szucs Oct 29, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would keep this, or generatePerLetterRotation. Having generatePerLetterRotationInRange AND generatePerLetterRotation seem to me like a bit redundant.

public Color generateRandomColor(boolean usePalette) {
if (usePalette) {
// Predefined palette of common colors
Color[] palette = {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you can extract this to the class like:

private static final Color[] COLOR_PALETTE = {
    Color.BLACK, Color.DARK_GRAY, Color.GRAY, Color.LIGHT_GRAY,
    Color.RED, Color.BLUE, Color.GREEN, Color.ORANGE,
    Color.MAGENTA, Color.CYAN, Color.PINK, Color.YELLOW
};

and call that here and also here:

    public Color generateRandomColorFromPalette(int colorCount) {
        // Predefined palette of common colors
        Color[] fullPalette = {
            Color.BLACK,
            Color.DARK_GRAY,
            Color.GRAY,
            Color.LIGHT_GRAY,
            Color.RED,
            Color.BLUE,
            Color.GREEN,
            Color.ORANGE,
            Color.MAGENTA,
            Color.CYAN,
            Color.PINK,
            Color.YELLOW
        };
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

API API-related issues or pull requests Back End Issues related to back-end development Front End Issues or pull requests related to front-end development Java Pull requests that update Java code pr-deployed Pull request has been deployed to a test environment Priority: Medium Issues or pull requests with medium priority size:XXL This PR changes 1000+ lines ignoring generated files. Test Testing-related issues or pull requests

3 participants