[CORE-14308] cl: better handing of non monotonic replicate requests #28294

bharathv · 2025-10-30T22:36:57Z

Split into two parts

61a04a8 - downgrades assert to an exception so it can be retried later
Rest of the commits reduces the likelihood of this happening.

Fixes: CORE-14308

Backports Required

Release Notes

none

This was originally an assert so we could catch things quickly during development. A single partition running into this doesn't seem like a reason to crash the broker. This commit changes it to a runtime exception so the replicator can recover on a new leader and continue. Do note that we still log an ERROR, so we are not losing any test coverage. What was originally an assert will not be caught as a BadLogLine.

Returns if the commit index was updated to the snapshot index

Currently it is possible that once the commit idx is updated to snapshot idx, the callers are notified right away and they may refer to untruncated offset translation state. In this case write_at_offset stm was seeding a wrong last_offset based on untruncated log's offset translation state. This commit defers the commit index notification until after the log truncation happened.

Copilot

Pull Request Overview

This PR improves handling of non-monotonic replicate requests in cluster link replication. The key change is downgrading an assertion to a recoverable exception, along with enhancements to reduce the likelihood of monotonicity violations occurring.

Key Changes:

Converts a vassert for monotonic offset checks into a throwable monotonicity_violation_exception that can be caught and retried
Refactors update_offset_from_snapshot to return a boolean indicating whether the commit index was updated, enabling proper notification propagation
Adds exception handling in the replicator to gracefully recover from monotonicity violations by stepping down the partition

Reviewed Changes

Copilot reviewed 7 out of 7 changed files in this pull request and generated 1 comment.

Show a summary per file

File	Description
src/v/raft/consensus.h	Changed method signature to return bool instead of void
src/v/raft/consensus.cc	Refactored commit index update notifications to be triggered based on return value rather than inside the update method
src/v/kafka/server/write_at_offset_stm.cc	Added trace logging when applying raft snapshots
src/v/cluster_link/service.cc	Replaced vassert with exception throw for monotonicity violations
src/v/cluster_link/replication/partition_replicator.h	Added field to track original start offset
src/v/cluster_link/replication/partition_replicator.cc	Enhanced exception handling and reset logic to use stored start offset
src/v/cluster_link/replication/deps.h	Defined new exception class for monotonicity violations

src/v/cluster_link/service.cc

michael-redpanda

Changes in cluster_link/replication make sense ot me - leaving the raft stuff for @mmaslankaprv

vbotbuildovich · 2025-10-31T03:29:35Z

CI test results

test results on build#75373

test_class	test_method	test_arguments	test_kind	job_url	test_status	passed	reason	test_history
DataMigrationsApiTest	test_creating_and_listing_migrations	null	integration	https://buildkite.com/redpanda/redpanda/builds/75373#019a3798-3528-4ead-b54e-566af0d514eb	FLAKY	16/21	upstream reliability is '96.35666347075743'. current run reliability is '76.19047619047619'. drift is 20.16619 and the allowed drift is set to 50. The test should PASS	https://redpanda.metabaseapp.com/dashboard/87-tests?tab=142-dt-individual-test-history&test_class=DataMigrationsApiTest&test_method=test_creating_and_listing_migrations
MountUnmountIcebergTest	test_simple_remount	{"cloud_storage_type": 1}	integration	https://buildkite.com/redpanda/redpanda/builds/75373#019a3798-3528-4ead-b54e-566af0d514eb	FLAKY	17/21	upstream reliability is '77.25225225225225'. current run reliability is '80.95238095238095'. drift is -3.70013 and the allowed drift is set to 50. The test should PASS	https://redpanda.metabaseapp.com/dashboard/87-tests?tab=142-dt-individual-test-history&test_class=MountUnmountIcebergTest&test_method=test_simple_remount
NodesDecommissioningTest	test_decommissioning_finishes_after_manual_cancellation	{"cloud_topic": true, "delete_topic": true}	integration	https://buildkite.com/redpanda/redpanda/builds/75373#019a3798-3529-4434-9972-6952f1a1a5ca	FLAKY	20/21	upstream reliability is '98.81656804733728'. current run reliability is '95.23809523809523'. drift is 3.57847 and the allowed drift is set to 50. The test should PASS	https://redpanda.metabaseapp.com/dashboard/87-tests?tab=142-dt-individual-test-history&test_class=NodesDecommissioningTest&test_method=test_decommissioning_finishes_after_manual_cancellation
RedpandaNodeOperationsSmokeTest	test_node_ops_smoke_test	{"cloud_storage_type": 1, "mixed_versions": false}	integration	https://buildkite.com/redpanda/redpanda/builds/75373#019a3798-3529-4434-9972-6952f1a1a5ca	FLAKY	20/21	upstream reliability is '100.0'. current run reliability is '95.23809523809523'. drift is 4.7619 and the allowed drift is set to 50. The test should PASS	https://redpanda.metabaseapp.com/dashboard/87-tests?tab=142-dt-individual-test-history&test_class=RedpandaNodeOperationsSmokeTest&test_method=test_node_ops_smoke_test

mmaslankaprv · 2025-10-31T06:53:39Z

src/v/raft/consensus.cc

        co_return;
    }
-    update_offsets_from_snapshot(metadata.value());
+    auto commit_idx_updated = update_offsets_from_snapshot(metadata.value());


i am wondering if it would be possible to do this as a last step in hydrate snapshot ?
Currently it seems that the race still may happen. If a check in stm manager verifying if offset is ready to be applied is executed before the co_await _configuration_manager.add scheduling point, for that the notification is not required.

Yes correct, the race is still theoretically possible.

i am wondering if it would be possible to do this as a last step in hydrate snapshot ?

This was my first thought but the thing I was unsure about is if we do this we move the raft logical start offset after log physical offset (in truncate_to_latest_snapshot()) has any other repercussions. There could be a reader that sees a logical offset, creates a reader in the truncated portion of the log, any implications of that?

github-actions bot added the area/redpanda label Oct 30, 2025

bharathv force-pushed the handle_monotonic branch from da19e57 to e7494c0 Compare October 30, 2025 23:02

bharathv added 5 commits October 30, 2025 16:04

raft: rename update_offset_from_snapshot

ddd04fb

raft: update the signature of update_offsets_from_snapshot

16171fc

Returns if the commit index was updated to the snapshot index

cl/write_at_offset_stm: improve logging

1579da4

bharathv force-pushed the handle_monotonic branch from e7494c0 to 1579da4 Compare October 30, 2025 23:04

bharathv marked this pull request as ready for review October 30, 2025 23:04

Copilot AI review requested due to automatic review settings October 30, 2025 23:04

Copilot AI reviewed Oct 30, 2025

View reviewed changes

src/v/cluster_link/service.cc Show resolved Hide resolved

bharathv requested review from michael-redpanda and mmaslankaprv October 30, 2025 23:05

michael-redpanda reviewed Oct 30, 2025

View reviewed changes

mmaslankaprv reviewed Oct 31, 2025

View reviewed changes

bharathv requested a review from mmaslankaprv October 31, 2025 16:23

dotnwat changed the title ~~cl: better handing of non monotonic replicate requests~~ Nov 1, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[CORE-14308] cl: better handing of non monotonic replicate requests #28294

[CORE-14308] cl: better handing of non monotonic replicate requests #28294

Uh oh!

bharathv commented Oct 30, 2025 •

edited by michael-redpanda

Loading

Copilot AI left a comment

Uh oh!

michael-redpanda left a comment

vbotbuildovich commented Oct 31, 2025

mmaslankaprv Oct 31, 2025

bharathv Oct 31, 2025 •

edited

Loading

Labels

4 participants

[CORE-14308] cl: better handing of non monotonic replicate requests #28294

Are you sure you want to change the base?

[CORE-14308] cl: better handing of non monotonic replicate requests #28294

Uh oh!

Conversation

bharathv commented Oct 30, 2025 • edited by michael-redpanda Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Backports Required

Release Notes

Copilot AI left a comment

Choose a reason for hiding this comment

Pull Request Overview

Reviewed Changes

Uh oh!

michael-redpanda left a comment

Choose a reason for hiding this comment

vbotbuildovich commented Oct 31, 2025

CI test results

mmaslankaprv Oct 31, 2025

Choose a reason for hiding this comment

bharathv Oct 31, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Labels

4 participants

bharathv commented Oct 30, 2025 •

edited by michael-redpanda

Loading

bharathv Oct 31, 2025 •

edited

Loading