-
Notifications
You must be signed in to change notification settings - Fork 693
[CORE-14308] cl: better handing of non monotonic replicate requests #28294
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: dev
Are you sure you want to change the base?
Conversation
da19e57 to
e7494c0
Compare
This was originally an assert so we could catch things quickly during development. A single partition running into this doesn't seem like a reason to crash the broker. This commit changes it to a runtime exception so the replicator can recover on a new leader and continue. Do note that we still log an ERROR, so we are not losing any test coverage. What was originally an assert will not be caught as a BadLogLine.
Returns if the commit index was updated to the snapshot index
Currently it is possible that once the commit idx is updated to snapshot idx, the callers are notified right away and they may refer to untruncated offset translation state. In this case write_at_offset stm was seeding a wrong last_offset based on untruncated log's offset translation state. This commit defers the commit index notification until after the log truncation happened.
e7494c0 to
1579da4
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
This PR improves handling of non-monotonic replicate requests in cluster link replication. The key change is downgrading an assertion to a recoverable exception, along with enhancements to reduce the likelihood of monotonicity violations occurring.
Key Changes:
- Converts a
vassertfor monotonic offset checks into a throwablemonotonicity_violation_exceptionthat can be caught and retried - Refactors
update_offset_from_snapshotto return a boolean indicating whether the commit index was updated, enabling proper notification propagation - Adds exception handling in the replicator to gracefully recover from monotonicity violations by stepping down the partition
Reviewed Changes
Copilot reviewed 7 out of 7 changed files in this pull request and generated 1 comment.
Show a summary per file
| File | Description |
|---|---|
| src/v/raft/consensus.h | Changed method signature to return bool instead of void |
| src/v/raft/consensus.cc | Refactored commit index update notifications to be triggered based on return value rather than inside the update method |
| src/v/kafka/server/write_at_offset_stm.cc | Added trace logging when applying raft snapshots |
| src/v/cluster_link/service.cc | Replaced vassert with exception throw for monotonicity violations |
| src/v/cluster_link/replication/partition_replicator.h | Added field to track original start offset |
| src/v/cluster_link/replication/partition_replicator.cc | Enhanced exception handling and reset logic to use stored start offset |
| src/v/cluster_link/replication/deps.h | Defined new exception class for monotonicity violations |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Changes in cluster_link/replication make sense ot me - leaving the raft stuff for @mmaslankaprv
CI test resultstest results on build#75373
|
| co_return; | ||
| } | ||
| update_offsets_from_snapshot(metadata.value()); | ||
| auto commit_idx_updated = update_offsets_from_snapshot(metadata.value()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i am wondering if it would be possible to do this as a last step in hydrate snapshot ?
Currently it seems that the race still may happen. If a check in stm manager verifying if offset is ready to be applied is executed before the co_await _configuration_manager.add scheduling point, for that the notification is not required.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes correct, the race is still theoretically possible.
i am wondering if it would be possible to do this as a last step in hydrate snapshot ?
This was my first thought but the thing I was unsure about is if we do this we move the raft logical start offset after log physical offset (in truncate_to_latest_snapshot()) has any other repercussions. There could be a reader that sees a logical offset, creates a reader in the truncated portion of the log, any implications of that?
Split into two parts
Fixes: CORE-14308
Backports Required
Release Notes