Skip to content

Conversation

@King-Dylan
Copy link
Member

@King-Dylan King-Dylan commented Oct 31, 2025

What problem does this PR solve?

Issue Number: close #64200

Problem Summary:
When decorrelating correlated subqueries, if the join key forms a unique key constraint, the LIMIT operator in the subquery becomes redundant since the join key guarantees at most one row. This optimization removes such redundant LIMIT operators and also eliminates redundant MaxOneRow wrappers to improve query performance.

What changed and how does it work?

Check List

Tests

  • Unit test
  • Integration test
  • Manual test (add detailed scripts or steps below)
  • No need to test
    • I checked and no code files have been changed.

Side effects

  • Performance regression: Consumes more CPU
  • Performance regression: Consumes more Memory
  • Breaking backward compatibility

Documentation

  • Affects user behaviors
  • Contains syntax changes
  • Contains variable changes
  • Contains experimental features
  • Changes MySQL compatibility

Release note

Please refer to Release Notes Language Style Guide to write a quality release note.

None
@ti-chi-bot ti-chi-bot bot added do-not-merge/needs-tests-checked release-note-none Denotes a PR that doesn't merit a release note. sig/planner SIG: Planner size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files. labels Oct 31, 2025
@ti-chi-bot
Copy link

ti-chi-bot bot commented Oct 31, 2025

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by:
Once this PR has been reviewed and has the lgtm label, please assign 0xpoe for approval. For more information see the Code Review Process.
Please ensure that each of them provides their approval before proceeding.

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@codecov
Copy link

codecov bot commented Oct 31, 2025

Codecov Report

❌ Patch coverage is 88.65248% with 16 lines in your changes missing coverage. Please review.
✅ Project coverage is 73.2572%. Comparing base (bdd2b6f) to head (9fa092a).
⚠️ Report is 6 commits behind head on master.

Additional details and impacted files
@@               Coverage Diff                @@
##             master     #64214        +/-   ##
================================================
+ Coverage   72.7334%   73.2572%   +0.5237%     
================================================
  Files          1859       1859                
  Lines        503870     505340      +1470     
================================================
+ Hits         366482     370198      +3716     
+ Misses       115127     112955      -2172     
+ Partials      22261      22187        -74     
Flag Coverage Δ
integration 41.8497% <83.6879%> (?)
unit 72.3040% <83.6879%> (+0.0049%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

Components Coverage Δ
dumpling 52.8700% <ø> (ø)
parser ∅ <ø> (∅)
br 46.3776% <ø> (-0.0076%) ⬇️
🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.
canRemove = true
}
}
} else if proj, ok := mChild.(*logicalop.LogicalProjection); ok {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When will this happen? Can you provide some cases for this situations?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

apply.SetChildren(outerPlan, innerPlan)
return s.optimize(ctx, p, groupByColumn)
} else if m, ok := innerPlan.(*logicalop.LogicalMaxOneRow); ok {
// Check if MaxOneRow's child is Limit or TopN, and if we can remove it for LeftOuterJoin
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems this PR doesn't handle the TopN case?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

At this stage, topn is still just a LIMIT, so it doesn’t matter.

Comment on lines 549 to 560
if decExpr := apply.DeCorColFromEqExpr(cond); decExpr != nil {
if sf, ok := decExpr.(*expression.ScalarFunction); ok && sf.FuncName.L == ast.EQ {
args := sf.GetArgs()
if len(args) == 2 {
if innerCol, ok := args[1].(*expression.Column); ok {
if sel.Schema().Contains(innerCol) {
innerJoinKeys = append(innerJoinKeys, innerCol)
}
}
}
}
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we make this a function? It has been used several times

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

if sf, ok := decExpr.(*expression.ScalarFunction); ok && sf.FuncName.L == ast.EQ {
args := sf.GetArgs()
if len(args) == 2 {
if innerCol, ok := args[1].(*expression.Column); ok {
Copy link
Contributor

@Reminiscent Reminiscent Oct 31, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we guarantee the args[1] rather than args[0] is the column from the inner side?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

DeCorColFromEqExpr preserves the order of args[1], which is on the inner side.

break
}
}
if allMatch && len(keyInfo) == len(innerJoinKeys) && len(keyInfo) > 0 {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems no need to check len(keyInfo) == len(innerJoinKeys). For example, the unique key is (a, b). And the filter columns contain (a, b, c). The (a, b) can guarantee the uniqueness.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, good idea~

// isJoinKeyUniqueKey checks if join key is unique key.
// Returns true if the join key forms a unique key constraint.
func isJoinKeyUniqueKey(apply *logicalop.LogicalApply, plan base.LogicalPlan) bool {
var hasMultiRowOperator func(base.LogicalPlan) bool
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This function needs to guarantee contains all the cases which will generate more rows. If there lacks some cases, it will generate the wring answer. For example, please add some cases related to the unnest function, it will generate more rows? So here should be considered more seriously.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, there may be mis-deletions down the road or in certain cases. But the NoDecorrelate hint lets us sidestep the issue, even if we miss maintaining the list when new funcs are introduced.

@King-Dylan
Copy link
Member Author

/retest

@King-Dylan King-Dylan requested review from AilinKid, D3Hunter and qw4990 and removed request for AilinKid, D3Hunter and qw4990 October 31, 2025 20:13
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

release-note-none Denotes a PR that doesn't merit a release note. sig/planner SIG: Planner size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files.

2 participants