-
Notifications
You must be signed in to change notification settings - Fork 6k
planner: optimize decorrelate by removing redundant LIMIT when join key is unique #64214
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
|
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: The full list of commands accepted by this bot can be found here.
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
Codecov Report❌ Patch coverage is Additional details and impacted files@@ Coverage Diff @@
## master #64214 +/- ##
================================================
+ Coverage 72.7334% 73.2572% +0.5237%
================================================
Files 1859 1859
Lines 503870 505340 +1470
================================================
+ Hits 366482 370198 +3716
+ Misses 115127 112955 -2172
+ Partials 22261 22187 -74
Flags with carried forward coverage won't be shown. Click here to find out more.
🚀 New features to boost your workflow:
|
| canRemove = true | ||
| } | ||
| } | ||
| } else if proj, ok := mChild.(*logicalop.LogicalProjection); ok { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
When will this happen? Can you provide some cases for this situations?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
| apply.SetChildren(outerPlan, innerPlan) | ||
| return s.optimize(ctx, p, groupByColumn) | ||
| } else if m, ok := innerPlan.(*logicalop.LogicalMaxOneRow); ok { | ||
| // Check if MaxOneRow's child is Limit or TopN, and if we can remove it for LeftOuterJoin |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It seems this PR doesn't handle the TopN case?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
At this stage, topn is still just a LIMIT, so it doesn’t matter.
pkg/planner/core/rule_decorrelate.go
Outdated
| if decExpr := apply.DeCorColFromEqExpr(cond); decExpr != nil { | ||
| if sf, ok := decExpr.(*expression.ScalarFunction); ok && sf.FuncName.L == ast.EQ { | ||
| args := sf.GetArgs() | ||
| if len(args) == 2 { | ||
| if innerCol, ok := args[1].(*expression.Column); ok { | ||
| if sel.Schema().Contains(innerCol) { | ||
| innerJoinKeys = append(innerJoinKeys, innerCol) | ||
| } | ||
| } | ||
| } | ||
| } | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we make this a function? It has been used several times
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
pkg/planner/core/rule_decorrelate.go
Outdated
| if sf, ok := decExpr.(*expression.ScalarFunction); ok && sf.FuncName.L == ast.EQ { | ||
| args := sf.GetArgs() | ||
| if len(args) == 2 { | ||
| if innerCol, ok := args[1].(*expression.Column); ok { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we guarantee the args[1] rather than args[0] is the column from the inner side?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
DeCorColFromEqExpr preserves the order of args[1], which is on the inner side.
pkg/planner/core/rule_decorrelate.go
Outdated
| break | ||
| } | ||
| } | ||
| if allMatch && len(keyInfo) == len(innerJoinKeys) && len(keyInfo) > 0 { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It seems no need to check len(keyInfo) == len(innerJoinKeys). For example, the unique key is (a, b). And the filter columns contain (a, b, c). The (a, b) can guarantee the uniqueness.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, good idea~
| // isJoinKeyUniqueKey checks if join key is unique key. | ||
| // Returns true if the join key forms a unique key constraint. | ||
| func isJoinKeyUniqueKey(apply *logicalop.LogicalApply, plan base.LogicalPlan) bool { | ||
| var hasMultiRowOperator func(base.LogicalPlan) bool |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This function needs to guarantee contains all the cases which will generate more rows. If there lacks some cases, it will generate the wring answer. For example, please add some cases related to the unnest function, it will generate more rows? So here should be considered more seriously.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, there may be mis-deletions down the road or in certain cases. But the NoDecorrelate hint lets us sidestep the issue, even if we miss maintaining the list when new funcs are introduced.
|
/retest |
What problem does this PR solve?
Issue Number: close #64200
Problem Summary:
When decorrelating correlated subqueries, if the join key forms a unique key constraint, the LIMIT operator in the subquery becomes redundant since the join key guarantees at most one row. This optimization removes such redundant LIMIT operators and also eliminates redundant MaxOneRow wrappers to improve query performance.
What changed and how does it work?
Check List
Tests
Side effects
Documentation
Release note
Please refer to Release Notes Language Style Guide to write a quality release note.