C++: share TOperand across IR stages#4825
Conversation
|
I've started a CPP-Differences job: internal link |
|
The tuple count parser I was using didn't handle query stages properly - there's visible effects there when I look at the logs directly, and the ones I've seen so far involve joins that now use |
This fixes a performance issue where the whole MemoryOperand table was scanned in some predicates that used only NonPhiMemoryOperand
MathiasVP
left a comment
There was a problem hiding this comment.
I think this looks good to me! I'll let more eyes look at this before approving, though. Do we want to run a fresh CPP-difference?
Co-authored-by: Mathias Vorreiter Pedersen <mathiasvp@github.com>
|
Are there fresh CPP-Differences results? I'm fine with the code changes as long as the diffs aren't surprising. |
|
New CPP-Differences: internal link |
|
It looks like the disappearing operands are mostly related to fallthrough returns for functions with an |
|
I've started caching the charpred of the per-stage |
|
I don't see anything to explain the kamailio and wireshark slowdowns in local evaluation or tuple counts, but they're far too big to dismiss as cloud variance without a proper rerun. I've started a rebuild with just those two projects. |
|
No result changes from those runs, and the rerun has sub-1% changes in performance, so I think I'm OK with the performance aspect now. I've just pushed an autoformat fix. |
|
There's another autoformat error: |
dbartol
left a comment
There was a problem hiding this comment.
LGTM modulo autoformat test failures. Feel free to merge once those are fixed.
| */ | ||
| class Operand extends TStageOperand { | ||
| Operand() { | ||
| cached Operand() { |
There was a problem hiding this comment.
I'm curious about whether caching at every level of the Operand hierarchy was necessary, or whether caching only the leaf classes would be sufficient. Performance numbers look great as-is, so no reason to change anything now. Just food for thought.


This follows on to #3612 by merging
TOperandacross IR stages. We create a singleTOperandIPA type, with individual branches of this type for operands created directly from the AST (TRegisterOperandandTNonSsaMemoryOperand), phi nodes created by unaliased SSA (TUnaliasedPhiOperand), and phi and chi nodes created by aliased SSA (TAliasedPhiOperandandTAliasedChiOperand), as well as an empty placeholder branch to use in parameterized modules (TNoOperand).Each stage's
Operand.qllcreates a private union type,TStageOperand, and the existingOperandclass now extends it.I've opened this PR in draft mode to run the differences job for performance testing - in my testing it looked like it reduced tuples in the IR stage by about 11%, but wall-clock time was too noisy to draw any conclusions. There weren't any tuple-count effects on later stages of the query I was running.