The Wayback Machine - https://web.archive.org/web/20260529143443/https://github.com/github/codeql/pull/4825
Skip to content

C++: share TOperand across IR stages#4825

Merged
MathiasVP merged 15 commits into
github:mainfrom
rdmarsh2:rdmarsh2/cpp/operand-reuse
Mar 3, 2021
Merged

C++: share TOperand across IR stages#4825
MathiasVP merged 15 commits into
github:mainfrom
rdmarsh2:rdmarsh2/cpp/operand-reuse

Conversation

@rdmarsh2
Copy link
Copy Markdown
Contributor

This follows on to #3612 by merging TOperand across IR stages. We create a single TOperand IPA type, with individual branches of this type for operands created directly from the AST (TRegisterOperand and TNonSsaMemoryOperand), phi nodes created by unaliased SSA (TUnaliasedPhiOperand), and phi and chi nodes created by aliased SSA (TAliasedPhiOperand and TAliasedChiOperand), as well as an empty placeholder branch to use in parameterized modules (TNoOperand).

Each stage's Operand.qll creates a private union type, TStageOperand, and the existing Operand class now extends it.

I've opened this PR in draft mode to run the differences job for performance testing - in my testing it looked like it reduced tuples in the IR stage by about 11%, but wall-clock time was too noisy to draw any conclusions. There weren't any tuple-count effects on later stages of the query I was running.

@rdmarsh2 rdmarsh2 requested a review from dbartol December 15, 2020 00:00
@dbartol dbartol self-assigned this Dec 15, 2020
@rdmarsh2
Copy link
Copy Markdown
Contributor Author

I've started a CPP-Differences job: internal link

@rdmarsh2
Copy link
Copy Markdown
Contributor Author

The tuple count parser I was using didn't handle query stages properly - there's visible effects there when I look at the logs directly, and the ones I've seen so far involve joins that now use Operand predicates but previously used predicates from a specific subtype of Operand

This fixes a performance issue where the whole MemoryOperand table was
scanned in some predicates that used only NonPhiMemoryOperand
@rdmarsh2 rdmarsh2 marked this pull request as ready for review January 26, 2021 01:04
@rdmarsh2 rdmarsh2 requested review from a team as code owners January 26, 2021 01:04
Copy link
Copy Markdown
Contributor

@MathiasVP MathiasVP left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this looks good to me! I'll let more eyes look at this before approving, though. Do we want to run a fresh CPP-difference?

Comment thread cpp/ql/src/semmle/code/cpp/ir/implementation/raw/Operand.qll Outdated
Comment thread cpp/ql/src/semmle/code/cpp/ir/implementation/internal/TOperand.qll Outdated
@dbartol
Copy link
Copy Markdown

dbartol commented Feb 10, 2021

Are there fresh CPP-Differences results? I'm fine with the code changes as long as the diffs aren't surprising.

@rdmarsh2
Copy link
Copy Markdown
Contributor Author

New CPP-Differences: internal link

@rdmarsh2
Copy link
Copy Markdown
Contributor Author

It looks like the disappearing operands are mostly related to fallthrough returns for functions with an int return type. I'm not sure what semantics we actually want for those. I'll think about it, but it might end up being a question for @dbartol when he's back.

@rdmarsh2
Copy link
Copy Markdown
Contributor Author

I've started caching the charpred of the per-stage Operand class, and started a new CPP-Differences job: internal link

@MathiasVP MathiasVP added the no-change-note-required This PR does not need a change note label Feb 26, 2021
@rdmarsh2
Copy link
Copy Markdown
Contributor Author

I don't see anything to explain the kamailio and wireshark slowdowns in local evaluation or tuple counts, but they're far too big to dismiss as cloud variance without a proper rerun. I've started a rebuild with just those two projects.

@rdmarsh2
Copy link
Copy Markdown
Contributor Author

rdmarsh2 commented Mar 1, 2021

No result changes from those runs, and the rerun has sub-1% changes in performance, so I think I'm OK with the performance aspect now. I've just pushed an autoformat fix.

@jbj
Copy link
Copy Markdown
Contributor

jbj commented Mar 2, 2021

There's another autoformat error: ql/cpp/ql/src/semmle/code/cpp/ir/implementation/internal/TOperand.qll would change by autoformatting

dbartol
dbartol previously approved these changes Mar 2, 2021
Copy link
Copy Markdown

@dbartol dbartol left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM modulo autoformat test failures. Feel free to merge once those are fixed.

*/
class Operand extends TStageOperand {
Operand() {
cached Operand() {
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm curious about whether caching at every level of the Operand hierarchy was necessary, or whether caching only the leaf classes would be sufficient. Performance numbers look great as-is, so no reason to change anything now. Just food for thought.

@MathiasVP MathiasVP merged commit 721ba5e into github:main Mar 3, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

C# C++ no-change-note-required This PR does not need a change note

4 participants