-
Notifications
You must be signed in to change notification settings - Fork 786
Fuzzer: Add an option to preserve imports and exports #7300
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
@@ -1749,6 +1748,40 @@ def can_run_on_wasm(self, wasm): | |||
return not CLOSED_WORLD and all_disallowed(['shared-everything']) and not NANS | |||
|
|||
|
|||
# Test --fuzz-preserve-imports-exports, which never modifies imports or exports. | |||
class PreserveImportsExports(TestCaseHandler): | |||
frequency = 0.1 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It seems wasteful to run this even 10% of the time, since it's only testing the fuzzer itself. Can we test this option some other way?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We do similar things for ClusterFuzz. Fuzzing our fuzzing tools is useful I think!
How else would we do it? Doing some deterministic workload will not find more bugs over time. Doing it in some fuzzer on the side will require us to remember to run it, unlike the main fuzzer that I constantly run. But I'm open to more ideas?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've actually noticed significant slowdowns in the fuzzer due to the fuzzing of the ClusterFuzz fuzzing, so I do think it would be nice to have a general solution here. Maybe an opt-in or opt-out option to fuzz_opt.py like --[no-]fuzz-self
or something like that?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just to check, is that the initial setup? The first time the ClusterFuzz handler runs, it creates the bundle, which takes multiple seconds, unfortunately. But later runs are super-fast. I agree this is annoying when you just want to fuzz 100 iterations, of course.
I'd be fine with an option to skip such handlers. Or some CLI command that lets you pick the fuzzer handlers in a generic way? Right now they are a list in Python...
Note though that these do not just test the fuzzer itself. The ClusterFuzz handler and this new one do generate new shapes of wasm that we don't otherwise see. ClusterFuzz will emit stuff in the wasm that we can't compare to our interpreter, for example, like multiple builds and runs of the wasm. And this new handler will tack on changes to an existing testcase. These things can find actual bugs.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just to check, is that the initial setup? The first time the ClusterFuzz handler runs, it creates the bundle, which takes multiple seconds, unfortunately. But later runs are super-fast.
Yes, it's likely that this is what I saw. It would have been near the beginning of the fuzzing run when I am actively watching the logging to see if it fails quickly.
Note though that these do not just test the fuzzer itself...
Ok, I don't think I fully appreciated that. Adding an option to control this doesn't seem urgent, then.
scripts/fuzz_opt.py
Outdated
# Imports and exports are relevant. | ||
lines = [line for line in wat.splitlines() if '(export ' in line or '(import ' in line] | ||
|
||
# Ignore type names, which may vary from $5 to $17 in uninteresting |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is it $5 to $17
because of the particular contents of preserve_input.dat
? This comment seems very likely to become stale at some point. Can we generalize it?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, that is just an example. I'll clarified it now.
src/tools/fuzzing/fuzzing.cpp
Outdated
// Ensure the initial memory can fit the segment (so we don't just trap), | ||
// but only do so when the segment is at a reasonable offset (to avoid | ||
// validation errors on the initial size >= 4GB in wasm32, but also to | ||
// avoid OOM errors on trying to allocate too much initial memory). | ||
Address ONE_GB = 1024 * 1024 * 1024; | ||
if (maxOffset <= ONE_GB) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How is this related to the rest of the PR?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry, I found this issue while fuzzing with this PR, and should have split it out... Removed.
|
||
;; And, without the flag, we do generate both imports and exports. | ||
|
||
;; RUN: wasm-opt %s.ttf --initial-fuzz=%s -all -ttf --metrics -S -o - | filecheck %s --check-prefix=NORMAL |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the large space here hurts readability more than it helps. Another way to achieve the desired effect would be to put the | filecheck ...
on a second RUN:
line, using a backslash at the end of the first RUN:
line to join them.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
scripts/fuzz_opt.py
Outdated
# We cannot run if the module has (ref exn) in globals (because we have | ||
# no way to generate an exn in a non-function context). The fuzzer is | ||
# careful not to emit that in testcases, but after the optimizer runs, | ||
# we may end up with struct fields getting refined to that. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What's the connection between struct fields having type (ref exn)
and globals having type (ref exn)
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If a struct has a field (ref exn)
and we pick that type for a global, we will end up doing struct.new
with a list of initial values of the fields, one of whom will be (ref exn)
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
And do we always construct global structs by first constructing separate globals for each of their fields? Otherwise the (ref exn)
type would not actually appear in the list of global definitions.
This all seems rather brittle :/
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, we do not create separate globals for each field. The problem is not in a global having this type, the problem is trying to create such a value. E.g.
(type (struct $struct_with_exn (field i32) (field (ref exn)) (field f64)))
(global $struct_with_exn (struct.new $struct_with_exn
(i32.const 42)
.. what do we put here for (ref exn)? ..
(f64.const 3.14159)
))
We have no way to create such a value in the global scope. In a function, we use the trick of (try (throw))
i.e. we throw and catch, giving us a (ref exn)
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Unless I'm missing something, that example won't be caught by the code below, though.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh, good point! 😆 That should say "struct", not "global"... I'll fix it.
It is really hard to test this stuff, unfortunately. This is in response to a random fuzzer input, and if we added such inputs to the test suite, they'd get out of date quickly, with no good way to find a replacement that happens to hit the same issue...
scripts/fuzz_opt.py
Outdated
# no way to generate an exn in a non-function context). The fuzzer is | ||
# We leave if the module has (ref exn) in struct fields (because we have | ||
# no way to generate an exn in a non-function context, and if we picked | ||
# that struct for a global, we'd end up needing a (ref enx) in the |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
# that struct for a global, we'd end up needing a (ref enx) in the | |
# that struct for a global, we'd end up needing a (ref exn) in the |
Do we also need to check for globals that have type (ref exn)
like the previous version of this code did, or do we think that that should never occur?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think it can be a problem. The fuzzer doesn't create such things originally, and no optimization pass can either AFAICT. (Well, a pass can refine (global $g (ref null exn) (..))
to a non-nullable type, but the value already existed there - so the value must still be valid to use.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Struct types are a hazard because (1) we can refine their fields to (ref exn)
and also (2) we may decide to create new globals with them. A single existing global, in comparison, is fine.
Another option for structs could be to avoid creating globals for structs with such fields, but we'd need to recurse to the types of fields etc.; the current hack seems simpler.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We do already have some logic for detecting uninhabitable heap types, e.g. those with cycles of non-nullable references. Perhaps we could extend it to detect types that are uninhabitable in constant expressions without adding new imports.
But the hack seems fine for now.
Normally the fuzzer will add new imports (for the JS stuff we can
call), and new exports as it adds functions. This adds an option to
avoid that, and instead keep the imports and exports fixed. This
is useful when we are used to modify an existing fuzzer's testcase,
as its connection to the JS side should be considered fixed (and it
will run in that fuzzer's JS, not ours).
This is added as
for wasm-opt.
Diff without whitespace is smaller, as much of the diff in
fuzzing.cpp is to put stuff behind a flag.