Fuzzer: Add an option to preserve imports and exports #7300

kripken · 2025-02-19T00:32:25Z

Normally the fuzzer will add new imports (for the JS stuff we can
call), and new exports as it adds functions. This adds an option to
avoid that, and instead keep the imports and exports fixed. This
is useful when we are used to modify an existing fuzzer's testcase,
as its connection to the JS side should be considered fixed (and it
will run in that fuzzer's JS, not ours).

This is added as

--fuzz-preserve-imports-exports

for wasm-opt.

Diff without whitespace is smaller, as much of the diff in
fuzzing.cpp is to put stuff behind a flag.

tlively · 2025-02-19T21:37:10Z

scripts/fuzz_opt.py

@@ -1749,6 +1748,40 @@ def can_run_on_wasm(self, wasm):
        return not CLOSED_WORLD and all_disallowed(['shared-everything']) and not NANS


+# Test --fuzz-preserve-imports-exports, which never modifies imports or exports.
+class PreserveImportsExports(TestCaseHandler):
+    frequency = 0.1


It seems wasteful to run this even 10% of the time, since it's only testing the fuzzer itself. Can we test this option some other way?

We do similar things for ClusterFuzz. Fuzzing our fuzzing tools is useful I think!

How else would we do it? Doing some deterministic workload will not find more bugs over time. Doing it in some fuzzer on the side will require us to remember to run it, unlike the main fuzzer that I constantly run. But I'm open to more ideas?

I've actually noticed significant slowdowns in the fuzzer due to the fuzzing of the ClusterFuzz fuzzing, so I do think it would be nice to have a general solution here. Maybe an opt-in or opt-out option to fuzz_opt.py like --[no-]fuzz-self or something like that?

Just to check, is that the initial setup? The first time the ClusterFuzz handler runs, it creates the bundle, which takes multiple seconds, unfortunately. But later runs are super-fast. I agree this is annoying when you just want to fuzz 100 iterations, of course.

I'd be fine with an option to skip such handlers. Or some CLI command that lets you pick the fuzzer handlers in a generic way? Right now they are a list in Python...

Note though that these do not just test the fuzzer itself. The ClusterFuzz handler and this new one do generate new shapes of wasm that we don't otherwise see. ClusterFuzz will emit stuff in the wasm that we can't compare to our interpreter, for example, like multiple builds and runs of the wasm. And this new handler will tack on changes to an existing testcase. These things can find actual bugs.

Just to check, is that the initial setup? The first time the ClusterFuzz handler runs, it creates the bundle, which takes multiple seconds, unfortunately. But later runs are super-fast.

Yes, it's likely that this is what I saw. It would have been near the beginning of the fuzzing run when I am actively watching the logging to see if it fails quickly.

Note though that these do not just test the fuzzer itself...

Ok, I don't think I fully appreciated that. Adding an option to control this doesn't seem urgent, then.

tlively · 2025-02-19T21:38:41Z

scripts/fuzz_opt.py

+            # Imports and exports are relevant.
+            lines = [line for line in wat.splitlines() if '(export ' in line or '(import ' in line]
+
+            # Ignore type names, which may vary from $5 to $17 in uninteresting


Is it $5 to $17 because of the particular contents of preserve_input.dat? This comment seems very likely to become stale at some point. Can we generalize it?

Ah, that is just an example. I'll clarified it now.

tlively · 2025-02-19T21:40:59Z

src/tools/fuzzing/fuzzing.cpp

+    // Ensure the initial memory can fit the segment (so we don't just trap),
+    // but only do so when the segment is at a reasonable offset (to avoid
+    // validation errors on the initial size >= 4GB in wasm32, but also to
+    // avoid OOM errors on trying to allocate too much initial memory).
+    Address ONE_GB = 1024 * 1024 * 1024;
+    if (maxOffset <= ONE_GB) {


How is this related to the rest of the PR?

Sorry, I found this issue while fuzzing with this PR, and should have split it out... Removed.

tlively · 2025-02-19T21:46:58Z

test/lit/fuzz-preserve-imports-exports.wast

+
+;; And, without the flag, we do generate both imports and exports.
+
+;; RUN: wasm-opt %s.ttf --initial-fuzz=%s -all -ttf                                 --metrics -S -o - | filecheck %s --check-prefix=NORMAL


I think the large space here hurts readability more than it helps. Another way to achieve the desired effect would be to put the | filecheck ... on a second RUN: line, using a backslash at the end of the first RUN: line to join them.

tlively · 2025-02-20T19:15:05Z

scripts/fuzz_opt.py

+        # We cannot run if the module has (ref exn) in globals (because we have
+        # no way to generate an exn in a non-function context). The fuzzer is
+        # careful not to emit that in testcases, but after the optimizer runs,
+        # we may end up with struct fields getting refined to that.


What's the connection between struct fields having type (ref exn) and globals having type (ref exn)?

If a struct has a field (ref exn) and we pick that type for a global, we will end up doing struct.new with a list of initial values of the fields, one of whom will be (ref exn).

And do we always construct global structs by first constructing separate globals for each of their fields? Otherwise the (ref exn) type would not actually appear in the list of global definitions.

This all seems rather brittle :/

No, we do not create separate globals for each field. The problem is not in a global having this type, the problem is trying to create such a value. E.g.

(type (struct $struct_with_exn (field i32) (field (ref exn)) (field f64))) (global $struct_with_exn (struct.new $struct_with_exn (i32.const 42) .. what do we put here for (ref exn)? .. (f64.const 3.14159) ))

We have no way to create such a value in the global scope. In a function, we use the trick of (try (throw)) i.e. we throw and catch, giving us a (ref exn).

Unless I'm missing something, that example won't be caught by the code below, though.

Oh, good point! 😆 That should say "struct", not "global"... I'll fix it.

It is really hard to test this stuff, unfortunately. This is in response to a random fuzzer input, and if we added such inputs to the test suite, they'd get out of date quickly, with no good way to find a replacement that happens to hit the same issue...

tlively · 2025-02-20T20:03:29Z

scripts/fuzz_opt.py

-        # no way to generate an exn in a non-function context). The fuzzer is
+        # We leave if the module has (ref exn) in struct fields (because we have
+        # no way to generate an exn in a non-function context, and if we picked
+        # that struct for a global, we'd end up needing a (ref enx) in the


Suggested change

# that struct for a global, we'd end up needing a (ref enx) in the

# that struct for a global, we'd end up needing a (ref exn) in the

Do we also need to check for globals that have type (ref exn) like the previous version of this code did, or do we think that that should never occur?

I don't think it can be a problem. The fuzzer doesn't create such things originally, and no optimization pass can either AFAICT. (Well, a pass can refine (global $g (ref null exn) (..)) to a non-nullable type, but the value already existed there - so the value must still be valid to use.)

Struct types are a hazard because (1) we can refine their fields to (ref exn) and also (2) we may decide to create new globals with them. A single existing global, in comparison, is fine.

Another option for structs could be to avoid creating globals for structs with such fields, but we'd need to recurse to the types of fields etc.; the current hack seems simpler.

We do already have some logic for detecting uninhabitable heap types, e.g. those with cycles of non-nullable references. Perhaps we could extend it to detect types that are uninhabitable in constant expressions without adding new imports.

But the hack seems fine for now.

kripken added 30 commits February 5, 2025 11:18

start

febd47b

exports

288e043

imports

1031b02

more

c940881

build

64afa44

test

c6e7d6f

test

f85d0c9

test

0c87886

sad

fe355b9

test

bdfed08

test

c43e50e

format

c9c515e

format

c3d3f13

fuzz

aa52170

sad

920d6a1

fix

bc022ee

fix

d7fab8d

fix

ee4c8e0

fix

1fbc119

wat

3afe863

fix

43d1e9f

fuzz.tag

2a07050

fix

53d123f

test

93b68ae

Merge remote-tracking branch 'myself/fuzz.imported.tag' into fuzz.pie

966490b

fix

d17bf82

work

acbee3e

work

10323cf

work

4adf65d

undo

34272e2

kripken added 7 commits February 14, 2025 13:01

Merge remote-tracking branch 'origin/main' into fuzz.pie

70d7a84

Merge remote-tracking branch 'origin/main' into fuzz.pie

7d8f903

finish

398d7e8

finish

ca0b3a4

fix

4a58485

done

844fab9

note

f00da09

kripken requested a review from tlively February 19, 2025 00:32

update help

0e1646e

tlively reviewed Feb 19, 2025

View reviewed changes

kripken added 3 commits February 19, 2025 14:59

work around (ref exn) issue

f9e7b2a

clarify comment

263959f

undo

616cb0d

tlively reviewed Feb 20, 2025

View reviewed changes

kripken added 4 commits February 20, 2025 11:17

split RUN lines

e088e18

fix.struct.check

a92aecc

typo

8d803b5

typo

70fed2b

tlively reviewed Feb 20, 2025

View reviewed changes

tlively approved these changes Feb 20, 2025

View reviewed changes

kripken merged commit 379e5ec into WebAssembly:main Feb 20, 2025
14 checks passed

kripken deleted the fuzz.pie branch February 20, 2025 22:14

kripken mentioned this pull request Apr 1, 2025

use Owi to check for equivalence of the fuzzer-generated programs #7420

Draft

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Fuzzer: Add an option to preserve imports and exports #7300

Fuzzer: Add an option to preserve imports and exports #7300

Uh oh!

kripken commented Feb 19, 2025

tlively Feb 19, 2025

kripken Feb 20, 2025

tlively Feb 20, 2025

kripken Feb 20, 2025

tlively Feb 20, 2025

tlively Feb 19, 2025

kripken Feb 20, 2025

tlively Feb 19, 2025

kripken Feb 20, 2025

tlively Feb 19, 2025

kripken Feb 20, 2025

tlively Feb 20, 2025

kripken Feb 20, 2025

tlively Feb 20, 2025

kripken Feb 20, 2025

tlively Feb 20, 2025

kripken Feb 20, 2025

tlively Feb 20, 2025

kripken Feb 20, 2025

kripken Feb 20, 2025

tlively Feb 20, 2025

Uh oh!


		;; And, without the flag, we do generate both imports and exports.

		;; RUN: wasm-opt %s.ttf --initial-fuzz=%s -all -ttf --metrics -S -o - \| filecheck %s --check-prefix=NORMAL

	# that struct for a global, we'd end up needing a (ref enx) in the
	# that struct for a global, we'd end up needing a (ref exn) in the

Fuzzer: Add an option to preserve imports and exports #7300

Fuzzer: Add an option to preserve imports and exports #7300

Uh oh!

Conversation

kripken commented Feb 19, 2025

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Uh oh!