Skip to content

[Strings] Add a string-builtins feature, and lift/lower automatically when enabled #7601

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 26 commits into
base: main
Choose a base branch
from
Open
Changes from 1 commit
Commits
File filter

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Prev Previous commit
Next Next commit
Merge remote-tracking branch 'origin/main' into sb
  • Loading branch information
kripken committed May 15, 2025
commit 8ad61b59409cb0c315ab772df886b876877c6bb3
29 changes: 12 additions & 17 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -184,29 +184,24 @@ There are a few differences between Binaryen IR and the WebAssembly language:
all function reference types will be emitted as `funcref`.

* `br_if` output types are more refined in Binaryen IR: they have the type of
the value, when a value flows in. In the wasm spec the type is that of the
branch target, which may be less refined. Using the more refined type here
ensures that we optimize in the best way possible, using all the type
information, but it does mean that some roundtripping operations may look a
little different. In particular, when we emit a `br_if` whose type is more
refined in Binaryen IR then we emit a cast right after it, so that the
output has the right type in the wasm spec. That may cause a few bytes of
extra size in rare cases (we avoid this overhead in the common case where
the `br_if` value is unused).
the sent value operand, when it exists. In the Wasm spec the type is that
of the branch target, which may be less refined. Using the more refined
type here ensures that we optimize in the best way possible, using all the
type information, but it does mean that some roundtripping operations may
look a little different. In particular, when we emit a `br_if` whose type
is more refined in Binaryen IR, then we emit a cast right after it to
recover the more refined type. That may cause a few bytes of extra size in
rare cases (we avoid this overhead in the common case where the `br_if`
value is unused).

* Strings
* When the string builtins feature is enabled (`--enable-string=builtins`),

* When the string builtins feature is enabled (`--enable-string-builtins`),
string operations are optimized. First, string imports are lifted into
stringref operations, before any default optimization passes. Those
stringref operations can then be optimized (e.g., a concat of constants
turns into a concatenated constant). When we are about to finish running
default optimizations, we lower stringref back into string builtins.
* Binaryen allows string views (`stringview_wtf16` etc.) to be cast using
`ref.cast`. This simplifies the IR, as it allows `ref.cast` to always be
used in all places (and it is lowered to `ref.as_non_null` where possible
in the optimizer). The stringref spec does not seem to allow this though,
and to fix that the binary writer will replace `ref.cast` that casts a
string view to a non-nullable type to `ref.as_non_null`. A `ref.cast` of a
string view that is a no-op is skipped entirely.

As a result, you might notice that round-trip conversions (wasm => Binaryen IR
=> wasm) change code a little in some corner cases.
Expand Down
You are viewing a condensed version of this merge commit. You can view the full changes here.