Skip to content

Made FuncCastEmulation make and export thunks for exports #3474

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 2 commits into
base: main
Choose a base branch
from

Conversation

joemarshall
Copy link

This is a fix for some problems I had when building pyodide with upstream wasm, which exposed that the EMULATE_FUNCTION_POINTER_CASTS function doesn't properly work between SIDE_MODULE and MAIN_MODULE yet.

In particular it fixes:

  1. Using dlsym to call a function in a SIDE_MODULE is entirely broken with this setting on.
  2. If a SIDE_MODULE passes a function pointer to the MAIN_MODULE and the main module calls it, the code crashes.

This is because currently the code only outputs thunks for things in the wasm table.
This breaks dynamic linking emscripten SIDE_MODULES because they don't have a table,
they instead load things into the main module table on import.

This means dlopen and dlsym don't work if EMULATE_FUNCTION_POINTER_CASTS==1, because EMULATE_FUNCTION_POINTER_CASTS in the calling module makes all indirect calls have the fixed call signature, whereas the functions returned from dlsym are still the original call signature. In combination with a version of emscripten which knows about this change (separate pull request headed their way), function pointers can work through dlsym, as also does passing a function pointer out of a module, which currently breaks as the function pointer in the wasm table is still hooked up to the old function on import, whereas with this fix emscripten can hook it up to the fixed call signature version.

This is my first PR to binaryen, so please feel free to be super critical and/or hack it apart by the way.

Currently it only outputs thunks for things in the wasm table.
This breaks dynamic linking emscripten SIDE_MODULES because they don't have a table.
This means dlopen and dlsym don't work if EMULATE_FUNCTION_POINTER_CASTS==1,
because EMULATE_FUNCTION_POINTER_CASTS in the calling module makes all calls
have the fixed call signature, whereas the returns from dlsym are still the original
call signature. In combination with a version of emscripten which knows about this
change (separate pull request headed their way), function pointers work through
dlsym, as does passing a function pointer out of a module, which currently breaks
as the function pointer in the wasm table is still to the old function, whereas
with this fix emscripten can hook it up to the fixed call signature version..
@joemarshall
Copy link
Author

I need to fix the tests on this, sorry. Will update once that is done.

@kripken
Copy link
Member

kripken commented Jan 11, 2021

whereas the functions returned from dlsym are still the original call signature

Is the issue that dlsym looks at the exports, and the exports have the normal signature, without a thunk that converts them to the fixed signature for function cast emulation? If so then I think this makes sense, but I'm not sure why this PR is needed - can't it all be done on the emscripten side? That is, emscripten's addFunctionWasm should be adding a thunk for the fixed signature.

Btw, how did pyodide ever work, before these fixes?

Also maybe @sbc100 will have other thoughts.

Base automatically changed from master to main January 19, 2021 21:59
@joemarshall
Copy link
Author

Hey,

I did more on this and I have got it mostly working in pyodide, but it is a massive pig, because with the way it requires rewriting of the wasm table so that taking function pointers works breaks things that call invoke functions via JavaScript (which expect unmodified signatures). My current hack in pyodide maps dyncalls back to the unmodified version, but I've put a suggestion in issues to flip the whole thing round and just do the remapping on taking function pointers??..

@joemarshall
Copy link
Author

I think back in the fastcomp backend, pyodide worked because asm.js is very much more cool about mismatched function signatures. So going via asm.js fixed all problems.

No one else ever got pyodide to work in upstream wasm, I'm the first person to do it, which is why this stuff hasn't been fully tested by pyodide.

So, the details that mean it isn't as simple as just wrapping in emscripten are:

  1. Firstly, in binaryen as is, there's no wrapping anything in side modules. It does rewrite the code in side modules, but if you take a function pointer in a side module, you get the bare function, whereas if you call a function pointer in a side module it tries to call with the fpcast signature. It only wraps the main module as that owns the wasm table. So if you make a function pointer in a side module bad things happen.
  2. Secondly, it is hard to do javascript things with fpcast unless you use bigint, because you can't call int64 functions from js without legalization stuff.
  3. Basically Everything breaks if you pass function pointers around because function pointers can point to fpcast versions or not depending on how you get them and there's no way to check what you have.

I think to me it would make most sense if any time you got a function pointer, it pointed to something with fpcast signature. That would be consistent which would make me happy.

@sbc100
Copy link
Member

sbc100 commented Jan 29, 2021

@joemarshall .. I wonder if its worth revisiting pyodide's (i.e. python's) dependency on EMULATE_FUNCTION_POINTER_CASTS. I wonder if upstream python has changed in the past years? Do you know exactly what part of codebase is depending on this feature? iirc its undefined behaviour so there could be an argument for changing upstream rather than trying fight with binaryen/emscripten?

@joemarshall
Copy link
Author

Funnily enough I just posted a comment to suggest that we should try that. Python itself I think will build with strict type checking as of 3.8, packages are what makes it hard

I've already fixed scipy, numpy, and pandas which were the big things that used random signatures. Python itself I think can probably build with strict signature checking, it's the packages that use old code, a lot of which does old school c things like use implicit declarations and /or just use void and int interchangeably in externs and packages that use Fortran which is compiled with f2c because there is as yet no non hacky, working wasm Fortran compiler, along with f2py wrappers which use inconsistent translations from the fortran calls. A lot of the old code in scipy and numpy just didn't use headers, so everything is brought in through ad hoc externs in c or Fortran code.

@sbc100
Copy link
Member

sbc100 commented Jan 29, 2021

That is interesting. I didn't know such code was widespread. Until now python was the only project that I know of that actually needed this feature.

@joemarshall
Copy link
Author

More thoughts on this - I think the long term aim has to be to avoid needing this in any projects. But the pain is that it is really hard to debug things that make these fpcast errors. The main reason I see for that is because you get an error and best case, boom, you're deep inside wasm calls, and the browser debugger stops on an exception, and you can at least see what function index is being called and the direct arguments to it. In the more common case it seems that the browser doesn't catch that exception even if you say stop on all exceptions, and the first you see of it is in a javascript exception handler higher up, where you get a call stack but no idea of what was being call-indirected at the time.

One way I can see of doing it is rather than by thunking at all, instead making javascript wrappers for all call_indirect calls, like the invoke_xxx calls that emscripten currently uses. In fact it could probably just use those. So invoke_TTTT[number,args] is called by any call indirect instructions. Then in the invoke javascript call, have it catch the exception that is generated by wrong argument calls, which is easy to debug because it happens in javascript with all the debugging info and arguments handy. That call could even do the fpcast fix as well, either by using the new reflection stuff, or reading the signatures from function table on module load (or even compiling the function sigs in as json).

Basically that would give a guaranteed rock solid but low performance way of running fpcast emulation, along with logging of when it was required, which would allow project maintainers of things like pyodide to easily debug what needs fixing in their C codebases.

@sbc100
Copy link
Member

sbc100 commented Jan 31, 2021

Hmm, I kind of like that idea,

@kripken
Copy link
Member

kripken commented Feb 1, 2021

@joemarshall Good idea! I also like it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
3 participants