grame-cncm / faust

Functional programming language for signal processing and sound synthesis

Home Page:http://faust.grame.fr

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

[wasm] Appetite for adding alternate faust-to-wasm via c++ and the wasi-sdk?

nuchi opened this issue · comments

What I noticed: Calls from wasm to js are slow*, so if a wasm module makes a lot of calls to JS's Math module then that can introduce a lot of latency. This is true even in Firefox which has relatively fast wasm-to-js calls, but is much slower in Chrome.

For example, I have a module which calls out to Math.pow 800 times per sample. On Firefox this module takes ~28% of a core, i.e. a second's worth of audio takes .28 seconds to generate. On Chrome a second's worth of audio takes 1.44 seconds to generate, meaning this can't actually run in realtime. (I'm on an M1 mac.)

* I noticed this especially in a module that has a lot of calls out to Math.pow, which can happen even when doing something as simple as _ <: _,_ : *, but this problem exists any time you want to use sin/cos/tan/exp/log/etc.

What I did: The wasi sdk is a C++-to-wasm toolchain that includes a C and C++ standard library. I used this along with faust's C++ backend to generate a wasm file which:

  1. Is a drop-in replacement for the output of faust2wasm -worklet, i.e. can be used with the existing generated .js file.
  2. Links wasm-native versions of sin/cos/tan/exp/log/pow/etc, removing the need for calling out to js.
  3. Additionally allows fvariable, ffunction to be used with wasm.
  4. Potentially allows the use of wasm simd, though I didn't actually see any improvements when I tried this. I don't know if that's because there are no benefits to be found or if FIrefox and Chrome's wasm simd implementations are tuned for Intel CPUs (I'm on arm).

Results: The same module I used earlier ran at 0.07 seconds, 0.05 seconds on Firefox and Chrome respectively, for 1 second of audio. (~4x and ~28x speedups.) As a side benefit the binary size went from ~180k (240k before wasm-opt) to ~110k.

Question: Are you interested in a PR for this, maybe as a "experimental wasi sdk" option to pass to faust2wasm? Downsides:

  • The user would have to have the wasi-sdk installed locally
  • The user would have to have binaryen installed locally (for wasm-opt and wasm-ctor-eval)
  • Compilation is slower (my example went from ~0.5s to ~7s).

If so, then I'd take a bit of time to clean it up; I'm thinking I'd add a subdirectory in architecture/webaudio for the necessary C++ files. At the moment I've only checked that it's compatible with faust2wasm -worklet — monophonic, no effect, no emcc — so an initial PR would only add functionality for that.

Having an alternate C/C++ based path to produce an optimised wasm file is certainly interesting. We had something in the past using emcc but it was not really maintained. Several questions here:

  • do you think using optimised versions of sin/cos/tan/exp/log/..etc..; functions could be even possible in the dynamic compilation path ? I mean having a precompiled wasm files containing all of them, being linked to the libfaust generated one? (a so removing the WASM/JS calling cost)
  • we are still in the process of rewriting the C++/WASM glue code. This is available in the wasm2 and wasm2-merge-master-dev branches. So It would better to wait for the merge of the wasm2 branch before adding this new functionality
  • is your code visible on some GitHub ?
  • we can possibly continue the discussion on Faust Slack if this is needed

do you think using optimised versions of sin/cos/tan/exp/log/..etc..; functions could be even possible in the dynamic compilation path ? I mean having a precompiled wasm files containing all of them, being linked to the libfaust generated one? (a so removing the WASM/JS calling cost)

I'm not sure what you mean by "dynamic compilation path" — do you just mean faust's wasm backend? If so, then I don't know how it could be done to have a precompiled wasm module which could be linked in place of importing JS's Math. I did look at trying that so I could just use the wasm backend instead of cpp. The difficulty is that that would require that faust generate a position-independent wasm object file, as opposed to a wasm executable, and then you'd still need to do the linking after. I wouldn't want to write my own linker. log and exp have data they refer to (they do various interpolations with hard-coded data), so that would need to be linked in as well. This all makes it harder to write the JSON description to the start of the file, since that assumes nothing important is going to be overwritten.

In my approach I allocate a big empty block and make sure it gets address 0; then I can write the JSON description (in the format that the js modules expect) to address 0 and then log/exp's data goes after. All that is much easier when I can just rely on llvm's wasm linker. I wouldn't want to have to implement that in code (as opposed to via the command line clang/wasm-ld tools) but maybe it's not that hard in libllvm, I don't know.

we are still in the process of rewriting the C++/WASM glue code. This is available in the wasm2 and wasm2-merge-master-dev branches. So It would better to wait for the merge of the wasm2 branch before adding this new functionality

Thanks for alerting me to these. I notice they have commits going back years. Is there a timeline for getting those merged in to master-dev?

is your code visible on some GitHub ?

I will upload it shortly and ping you here when I do.

Thanks. Several comments then:

  • I see that having virtual xxxx() kind of methods is not the most appropriate in your use-case. The could be the time to add a special -lang cpp1 option to generate C++ code without using any virtual.
  • I suggest to keep the faust2cpp2wasm tool as separated, possible adding some more Information in the README (like benchmarks...), then adding a reference to it here and here, so that to ease the mode to the new wasm2 branch later on

What do you think?

That works for me! I'm not attached to any particular integration — I've got it working to my own satisfaction and for my own purpose, and wanted to post here mainly to ensure that others could benefit if they wanted. Even this discussion would be enough for people to find by searching if they run into similar issues that I ran into, and adding a link or two in the manual would also be useful. Thanks!

Re: virtual methods — in this case virtual was a liability for me because it increased code size (via vtables) and added indirect calls. Indirect calls couldn't always be easily optimized out, and for webassembly in particular that sometimes meant that the linker would try to import wasi system calls like file read/write/seek which would prevent me the whole thing from working.

I don't know how important it would be for other contexts. For others working in embedded environments where time and space are at a premium, I can imagine it would be useful to not have the extra weight of vtables, but I don't concretely know how much of an issue that would really be. I was able to easily get around it with the #define virtual trick; I don't know if it's worth adding another compilation option. (As an aside, it need not be a whole other language backend; could be as simple as a -no-virtual flag to be used alongside -lang cpp, just like -vec, -clang, etc.)

OK I'll add a -no-virtual option then.

Done in: 6a98a8e and adapted documentation here and here.