Adventures in porting SpiderMonkey to the WASI WebAssembly platform

Hello hackers, today we are about to go on a trip into the world of C++ and WebAssembly. Recently I have been lucky enough to participate in porting such a big and complex codebase as SpiderMonkey to the WASI platform. I want to share with you some of that experience.

If you aren’t familiar with WebAssembly, please read this documentation about it first.

Introduction

SpiderMonkey is a JavaScript engine written in C++. It compiles and executes JavaScript source code in the Firefox browser and other products. One of the first mentions about SpiderMonkey is from 1996, and so, it is a really old solid codebase. It is so old, so you can even find a declaration of C functions in Kernighan & Ritchie Style.

You can ask yourself - why would someone want such a big and industrial-ready JavaScript engine as a .wasm module? You can read about that in an article by Lin Clark on the Bytecode Alliance blog. In this post, we will discuss mostly the technical aspects of getting SpiderMonkey to compile and run on this platform. BTW, all bugs and all patches are open, so don’t hesitate to inspect this bug.

This work was a collaboration between Igalia, Fastly and Mozilla; Fastly’s Chris Fallin produced the first prototype and Igalia engineers helped in getting it upstream.

Instruments and restrictions

SpiderMonkey can JIT-compile JavaScript code, but with WebAssembly we can’t just generate fresh code at run-time and then run it, at least not without generating a separate WebAssembly module and instantiating it. Therefore our first port of SpiderMonkey just includes the JavaScript interpreter written in C++.

There are two options for compiling C++ to WASI: using emscripten or using clang directly. Emscripten is the best choice when you are targeting a web browser, but that’s not the case here -- we are just targeting WASI, which defines its own standard library. Therefore we can just use upstream clang as the compiler, and then link the result to the WASI C library. An advantage of this approach is that we minimize the size of the resulting .wasm file, while still allowing the result to run in a browser.

Though you can just use upstream clang, invoking the compiler with --target=wasm32-unknown-wasi, it’s easier to use the compiler from the WASI SDK. We added a new cpu target in SpiderMonkey’s build system cpu = wasm32, and also added additional parameters to clang --target=wasm32-unknown-wasi --sysroot=wasi-sysroot and run the build. Of course, with only that change, nothing is working. We received many compilation errors about missing headers files for <thread> and <signal.h>, and some missing functions from <unistd.h>. Our journey had just begun...

Missing syscall functionality

WASI is in its early stage and some functionality can be missing. For example, SpiderMonkey uses the missing getpid() function and this is our first compilation error. Getpid() returns identificator of the current running process, and since we have only one process in the wasm, so we can just replace this function with a stub - https://phabricator.services.mozilla.com/D110070.

Threads and signals

Run build again and parse new error messages. This time it is missing support for <thread> and <signal.h>. Unfortunately WASI doesn’t support threads yet, but there is some work to support it. To handle this error we configured SpiderMonkey to use only one thread, and wrote a simple stub realization of std::thread and std::atomic to pass the compilation https://bugzilla.mozilla.org/show_bug.cgi?id=1701613, https://phabricator.services.mozilla.com/D110215.

With signal handling we can use the same approach. We just cut off all functionality that won’t work without signals https://phabricator.services.mozilla.com/D110216. Luckily, SpiderMonkey in interpreter only mode can work without any signal support. It is funny but the WebAssembly execution engine in SpiderMonkey uses signals, so it was disabled. So, for now at least, the SpiderMonkey built on top of WebAssembly won’t be able to run WebAssembly itself.

Memory stuff

Run the build again and get an error about missing mmap in WASI. Well, if mmap isn’t supported yet we can use malloc which is supported in WASI. It works but it’s needed to align the chunk size each time because mmap returns aligned on the page size pointer. Actually, even without alignment everything is working well, but there might be some code in SpiderMonkey that asserts alignment and we are just lucky enough to not run into it yet. So, malloc was replaced with posix_memalign which ideally suits for our purposes - https://phabricator.services.mozilla.com/D110075. The dlmalloc WASI allocator has optimized support for posix_memalign, so this is a fine solution.

Rust

With all our hacks compilation passed. Yay! But the linker found some errors, of course. It turns out that SpiderMonkey uses components implemented in Rust and one of these components exports the symbol exit, but libc from WASI SDK also exports this symbol and so, the link failed. I tried to cut off all Rust components but it is really hard because there are a lot of places where it is used. Luckily, in the new version of the WASI SDK 12 exit symbol was renamed and so, to tackle this error I just updated the SDK. Now compilation and linking has passed.

Over-recursion: emulated C++ stack

If we run SpiderMonkey’s’s test suite via wasmtime we found out that tests which check recursion exceeding fail. Let’s find out why. When we execute the final WebAssembly module we have two stacks instead of one as we have in native C++. The first stack is a stack of WebAssembly itself and it is used to store local variables, function frames of WebAssembly itself. The second stack is an emulated C++ stack which is just an address in WebAssembly.Memory. Host should handle overflow of the WebAssembly stack and we should handle overflows of the C++ emulated stack ourselves. Memory layout inside WebAssembly.Memory will look like the memory layout for the usual native program:

wasi memory layout

You can read about why it looks like this in Everything Old is New Again: Binary Security of WebAssembly, Figure 4.

Let’s deal with the emulated stack in linear memory first. We want to check both limits of the stack - the higher one and the low one. To achieve this we change memory layout so that the stack is placed in the very beginning, so the host can check for overflows.

final memory layout

SpiderMonkey in turn will check underflows from the other side via Stack Base that is equal to the size of the stack.

Over-recursion: handle WebAssembly stack pointer

Some C++ functions that implement JavaScript API may not touch any linear memory and may have very deep nesting. For example this trivial C++ function won’t use any emulated stack at all:

int foo(int a, int b) {
   return a + b;
}
foo(int, int):
       local.get       1
       local.get       0
       i32.add
       end_function

So, some of SpiderMonkey’s functions that implements Javascript API, for example Regexp, may not use and not fully use the emulated in linear memory stack pointer, but we want to handle over-recursion for this type of function too. Thus a manual RAII based autorecursion limiter was added - https://phabricator.services.mozilla.com/D111813. Objects of this class count how many times we have entered into a particular C++ function and will report over-recursion when we reach the predefined limit.

CI Build

After all our exercises and workarounds, SpiderMonkey.wasm passed all tests and kind people from Mozilla agreed to setup CI build to check if something went wrong- https://bugzilla.mozilla.org/show_bug.cgi?id=1710358. At the moment of writing the build is green and the WASI target is considered to be a Tier 2 platform for upstream SpiderMonkey.

Online demo

Just for fun I used SpiderMonkey.wasm in the browser with WASI polyfill provided by wasmer: online demo. It is a personal and extremely experimental version that allows you to run JavaScript scripts in the browser. Of course, there has never been such a thing...

Conclusion

Webassembly becomes a jack to bring native apps on the web and I think that this is very good. Tools are improving and on the SpiderMonkey example we see that even very large, complex software can be run via WebAssembly. Here is just a small nostalgic selection of what Webassembly has already allowed you to run: Doom3, Pokemon Blue, Baldur's gate 2, Spy fox and Diablo 1. No need to install, no need to configure or something like this - just click the link and get fun. Yep, the porting process can be annoying but WASI evolves with time and the process becomes easier. As for me, bringing tools to the web that simplify work, do not require installation and updates, and even save time on deployment and work on everything that has a browser is a good thing.

Special thanks

I want to say thanks to my college Andy Wingo who helped me with this post and with all this stuff. Thanks Andy! Also, check his amazing blog.

Also, I want to say thank to Jan de Mooij who patiently reviews all our patches, thanks.

Thanks for reading.