dlqs / webscripten

Home Page:https://dlqs.github.io/webscripten/demo/dist/index.html

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Webscripten

Webscripten is a compiler for LLVM IR to WebAssembly, written in WebAssembly, with tools from the LLVM Project. This project allows for compilation and running of LLVM IR from entirely within the browser.

Try it in the browser here.

Usage

The Javascript parts of Webscripen can be installed via npm and must be deployed using a Javascript bundler. The non-Javascript parts, i.e. the static files, must be served separately.

The folllowing npm commands can also be replaced with their yarn equivalents.

Installation

  1. Install the package.
    npm install webscripten
    
  2. Copy the static files into a static assets folder, where they will be accessible alongside the main site. The static files will be downloaded at run time.
    For an example of how this can be done with webpack, check the demo.
    // from the root of the project, copy the files out
    cp -r ./node_modules/webscripten/dist/static <static folder>
    
  3. Import via require (or whatever import syntax your bundler supports)
    const webscripten = require('webscripten')
    
    ...
    
    webscripten.compile(code, 'static/')
    

Hint: if the static files do not load, use the browser's console > networking to check if the URL being accessed is correct.

Warning: the static files are large (~60MB) and can cause significant lag in the browser. They are only downloaded when required: this slows the compilation process down significantly. Please ensure a minimum of 1GB of available RAM. Several runs in a row may cause your browser tab to freeze or crash.

An example is provided in the demo folder.

API

compile(code: string, staticPath: string): Promise<string>

Returns a promise with the compiled LLVM IR (the object file) as a hex string. This is so that it can be easily passed around or stored in LocalStorage etc. Object files are not executable until they are linked.

link(obj: string, staticPath: string): Promise<string>

Returns a promise with the linked object file (the runnable WebAssembly module) as a hex string.

run(wasm: string, staticPath: string): Promise<string>

Returns a promise with the stdout from running the WebAssembly module.

compileLinkRun(code: string, staticPath: string): Promise<string>

Returns a promise with the stdout from compiling, linking and running the LLVM IR. This is a composition of the compile, link and run APIs described above.

Example:

const webscripten = require('webscripten')

const ir = `; ModuleID = 'hello_world.c'
source_filename = "hello_world.c"
target datalayout = "e-m:e-p270:32:32-p271:32:32-p272:64:64-i64:64-f80:128-n8:16:32:64-S128"
target triple = "x86_64-pc-linux-gnu"

@.str = private unnamed_addr constant [14 x i8] c"Hello, World!\\00", align 1
...
< cut for brevity >
`

const obj = await webscripten.compile(ir)
const wasm = await webscripten.link(obj)
const out = await webscripten.run(wasm)

const out2 = await webscripten.compileLinkRun(ir)
// out and out2 are the same

Development

Layout

.
├── demo                // example of using the webscripten npm package
├── dist                // distribution for npm (gitignored)
├── node_modules
├── package.json
├── prettier.config.js
├── README.md
├── src                 // source code folder
│   ├── lib/            // Javascript library code for the final WebAssembly executable
│   ├── static/         // static files that need to be distributed too
|   │   ├── llc.wasm    // llc in WebAssembly
|   │   ├── lld.wasm    // lld in WebAssembly
|   │   ├── sysroot.tar // tar'ed headers and libraries
|   │   └── webscripten.d.ts
│   ├── llc.js          // loading code for llc.wasm
│   ├── lld.js          // loading code for lld.wasm
│   ├── main.js         // main entrypoint for webscripten
│   ├── run_llc.js
│   ├── run_lld.js
│   ├── run_wasm.js
│   ├── util.js
│   ├── browserBindings.js
│   ├── wasi.index.esm.js
│   └── wasmfs.index.esm.js
└── webpack.config.js   // creates distribution

Building

The build process is rather long and involves compiling the LLVM tools and generating the Javascript glue code.

Requirements

  • GNU/Linux. Ubuntu preferred. You can use an Ubuntu >18 EC2 instance.
  • Minimum 8GB RAM and 20GB disk space
  • The entire process takes 1-2 hours, probably more for troubleshooting.

Steps

  1. Install LLVM from source. Refer to the Getting Started section of the LLVM docs.
    • This step is non-trivial and will probably take an hour.
    • Other build tools will be installed and configured in this step, such as CMake, gcc, python, and Make.
  2. Install emscripten
  3. Run the following to compile LLVM, then Webscripten:
// from home directory
git clone https://github.com/llvm/llvm-project

// this is our llvm build folder (outside of the llvm-project tree)
mkdir build && cd build

// generate the build files
cmake -G "Ninja" -DCMAKE_BUILD_TYPE=Release -DLLVM_TARGETS_TO_BUILD=WebAssembly -DLLVM_ENABLE_PROJECTS="clang;libcxx;libcxxabi;lld" -DCMAKE_INSTALL_PREFIX=~/build ~/llvm-project/llvm 

// kick off the llvm build (~30 minutes)
cmake --build .

// new build folder for webscripten
mkdir ../webscripten_build && cd ../webscripten_build

// generate the build files (ensure your emsdk environment variables are set)
emcmake cmake -G "Ninja" -DLLVM_ENABLE_DUMP=OFF -DLLVM_ENABLE_ASSERTIONS=OFF -DLLVM_ENABLE_EXPENSIVE_CHECKS=OFF -DLLVM_ENABLE_BACKTRACES=OFF -DLLVM_ENABLE_THREADS=OFF -DLLVM_BUILD_LLVM_DYLIB=OFF -DLLVM_INCLUDE_TESTS=OFF -DLLVM_INCLUDE_EXAMPLES=OFF -DCMAKE_CXX_FLAGS="-s EXPORT_ALL=1" -DCMAKE_INSTALL_PREFIX=$HOME/webscripten_build -DCMAKE_BUILD_TYPE=Release  -DLLVM_DEFAULT_TARGET_TRIPLE=wasm32-unknown-unknown -DLLVM_TARGETS_TO_BUILD="WebAssembly" -DCMAKE_CROSSCOMPILING=True -DLLVM_ENABLE_PROJECTS=lld -DLLVM_TABLEGEN=$HOME/build/bin/llvm-tblgen -DCLANG_TABLEGEN=$HOME/build/bin/clang-tblgen $HOME/llvm-project/llvm

// kick off the webscripten build (~30 minutes)
cmake --build .

Future work

1. Integration with llvm-sauce

This will be a relatively straightforward task, but there is a dependency on llvm-sauce being able to run standalone in the browser. Since Webscripten can already run standalone in the browser (refer to the demo page), eventually these two can be integrated into Source Academy.

2. Linking other libraries

Libraries can either statically linked via LLD, or dynamically loaded by provinding Javascript functions as WebAssembly imports.
In the demo:

  • display is statically linked via LLD
  • math_sin is dynamically loaded via WebAssembly import

Static Linking Using LLD

Static linking is done by fetching the library folder sysroot.tar located in the static folder, and copying its contents into the filesystem that lld.wasm accesses. Extra flags would also have to be added to LLD's arguments to let LLD know the location of the libraries. The un-taring code is taken from wasm-clang.
To add extra libraries, modify the code in run_lld.js to do the same for other .tar library files.

Dynamic loading via Webassembly imports

Inside the file run_wasm.js , import the javascript module via require and add the module to the environment of importObject before running the WebAssembly instance.

Example (math library):

const math = require('./lib/math.js')
... < other code >

    const importObject = {
      ...wasi.getImports(module),
      env: {
        ...math,
      },
    }

    let instance = await WebAssembly.instantiate(module, importObject)

3. Passing Higher Order Functions into library code

Library functions are implemented as Javascript imports. However, passing higher order functions between Javascript and WebAssembly not trivialy implementable. Consider compilation of the following Source code:

// Source
math_sin(0.5)
map((x) => x + 1, list(1, 2, 3))

// LLVM IR (declare: declares an *externally* defined function)
declare double @math_sin(double)  // OK
declare ? @map((? -> ?), ?)       // What is the type of map?

One solution is for map library code to accept a function pointer. A function pointer is compiled to an integer in WebAssembly, which is the index of the function in the program's function table. For more information on WebAssembly function tables, click here.

If we provide the definition of map inside the IR itself, then there would be no problem at all. However, if we use a Javascript implementation of map, the function would read its parameters as two numbers which is not what we want.

Potential Solutions

The following are some possible solutions to the problems posed by passing around higher order functions.

Passing Arrays Between WebAssembly and Javascript

This article contains a section which has an implementation of passing arrays between javascript and WebAssembly by managing the memory of the WebAssembly instance using javascript.

Passing and Adding Functions
  • We can make use of the table index that was passed to obtain the Exported WebAssembly Function from the function table. This however, requires table to be imported into the WebAssembly module first, click here for more information.
  • It is also possible to convert a javascript function into an Exported WebAssembly Function and add it into the table, click here for more information.

About

https://dlqs.github.io/webscripten/demo/dist/index.html


Languages

Language:JavaScript 100.0%