avesus / llama2.c-web

Simple repo that compiles and runs llama2.c on the Web

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Run llama2.c in the Browser

✨✨ Demo ✨✨

llama2coutput

Minimal repo that builds llama2.c to WebAssembly and run it in the Browser

  • No build system. Just a bash script.
  • No Emscripten.
  • llama2.c repo is a submodule and code builds without modification.
  • Repo is stripped down to the bare minimum. Every line of code has a purpose.
  • It runs llama2.c in a Web Worker to not block the main thread.
  • On my M1 MacBook Air native inference runs at ~100 tokens/s and ~80 tokens/s in the Browser. 20% overhead but haven't spend anytime yet optimizing.

This repo is Mac only for now due to the compiler / linker binaries. Could be adapted easily to other systems but don't have the bandwidth at the moment.

This builds on a minimal C++/C + libc to WASM template that I recommend to look at if you want to learn how the bare WASM stack works without the additional the complexity. Additional considerations to build llama2.c:

  • mmap is not part of the WASI standard but there's an emulation available on the wasi-sdk by passing the D_WASI_EMULATED_MMAN flag to the compiler and linking against wasi-emulated-mman library. For this we also need to link against clang_rt.builtins-wasm32 that the wasi-sdk distributes separatetly.

Usage

The command below downloads a wasi-sdk release bundle that contains the WASI headers, libraries, compiler (clang) and linker (wasm-ld). It also downloads a model locally for dev and testing purposes.

./setup.sh

Compiles and links C++ code to WASM

./build.sh

Starts a local Web server so you can run the code. Open in your browser http://localhost:8080

./run.sh

Deploy example

Deploys the demo to gh-pages

./deploy.sh

Prior work

https://github.com/michaelfranzl/clang-wasm-browser-starterpack/tree/dev/examples/11

https://medium.com/@michaelyuan_88928/running-llama2-c-in-wasmedge-15291795c470

https://stackoverflow.com/a/29694977/717508

https://github.com/taybenlor/runno

Notes

/vendor/wasi.js is built from the runno wasi js runtime by running mpm run build. There are two small modifications on 9b9dc1f3142c that I might submit upstream:

  1. The ability to pass a WebAssembly.Memory object to the runtime.
  2. The ability to pass an object with JS-defined functions that can be invoked from native code.

I included a wasi.js.original file as a references for the differences (git diff wasi.js wasi.js.original)

https://github.com/taybenlor/runno/commit/9b9dc1f3142c

About

Simple repo that compiles and runs llama2.c on the Web

License:MIT License


Languages

Language:JavaScript 46.0%Language:Shell 28.4%Language:CSS 14.9%Language:HTML 10.6%