davidglassborow / fsharp-wasm

A brief introduction to WebAssembly in .NET and F#

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

A brief introduction to WebAssembly in .NET and F#

This is my post in the annual F# Advent. I've decided to do some investigation on something I've not worked with before, WebAssembly (WASM), and in particular how it works in the .net and F# worlds.

The questions I've looked at are:

  1. What is WASM ?
  2. Why was it created and where can I use it ?
  3. How do you I use it from C# ?
  4. How do you I use if from F#
  5. How fast is it ?

1. What is WASM and why was it created ?

Web Assembly was created by a group including Mozilla, Microsoft, Apple and Google, back in 2015, as a way of having portable high performance apps written for a common 'assembly' target. I think of it like the CLR and JVM, and machine portable target to run low level code. It's the web's first binary / non-text language to run in browsers.

It's designed so that multiple programming languages can be compiled into WASM:

web-compile

Image from https://arghya.xyz/articles/webassembly-wasm-wasi/

2. Where can I use it ?

Originally just in web browers, but it's starting to be seen server side now.

Cloudflare supports running WASM on cloud flare workers. There are various projects to run WASM on Kubernetes.

A key part of this is the WebAssembly System Interface (WASI) initiative. When running WASM on a browser, WASM has no access to the outside world, it's in a secure sandbox and can only call back to the Javascript that is hosting it. WASI is an attempt to define operating system like APIs to standardise how server side WASM can talk to files, the network, etc. WASM and WASI are in flux at the moment, it's early days with none of the APIs set in stone yet.

overview

Image from https://files.speakerdeck.com/presentations/fbfddfe5eccb4700a3ae600b814a9ce9/slide_19.jpg

There are various WASM runtimes available alreaday that implement WASI, the most popular seems to be Wasmtime, but there are others like Wasmer.

Steve Sanderson, the author of Blazor, has got an Experimental WASI SDK for .NET Core nuget that allows both C# and F# to be written to run inside WASI containers, and has an excellant talk at NDC Porto 2022 where he goes into more detail on WASI and .NET.

3. How do you use it from C# ?

It's built into .net these days, to allow C# to run inside the browser by being compiled to WASM. It's just a case of using BlazorWebAssembly as the Project SDK:

<Project Sdk="Microsoft.NET.Sdk.BlazorWebAssembly">

blazor

The .NET build chain uses Mono, compiled via Emscripten, and can even add native dependencies from C or C++.

Once compiled, on startup the project uses the WebAssemblyHost, builders and interfaces to bootstrap the Aspnet pipeline with DI etc.

Build in routing components provide the runtime WASM files for the browser to load as it starts up, you can see below the loading of the various DLLs to run in the browser, in release mode these are all bundled together.

network-loading

4. How do you use if from F#

The F# blazor equivalents use the same underlying BlazorWebAssembly project SDK, and WebAssemblyHost. There are two projects I'm aware of in the space:

I've only had a quick play with both, they both seem good, with various Computation Expressions, Elmish MVU mechanisms, hot-reloading, etc. to make writing nice functional style rich clients. Below are a couple of example code snippets I took from their homepages:

// Bolero
let loginForm=
  form {
    attr.id "login-form"
    input { attr.placeholder "First name"
    input { attr.placeholder "Last name"
    button {
      on.click (fun_ -> printfn "Welcome!")
      "Log In"
    }
  }

// Fun Blazor:
let entry =
    adaptiview () {
        let! count1, setCount1 = cval(1).WithSetter()
        div {
            h6 { $"Count1={count1}" }
            button {
                onclick (fun _ -> setCount1 (count1 + 1))
                "Increase count 1"
            }
        }
    }

5. How fast is it ?

One of the aims of WASM is to be fast, faster than Javascript is. It's compiled rather than interpreted.

To run some quick perf tests I've choosen an implementation of the sieve of Eratosthenes as a test of pure computation throughput (rather than IO etc). I found this crazy fast implementation on StackOverflow.

No, I don't know how it works, but it is incredibly fast, it can work out the first 10 million primes in under 2 seconds in native code.

    // https://stackoverflow.com/a/17820204/131701
    let private primesAPF32() =
      let rec oddprimes() =
        let BUFSZ = 1<<<17 in let buf = Array.zeroCreate (BUFSZ>>>5) in let BUFRNG = uint32 BUFSZ<<<1
        let inline testbit i = (buf.[i >>> 5] &&& (1u <<< (i &&& 0x1F))) = 0u
        let inline cullbit i = let w = i >>> 5 in buf.[w] <- buf.[w] ||| (1u <<< (i &&& 0x1F))
        let inline cullp p s low = let rec cull' i = if i < BUFSZ then cullbit i; cull' (i + int p)
                                   cull' (if s >= low then int((s - low) >>> 1)
                                          else let r = ((low - s) >>> 1) % p in if r = 0u then 0 else int(p - r))
        let inline cullpg low = //cull composites from whole buffer page for efficiency
          let max = low + BUFRNG - 1u in let max = if max < low then uint32(-1) else max
          let sqrtlm = uint32(sqrt(float max)) in let sqrtlmndx = int((sqrtlm - 3u) >>> 1)
          if low <= 3u then for i = 0 to sqrtlmndx do if testbit i then let p = uint32(i + i + 3) in cullp p (p * p) 3u
          else baseprimes |> Seq.skipWhile (fun p -> //force side effect of culling to limit of buffer
              let s = p * p in if p > 0xFFFFu || s > max then false else cullp p s low; true) |> Seq.item 0 |> ignore
        let rec mkpi i low =
          if i >= BUFSZ then let nlow = low + BUFRNG in Array.fill buf 0 buf.Length 0u; cullpg nlow; mkpi 0 nlow
          else (if testbit i then i,low else mkpi (i + 1) low)
        cullpg 3u; Seq.unfold (fun (i,lw) -> //force cull the first buffer page then doit
            let ni,nlw = mkpi i lw in let p = nlw + (uint32 ni <<< 1)
            if p < lw then None else Some(p,(ni+1,nlw))) (0,3u)
      and baseprimes = oddprimes() |> Seq.cache
      seq { yield 2u; yield! oddprimes() }

    let calculatePrime nth =
        primesAPF32() |> Seq.item nth

I playing around, and found the time to calculate the first 10 million primes seemed to give indicative results. Less than 10 million was too quick, and getting up into the half billion range started to take a minute or so.

I compared Blazor with Bolero and Fun Blazor but they all gave the same sorts of performance, which makes sence given they are all running on the same underlying code.

I quickly noticed that the browser made a huge difference to performance, so recorded different times for Safari (v16.1) and Chrome (v108). I also added native code and javascript (compiled via Fable) to compare to the WASM code. Times are in seconds:

results

Caveats for these tests:

  • They were done quickly, not rigourlessly
  • I didn't always run test in release mode.
  • All tests run on my 2015 iMac, a quad core 4Ghz i7.

I was very surprised in the differences between Safari and Chrome, with Safari running wasm x2.5 faster than Chrome, but Chrome running Javascript x3 faster than Safari. I did compare .net6 and .net7, but they were really too close to see if there was any difference that wasn't noise.

The fastest code in the end was Chrome running javascript via Fable, which is shocking, but I guess larger tests might help dotnet native where the JIT gets time to kick in.

I did also try and run the tests compiled with the WASI experimental nuget package mentioned above, and run via wasmtime. I was able to get it working with F# ok, and running general code, but found it crashes in the primes problem, and generally seemed to have problems with recursive functions in F#.

My overall take away from all of this is just how good Fable is, and how fast it is in the browsers, and where as wasm helps C# run in the browser, we in the F# community are very lucky to have the other option of running transpiled Javascript directly.

Happy Xmas to all, Cheers, Dave

Update AOT

Following Paul Biggar's suggestion (of DarkLang fame), I've updated the Blazor WASM test to include AOT. I've also explicitly show the difference between Blazor in (non-AOT) debug vs release mode builds.

You can see Release builds help a fair bit, and AOT a little bit more, but Fable is still king by a mile 😎

perf-aot

Further Notes

docker-wasm

December 2023 - dotnet 8 update

I've re-run the tests using dotnet 8.0.100, same hardware as last time, latest copies of Safari, Google and Firefox (for my collegue Miles). Quick summary: everything got faster, WASM got a lot faster, and Safari now has the Javascript crown:

  • Wasm on Chrome is x3 faster than a year ago
  • Safari got faster at native javascript than a year ago, 3.5 times faster.
  • WASI is now supported by net8, I've added it to the test, running on the wasmtime runtime. See the microsoft blog about wasm
  • I finally ran native F# in release mode, and it smoked everything (AOT didn't make any difference). I don't have dotnet7 of my mac anymore, so don't have the figures for native .net7 in release mode.

perf-8

Appendix

I've run out of time but other things I'd like to investigate further when I get a chance:

  • Running F# in WASM on Cloudflare

Misc references:

Code:

About

A brief introduction to WebAssembly in .NET and F#


Languages

Language:F# 31.5%Language:JavaScript 25.0%Language:CSS 23.7%Language:HTML 17.3%Language:C# 1.8%Language:TypeScript 0.7%