zotero / citeproc-rs

CSL processor in Rust.

Home Page:https://cormacrelf.github.io/citeproc-wasm-demo/

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Performance vs. citeproc-js

dstillman opened this issue · comments

https://forums.zotero.org/discussion/93803/performance-hit-with-citeproc-rs

When I was testing citeproc-rs in Zotero, performance seemed like a mixed bag: CSL Preview (which generates output in all installed citation styles, and maybe involves an engine instantiation for each one) was dramatically faster than citeproc-js — about 2x — but generating a large bibliography was about the same or a little slower. adamsmith is reporting it taking over twice as long to generate 250 items in APA 7 as HTML and text:

citeproc-js:

(3)(+0001972): Copied bibliography to clipboard in 6156 ms}

citeproc-rs:

(3)(+0000018): Copied bibliography to clipboard in 13198 ms}

In terms of the debug output:

(5)(+167588974): CiteprocRs: new Driver

(3)(+0000013): CiteprocRstoJs: updateItems [forwarding to updateUncitedItems()]

(5)(+0000008): CiteprocRs: insertReference {"id":"33938","type":"article-journal", ,"container-title":"Bib(La)TeX Journal" […]

[lots of those]

(5)(+0000000): CiteprocRs: bibliographyMeta

(5)(+0000001): CiteprocRs: makeBibliography

(5)(+0000001): CiteprocRs: makeBibliography

(3)(+0005118): DATE: algorithms failed sanity check

[…]

(5)(+0000016): CiteprocRs: free Driver

(5)(+0000014): CiteprocRs: new Driver

(3)(+0000004): CiteprocRstoJs: updateItems [forwarding to updateUncitedItems()]

(5)(+0000000): CiteprocRs: insertReference {"id":"33938","type":"article-journal", ,"container-title":"Bib(La)TeX Journal" […]

[all the same entries again]

(5)(+0000000): CiteprocRs: bibliographyMeta

(5)(+0000000): CiteprocRs: makeBibliography

(5)(+0007589): CiteprocRs: free Driver

(3)(+0000018): Copied bibliography to clipboard in 13198 ms}

So the first makeBibliography from -rs takes 5 seconds. The second takes 7.5. -js takes 6 seconds total.

@cormacrelf, how does this compare to the performance you were seeing natively? How much of this do we think this is due to Firefox 60 and/or wasm, as opposed to differences in optimizations between the two engines?

Overall I have not been expecting WASM performance to be amazing. Natively, my M1 Mac can do 250 references with APA 7 from my own library in 238ms 193ms. Roughly double that, add overhead and you get ~800ms for the makeBibliography benchmark of the same size. On the same machine, JS manages the same set of 250 in 3592ms.

$ hyperfine '/Users/cormac/git/citeproc-rs/target/release/bench --style apa --library 250.json'
Benchmark #1: /Users/cormac/git/citeproc-rs/target/release/bench --style apa --library 250.json
  Time (mean ± σ):     193.4 ms ±   4.4 ms    [User: 178.6 ms, System: 12.3 ms]
  Range (min … max):   190.1 ms … 208.1 ms    14 runs

Edit: forgot to measure citeproc-rs in Zotero/FF60; it manages 1513ms total on my machine. There is approximately 400ms of overhead, I think (total without logging - the two makeBibliography calls), plus 200ms extra for the logging in rs mode. There is only 10ms total of overhead in the figures on native.

Code Task Time - overhead (ms) / baseline
citeproc-js in Zotero/FF60 html + plain 3192 8.6
@citeproc-rs/wasm in Zotero/FF60 html + plain 934 2.5
citeproc-rs native html + plain 373 1.0

That leads me to think there's either something odd about the specific 250 references in use for Sebastian's benchmark, or the machine. It could literally be that M1 is too good at WASM (this is running under Rosetta). I'll have to try this on another computer too.

Given the perf of native, it would be nice, in theory, if we could bundle a native binary and communicate via stdio with a subprocess. That's not something I think you'll get in repackaged Firefox for a while (maybe not even ESR 78), but it is available via the WebExtension APIs if you can get those going in a newer ESR. Not sure where you're at thinking about Electron or its ilk, but subprocess APIs are standard there. But a new stdio bridge would be a fair bit of effort.

I'll do a benchmark of some newer WASM engines than ESR 60 as well and post them here. That will be more interesting, because as you suggest, FF 60 is not exactly the gold standard of WASM performance but more of an MVP, and any upgrade you and the team can manage will hopefully come with perf improvements without extra work from my end.

So we've been rebuilding Zotero on Firefox 102 ESR, and we do actually have a subprocess call with stdout support now. Would you still expect a native binary to be substantially faster than wasm in a modern browser?

(It's also possible that some of the performance issues noticed here were due to #151, and it's also possible that, even if a native binary is a lot faster, in practice it won't matter once any remaining pathological cases are addressed.)

This should be re-tested in a Zotero 7 dev build, though it would have to be with a pre-1.0.2 style, since those throw errors (at least with the citeproc-rs build we have in Zotero).