meilisearch / heed

A fully typed LMDB wrapper with minimum overhead 🐦

Home Page:https://docs.rs/heed

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Web support

GregoryConrad opened this issue · comments

CC @Kerollmops, as we talked briefly about LMDB alternatives over on the Meili discord.

This issue stems from the fact LMDB won't compile to WASM/WASI because of its complicated concurrency model (and reliance on some other libc features). It is highly unlikely that LMDB will compile for web for many years to come.

To circumvent this issue, I was looking to provide WASM/WASI support via a shim in heed to a different (rust) library, like sanakirja or redb. However, both rely upon libc's mmap under the hood, which is poorly emulated in WASI (i.e., readonly). Further, file locking is not yet implemented in WASI libc, which is also problematic.

Thus, the only option left for web support as far as I can see is to use a web-native technology directly. For this, I propose the use of IndexedDB by using the web-sys crate, which plays well with wasm-bindgen. wasm-bindgen seems to be the go-to way to write Rust that interacts with JS & web. (There is also emscripten, but that does not seem like the "correct" way to write new code, just port over existing code).

I can look a little more into it, but I am thinking the appropriate path forward would be to make a new crate alongside lmdb-master-sys and the other heed crates that provides a shim mimicking the LMDB ffi bindings to leverage heed's preexisting code around LMDB. For the implementation, the shim would call upon web-sys's IndexedDB API. Does this sound appropriate? Would a PR for this be accepted?

This actually might be better using a slight shim with: https://developer.chrome.com/blog/sqlite-wasm-in-the-browser-backed-by-the-origin-private-file-system/
That will certainly be more performant than IndexedDB, based on most browser’s implementations

This would require entirely reimplementing LMDB but using OPFS. I do not think I would be able to do that correctly, or at least in any reasonable amount of time.

Since redb is looking like it'll support WASI, I am actually just going to try to add an alternative/experimental backend to heed that uses a redb shim instead of LMDB. That will also solve an issue I'm having on macOS where LMDB's concurrency model extravaganza violates the Mac App Sandbox which is required for submission to the Mac App Store.

That will also solve an issue I'm having on macOS where LMDB's concurrency model extravaganza violates the Mac App Sandbox which is required for submission to the Mac App Store.

Hi @GregoryConrad, is it possible to share a bit more about the concurrency model extravaganza violates the Mac App Sandbox? I'm also integrating LMDB into my app, and I'm having a problem that the app runs fine locally but crashes when release to TestFlight, and I'm not sure if this can be work around in any way? Thank you!

@thomasdao Oh, yea that is really hard to do 😅

For macOS, you need to enable POSIX semaphores (a heed option). Issue is that for POSIX semaphores to work under the app sandbox (which is required for submission to App Store), the posix semaphores require a certain name format that is really frustrating/limiting. I can't remember which apple developer link I looked at before, but this might point you in the right direction: https://developer.apple.com/forums/thread/674207

If you really want to use LMDB, you're going to have to fork heed to be able to compile LMDB with your own custom POSIX semaphore prefix. I never went through with this since it sounded like too much of a nightmare/PITA. If you can, I'd recommend a fully rust-based solution (that also doesn't use mmap, at least if you ever want to target iOS1). The only viable solution to all of these problems that I'm aware of is redb, which after having to deal with LMDB as a C dependency, I'd highly recommend over LMDB (at least for Rust applications).

Edit: now that I think about it, there are probably some LSM tree implementations in Rust that you could try out too instead of redb. Not sure how mature they are though.

Footnotes

  1. This is a whole different issue that I'll leave out of this comment for brevity. Let me know if you want the full mmap on iOS rundown, but TL;DR: if you even have a slim chance of targeting iOS, don't use anything that uses mmap.

@GregoryConrad I'm super grateful for your answer, thank you!