maxmind / libmaxminddb

C library for the MaxMind DB file format

Home Page:https://maxmind.github.io/libmaxminddb/

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Support memory open mode (was: More DB open methods possible?)

ezelkow1 opened this issue · comments

It would be good if there were other methods available to open besides mmap'ing a file. We have integrated this into apache trafficserver (and I know other projects use it) but since the only way to load a db file is via mmap then that means you cant have any other automated tooling update databases on disk without synchronizing the rest of the system (and forcing downtime of applications dependent on mmdb since they must be stopped and forced to close db's before touching the file) to that, otherwise you may cause a crash due to the mmap'd file.

If there was a way to load via a buffer+size, or just any other sort of method that isnt directly referencing the on-disk file in the library, that could help alleviate this.

You can update the database on disk without synchronizing the rest of the system. The file just needs to be replaced atomically. This is what geoipupdate does, for instance.

I suppose, but then you have to make sure all of your automated tooling will actually do this correctly. Where as if the C version supported the same MEMORY mode that the java version of the library does then it could all be avoided by letting the program using the library handle the file/memory management

Just seems like it would open up the library to more flexible implementations

Thanks. I'll leave this issue open as a feature request. There was some work on a shared memory mode 8+ years ago in the https://github.com/maxmind/libmaxminddb/tree/dave/shared-memory branch. There didn't seem to be that much interest in it at the time though, and so work was abandoned.

Note that even if the reader is loading the database into memory, it is strongly recommended that the file be replaced atomically unless you know that there is no chance that the database could be reopened by the application while it is being updated.

Is there more to it than skipping the mmap and setting the pointers in the MMDB_s structure?
mmdb_open doesn't seem to perform any initialization besides copying the filename and mmapping, and then the lookup functions work off file_content.

I haven't examined the code, but assuming we are just being passed in a buffer and size, I don't suspect that many changes would be necessary.

You can update the database on disk without synchronizing the rest of the system. The file just needs to be replaced atomically. This is what geoipupdate does, for instance.

Hi @oschwald - I was trying to achieve this behavior but it seems the mmdb->file_content doesn't get updated. Also the mmdb->file_size would stay old so probably the new reads would be corrupted. Am I mistaken here?
I tried to update my db via https://github.com/google/renameio/ so it should be atomic.

If you are updating the file atomically, the reader would continue to have the old file MMAP'd. You would need to reopen the database to get the new file.

If you have follow-up questions, it might be best to create a new issue. This is pretty far removed from a memory-open mode.