google / flatbuffers

FlatBuffers: Memory Efficient Serialization Library

Home Page:https://flatbuffers.dev/

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

HashTable in Flatbuffer?

linchuan4028 opened this issue · comments

We are going to use the flatbuffer as the data format in our search index (not the message passing in RPC).
Consider that the flatbuffer only support an array, if we want a hashtable to search the value, we need to read the record one by one and insert them into a c++ or golang hashtable.
I realize that the key in Attribute, however, it only support in c++ SDK and the binary search can't meet our performance requirement.
So my question is

  1. Do you think flatbuffer is suitable in the search index scenario.
  2. Do we have plan to support hashtable in it
  3. DO we have plan to support key in golang SDK.

We don't have plans to support hash tables within flatbuffers itself. You're welcome to open an issue and start a design but it will be a lot of effort to get it approved then implemented everywhere since all the maintainers are part time. There are surprisingly many of design choices around hash tables.

That said, I think this is a fairly popular topic and we would invest in it if there's enough support.

You should be able to emulate a hash table atop of flatbuffers. We can try to design that for your use case here so it can act as an example for future users. In the best case scenario, it could lead to the basis of a design flatbuffer-wide design.

I'd recommend:

  • I'd recommend using robin hood hashing
    • this can lead to a good fill rate (Rust uses ~90%)
    • you need to cache the hash value for performance
  • Construction would require mutation and should be done in native data structures before serialization
  • Users must provide functions for hashing and equality
table MyElement {
  # For use in a rh_hash_table, cache the hash value to avoid computing it during linear probing
  # if this field equals 0 (i.e. not present) then the hash value must be recomputed.
  hash: uint64 (hash);
  # Other properties
}

table MyTable {
  hash_table: [MyElement] (rh_hash_table);
}

Thanks for your quick feedback. What about the 3rd question.
key in Attribute (binary search) supported in golang SDK

No current plans for binary search support in Go but feel free to contribute a PR!