How does it compare to Flatbuf in benchmarks?

Question

How does it compare to Flatbuf in benchmarks?

haydenflinner opened this issue a year ago · comments

Hi, first I have to say, excellent library, this space is really ripe for improvement and this looks like the friendliest high performance option yet. The tempo API on top is even cooler and I look forward to seeing it grow!

On the growth note, I found it a little off-putting that Flatbuf is mentioned as an alternative, and the repo is tagged "zero-copy", but no performance claims vs Flatbuf are made. Based on my read of the wire format page, it seems that bebop's format is kind of like Protobuf minus the var-int compression and minus the types-inlined-into-key-bits. It thus seems impossible to be zero-copy (parsing code here); one must read all of the values at this "message level" (worst-case) before finding the value for the key we're interested in. See DuplicateMessageField errors if the same key is present twice during deserialization. This is in contrast to CapnProto, Flatbuf, SBE, where if one is only interested in a single field from a large message, that is all that's parsed, skipping an explicit deserialization step. Encoding could plausibly be made zero-copy in a hotpath, so maybe that's where the claim comes from.

I'd expect performance to be around the same order of magnitude as Flatbuf if reading all of the declared fields or in typical usage of RPC messages, and varying in cases which torture each format, like very small messages (Bepop win) or being finicky in very large messages with large arrays of messages (Flatbuf win). But the fact that it's mentioned and then left out of comparison makes me wonder if it's not significantly slower in most cases.

From a usability perspective though, bepop seems to have a sure win on its hands 😃

andrew · Answer 1 · Wed May 03 2023 09:54:29 GMT+0800 (China Standard Time)

We don’t benchmark or compare to Flatbuffers as they produce very different data structures. Bebob is record-oriented, like Protobuf or JSON, whereas Flatbuffers have their own partial data parsing implementation. So you can’t compare them 1:1 fairly.

I’d assume though they’re equal in performance; you can try implementing a Flatbuffer comparison in the existing benchmarks.