morganstanley / hobbes

A language and an embedded JIT compiler

Home Page:http://hobbes.readthedocs.io/

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Structured logs of unit payloads can have sequence types collapsed to 'long'

kthielen opened this issue · comments

I was just looking at a pathological structured log that has a ton of unit payloads, and realized that we don't need to store these as a bunch of linked counts (essentially all that a batch of unit values means).

Instead we can just collapse the whole unit sequence into a number (how many of those unit values were stored) and where we correlate this data in other sequences (e.g. the transactions or log sequences) then it doesn't matter either, because we don't really store references to unit values (they encode in 0 bits, and it's always possible to make one trivially).

In some cases, this can be a significant space savings, and also a significant time savings for queries over these values (where we'd otherwise run all over memory finding all of the ints to add together).