Variable-length encoding should be optionally enabled for all data blocks
simerplaha opened this issue · comments
Current behaviour
Currently variable-length encoding is used by default for all data blocks (sortedIndex, hashIndex, bloomFilter etc)
Issue
All reads will read the byte[]
into heap (cached if configured) first and then iterate over the read byte[]
to decode the primitive value.
Solution
Off-heap ByteBuffer
(#284) allocations will allows us to read primitives (Int
, Long
& Byte
) directly from memory. This would reduce heap allocations.
Drawback
This increases the cost of storage and required cache size) since ByteBuffer
and Unsafe
APIs do not provide variable-length encoding & decoding so it should be configurable for each data-block so we can choose between performance or storage savings or have a balanced tradeoff by enable varints for some blocks vs others.