meilisearch / heed

The BytesEncode trait is currently defined as follows:

Lines 20 to 26 in 3406383

    
           pub trait BytesEncode<'a> { 
        
               /// The type to encode. 
        
               type EItem: ?Sized + 'a; 
        
               /// Encode the given item as bytes. 
        
               fn bytes_encode(item: &'a Self::EItem) -> Result<Cow<'a, [u8]>, BoxedError>; 
        
           }

There are some issues with this definition:

Unless the encoded type can be trivially transmuted to a slice of bytes (str, [u8], U32<NativeEndian>, ...) we have to allocate a temporary buffer where the value is written.
There is no way to get a temporary buffer without going through the global allocator.
Allocating a temporary buffer incurs an extra copy.

For example, most architectures in use are Little-Endian (x86, Arm), however, for lexicographic ordering to correctly sort integers, we have to store them in Big-Endian. Since this incurs a byte-swap on Little-Endian architectures we cannot simply return a pointer to the integer itself, instead we have to allocate a temporary buffer where we store the byte-swapped result. That's one heap allocation for each integer, something which could have been done trivially on the stack or in a CPU register.

Optimizations

In most cases we know a-priori the maximum size of the needed allocation:

LMDB forces keys to be 511 bytes or less.
Primitive types and arrays have a compile-time known size.

In some cases it may be preferable to run serialization in two phases:

First we serialize to a "null" writer which counts the number of bytes needed for the value.
With this size in hand, we can have LMDB reserve a memory area in the database which fits the serialized value (using MDB_RESERVE).
We then serialize our value directly into the reserved space.

A new trait

With the above in mind, we can make a few changes to the BytesEncode trait (naming TBD):

/// Set to `true` if the value can be trivially casted to `&[u8]` (e.g., `str`, `[u8]`, ...).
/// This is used as a hint by the library to avoid using intermediate buffers.
const ZERO_COPY: bool = false;

/// A hint to the database about the final size of the encoded value.
/// Can be used to reduce allocations and memory copies.
/// 
/// Must be exactly equal to the final encoded size,
/// otherwise encoding will result in an error.
fn size_hint(item: &Self::EItem) -> Option<usize> {
    None
}

/// Encode the value into a pre-allocated buffer.
/// The buffer could be allocated on the stack if we are encoding keys (511 bytes),
/// or allocated on the heap (with an initial capacity equal to `size_hint`).
fn encode_writer<W: std::io::Write>(item: &Self::EItem, writer: &mut W) -> Result<()> {
    // the default impl forwards to the old method
    writer.write_all(Self::bytes_encode(item)?)?;
    Ok(())
}

Based on the value of ZERO_COPY we can decide between either calling bytes_encode and directly get our slice, or using a pre-allocated buffer with encode_writer. (If the )

We could also consider making encode_writer the required method, and define bytes_encode in terms of encode_writer with Vec as our Writer. However, this would be a breaking change.

Example Usage

Let's look at how we might implement Database::put with the above definition (pseudocode):

fn put(db, key: impl BytesEncode, value: impl BytesEncode) {
    if Key::ZERO_COPY {
        key_bytes = key.encode_bytes()?; // should yield &[u8] without allocation
    } else {
        let mut buffer = [0u8; 511];
        let mut array_writer = std::io::Cursor::new(&mut buffer);
        key.encode_writer(&mut array_writer);
        key_bytes = &buffer[..array_writer.position()];
    }

    if !Value::ZERO_COPY {
        if let Some(size) = value.size_hint() {
            // allocate space directly in DB for value before writing
            return db.put_reserve(key, size, |space| value.encode_writer(space));
        }
    }

    // Should yield `&[u8]` directly if ZERO_COPY is true.
    // Otherwise, the implementation likely makes better decisions about allocating a `Vec`.
    // Although, we could possibly pre-allocate a small buffer on the stack, something the implementation cannot.
    value_bytes = key.encode_bytes()?;

    mdb_put(db, key_bytes, value_bytes)?;
}

Note that we can avoid all temporary allocations and intermediate memory copies in all cases except one: if the resulting size of value is unknown.

Thank you for this proposal @nolanderc, very appreciated 😃

However, there is one minor issue with the ZERO_COPY parameter; it should be possible to dynamically determine it from the given Self::Item. I advise you to look at the Meilisearch use case, which is probably the project that uses heed the most. We use a CboRoaringBitmap (using RoaringBitmaps) codec that conditionally stores integers one after the others (when there are not many) or uses the default serialization method.

I am pretty sure we can apply this optimization to the Meilisearch codebase and see the performance gains 🤩

	pub trait BytesEncode<'a> {
	/// The type to encode.
	type EItem: ?Sized + 'a;

	/// Encode the given item as bytes.
	fn bytes_encode(item: &'a Self::EItem) -> Result<Cow<'a, [u8]>, BoxedError>;
	}

Reduce allocations in `BytesEncode`

Optimizations

A new trait

Example Usage