tomarrell / rust-elias-fano

Elias-Fano encoding implementation in Rust

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Skip a givne number of elements

fulmicoton opened this issue · comments

This is similar but different from #5.

There are some use case for skipping a given number of elements.
Once again, Elias Fano makes it possible by counting zeroes and ones in the unary encoded sequence.

One use case was met in tantivy when encoding positions. Given a term, term positions for all of the documents are concatenated and encoded as an increasing sequence.
Accessing the positions in the n-th document is made possible by suming all of the term frequency in
document 1 to (n-1)th and skipping the same amount of values in the elias fano encoded positions.

Thanks for this. Correct me if I'm wrong, but is what you're asking here similar to the functionality of something like the following:

By skipping the same number of EF encoded values are you not performing a visit on the current internal position plus the number that you would like to skip? Something like ef.visit(ef.position() + num_to_skip).

If that's not the case, I might simply be misinterpreting what you're after. Let me know if that is it or not. :)

Oh yes you are right. This is the same spec.

The implementation must be very slow : you could use popcounts instead of this loop

    for _ in 0..skip {
        pos = get_next_set(&self.b, (pos + 1) as usize);
    }

Fantastic, I'll have a look into maximizing performance and providing a few more helpful methods for common use cases. I will take the performance optimizations into a separate issue. Thanks!