mattgodbolt / zindex

Create an index on a compressed text file

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Support "LIKE" queries

mattgodbolt opened this issue · comments

Would tie us forevermore to SQL if not careful but may be worth it (or else transform some zq-ish expression to SQL.

Thanks for submitting this. The companion on the index-creating side would be to apply some transform to the index before storing it; is that a feature you would be interested in incorporating too?

I'd be happy to support both! The LIKE thing ought to be pretty simple, whereas the transform requires a little more work.

I wonder if a more general approach like "execute this UNIX command for each line and use its output" would be better for all of these? But that may be too slow for indexing. It's certainly more "UNIXy";

Something like:

zindex foo.gz --exec "bash -c 'cut -f1 -d\  | tr a-z A-Z'"

That might also cover the "jq" case referenced in #4 too.

Looking at how LIKE actually runs, it ends up scanning every entry in the index, which is obviously slower than other approaches. A pre-transform seems to be more performant if you know ahead of time what the transform should be. I've opened #14 to track this separate request.

In 5aab30b I've started an experiment in which I pipe output to an external command to create the index.

Raw queries (zq --raw) allow this, and many other types of queries, at the cost of exposing the sqlite innards.