FTS: Field `k` is required for HNSW search
iSuslov opened this issue · comments
In a sandbox here https://www.cozodb.org/wasm-demo/ I created a table:
:create my_table {id: String, year: Validity => president: String}
?[id, year, president] <- [['US1', [2001, true], 'Bush'],
['US2', [2005, true], 'Bush'],
['US3', [2009, true], 'Obama'],
['US4', [2013, true], 'Obama'],
['US5', [2017, true], 'Trump'],
['US6', [2021, true], 'Biden']]
:put my_table {id, year => president}
Then created an index:
::fts create my_table:my_fts_index {
extractor: president,
tokenizer: Simple,
filters: [Lowercase, Stemmer('english'), Stopwords('en')]
}
Then tried FTS and got Field
k is required for HNSW search
error
?[id, year, president, score] := ~my_table:my_fts_index {id, year, president | query: $q, bind_score: score }
:order -score
Somehow it thinks that I'm trying to utilize HNSW instead of FST.
Full error:
parser::hnsw_query_required
× Field `k` is required for HNSW search
╭─[1:1]
1 │ ?[id, year, president, score] := ~my_table:my_fts_index {id, year, president | query: $q, bind_score: score }
· ────────────────────────────────────────────────────────────────────────────
2 │ :order -score
╰────
In documentation https://docs.cozodb.org/en/latest/releases/v0.7.html#full-text-search it doesn't mention the need for k
. When I add k: 10
everything works.
I can see that this part of doc https://docs.cozodb.org/en/latest/vector.html#full-text-search-fts mentions k
. Probably just need to fix the release announcement page mentioned above.
Update 1: In the doc for tokenizers it says:
Tokenizer is specified in the configuration as a function call such as Ngram(9), or if you omit all arguments, Ngram is also acceptable.
But when I try to use Ngram
as a tokenizer value, I get an error Unknown tokenizer: Ngram
.
On the same page tokenizer: Simple
+ n_gram: 3
parameters used for LSH index. https://docs.cozodb.org/en/latest/vector.html#minhash-lsh-for-near-duplicate-indexing-of-strings-and-lists
Needs clarification.
Update 2: It seems like import_relations
does not built existing indexes for imported values.