opendistro-for-elasticsearch / k-NN

🆕 A machine learning plugin which supports an approximate k-NN search algorithm for Open Distro.

Home Page:https://opendistro.github.io/

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Use with frequent updates

z0mb1ek opened this issue · comments

Hi! Big thanks for this project, it is great!
Please tell me if I understand correctly: If I often add vectors or often update vectors, I will have many segments and search performance will be slow?

Hi @z0mb1ek ,

Yes that is correct. How much slower depends on how many segments there are. The hnsw search complexity scales O(log(n)), where n is the number of vectors (1).

Lucene will search the segments sequentially. So searching 5 segments with 100,000 vectors would take 5 *log(100,000) ~= 83. Searching 1 segment with 500,000 vectors would take log(500,000) ~= 19.

thx @jmazanec15

Are there any options for update at least once every 2-3 hours? Without downtime

What do you mean by update?

Regardless, you won't get downtime.

add portion of vectors, delete them