opendistro-for-elasticsearch / k-NN

🆕 A machine learning plugin which supports an approximate k-NN search algorithm for Open Distro.

Home Page:https://opendistro.github.io/

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

KNN search with _msearch

pratap-surya opened this issue · comments

Greetings,

It is mentioned in the documentation that we can use KNN plugin with the _search API. But I cannot find any documentation that outlines how to use it with _msearch API. Would be grateful if someone can help.

I tried this search request in POSTMAN:

GET embeddings_1/_msearch
{"size": 2,"query": {"knn": {"my_vector2": {"vector": [0.72662646,0.6169524,0.6315499,0.8576119,0.7918579,0.0062808604,0.8855599,0.9140413,0.67190045,0.62971514,0.3923507,0.9477029,0.47421584,0.93582714,0.139513,0.10633592,0.4266537,0.8289768,0.27715486,0.6586612,0.2178715,0.7618145,0.82551104,0.8624665,0.59842443,0.55542684,0.04072092,0.054840904,0.9203533,0.816474,0.47055548,0.85200113],"k": 2}}}}
{"size": 2,"query": {"knn": {"my_vector2": {"vector": [0.039989088,0.4664009,0.75250006,0.4495583,0.1232088,0.57742596,0.8623316,0.041624475,0.63842386,0.5753898,0.729613,0.21783368,0.34890926,0.82330215,0.36830667,0.5995123,0.6419654,0.43775582,0.3754376,0.2845271,0.029266138,0.4790184,0.38583612,0.6411376,0.85759836,0.44029248,0.10275099,0.80581564,0.26424167,0.14465764,0.050962653,0.6367447],"k": 2}}}

I got the following response for the above request:

{ "error": { "root_cause": [ { "type": "illegal_argument_exception", "reason": "key [size] is not supported in the metadata section" } ], "type": "illegal_argument_exception", "reason": "key [size] is not supported in the metadata section" }, "status": 400 }

FYI I created the index using the following command:

curl -X PUT "<IP address>:9200/embeddings_1?pretty" -H 'Content-Type: application/json' -d '{ "settings": { "index": { "knn": true, "knn.space_type": "cosinesimil", "number_of_shards": 2,"number_of_replicas":1 } }, "mappings": { "properties": { "embeddingVector": { "type": "knn_vector", "dimension": 32 } } } }'

Thanks in advance!

@pratap-surya
By default your GET requests assumes that body is application/json but that is not the case for _msearch. This is similar to _bluk api. Hence, you have to add header "Content-Type: application/x-ndjson".

You can re-write you query like below

cat requests
{}
{"query": {"knn": {"embeddingVector": {"vector":[0.72662646,0.6169524,0.6315499,0.8576119,0.7918579,0.0062808604,0.8855599,0.9140413,0.67190045,0.62971514,0.3923507,0.9477029,0.47421584,0.93582714,0.139513,0.10633592,0.4266537,0.8289768,0.27715486,0.6586612,0.2178715,0.7618145,0.82551104,0.8624665,0.59842443,0.55542684,0.04072092,0.054840904,0.9203533,0.816474,0.47055548,0.85200113],"k": 2}}}}
{}
{"size": 2,"query": {"knn": {"embeddingVector": {"vector": [0.039989088,0.4664009,0.75250006,0.4495583,0.1232088,0.57742596,0.8623316,0.041624475,0.63842386,0.5753898,0.729613,0.21783368,0.34890926,0.82330215,0.36830667,0.5995123,0.6419654,0.43775582,0.3754376,0.2845271,0.029266138,0.4790184,0.38583612,0.6411376,0.85759836,0.44029248,0.10275099,0.80581564,0.26424167,0.14465764,0.050962653,0.6367447],"k": 2}}}}

Your request should look like below:

curl -H "Content-Type: application/x-ndjson" -XGET localhost:9200/embeddings_1/_msearch --data-binary "@requests"; echo

Please let us know if it didn't work.

Worked like a charm! Thanks so much :)