Ability to specify the distance threshold when calling similarity_search
andreibondarev opened this issue · comments
Description
In Discord it was asked whether we can specify a distance threshold when calling the Vectorsearch#ask
method. The need is to return ALL record based on their relevance score as opposed to returning a static number of k:
record.
Tasks
- Explore whether vectorsearch DBs support a distance threshold parameter. If yes -- we should implement it. If no -- we should not because then it could be done on the client side.
- Modify
vectorsearch#ask()
,vectorsearch#similarity_search_by_vector()
andvectorsearch#similarity_search()
methods to acceptdistance_gte:
("distance greater than or equal") parameter to set this threshold.
Note: We might need to normalize/standardize the distance scores that various vectorsearch engines return.