patterns-ai-core / langchainrb

Build LLM-powered applications in Ruby

Home Page:https://rubydoc.info/gems/langchainrb

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Ability to specify the distance threshold when calling similarity_search

andreibondarev opened this issue · comments

Description

In Discord it was asked whether we can specify a distance threshold when calling the Vectorsearch#ask method. The need is to return ALL record based on their relevance score as opposed to returning a static number of k: record.

Tasks

  • Explore whether vectorsearch DBs support a distance threshold parameter. If yes -- we should implement it. If no -- we should not because then it could be done on the client side.
  • Modify vectorsearch#ask(), vectorsearch#similarity_search_by_vector() and vectorsearch#similarity_search() methods to accept distance_gte: ("distance greater than or equal") parameter to set this threshold.

Note: We might need to normalize/standardize the distance scores that various vectorsearch engines return.