patterns-ai-core / langchainrb

Build LLM-powered applications in Ruby

Home Page:https://rubydoc.info/gems/langchainrb

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Help: Example on how to use the assistant with Vector DB ?

pedroresende opened this issue · comments

It would be great to have an example on how we can integrate an assistant with a RAG.
Even though the documentation mentions that "Assistants can be configured with an LLM of your choice (currently only OpenAI), any vector search database and easily extended with additional tools." (https://github.com/andreibondarev/langchainrb/blob/main/lib/langchain/assistants/assistant.rb#L5) there is no simple example on how we can do it

@pedroresende I created a tool that wraps any vector search database. Take a glance at the diff here: https://github.com/andreibondarev/langchainrb/compare/add-vectorsearch-wrapper-tool?expand=1.

This is how I tested it out:

# This could be any LLM. It'll be used to embed documents and query.
llm = Langchain::LLM::Ollama.new url: ENV['OLLAMA_URL']

# Initialize the vectorsearch db
chroma = Langchain::Vectorsearch::Chroma.new(url: ENV["CHROMA_URL"], index_name: "docs", llm: llm)

# Add documents to it
chroma.create_default_schema
chroma.add_data paths: [
  # I imported this file: https://www.coned.com/-/media/files/coned/documents/small-medium-large-businesses/gasyellowbook.pdf
  Langchain.root.join("./file1.pdf"),
  Langchain.root.join("./file2.pdf")
]

# Initialize the tool that will be passed to the Assistant
vectorsearch_tool = Langchain::Tool::Vectorsearch.new(vectorsearch: chroma)

# Initialize the Assistant
assistant = Langchain::Assistant.new(
  llm: Langchain::LLM::OpenAI.new(api_key: ENV['OPENAI_API_KEY']),
  thread: Langchain::Thread.new,
  # It's up to you to explain the Assistant when it should be accessing the vectorsearch DB. You could even tell it to access it every single time before answering the question.
  instructions: "You are a chat bot that helps users find information from the Con Edison Yellow Book that you have stored in your vector search database. Feel free to refer to it when answering questions.",
  tools: [
    vectorsearch_tool
  ]
)

# Ask away!
assistant.add_message_and_run content: "...", auto_tool_execution: true

I would really really love your feedback on the approach.

@andreibondarev it worked perfectly fine, thanks for the help. The only strange thing that occurs, sometimes it's the following error

image

@andreibondarev it worked perfectly fine, thanks for the help. The only strange thing that occurs, sometimes it's the following error

image

I called the tool Langchain::Tool::Vectorsearch.

I know you did, I've renamed it to try to debug if it was clashing for some reason but I'm getting exactly the same error

I know you did, I've renamed it to try to debug if it was clashing for some reason but I'm getting exactly the same error

Make sure to modify the NAME constant as well:

module Langchain::Tool
  class Vectorsearchtool < Base
    NAME = "vectorsearchtool"

I know you did, I've renamed it to try to debug if it was clashing for some reason but I'm getting exactly the same error

Make sure to modify the NAME constant as well:

module Langchain::Tool
  class Vectorsearchtool < Base
    NAME = "vectorsearchtool"

Sure thing

Resolved.

This currently does not work.

If I use .ask it will return a relevant response.

If I use the Vectorsearch tool inside an assistant, it just tells me it can't find the relevant information.

I ended up creating my own tool and directly using .ask method, which resulted in relevant information consistently being provided.

@Jellyfishboy Could you please show me how you were using the Langchain::Tool::Vectorsearch tool?

@andreibondarev Is it possible to indicate which vector db to search for if I have multiple indices and I want to switch based on context?

Eg: if the instruction is "You're a helpful chat bot that helps users find relevant information. If the user asks x then search through vector db 1, and if user asks y then search through vector db 2"

@mengqing You may need to create your own tool that is similar to https://github.com/patterns-ai-core/langchainrb/tree/main/lib/langchain/tool/vectorsearch.

There's no issue with doing something like:

assistant = Langchain::Assistant.new(
   ...
   tools: [
     Langchain::Tool::Vectorsearch.new(vectorsearch: Langchain::Vectorsearch::Chroma.new(index_name: "private_documents"),
     Langchain::Tool::Vectorsearch.new(vectorsearch: Langchain::Vectorsearch::Chroma.new(index_name: "public_documents"),
   ]

But you need to override the tool definitions that are being passed to the LLM because they should not be identical.

@mengqing You may need to create your own tool that is similar to https://github.com/patterns-ai-core/langchainrb/tree/main/lib/langchain/tool/vectorsearch.

There's no issue with doing something like:

assistant = Langchain::Assistant.new(
   ...
   tools: [
     Langchain::Tool::Vectorsearch.new(vectorsearch: Langchain::Vectorsearch::Chroma.new(index_name: "private_documents"),
     Langchain::Tool::Vectorsearch.new(vectorsearch: Langchain::Vectorsearch::Chroma.new(index_name: "public_documents"),
   ]

But you need to override the tool definitions that are being passed to the LLM because they should not be identical.

Thanks, that's also what I was thinking in mind. Although I'm not sure how will LLM decide what vector db to use. Is this determined by the name and description defined in the definition file?

@mengqing Yes, the name and description are sent to the LLM and it uses those as context.