AzureSearch with Retriever is not working

Question

AzureSearch with Retriever is not working

premselvang opened this issue a month ago · comments

Checked other resources

I added a very descriptive title to this issue.
I searched the LangChain documentation with the integrated search.
I used the GitHub search to find a similar question and didn't find it.
I am sure that this is a bug in LangChain rather than my code.
The bug is not resolved by updating to the latest stable version of LangChain (or the specific integration package).

Example Code

def get_prompt_doc_word_html():
    template = """Use the following pieces of context to answer the question at the end.
    If you don't know the answer, just say that you don't know, don't try to make up an answer.

    {context}

    Question: {question}

    Helpful Answer:"""
    custom_rag_prompt = PromptTemplate.from_template(template)
    return custom_rag_prompt

def format_docs(docs):
    return "\n\n".join(doc.page_content for doc in docs)

vector_store = get_vector_store(index_name_1)

llm = AzureChatOpenAI(
            openai_api_version=openai_version, azure_deployment=openai_model_name
        )

retriever = vector_store.as_retriever(
            search_type="similarity",
            k=1,
            filters="Header2 eq '" + header_tag + "'",
        )

custom_rag_prompt = get_prompt_doc_word_html()

### if i use retriever with LLM chain like below. The filter condition is not working.
rag_chain = (
            {"context": retriever | format_docs, "question": RunnablePassthrough()}
            | custom_rag_prompt
            | llm
            | StrOutputParser()
        )

rag_chain.invoke(standalone_question)

###In below code the filter condition is working.

docs_retr = vector_store.similarity_search(
             query=standalone_question,
             k=3,
           search_type="similarity",
            filters="Header2 eq '" + header_tag + "'",
         )
display(docs_retr)

Error Message and Stack Trace (if applicable)

No response

Description

from the below file, under "_get_relevant_documents", while retrieving the documents we are not sending the filter condition from retriever rather the "_get_relevant_documents" expects the filter condition as kwargs. This is not possible while using the retriever with LLM chain which has memory.

https://github.com/langchain-ai/langchain/blob/master/libs/community/langchain_community/vectorstores/azuresearch.py

Existing code:
docs = self.vectorstore.hybrid_search(query, k=self.k, **kwargs)

New code suggested:
docs = self.vectorstore.hybrid_search(query, k=self.k, **self.search_kwargs)

Please update the code for all the search type.

System Info

NA