PineconeHynridSearchRetriever not having search_kwargs
danterran00 opened this issue · comments
Checked other resources
- I added a very descriptive title to this issue.
- I searched the LangChain documentation with the integrated search.
- I used the GitHub search to find a similar question and didn't find it.
- I am sure that this is a bug in LangChain rather than my code.
- The bug is not resolved by updating to the latest stable version of LangChain (or the specific integration package).
Example Code
I am using PineconeHybridSearchRetriever class which _get_relevant_documents hasn't implemented yet search_kwargs as parameter. I've added it manually following a sample but now the question is how to use or pass this parameter from PineconeHybridSearchRetriever object.
def _get_relevant_documents(
self, query: str, *, run_manager: CallbackManagerForRetrieverRun,
search_kwargs: Optional[Dict] = None
result = self.index.query(
vector=dense_vec,
sparse_vector=sparse_vec,
top_k=self.top_k,
include_metadata=True,
namespace=self.namespace,
**(search_kwargs if search_kwargs is not None else {})
)
As complement on the top I am using MultiQueryRetriever as following
retriever = MultiQueryRetriever(
retriever = myPineconeHybridSearchRetriever ,
llm_chain = llm_chain,
parser_key = "lines",
include_original = True,
)
Many thanks!
Error Message and Stack Trace (if applicable)
No response
Description
I am using PineconeHybridSearchRetriever class which _get_relevant_documents hasn't implemented yet search_kwargs as parameter. I've added it manually following a sample but now the question is how to use or pass this parameter from PineconeHybridSearchRetriever object.
def _get_relevant_documents(
self, query: str, *, run_manager: CallbackManagerForRetrieverRun,
search_kwargs: Optional[Dict] = None
result = self.index.query(
vector=dense_vec,
sparse_vector=sparse_vec,
top_k=self.top_k,
include_metadata=True,
namespace=self.namespace,
**(search_kwargs if search_kwargs is not None else {})
)
As complement on the top I am using MultiQueryRetriever as following
retriever = MultiQueryRetriever(
retriever = myPineconeHybridSearchRetriever ,
llm_chain = llm_chain,
parser_key = "lines",
include_original = True,
)
Many thanks!
System Info
python 3.11.4
langchain 0.1.0
langchain-community 0.0.10
langchain-core 0.1.33
retriever = MultiQueryRetriever(
retriever = myPineconeHybridSearchRetriever ,
llm_chain = llm_chain,
parser_key = "lines",
include_original = True,
)
like this
retriever = MultiQueryRetriever(
retriever = myPineconeHybridSearchRetriever ,
llm_chain = llm_chain,
parser_key = "lines",
include_original = True,
sparse_vector=sparse_vec,
top_k=self.top_k,
)
Let me see
The solution for me was to modify the PineconeHybridSearch class adding the search_kwargs:
search_kwargs: dict = Field(default_factory=lambda: dict(k=100))
"""Keyword arguments to pass to the vectorstore hybrid search."""
and then adding on the function _get_relevant_documents
result = self.index.query(
vector=dense_vec,
sparse_vector=sparse_vec,
top_k=self.top_k,
include_metadata=True,
**(self.search_kwargs if self.search_kwargs is not None else {}),
)
You can also have a custom class but I prefer modify this as temp solution