Support for opensource LLM's
sak12009cb opened this issue · comments
Team,
I would like to write the embeddings generated through non open ai(Hugging face embedding) into pine cone.I would like to query them from an open source LLM (Google/flan-t5). Is it natively supported by the existing Lang chain pinecone integration?If yes,can you please guide me with some examples
Hello, I'd be happy to assist you. Let me break down the process into three simple steps:
-
First, you'll need to create embeddings using an open-source model from HuggingFace. This can be done using the LangChain library with the
langchain.embeddings.huggingface
module, which contains theHuggingFaceEmbeddings
object. For a better understanding, refer to the second cell in this example: https://github.com/pinecone-io/examples/blob/master/learn/generation/llm-field-guide/llama-2-13b-retrievalqa.ipynb. -
The second step involves upserting those embeddings to the Pinecone database. You can check the cells 4 to 9 in the example above.
-
Lastly, in the third step, you can utilize open-source LLMs using the LangChain library. Within it, you'll find the
langchain.llms
module, from which you can instantiateHuggingFacePipeline
. This allows you to use open-source models from HuggingFace according to your preference. You can refer to the example above starting from the 10th cell: https://github.com/pinecone-io/examples/blob/master/learn/generation/llm-field-guide/llama-2-13b-retrievalqa.ipynb.
Additionally, there are other notebooks available in the same directory that you may find helpful. Feel free to explore them as well! :)