Support for opensource LLM's

Question

Support for opensource LLM's

sak12009cb opened this issue a year ago · comments

sak12009cb commented a year ago

Team,
I would like to write the embeddings generated through non open ai(Hugging face embedding) into pine cone.I would like to query them from an open source LLM (Google/flan-t5). Is it natively supported by the existing Lang chain pinecone integration?If yes,can you please guide me with some examples

Jelena Dostić · Answer 1 · Wed Aug 02 2023 21:00:15 GMT+0800 (China Standard Time)

Hello, I'd be happy to assist you. Let me break down the process into three simple steps:

First, you'll need to create embeddings using an open-source model from HuggingFace. This can be done using the LangChain library with the langchain.embeddings.huggingface module, which contains the HuggingFaceEmbeddings object. For a better understanding, refer to the second cell in this example: https://github.com/pinecone-io/examples/blob/master/learn/generation/llm-field-guide/llama-2-13b-retrievalqa.ipynb.
The second step involves upserting those embeddings to the Pinecone database. You can check the cells 4 to 9 in the example above.
Lastly, in the third step, you can utilize open-source LLMs using the LangChain library. Within it, you'll find the langchain.llms module, from which you can instantiate HuggingFacePipeline. This allows you to use open-source models from HuggingFace according to your preference. You can refer to the example above starting from the 10th cell: https://github.com/pinecone-io/examples/blob/master/learn/generation/llm-field-guide/llama-2-13b-retrievalqa.ipynb.

Additionally, there are other notebooks available in the same directory that you may find helpful. Feel free to explore them as well! :)