Simply we provide the LLM with external data for example pdfs,urls,...etc helping to generate more conventent answer based on the source knowledge and parameteric knowledge that gained from the training on large amount of data
- Load data and split it into chunks
- Configure a suitable vector database for example FAISS or Pinecone
- configure the embeddings that used to embedded taxt into vector of numbers understandable by Models
- Create index by passing the chunks throught the embedding matrix and then store it into the vector database
- Configure the LLM
- Get the name of the model from huggingface
- load the tokenizer first
- set the configuration used for that model
- add quantization settings to load it with small size
- create a HuggingFace wrapper for the loaded model to integrate it easily with langchain
- config the Retrieval Chain from Langchain
- Enjoy using it!
