sleeplessinva / personal-assistant

Simple personal assistant that is able to use your local LLM

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Personal Assistant: RetirvalQA

Description

Partially based on the privateGPT project. However, the main chain was written from scratch to speed up the inference. Not sure why LangChains implementation is so slow, like 25 seconds of prompt evaluatuion in LlamaCpp model (vs. 2-4 seconds when queried directly)

The main retrieval chain logic:

  1. Given user question, the retrival agent:
    • gets the most relevant documents from the vector DB,
    • processes them, and
    • answers the question based on the context documents
  2. Given the retrievers' response, the context and the question, the reviewer agent:
    • reviewes the retrievers' respond with respect to the context and user question
    • imporoves it, and
    • returns to the user

Implementation progress

  • Simple retrieval QA functionality
  • What is wrong with LangChain, why so slow?
  • Add extra tools
  • Obsidian integration?
  • Docker
  • ...

Installation

Download the desired model from https://huggingface.co/ to the models folder. For now, the model should be in the ggml format to be compatible with LlaMaCpp model interface.

Examples of tested models:

  • TheBloke/wizard-mega-13B-GGML link
  • TheBloke/Manticore-13B-GGML link - update of the wizard-lm, makes it more versitile and robust

You can use a quantized version with whatever number of bits, as long as it is in ggml format and supported by Llama-cpp.

Create a python environment with Python 3.10+

Using pyenv:

pyenv virtualenv 3.10.9 pa
pyenv local pa
pyenv shell pa 

Using conda:

conda create -n pa python=3.10
conda activate pa

Install all the requirements from the requirements.txt file:

pip install -f requirements.txt

Then install llamacpp-python witu cuBLAS:

CMAKE_ARGS="-DLLAMA_CUBLAS=on" FORCE_CMAKE=1 pip install llama-cpp-python

Then you can install the pa package itself:

pip install .

Prepare the files

Create files folder to put the source documents in.

Usage

Load the documents from the files folder to the local vector ChormaDB database:

python inject.py

Run the retrievalQA chain:

python retrievalQA.py

References

About

Simple personal assistant that is able to use your local LLM

License:Apache License 2.0


Languages

Language:Python 68.7%Language:Jupyter Notebook 31.3%