viniciusarruda / llama-cpp-chat-completion-wrapper

Wrapper around llama-cpp-python for chat completion with LLaMA v2 models.

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

LLaMA v2 Chat Completion Wrapper

Streamlit chat example

Handles chat completion message format to use with llama-cpp-python. The code is basically the same as here (Meta original code).

NOTE: It's still not identical to the result of the Meta code. More about that here. Update: I added an option to use the original Meta tokenizer encoder in order to get the correct result. See the example.py file along the USE_META_TOKENIZER_ENCODER flag.

Installation

Developed using python 3.10 on windows.

pip install -r requirements.txt

Usage

Check example.py file.

Streamlit

First install streamlit

pip install streamlit

Then, run the file streamlit_app.py with:

streamlit run streamlit_app.py

About

Wrapper around llama-cpp-python for chat completion with LLaMA v2 models.


Languages

Language:Jupyter Notebook 66.6%Language:Python 33.4%