vwxyzjn / quickchat

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

quickchat

A simple cli to chat with HF models, using TGI, gradio, and slurm. The CLI automatically spins up a tgi instance via slurm and then opens a gradio interface to chat with the model.

Chat with models

Modify tgi_template.slurm to use your own slurm account. Then run the following command to chat with a model:

pip install -e . # or `poetry install`

# use the public tgi instance
python quickchat.py  --endpoint https://api-inference.huggingface.co/models/mistralai/Mistral-7B-Instruct-v0.1

# spin up your own tgi instance via slurm
python quickchat.py --manage_tgi_instances --model mistralai/Mistral-7B-Instruct-v0.1 --revision main
283507962-9eb6aae7-005a-4d67-80aa-f93d5a0d3cdb.mov

Why

There are already pretty good solutions like FastChat. This repo is a simpler alternative with minimal code allowing researchers to customize and quickly spin up their own tgi instances and chat with them.

About

License:MIT License


Languages

Language:Python 82.7%Language:Shell 17.3%