sujitvasanth / streaming-LLM-chat

transformers based streaming chat for GPTQ models

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Streaming-LLM-chat

samplechat

This is a transformers library application that allows you to choose a local LLM and run streaming inference on GPU.

it uses:

  • Python: 3.8.10
  • transformers library: 4.36.2
  • transformers_stream_generator library

the models are assumed to be in oogabooga textgeneration ui folder

the openchat model is available at https://huggingface.co/

TheBloke/openchat-3.5-0106-GPTQ

sujitvasanth/TheBloke-openchat-3.5-0106-GPTQ

About

transformers based streaming chat for GPTQ models


Languages

Language:Python 100.0%