PotatoSpudowski / fastLLaMa

fastLLaMa: An experimental high-performance framework for running Decoder-only LLMs with 4-bit quantization in Python using a C/C++ backend.

https://potatospudowski.github.io/fastLLaMa/

Implement the WebSocket Server

PotatoSpudowski opened this issue a year ago · comments

Bahushruth commented a year ago

Implement the WebSocket Server