eltociear / serge

A web interface for chatting with Alpaca through llama.cpp. Fully dockerized, with an easy to use API.

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Serge - LLaMa made easy 🦙

License Discord

A chat interface based on llama.cpp for running Alpaca models. Entirely self-hosted, no API keys needed. Fits on 4GB of RAM and runs on the CPU.

  • SvelteKit frontend
  • MongoDB for storing chat history & parameters
  • FastAPI + beanie for the API, wrapping calls to llama.cpp
demo.webm

Getting started

Setting up Serge is very easy. TLDR for running it with Alpaca 7B:

git clone https://github.com/nsarrazin/serge.git && cd serge

cp .env.sample .env

docker compose up -d
docker compose exec api python3 /usr/src/app/utils/download.py tokenizer 7B

(You can pass 7B 13B 30B as an argument to download multiple models.)

Then just go to http://localhost:8008/ and you're good to go!

Models

Currently only the 7B, 13B and 30B alpaca models are supported. There's a download script for downloading them inside of the container, described above.

If you have existing weights from another project you can add them to the serge_weights volume using docker cp.

Support

Feel free to join the discord if you need help with the setup: https://discord.gg/62Hc6FEYQH

What's next

  • Front-end to interface with the API
  • Pass model parameters when creating a chat
  • User profiles & authentication
  • Different prompt options
  • LangChain integration with a custom LLM
  • Support for other llama models, quantization, etc.

And a lot more!

About

A web interface for chatting with Alpaca through llama.cpp. Fully dockerized, with an easy to use API.

License:MIT License


Languages

Language:Python 48.4%Language:Svelte 37.8%Language:TypeScript 8.7%Language:JavaScript 3.4%Language:HTML 1.5%Language:CSS 0.2%