remixer-dec / botality-ii

telegram bot for self-hosted local inference of stable diffusion, text-to-speech and large language models, such as llama3

Home Page:https://github.com/remixer-dec/botality-ii/wiki/Overview

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Botality II

This project is an implementation of a modular telegram bot based on aiogram, designed for local ML Inference with remote service support. Currently integrated with:

Accelerated LLM inference support: llama.cpp, mlc-llm and llama-mps
Remote LLM inference support: oobabooga/text-generation-webui, LostRuins/koboldcpp and llama.cpp server
Compatibility table is available here

Evolved from predecessor Botality I
Shipped with an easy-to-use webui, you can run commands and talk with the bot right in the webui.

preview

Documentation

You can find it here (coming soon)

Changelog

Some versions have breaking changes, see Changelog file for more information

Features

[Bot]

  • User-based queues and delayed task processing
  • Multiple modes to filter access scopes (WL/BL/Both/Admin-only)
  • Support of accelerated inference on M1 Macs
  • Memory manager, keeps track of models loaded at the same time and loads/unloads them on demand.

[LLM]

  • Supports dialog mode casually playing a role described in a character file, keeping chat history with all users in group chats or with each user separately
  • Character files can be easily localized for any language for non-english models
  • Assistant mode via /ask command or with direct replies (configurable)
  • Single-reply short-term memory for assistant feedback
  • Supports visual question answering, when multimodal-adapter is available

[SD]

  • CLI-like way to pass stable diffusion parameters
  • pre-defined prompt wrappers
  • lora integration with easy syntax: lora_name100 => <lora:lora_name:1.0> and custom lora activators

[TTS]

  • can be run remotely, or on the same machine
  • tts output is sent as voice messages
  • can be used on voice messages (speech and acapella songs) to dub them with a different voice

[STT]

  • can be activated as a speech recognition tool via /stt command replying to voice messages
  • if stt_autoreply_mode parameter is not none, it recognizes voice messages and replies to them with LLM and TTS modules

[TTA]

  • can be used with /sfx and /music commands after adding tta to active_modules

multimodality demo

Setup:

  • copy .env.example file and rename the copy to .env, do NOT add the .env file to your commits!
  • set up your telegram bot token and other configuration options in .env file
  • install requirements pip install -r requrements.txt
  • install optional requirements if you want to use tts and tts_server pip install -r requrements-tts.txt and pip install -r requrements-llm.txt if you want to use llm, you'll probably also need a fresh version of pytorch. For speech-to-text run pip install -r requrements-stt.txt, for text-to-audio run pip install -U git+https://git@github.com/facebookresearch/audiocraft#egg=audiocraft
  • you can continue configuration in the webui, it has helpful tips about each configuration option
  • for stable diffusion module, make sure that you have webui installed and it is running with --api flag
  • for text-to-speech module download VITS models, put their names in tts_voices configuration option and path to their directory in tts_path
  • for llm module, see LLM Setup section bellow
  • if you want to use webui + api, run it with python dashboard.py, otherwise run the bot with python bot.py

python3.10+ is recommended, due to aiogram compatibility, if you are experiencing problems with whisper or logging, please update numpy.

Supported language models (tested):

Python/Pytorch backend

C++ / TVM backend

Remote api backend

LLM Setup

  • Make sure that you have enough RAM / vRAM to run models.

  • Download the weights (and the code if needed) for any large language model

  • in .env file, make sure that "llm" is in active_modules, then set:
    llm_paths - change the path(s) of model(s) that you downloaded
    llm_backend - select from pytorch, llama.cpp, mlc_pb, remote_ob, remote_lcpp llm_python_model_type = if you set pytorch in the previous option, set the model type that you want to use, it can be gpt2,gptj,llama_orig, llama_hf and auto_hf.
    llm_character = a character of your choice, from characters directory, for example characters.gptj_6B_default, character files also have prompt templates and model configuration options optimal to specific model, feel free to change the character files, edit their personality and use with other models.
    llm_assistant_chronicler = a input/output formatter/parser for assistant task, can be instruct or raw, do not change if you do not use mlc_pb.
    llm_history_grouping = user to store history with each user separately or chat to store group chat history with all users in that chat
    llm_assistant_use_in_chat_mode = True/False when False, use /ask command to ask the model questions without any input history, when True, all messages are treated as questions.

  • For llama.cpp: make sure that you have a c++ compiler, then put all necessary flags to enable GPU support, and install it pip install llama-cpp-python, download model weights and change the path in llm_paths.

  • For mlc-llm, follow the installation instructions from the docs, then clone mlc-chatbot, and put 3 paths in llm_paths. Use with llm_assistant_use_in_chat_mode=True and with raw chronicler.

  • For oobabooga webui and kobold.cpp, instead of specifying llm_paths, set llm_host, set llm_active_model_type to remote_ob and set the llm_character to one that has the same prompt format / preset as your model. Run the server with --api flag.

  • For llama.cpp c-server, start the ./server, set its URL in llm_host and set llm_active_model_type to remote_lcpp, for multimodality please refer to this thread

Bot commands

Send a message to your bot with the command /tti -h for more info on how to use stable diffusion in the bot, and /tts -h for tts module. The bot uses the same commands as voice names in configuration file for tts. Try /llm command for llm module details. LLM defaults to chat mode for models that support it, assistant can be called with /ask command

License: the code of this project is currently distributed under CC BY-NC-SA 4.0 license, third party libraries might have different licenses.

About

telegram bot for self-hosted local inference of stable diffusion, text-to-speech and large language models, such as llama3

https://github.com/remixer-dec/botality-ii/wiki/Overview


Languages

Language:Python 58.9%Language:Vue 20.9%Language:JavaScript 19.7%Language:HTML 0.6%