TugdualKerjan / llm-swarm

Manage scalable open LLM inference endpoints in Runai and Slurm clusters

🐝 Fork of llm-swarm 🦋

Manage scalable open LLM inference endpoints in Runai and Slurm clusters

Features

😎 This fork implements the management for RunAI clusters as well. 😎
Generate synthetic datasets for pretraining or fine-tuning using either local LLMs or Inference Endpoints on the Hugging Face Hub.
Integrations with huggingface/text-generation-inference and vLLM to generate text at scale.

What's different here ?

Support for RunAI Schedulers is added.
The code is agnostic to a specific scheduler, new ones can be added following the BaseScheduler.py class
Templates have been cleaned up and an example for running with RunAI is given.
__init__.py is more readable.
utils.py is a file full of helper functions
Typing is used to avoid type errors in functions

Install and prepare

pip install -e .
# or pip install llm_swarm
mkdir -p .cache/
# you can customize the above docker image cache locations and change them in `templates/tgi_h100.template.slurm` and `templates/vllm_h100.template.slurm`

For the rest read the official README.md

About

Manage scalable open LLM inference endpoints in Runai and Slurm clusters

MIT License

Languages

Language:Python 79.8%Language:Shell 20.2%