TugdualKerjan / llm-swarm

Manage scalable open LLM inference endpoints in Runai and Slurm clusters

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

🐝 Fork of llm-swarm πŸ¦‹

Manage scalable open LLM inference endpoints in Runai and Slurm clusters

Features

  • 😎 This fork implements the management for RunAI clusters as well. 😎
  • Generate synthetic datasets for pretraining or fine-tuning using either local LLMs or Inference Endpoints on the Hugging Face Hub.
  • Integrations with huggingface/text-generation-inference and vLLM to generate text at scale.

What's different here ?

  • Support for RunAI Schedulers is added.

  • The code is agnostic to a specific scheduler, new ones can be added following the BaseScheduler.py class

  • Templates have been cleaned up and an example for running with RunAI is given.

  • __init__.py is more readable.

  • utils.py is a file full of helper functions

  • Typing is used to avoid type errors in functions

Install and prepare

pip install -e .
# or pip install llm_swarm
mkdir -p .cache/
# you can customize the above docker image cache locations and change them in `templates/tgi_h100.template.slurm` and `templates/vllm_h100.template.slurm`

For the rest read the official README.md

About

Manage scalable open LLM inference endpoints in Runai and Slurm clusters

License:MIT License


Languages

Language:Python 79.8%Language:Shell 20.2%