LLM Utility for the IDA Cluster

Install

pip install --upgrade git+https://github.com/Parry-Parry/idaLLM.git

LLM Serving: Through vLLM and FastAPI you can run an efficient endpoint for LLM inference
Local Inference: Through vLLM you can execute prompts (coupled with LightChain)

To run the API in its simplest form

python -m idallm.api.serve --model <MODEL_ID> --host 0.0.0.0 --port 8080

To run local inference in its simplest form (over a text file )

python -m idallm.local.serve --model <MODEL_ID> --input_file my_prompts.txt --output_file my_outputs.txt

LightChain: Makes life a lot easier in terms of prompt formatting, chain-prompting and chat functionality