RahulSChand / gpu_poor

Calculate token/s & GPU memory requirement for any LLM. Supports llama.cpp/ggml/bnb/QLoRA quantization

https://rahulschand.github.io/gpu_poor/

RahulSChand/gpu_poor Issues

Activation Memory
Closed 2 months ago
What's the meaning of magic numbers?
Updated 4 months ago
Missing License
Updated 4 months ago
why batch size does not effect to memory usage in inference mode
Updated 5 months ago
compute in gpu_configs.json meaning
Updated 6 months ago1
Name and size from same model can cause different result
Closed 6 months ago2
DeepSpeed support
Closed 6 months ago1
The memory usage in LoRA finetuning
Closed 7 months ago1
What is [Prompt len] and [Tokens to Generate]?
Closed 7 months ago3
Test results are different
Closed 7 months ago6
API to use this repo
Updated 7 months ago1
Results are inconsistent and is not reliable enough
Closed 8 months ago6