If using one GPU, how much graphics memory is required at least
Dgreen2017 opened this issue · comments
When using a 4090 graphics card with 24G graphics memory inference, it is killed. If using one GPU, how much graphics memory is required at least
4090 graphics card with 16G graphics memory , it shows error message:
torch.cuda.OutOfMemoryError: CUDA out of memory. Tried to allocate 192.00 MiB (GPU 0; 15.99 GiB total capacity; 15.08 GiB already allocated; 0 bytes free; 15.08 GiB reserved in total by PyTorch) If reserved memory is >> allocated memory try setting max_split_size_mb to avoid fragmentation. See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CONF
你好,请问至少需要多大显存呢?
Hi @Dgreen2017, @ihongxx
In the inference process, a model with 7B parameters typically requires approximately 28GB of graphics memory. However, when utilizing half precision, the memory requirement can be reduced to around 14GB. It's important to note that in practical scenarios, additional memory is needed to store intermediate states of the model, resulting in higher memory usage. I hope this information proves helpful to you.
Best,
Hongbo