If using one GPU, how much graphics memory is required at least

Question

If using one GPU, how much graphics memory is required at least

Dgreen2017 opened this issue a year ago · comments

Dgreen2017 commented a year ago

When using a 4090 graphics card with 24G graphics memory inference, it is killed. If using one GPU, how much graphics memory is required at least

fung077 · Answer 1 · Thu Jun 01 2023 11:09:07 GMT+0800 (China Standard Time)

4090 graphics card with 16G graphics memory , it shows error message:

torch.cuda.OutOfMemoryError: CUDA out of memory. Tried to allocate 192.00 MiB (GPU 0; 15.99 GiB total capacity; 15.08 GiB already allocated; 0 bytes free; 15.08 GiB reserved in total by PyTorch) If reserved memory is >> allocated memory try setting max_split_size_mb to avoid fragmentation. See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CONF

Dgreen2017 · Answer 2 · Thu Jun 01 2023 11:09:25 GMT+0800 (China Standard Time)

来信已收到，谢谢！

Kenny · Answer 3 · Thu Jun 01 2023 15:26:25 GMT+0800 (China Standard Time)

你好，请问至少需要多大显存呢？

Hongbo Zhang · Answer 4 · Sun Jun 18 2023 12:29:40 GMT+0800 (China Standard Time)

Hi @Dgreen2017, @ihongxx
In the inference process, a model with 7B parameters typically requires approximately 28GB of graphics memory. However, when utilizing half precision, the memory requirement can be reduced to around 14GB. It's important to note that in practical scenarios, additional memory is needed to store intermediate states of the model, resulting in higher memory usage. I hope this information proves helpful to you.
Best,
Hongbo

Dgreen2017 · Answer 5 · Sun Jun 18 2023 12:29:58 GMT+0800 (China Standard Time)

来信已收到，谢谢！

Dgreen2017 · Answer 6 · Wed Dec 06 2023 14:03:06 GMT+0800 (China Standard Time)

来信已收到，谢谢！