FreedomIntelligence / HuatuoGPT

HuatuoGPT, Towards Taming Language Models To Be a Doctor. (An Open Medical GPT)

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

If using one GPU, how much graphics memory is required at least

Dgreen2017 opened this issue · comments

When using a 4090 graphics card with 24G graphics memory inference, it is killed. If using one GPU, how much graphics memory is required at least

4090 graphics card with 16G graphics memory , it shows error message:

torch.cuda.OutOfMemoryError: CUDA out of memory. Tried to allocate 192.00 MiB (GPU 0; 15.99 GiB total capacity; 15.08 GiB already allocated; 0 bytes free; 15.08 GiB reserved in total by PyTorch) If reserved memory is >> allocated memory try setting max_split_size_mb to avoid fragmentation. See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CONF

commented

你好,请问至少需要多大显存呢?

Hi @Dgreen2017, @ihongxx
In the inference process, a model with 7B parameters typically requires approximately 28GB of graphics memory. However, when utilizing half precision, the memory requirement can be reduced to around 14GB. It's important to note that in practical scenarios, additional memory is needed to store intermediate states of the model, resulting in higher memory usage. I hope this information proves helpful to you.
Best,
Hongbo