OpenBMB / MiniCPM-V

MiniCPM-Llama3-V 2.5: A GPT-4V Level Multimodal LLM on Your Phone

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Two A100 GPUs for fine-tuning, insufficient memory?

hekaijie123 opened this issue · comments

Hello Author, I am using two A100 GPUs (40GB each) to run full-parameter fine-tuning. Regardless of whether I use the zero2 or zero3 configuration, it always shows that the memory is exceeded. However, according to the "Model Fine-tuning Memory Usage Statistics" table you provided below, GPUs2 uses 16GB of memory. How can this be resolved?

have you set offload the model‘s params and optimizer to cpu?

You need to offload model parameters and optimizer parameters to the CPU, further reducing GPU memory usage:

"zero_optimization": {
"stage": 3,
"offload_optimizer": {
"device": "cpu",
"pin_memory": true
},
"offload_param": {
"device": "cpu",
"pin_memory": true
}
}