kuleshov-group / llmtools

Finetuning Large Language Models on One Consumer GPU in Under 4 Bits

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

fine tune 65B model with A100 OOM

ChaoGaoUCR opened this issue · comments

Dear author, thanks for providing such an incredible project.
I am trying to fine-tune the 65B model with A100 on 40GB version,
However, it went OOM and I wondered is that was because the batch size was too big or is there was any way I could resolve this issue?
I have multiple GPUs available, and I wondered if there is any command I can use for parallel fine-tuning?

Thanks

Sorry I got it solved