salesforce / CodeGen

CodeGen is a family of open-source model for program synthesis. Trained on TPU-v4. Competitive with OpenAI Codex.

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

memory out of error. Hardware requirements

Kushalamummigatti opened this issue · comments

I have tried to finetune Codegen 2B Mono on 40GB GPU (single card) with sequence length set as 256. It gave CUDA memory out of error. What is the GPU memory required to finetune 2B and above models?

Seems like an issue on your side. A 7Billion parameter model fits on a 12GB GPU (the 7B 4-Bit one a 8GB GPU).

The 2B model should fit on a 6GB card just fine.