Xirider / finetune-gpt2xl

Guide: Finetune GPT2-XL (1.5 Billion Parameters) and finetune GPT-NEO (2.7 B) on a single GPU with Huggingface Transformers using DeepSpeed

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Out of memory with RTX3090

PyxAI opened this issue · comments

Hi,
I'm trying to train gpt2xl, but keep getting OOM, even when I set batch size to 1 and gradient_accumulation to 8\16\512, contigous_gradients false and allgather_bucket_size \ reduce_bucket_size 2e2.
I can see in nvidia-smi that I'm only reaching half the memory capacity - around 12GB
My system is as stated - 3090 with 24GB memory
80 GB RAM
5600x cpu if that matters
running WSL2 on windows 10
Thanks.

So working with WSL is just a no-go
I installed dual boot ubuntu and now the problem disappeared

dual boot only, huh... that sucks. I was really hoping I could use this on win10 or wsl(2)

I was however, able to run the model under WSL2 windows 11
Didn't check training, it's worth a shot @ReewassSquared

Hi @PyxAI . Which ubuntu version did you run this code on?