GPU memory limited

Question

GPU memory limited

liustu opened this issue 9 months ago · comments

Hi, would there another solution for "Step 3: Finetune Dreambooth model (minimal GPU memory requirement: 2x32G): with only a single 3090 GPU?

Yangyi Huang · Answer 1 · Tue Oct 31 2023 20:35:55 GMT+0800 (China Standard Time)

We plan to test how TeCH works with more efficient DreamBooth finetune strategies, but you can try it out yourself first by following the examples in diffusers.

liustu · Answer 2 · Tue Oct 31 2023 20:39:05 GMT+0800 (China Standard Time)

ok,thank you.

…

---Original--- From: "Yangyi ***@***.***> Date: Tue, Oct 31, 2023 20:36 PM To: ***@***.***>; Cc: ***@***.******@***.***>; Subject: Re: [huangyangyi/TeCH] GPU memory limited (Issue #6) We plan to test how TeCH works with more efficient DreamBooth finetune strategies, but you can try it out yourself first by following the examples in diffusers. — Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you authored the thread.Message ID: ***@***.***>

Xzy765039540 · Answer 3 · Fri Nov 10 2023 11:19:32 GMT+0800 (China Standard Time)

I was run step 3 in two RTX 3090 by using half precision and apex amp_backend

Wenhao Shen · Answer 4 · Mon Nov 13 2023 16:58:50 GMT+0800 (China Standard Time)

I was run step 3 in two RTX 3090 by using half precision and apex amp_backend

Could you please show how to modify the code to implement the half precision training? I have tried to use apex amp in two GPU with 24GB each but it still goes out of memory.

Xzy765039540 · Answer 5 · Mon Nov 13 2023 17:16:30 GMT+0800 (China Standard Time)

I was run step 3 in two RTX 3090 by using half precision and apex amp_backend

Could you please show how to modify the code to implement the half precision training? I have tried to use apex amp in two GPU with 24GB each but it still goes out of memory.

add precision: 16 in v1-finetune_unfrozen.yaml then run step 3 you may get some error like RuntimeError: expected scalar type Float but found Half.
find error and fix it by add with torch.autocast("cuda"): in the top (note: the error always in forward or _forward)
I also change optimizer into 8 bit according to https://github.com/TimDettmers/bitsandbytes#requirements--installation