CUDA out of memory

Question

CUDA out of memory

Liuhm0710 opened this issue 3 months ago · comments

Traceback (most recent call last):
File "/root/anaconda3/envs/DiffuMask/lib/python3.8/multiprocessing/process.py", line 315, in _bootstrap
self.run()
File "/root/anaconda3/envs/DiffuMask/lib/python3.8/multiprocessing/process.py", line 108, in run
self._target(*self.args, **self.kwargs)
File "/18054208921/diffumask/Stable_Diffusion/parallel_generate_VOC_Attention_AnyClass.py", line 738, in sub_processor
image, x_t = run(prompts, controller, latent=None, generator=g_cpu,out_put = os.path.join(image_path,"image{}{}.jpg".format(args.classes,image_cnt)),ldm_stable=ldm_stable)
File "/18054208921/diffumask/Stable_Diffusion/parallel_generate_VOC_Attention_AnyClass.py", line 438, in run
images_here, x_t = ptp_utils.text2image_ldm_stable(ldm_stable, prompts, controller, latent=latent, num_inference_steps=NUM_DIFFUSION_STEPS, guidance_scale=7, generator=generator, low_resource=LOW_RESOURCE)
File "/18054208921/diffumask/myenv/lib/python3.8/site-packages/torch/autograd/grad_mode.py", line 28, in decorate_context
return func(*args, **kwargs)
File "/18054208921/diffumask/Stable_Diffusion/ptp_utils.py", line 175, in text2image_ldm_stable
latents = diffusion_step(model, controller, latents, context, t, guidance_scale, low_resource)
File "/18054208921/diffumask/Stable_Diffusion/ptp_utils.py", line 77, in diffusion_step
noise_pred = model.unet(latents_input, t, encoder_hidden_states=context)["sample"]
File "/18054208921/diffumask/myenv/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1102, in _call_impl
return forward_call(*input, **kwargs)
File "/18054208921/diffumask/myenv/lib/python3.8/site-packages/diffusers/models/unet_2d_condition.py", line 773, in forward
sample = upsample_block(
File "/18054208921/diffumask/myenv/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1102, in _call_impl
return forward_call(*input, **kwargs)
File "/18054208921/diffumask/myenv/lib/python3.8/site-packages/diffusers/models/unet_2d_blocks.py", line 1858, in forward
hidden_states = attn(
File "/18054208921/diffumask/myenv/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1102, in _call_impl
return forward_call(*input, **kwargs)
File "/18054208921/diffumask/myenv/lib/python3.8/site-packages/diffusers/models/transformer_2d.py", line 265, in forward
hidden_states = block(
File "/18054208921/diffumask/myenv/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1102, in _call_impl
return forward_call(*input, **kwargs)
File "/18054208921/diffumask/myenv/lib/python3.8/site-packages/diffusers/models/attention.py", line 313, in forward
attn_output = self.attn1(
File "/18054208921/diffumask/myenv/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1102, in _call_impl
return forward_call(*input, **kwargs)
File "/18054208921/diffumask/Stable_Diffusion/ptp_utils.py", line 221, in forward
sim = torch.einsum("b i d, b j d -> b i j", q, k) * self.scale

RuntimeError: CUDA out of memory. Tried to allocate 1024.00 MiB (GPU 1; 23.50 GiB total capacity; 20.56 GiB already allocated; 627.25 MiB free; 21.34 GiB reserved in total by PyTorch) If reserved memory is >> allocated memory try setting max_split_size_mb to avoid fragmentation. See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CONF

I run the data and mask generation part and find this problem. It seems that It is when i load the model and sudddenly use a large part of memory. I wonder if you could give me some advice and solve it. Thanks a lot. @weijiawu

Mello · Answer 1 · Sun Apr 07 2024 15:16:44 GMT+0800 (China Standard Time)

Same question. I only have 4 v100-16GB gpus, have you solved this problem?

Jumponthemoon · Answer 2 · Tue Apr 09 2024 00:05:48 GMT+0800 (China Standard Time)

Same problem here