NJU-PCALab / AddSR

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Out of Memory Issue with Tiled-Based Inference on Large Images

zxwxz opened this issue · comments

First and foremost, I would like to express my gratitude for the excellent work you have done.

I am currently facing an issue with tiled-based inference. While attempting to perform inference on a large image, I encounter an Out of Memory (OOM) error related to CUDA. It appears that the vae.encoder is not properly utilizing the tiled settings.

Could you provide guidance on how to enable tiled-based inference to avoid this issue? Any assistance or suggestions would be greatly appreciated.

Thank you for your time and support.

Thank you for your attention! Could you please provide more detailed error information, as well as the amount of CUDA memory available on your computer?

Due to privacy concerns, I have redacted personal information with "xxxxxx" in my error log.
Additionally, I noticed that the following code snippet has been marked:

# self.vae.encoder.forward = VAEHook(

input size: 4320x7680
Traceback (most recent call last):
File "xxxxxx/AddSR-main/test_addsr.py", line 268, in
main(args)
File "xxxxxx/AddSR-main/test_addsr.py", line 217, in main
image = pipeline(
File "xxxxxx/AddSR-main/utils/vaehook.py", line 444, in wrapper
ret = fn(*args, **kwargs)
File "xxxxxx/lib/python3.9/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context
return func(args, **kwargs)
File "xxxxxx/AddSR-main/pipelines/pipeline_addsr.py", line 1005, in call
latents_condition_image = self.vae.encode(image
2-1).latent_dist.sample()
File "xxxxxx/lib/python3.9/site-packages/diffusers/utils/accelerate_utils.py", line 46, in wrapper
return method(self, *args, **kwargs)
File "xxxxxx/lib/python3.9/site-packages/diffusers/models/autoencoder_kl.py", line 258, in encode
h = self.encoder(x)
File "xxxxxx/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1194, in _call_impl
return forward_call(*input, **kwargs)
File "xxxxxx/lib/python3.9/site-packages/diffusers/models/vae.py", line 141, in forward
sample = down_block(sample)
File "xxxxxx/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1194, in _call_impl
return forward_call(*input, **kwargs)
File "xxxxxx/lib/python3.9/site-packages/diffusers/models/unet_2d_blocks.py", line 1247, in forward
hidden_states = resnet(hidden_states, temb=None, scale=scale)
File "xxxxxx/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1194, in _call_impl
return forward_call(*input, **kwargs)
File "xxxxxx/lib/python3.9/site-packages/diffusers/models/resnet.py", line 606, in forward
hidden_states = self.norm1(hidden_states)
File "xxxxxx/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1194, in _call_impl
return forward_call(*input, **kwargs)
File "xxxxxx/lib/python3.9/site-packages/torch/nn/modules/normalization.py", line 273, in forward
return F.group_norm(
File "xxxxxx/lib/python3.9/site-packages/torch/nn/functional.py", line 2528, in group_norm
return torch.group_norm(input, num_groups, weight, bias, eps, torch.backends.cudnn.enabled)
torch.cuda.OutOfMemoryError: CUDA out of memory. Tried to allocate 15.82 GiB (GPU 0; 47.51 GiB total capacity; 31.72 GiB already allocated; 14.11 GiB free; 31.79 GiB reserved in total by PyTorch) If reserved memory is >> allocated memory try setting max_split_size_mb to avoid fragmentation. See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CONF

I think the insufficient memory error is due to the high resolution of your images. When we perform four times upscaling on 128x128 images, it typically requires around 11GB of CUDA memory. However, your input size is 4320x7680, which is significantly larger.

It appears that the vae.encoder is not properly utilizing the tiled settings.

Yes, I meet the same problem and I find the vae.encoder does not utilize the tiled settings. Actually, in SeeSR, the encoder is tiled with size 1024. By updating related settings in vaehook, addsr_pipeline, and test_addsr, it can work for larger image.