CoDi : CUDA ran out of memory while trying to do inference tasks

Question

CoDi : CUDA ran out of memory while trying to do inference tasks

PHOENIXFURY007 opened this issue a year ago · comments

I was trying to run the demo notebook on Nvidia A100 80 GB. While trying to load the model from checkpoint, I am facing this issue:
#######################
Running in eps mode
#######################

making attention of type 'vanilla' with 512 in_channels
Working with z of shape (1, 4, 32, 32) = 4096 dimensions.
making attention of type 'vanilla' with 512 in_channels
Load pretrained weight from ['CoDi_encoders.pth', 'CoDi_text_diffuser.pth', 'CoDi_audio_diffuser_m.pth', 'CoDi_video_diffuser_8frames.pth']

RuntimeError: CUDA out of memory. Tried to allocate 64.00 MiB (GPU 0; 23.70 GiB total capacity; 17.10 GiB already allocated; 3.56 MiB free; 17.49 GiB reserved in total by PyTorch) If reserved memory is >> allocated memory try setting max_split_size_mb to avoid fragmentation. See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CONF

Can you let me know how to solve this issue ?

+-----------------------------------------------------------------------------+
| Processes: |
| GPU GI CI PID Type Process name GPU Memory |
| ID ID Usage |
|=============================================================================|
| No running processes found |
+-----------------------------------------------------------------------------+

GAURAV SAHA · Answer 1 · Sat Jul 08 2023 17:11:33 GMT+0800 (China Standard Time)

I was able to load the checkpoints , but as I tried to do Text to Video +Audio , it shows the same problem as before.
RuntimeError: CUDA out of memory. Tried to allocate 1024.00 MiB (GPU 0; 23.70 GiB total capacity; 21.25 GiB already allocated; 416.56 MiB free; 21.66 GiB reserved in total by PyTorch) If reserved memory is >> allocated memory try setting max_split_size_mb to avoid fragmentation. See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CONF

Any way I can run all of the inference tasks on a single A100 80 GB ?