[BUG] Cuda out of memory on Linux during inference
shirounanashi opened this issue · comments
Before You Report a Bug
My setup is a GTX 1660 Super, Ryzen 5600G (16GB ram).
Bug Description
I use Applio on both Windows 11 and Linux (Arch), but on Linux, it is giving this Cuda out of memory error, both in the last version and in the last commit.
Steps to Reproduce
Outline the steps to replicate the issue:
- Simply make the inference with the default settings
Expected Behavior
Make the inference without giving cuda out of memory
Desktop Details:
- Operating System: Linux (Arch Linux, Gnome)
- Browser: Microsoft Edge
Additional Context
I'm not using IAHispano's fairseq.
It could be because a lot of things but I think is one of this:
- Your GPU driver is outdated: https://docs.nvidia.com/deeplearning/cudnn/latest/reference/support-matrix.html
- Your GPU only has 5 GB of VRAM and it's also a GTX which there's some people having issues with them
Thank you, it really was a driver problem, but it wasn't because it was outdated, it was because it wasn't installed, both cuda and cudnn. I installed it and solved the problem
Testing further, I discovered that it is a problem with Applio on Linux, a problem that does not happen in RVC WebUI, that is, it has nothing to do with the driver as I thought it would be when closing the issue
Applio uses identical code for GPU detection and utilization in both RVC WebUI. We only chnaged the Torch version, hence I'm sharing this link https://docs.nvidia.com/deeplearning/cudnn/latest/reference/support-matrix.html for you to verify compatibility with our current setup.
@aitronssesin In theory, my GPU was supposed to run smoothly. But even with the latest version of the drivers and cudnn in a clean Arch installation, it still gives me this Cuda out of memory problem, which doesn't happen with the RVC Web UI. Honestly, I don't know why this happens, since it doesn't happen on Windows on the same PC
It could be an issue with Arch because in my Ubuntu server it works without any issues.
It may be, but it doesn't make sense for the RVC WebUI to work without problems
Yes because of the torch version maybe
I tried updating torch, torchaudio and torchvision, but the Cuda out of memory problem still occurred
Sorry I didn't explain me well I was saying that probably the newer torch version we are using is broken in arch but try this:
pip install torch==2.1.1 torchvision==0.16.1 torchaudio==2.1.1 --index-url https://download.pytorch.org/whl/cu121
I didn't explain myself well either, I updated to the version I was using on the RVC Web UI. But I tested the version you sent and the problem still exists. I also noticed that Applio doesn't seem to release the VRAM until I close its window in the terminal