Problem with GPU allocation after updating to CTranslate2 4.0.0
carolinaxxxxx opened this issue · comments
When the device_index = 1 parameter (GPU 1), the GPU 0 is charged with low value data (in my case about 263 MB) and GPU itself shows signs of work, although it should not. This is clearly the result of the CTranslate2 4.0.0. After returning to ctranslate2 3.24.0 the problem disappears.
The above example is for the whisper model, but I also tried with llm models and the same result.