marella / ctransformers

Python bindings for the Transformer models implemented in C/C++ using GGML library.

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Out of memory exits process

kczimm opened this issue · comments

Thanks for creating this library! We are using ctransformers at PostgresML to support GGUF models from Huggingface. We need to detect and recover from CUDA out of memory errors. Currently, it appears that ctransformers exits the process whenever a CUDA error is encountered.

Is there any plan to add error handling to ctransformers? Ideally, we would get some sort of Exception in Python that we can handle.