OpenNMT / CTranslate2

Fast inference engine for Transformer models

Home Page:https://opennmt.net/CTranslate2

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

`unload_model` support for `Generator`

NeonBohdan opened this issue · comments

unload_model here is a unique feature of ctranslate2
But it's supported only for Translator

Can it also be supported for Generator models?
It will optimize memory management for them alot

+1, can this support Whisper models as well?

Hello, we can support it for Generator and Whisper of course. I will add it when I have time.

In the meantime, hit me up if you want some quick code snippets on how to delete the model object, perform torch.cuda freeing of vram, garbage collection, etc., which is what I've resorted to...