abdeladim-s / subsai

🎞️ Subtitles generation tool (Web-UI + CLI + Python package) powered by OpenAI's Whisper and its variants 🎞️

Home Page:https://abdeladim-s.github.io/subsai/

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Hugging Face pretrained models integration

mkatic007 opened this issue · comments

Could you please explain how to add a Hugging Face pretrained model to work with your solution?

@mkatic007, I've added the hugging face implementation to the supported models.
You can use any pretrained model from the hub as long as it is compatible with the Automatic Speech Recognition task.
Please give it a try and let me know if you find any issues.

Thank you! I tried with: subsai D:/TranSource/03.mp3 --model japanese-asr/distil-whisper-large-v3-ja-reazonspeech-large --model-configs "{"model_type": "large-v3"}" --format srt -tm mbart50 -tsl japanese -ttl english
But it gives the error: return AVAILABLE_MODELS[model_name]'class' KeyError: 'japanese-asr/distil-whisper-large-v3-ja-reazonspeech-large'
I did not download the model from HF, so I am not sure if I am missing any steps :)
Please be so kind as to instruct me on what to do.

The command should look like:

subsai D:/TranSource/03.mp3 --model HuggingFaceModel  --model-configs "{"model_id": "japanese-asr/distil-whisper-large-v3-ja-reazonspeech-large"}" --format srt -tm mbart50 -tsl japanese -ttl english

Thank you, I tried but now I am getting this error:
"json.decoder.JSONDecodeError: Expecting property name enclosed in double quotes: line 1 column 2 (char 1)".