huggingface / distil-whisper

Distilled variant of Whisper for speech recognition. 6x faster, 50% smaller, within 1% word error rate.

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Finetuning on which model?

RohitMidha23 opened this issue · comments

As you mentioned, we should fine tune when the WER > 20% and dataset size < 1000 hours.

This is my case as well, where I have a finetuned model with WER = 48% and dataset size = 100 hours.

My question is, do you fine tune the model created through create_student_model or are we better of finetuning a tiny / small model?

Thanks for your time @sanchit-gandhi!

good question! i also want to know -- from a preliminary experiment i find that using create_student_model to finetune can generate only empty transcripts, my finetuning data was <1hr. but finetuning the tiny/small with <1hr data yielded better result.