huggingface / distil-whisper

Distilled variant of Whisper for speech recognition. 6x faster, 50% smaller, within 1% word error rate.

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

支持中文吗?

xxm1668 opened this issue · comments

It does not, but you have two options for training a Chinese-compatible model:

  1. Follow the distillation instructions in the training folder and train on the Mandarin split of the Common Voice dataset https://github.com/huggingface/distil-whisper/tree/main/training
  2. Fine-tune the pre-trained checkpoint on Mandarin (instructions for this can also be found under the training folder)

=> 1 will get you the best results, but is a little bit more involved than 2

Disclaimer: the following translation is generated using Google Translate

它没有,但你有两种选择来训练中文兼容模型:

  1. 按照训练文件夹中的蒸馏说明,对 Common Voice 数据集的普通话分割进行训练 https://github.com/huggingface/distil-whisper/tree/main/training
  2. 微调普通话的预训练检查点(也可以在训练文件夹下找到相关说明)

=> 1 会给你最好的结果,但比 2 稍微复杂一些

免责声明:使用 Google 翻译翻译回复

thanks your recommendation

you can refer to https://huggingface.co/BELLE-2/Belle-distilwhisper-large-v2-zh
support Chinese based on distil-whipser-larger-v2

thanks