mpuels / docker-py-kaldi-asr-and-model

STT Service based on Kaldi ASR

Home Page:http://zamia-speech.org/

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

STT Service based on Kaldi ASR

This image contains a demo STT service based on Kaldi ASR and py-kaldi-asr. Try it out by following these steps.

To start the STT service on your local machine, execute:

$ docker pull quay.io/mpuels/docker-py-kaldi-asr-and-model:kaldi-generic-en-tdnn_sp-r20180815
$ docker run --rm -p 127.0.0.1:8080:80/tcp quay.io/mpuels/docker-py-kaldi-asr-and-model:kaldi-generic-en-tdnn_sp-r20180815

To transfer an audio file for transcription to the service, in a second terminal, execute:

$ conda env create -f environment.yml
$ source activate py-kaldi-asr-client
$ ./asr_client.py asr.wav

For a list of available Kaldi models packaged in Docker containers, see https://quay.io/repository/mpuels/docker-py-kaldi-asr-and-model?tab=tags .

For a description of the available models, see https://github.com/gooofy/zamia-speech#asr-models .

Docker images are named according to the format

kaldi-generic-<LANG>-tdnn-<SIZE>-<RELEASEDATE>
  1. <LANG>: There are models for English (en) and German (de).
  2. <SIZE>: Kaldi models come in two sizes: sp (standard size) and 250 ( smaller size, suitable for realtime decoding on Raspberry Pi).
  3. <RELEASEDATE>: Usually, models released later are trained on more data and hence have a lower word error rate.

The image is part of Zamia Speech.

About

STT Service based on Kaldi ASR

http://zamia-speech.org/

License:GNU Lesser General Public License v3.0


Languages

Language:Python 80.9%Language:Dockerfile 10.6%Language:Shell 8.5%