MainRo / deepspeech-server

A testing server for a speech to text service based on coqui.ai

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Using this server as the ASR(Speech to Text) module on mycroft mark1

tachen-cs opened this issue · comments

Hi, I am wandering can I use this server as the ASR module on a voice assistant namely mycroft mark1 (controlled by a respberry pi).

I am not sure how to combine it with other services on mycroft mark1.

Thank you!

do you want to run it in the device itself (the rpi) or on a remote server controlled via the rest API ?
Running it on the rpi may be challenging if you want to run a deepspeech model for a full language: Models are quite big (several GB), and it may not be able to run in real time (I did not try).
I did not have the opportunity to look at the mycroft code, but maybe it is easy to replace the existing ASR with deepspeech.

This project can be used more or less as a drop in replacement for the default remote Mycroft STT module. I've been running it locally with Mycroft for more than a year.

To use a local deepspeech server apply the following settings to the Mycroft configuration file:

  "stt": {
    "module": "deepspeech_server"
     "deepspeech_server": {
       "uri": "http://localhost:8080/stt"
     }
  }

Replace localhost with whatever IP address or hostname the deepspeech server is running on.

I can't answer if the deepspeech server itself will work well enough on a Raspberry Pi but I think it might struggle a bit. The lowest configuration I've used so far is Intel Core i5-4570 for the deepspeech server and RPi 2B for Mycroft itself.

yes, @johanpalmqvist. I successfully replace deepspeech server with the default STT module.

But I found the recognition performance is so poor, some common words now have the recognition mistakes. Did you meet this too?

The default models provided by mozilla are trained on open datasets (Fisher, LibriSpeech, and Switchboard) that are probably too small to be as effective as commercial solutions. This may improve in the future.