T-vK / Termux-DeepSpeech

Open source offline speech recognition for Android using Mozilla's DeepSpeech in Termux

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Unexpected type for ./tmp.wav: 49152

navid-zamani opened this issue · comments

Sometimes, speech2text throws this error. The number varies.

I found no way to reproduce it reliably. So before I go add some logging output lines to the entire script, maybe you know what it likely is, and can tell me to put logging at the right location(s), that show me some context when it happens).

(Btw, since it’s hard to test right now, giving me a hint on how to make it recognize German would make it much easier for me to test it. I think right now, it may struggle because it can’t find anything English with my accent. ;)

commented

I haven't used DeepSpeech in a while, but I've never had that error. It might be helpful to analyze that tmp.wav file, but for that you'd probably have to change the script to not delete it automatically. It would be interesting to know if it is a valid wav file at all and if it is, info like, channel count, sample rate, data type, sample size and endianess woukd be interesting. Especially compared against another wav file of an example where it did work.

I'm not sure if DeepSpeech provides models for German speech reconition.

Vosk might be a better alternative. It works quite well in the open source app Dicio. I havent tried getting it to work in an Android terminal.