keenresearch / KeenASR-Android-PoC

A proof-of-concept app using KeenASR SDK on Android. WE ARE HIRING: https://keenresearch.com/careers.html

Home Page:https://keenresearch.com/keenasr-docs

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

"WARNING: AUDIO BUFFER OVERFLOW!!!"

partounian opened this issue · comments

WARNING: AUDIO BUFFER OVERFLOW!!! -- decoding speed is slower than real time (this is a problem) CB.size: 17280, CB available: 2560, new bytes: 3200

Taken from logcat.

Also I see to get results as "<SPOKEN_NOISE>" and "Num words", not sure if this was intentionally. How Keen to give me English STT results?

Thank you for the help

We've seen this happen in couple of scenarios:

  • older devices where CPU cannot keep up with the processing (this could be overcome by more compact acoustic models that we have); in this scenario you will see the warning very quickly after you start listening
  • if your app is listening/recognizing for long time, CPU throttling may kick in
  • with very low battery (<10%), some devices also reduce CPU frequency

<SPOKEN_NOISE> 'word' may occur if you say things that are not in the language model (PoC is setup to only recognize digits); audio buffer overflows may also be causing this even if you are saying digits. "Num words" is just part of the debug printout -- you can access various attributes in KASRResult object to get words, confidences, etc..

In general, you will need to specify what the SDK is listening to by creating a decoding graph from an array of phrases specific to your app. This approach will not work for general, large vocabulary dictation -- if that's your goal, we would need to get involved a bit more.

Best to contact me at ogi@keenresearch.com and we can discuss this in more detail.

Thanks on an update. While we have not tested on that specific device/cpu (Snapdragon 625) I believe it should be able to keep up with the real-time processing with the ASR Bundle (acoustic models) that comes with the PoC app.

Next release of the SDK will leverage multiple CPU cores (up to two) for heavy computation, which would further alleviate these sort of problems. But, I would like to better understand the context in which you are running into this issue with the current version of the SDK. For example. are you running PoC app as-is, how soon do you see these warnings in the logcat after you start listening?

We also have some non-public ASR Bundles that are more compact and require less computational resources.

Hi @partounian the updated version of ASR SDK has numerous optimizations; we are also releasing an ASR Bundle that's more compact. These two in combination should resolve the issues with buffer overflows.

Looking forward to your feedback.