Very low accuracy

Question

Very low accuracy

dan960 opened this issue 7 years ago · comments

I have tried to do recognition from wav file (16kHz mono), but with very low accuracy (~65%) compared to using the pocketsphinx_continuous tool (~95%) with the same models, dictionary and pocketpshinx config options. The whole buffer (int16 vector) is fed to ps_process_raw in chunks (2048), basically mirroring the process of pocketsphinx_continuous tool .
Before passing the file to the recognizer, the wav headers are removed (using decodeAudioData) and then resampled back to 16000(because of the AudioContext automatic resampling to context rate).
The models and dictionary are raw and lazy-loaded.

My theory is that the low accuracy could be cause by the browsers performance. Has anybody ran into something similar?

On a separate note I have also tried the WebAssembly version, because if the problem was in insufficient resources, that would presumably result in increase of accuracy. The compilation runs file but on process it gives runtime error:

Uncaught RuntimeError: integer result unrepresentable
    at _eval_topn (wasm-function[652]:477)
    at _ptm_mgau_codebook_eval (wasm-function[648]:66)
    at _ptm_mgau_frame_eval (wasm-function[646]:125)
    at _acmod_score (wasm-function[371]:234)
    at _phone_loop_search_step (wasm-function[578]:116)
    at _ps_search_forward (wasm-function[611]:109)
    at _ps_process_raw (wasm-function[609]:152)

Sylvain Chevalier · Answer 1 · Wed Jan 17 2018 05:39:46 GMT+0800 (China Standard Time)

Browser performance should not affect recognition accuracy, computations are just the same than when compiled natively, but the decoder would certainly run slower in the browser.

If I were you, I would first look at the initialization parameters on both versions (for pocketsphinx.js, they are displayed in the JavaScript console) and make sure they are all the same.

You should also try to find a way to make sure the audio data that are passed to the decoder are the same. I am not sure I understand what you describe about AudioContext resampling your file, but that could be something that affects recognition rate.

Mainak Biswas · Answer 2 · Thu Jun 14 2018 22:50:03 GMT+0800 (China Standard Time)

same problem...help !!