How to just evaluate a pre-trained network on an audio file?

Question

How to just evaluate a pre-trained network on an audio file?

devinbostIL opened this issue 7 years ago · comments

Hi,

I was able to get my environment setup, and I am wanting to just try evaluating an existing model (such as the LibriSpeech network) to attempt speech-to-text on an audio file. I just want to perform the transcription.
How do I go about this with your library? I am not sure from the documentation what steps are necessary and how much extra development work I will need to do (if any) to perform the transcription task from your library.

Sean Naren · Answer 1 · Fri May 19 2017 18:01:34 GMT+0800 (China Standard Time)

Hey my bad! Should update the docs sometime :) To do this use the predict script like below:

th Predict.lua -modelPath /path/to/model.t7 -audioPath /path/to/audio.wav

There are further parameters if you need them, use the -help argument to see them!

Devin G. Bost · Answer 2 · Fri May 26 2017 04:31:01 GMT+0800 (China Standard Time)

Thanks for the information!

I attempted to run the model, and it blew up with this message:

$ th Predict.lua -modelPath libri_deepspeech.t7 -audioPath '/home/devinbost/Downloads/speech_audio_files_sample/nameOfAudioFile.wav'
/home/devinbost/torch/install/bin/luajit: ...e/devinbost/torch/install/share/lua/5.1/nn/Container.lua:67: 
In 2 module of nn.Sequential:
In 3 module of nn.Sequential:
In 1 module of cudnn.BatchBRNNReLU:
/home/devinbost/torch/install/share/lua/5.1/nn/View.lua:47: input view (5107x1x1x1760) and desired view (5107x-1) do not match
stack traceback:
	[C]: in function 'error'
	/home/devinbost/torch/install/share/lua/5.1/nn/View.lua:47: in function 'batchsize'
	/home/devinbost/torch/install/share/lua/5.1/nn/View.lua:79: in function </home/devinbost/torch/install/share/lua/5.1/nn/View.lua:77>
	[C]: in function 'xpcall'
	...e/devinbost/torch/install/share/lua/5.1/nn/Container.lua:63: in function 'rethrowErrors'
	.../devinbost/torch/install/share/lua/5.1/nn/Sequential.lua:44: in function <.../devinbost/torch/install/share/lua/5.1/nn/Sequential.lua:41>
	[C]: in function 'xpcall'
	...e/devinbost/torch/install/share/lua/5.1/nn/Container.lua:63: in function 'rethrowErrors'
	.../devinbost/torch/install/share/lua/5.1/nn/Sequential.lua:44: in function <.../devinbost/torch/install/share/lua/5.1/nn/Sequential.lua:41>
	[C]: in function 'xpcall'
	...e/devinbost/torch/install/share/lua/5.1/nn/Container.lua:63: in function 'rethrowErrors'
	.../devinbost/torch/install/share/lua/5.1/nn/Sequential.lua:44: in function 'forward'
	Predict.lua:42: in main chunk
	[C]: in function 'dofile'
	...bost/torch/install/lib/luarocks/rocks/trepl/scm-1/bin/th:150: in main chunk
	[C]: at 0x00405d50

WARNING: If you see a stack trace below, it doesn't point to the place where this error occurred. Please use only the one above.
stack traceback:
	[C]: in function 'error'
	...e/devinbost/torch/install/share/lua/5.1/nn/Container.lua:67: in function 'rethrowErrors'
	.../devinbost/torch/install/share/lua/5.1/nn/Sequential.lua:44: in function 'forward'
	Predict.lua:42: in main chunk
	[C]: in function 'dofile'
	...bost/torch/install/lib/luarocks/rocks/trepl/scm-1/bin/th:150: in main chunk
	[C]: at 0x00405d50

Any ideas?

Devin G. Bost · Answer 3 · Thu Jun 08 2017 07:22:44 GMT+0800 (China Standard Time)

Is it expecting me to pass it a table or a directory with a collection of audio files?

Devin G. Bost · Answer 4 · Thu Jun 08 2017 07:56:14 GMT+0800 (China Standard Time)

I tried changing the file and then also the sampling rate, and these were the error messages that I got:

~/src/deepspeech.torch$ th Predict.lua -modelPath libri_deepspeech.t7 -audioPath '/home/devinbost/Downloads/speech_audio_files_sample/4402691.wav'
/home/devinbost/torch/install/bin/luajit: ...e/devinbost/torch/install/share/lua/5.1/nn/Container.lua:67:
In 2 module of nn.Sequential:
In 3 module of nn.Sequential:
In 1 module of cudnn.BatchBRNNReLU:
/home/devinbost/torch/install/share/lua/5.1/nn/View.lua:47: input view (3951x1x1x1760) and desired view (3951x-1) do not match
stack traceback:
[C]: in function 'error'
/home/devinbost/torch/install/share/lua/5.1/nn/View.lua:47: in function 'batchsize'
/home/devinbost/torch/install/share/lua/5.1/nn/View.lua:79: in function </home/devinbost/torch/install/share/lua/5.1/nn/View.lua:77>
[C]: in function 'xpcall'
...e/devinbost/torch/install/share/lua/5.1/nn/Container.lua:63: in function 'rethrowErrors'
.../devinbost/torch/install/share/lua/5.1/nn/Sequential.lua:44: in function <.../devinbost/torch/install/share/lua/5.1/nn/Sequential.lua:41>
[C]: in function 'xpcall'
...e/devinbost/torch/install/share/lua/5.1/nn/Container.lua:63: in function 'rethrowErrors'
.../devinbost/torch/install/share/lua/5.1/nn/Sequential.lua:44: in function <.../devinbost/torch/install/share/lua/5.1/nn/Sequential.lua:41>
[C]: in function 'xpcall'
...e/devinbost/torch/install/share/lua/5.1/nn/Container.lua:63: in function 'rethrowErrors'
.../devinbost/torch/install/share/lua/5.1/nn/Sequential.lua:44: in function 'forward'
Predict.lua:42: in main chunk
[C]: in function 'dofile'
...bost/torch/install/lib/luarocks/rocks/trepl/scm-1/bin/th:150: in main chunk
[C]: at 0x00405d50

WARNING: If you see a stack trace below, it doesn't point to the place where this error occurred. Please use only the one above.
stack traceback:
[C]: in function 'error'
...e/devinbost/torch/install/share/lua/5.1/nn/Container.lua:67: in function 'rethrowErrors'
.../devinbost/torch/install/share/lua/5.1/nn/Sequential.lua:44: in function 'forward'
Predict.lua:42: in main chunk
[C]: in function 'dofile'
...bost/torch/install/lib/luarocks/rocks/trepl/scm-1/bin/th:150: in main chunk
[C]: at 0x00405d50

~/src/deepspeech.torch$ th Predict.lua -modelPath libri_deepspeech.t7 -audioPath '/home/devinbost/Downloads/speech_audio_files_sample/4402691.wav' -sampleRate 13000
/home/devinbost/torch/install/bin/luajit: ...e/devinbost/torch/install/share/lua/5.1/nn/Container.lua:67:
In 1 module of nn.Sequential:
In 7 module of nn.Sequential:
/home/devinbost/torch/install/share/lua/5.1/nn/View.lua:47: input view (1x32x26x4864) and desired view (1312x-1) do not match
stack traceback:
[C]: in function 'error'
/home/devinbost/torch/install/share/lua/5.1/nn/View.lua:47: in function 'batchsize'
/home/devinbost/torch/install/share/lua/5.1/nn/View.lua:79: in function </home/devinbost/torch/install/share/lua/5.1/nn/View.lua:77>
[C]: in function 'xpcall'
...e/devinbost/torch/install/share/lua/5.1/nn/Container.lua:63: in function 'rethrowErrors'
.../devinbost/torch/install/share/lua/5.1/nn/Sequential.lua:44: in function <.../devinbost/torch/install/share/lua/5.1/nn/Sequential.lua:41>
[C]: in function 'xpcall'
...e/devinbost/torch/install/share/lua/5.1/nn/Container.lua:63: in function 'rethrowErrors'
.../devinbost/torch/install/share/lua/5.1/nn/Sequential.lua:44: in function 'forward'
Predict.lua:42: in main chunk
[C]: in function 'dofile'
...bost/torch/install/lib/luarocks/rocks/trepl/scm-1/bin/th:150: in main chunk
[C]: at 0x00405d50

WARNING: If you see a stack trace below, it doesn't point to the place where this error occurred. Please use only the one above.
stack traceback:
[C]: in function 'error'
...e/devinbost/torch/install/share/lua/5.1/nn/Container.lua:67: in function 'rethrowErrors'
.../devinbost/torch/install/share/lua/5.1/nn/Sequential.lua:44: in function 'forward'
Predict.lua:42: in main chunk
[C]: in function 'dofile'
...bost/torch/install/lib/luarocks/rocks/trepl/scm-1/bin/th:150: in main chunk
[C]: at 0x00405d50

Sean Naren · Answer 5 · Thu Jun 08 2017 15:56:35 GMT+0800 (China Standard Time)

Make sure the file is a 16khz wav file, is this the case?

I've also added documentation here.

Michael · Answer 6 · Fri Jun 09 2017 05:16:04 GMT+0800 (China Standard Time)

I'm having the same problem. I downloaded the LibriSpeech pre trained model, am launching with th Predict.lua -modelPath libri_deepspeech.t7 -audioPath amy.out.wav -dictionaryPath ./dictionary -nGPU 1

I'm trying to run this against a WAV file I downsampled to 16k mono with sox amy.wav amy.out.wav rate 16k channels 1. It is a 16bit file, if that counts for anything.

I'm getting a very similar error when i try to run predict, View.lua:47: input view (241x1x1x1760) and desired view (241x-1) do not match

If I figure out what I'm doing wrong, I'd be happy to contribute some better documentation or strengthen the input file checking in Predict.lua so it throws actionable errors.