trained a custom model but it did not work and no error message

Question

trained a custom model but it did not work and no error message

raymond00000 opened this issue 3 years ago · comments

I tried to train a model on a subset of common voice data.
The training process looked good.
(P.S. what should be a good/reasonable loss value?)

After 50 epochs. I stopped it and I would like to test the model.

KeyboardInterrupt
I FINISHED optimization in 57:12:08.180975
I Loading best validating checkpoint from checkpoint/yue/best_dev-129058
I Loading variable from checkpoint: cudnn_lstm/rnn/multi_rnn_cell/cell_0/cudnn_compatible_lstm_cell/bias
I Loading variable from checkpoint: cudnn_lstm/rnn/multi_rnn_cell/cell_0/cudnn_compatible_lstm_cell/kernel
I Loading variable from checkpoint: global_step
I Loading variable from checkpoint: layer_1/bias
I Loading variable from checkpoint: layer_1/weights
I Loading variable from checkpoint: layer_2/bias
I Loading variable from checkpoint: layer_2/weights
I Loading variable from checkpoint: layer_3/bias
I Loading variable from checkpoint: layer_3/weights
I Loading variable from checkpoint: layer_5/bias
I Loading variable from checkpoint: layer_5/weights
I Loading variable from checkpoint: layer_6/bias
I Loading variable from checkpoint: layer_6/weights
Testing model on /data/asr/cv-corpus-7.0-2021-07-21-zh-HK/cv-corpus-7.0-2021-07-21/zh-HK/clips/test.csv
Test epoch | Steps: 1 | Elapsed Time: 0:18:29

But it did not run on the test samples. After 15 minutes of waiting, I killed the process.
Test epoch | Steps: 1 | Elapsed Time: 0:18:29 Killed

I tried again with this command, but same problem, no output/error message from screen even after a long waiting..

python3 DeepSpeech.py  --alphabet_config_path yue_alphabet.txt \
    --test_files /data/asr/cv-corpus-7.0-2021-07-21-zh-HK/cv-corpus-7.0-2021-07-21/zh-HK/clips/test.csv \
    --checkpoint_dir checkpoint/yue

I Loading best validating checkpoint from checkpoint/yue/best_dev-129058
I Loading variable from checkpoint: cudnn_lstm/rnn/multi_rnn_cell/cell_0/cudnn_compatible_lstm_cell/bias
I Loading variable from checkpoint: cudnn_lstm/rnn/multi_rnn_cell/cell_0/cudnn_compatible_lstm_cell/kernel
I Loading variable from checkpoint: global_step
I Loading variable from checkpoint: layer_1/bias
I Loading variable from checkpoint: layer_1/weights
I Loading variable from checkpoint: layer_2/bias
I Loading variable from checkpoint: layer_2/weights
I Loading variable from checkpoint: layer_3/bias
I Loading variable from checkpoint: layer_3/weights
I Loading variable from checkpoint: layer_5/bias
I Loading variable from checkpoint: layer_5/weights
I Loading variable from checkpoint: layer_6/bias
I Loading variable from checkpoint: layer_6/weights
Testing model on /data/asr/cv-corpus-7.0-2021-07-21-zh-HK/cv-corpus-7.0-2021-07-21/zh-HK/clips/test.csv
Test epoch | Steps: 0 | Elapsed Time: 0:00:00

Many thanks if you could give me some advice on what is wrong and how to resolve it. Thanks!

Francis Tyers · Answer 1 · Thu Dec 02 2021 06:12:54 GMT+0800 (China Standard Time)

Check out #3693.