trained a custom model but it did not work and no error message
raymond00000 opened this issue · comments
I tried to train a model on a subset of common voice data.
The training process looked good.
(P.S. what should be a good/reasonable loss value?)
Epoch 50 | Validation | Elapsed Time: 0:02:36 | Steps: 1782 | Loss: 86.263758 | Dataset: /data/asr/cv-corpus-7.0-2021-07-21-zh-HK/cv-corpus-7.0-2021-0
After 50 epochs. I stopped it and I would like to test the model.
KeyboardInterrupt
I FINISHED optimization in 57:12:08.180975
I Loading best validating checkpoint from checkpoint/yue/best_dev-129058
I Loading variable from checkpoint: cudnn_lstm/rnn/multi_rnn_cell/cell_0/cudnn_compatible_lstm_cell/bias
I Loading variable from checkpoint: cudnn_lstm/rnn/multi_rnn_cell/cell_0/cudnn_compatible_lstm_cell/kernel
I Loading variable from checkpoint: global_step
I Loading variable from checkpoint: layer_1/bias
I Loading variable from checkpoint: layer_1/weights
I Loading variable from checkpoint: layer_2/bias
I Loading variable from checkpoint: layer_2/weights
I Loading variable from checkpoint: layer_3/bias
I Loading variable from checkpoint: layer_3/weights
I Loading variable from checkpoint: layer_5/bias
I Loading variable from checkpoint: layer_5/weights
I Loading variable from checkpoint: layer_6/bias
I Loading variable from checkpoint: layer_6/weights
Testing model on /data/asr/cv-corpus-7.0-2021-07-21-zh-HK/cv-corpus-7.0-2021-07-21/zh-HK/clips/test.csv
Test epoch | Steps: 1 | Elapsed Time: 0:18:29
But it did not run on the test samples. After 15 minutes of waiting, I killed the process.
Test epoch | Steps: 1 | Elapsed Time: 0:18:29 Killed
I tried again with this command, but same problem, no output/error message from screen even after a long waiting..
python3 DeepSpeech.py --alphabet_config_path yue_alphabet.txt \
--test_files /data/asr/cv-corpus-7.0-2021-07-21-zh-HK/cv-corpus-7.0-2021-07-21/zh-HK/clips/test.csv \
--checkpoint_dir checkpoint/yue
I Loading best validating checkpoint from checkpoint/yue/best_dev-129058
I Loading variable from checkpoint: cudnn_lstm/rnn/multi_rnn_cell/cell_0/cudnn_compatible_lstm_cell/bias
I Loading variable from checkpoint: cudnn_lstm/rnn/multi_rnn_cell/cell_0/cudnn_compatible_lstm_cell/kernel
I Loading variable from checkpoint: global_step
I Loading variable from checkpoint: layer_1/bias
I Loading variable from checkpoint: layer_1/weights
I Loading variable from checkpoint: layer_2/bias
I Loading variable from checkpoint: layer_2/weights
I Loading variable from checkpoint: layer_3/bias
I Loading variable from checkpoint: layer_3/weights
I Loading variable from checkpoint: layer_5/bias
I Loading variable from checkpoint: layer_5/weights
I Loading variable from checkpoint: layer_6/bias
I Loading variable from checkpoint: layer_6/weights
Testing model on /data/asr/cv-corpus-7.0-2021-07-21-zh-HK/cv-corpus-7.0-2021-07-21/zh-HK/clips/test.csv
Test epoch | Steps: 0 | Elapsed Time: 0:00:00
Many thanks if you could give me some advice on what is wrong and how to resolve it. Thanks!
Check out #3693.