vineeths96 / Spoken-Keyword-Spotting

In this repository, we explore using a hybrid system consisting of a Convolutional Neural Network and a Support Vector Machine for Keyword Spotting task.

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Cannot batch tensors with different shapes in component 0. First element had shape [1445,40] and element 1 had shape [1330,40].

himrlawrrence opened this issue · comments

commented

image

==>
hi vinneths96,

i tried to use my own hotword -"get" insteand of "marvin" .
Then added some get-wav files into train folder ,create_modle and when run
"
# Obtain the feature embeddings
X_train = feature_extractor.predict(get_data, use_multiprocessing=True)
"
i got the err stack:
Cannot batch tensors with different shapes in component 0. First element had shape [1445,40] and element 1 had shape [1330,40].

could you please help me out?

Thanks a lot.

JFU

Hi JFU,

Can you post the entire error stack to understand which statement triggers this error? It's hard to understand from the current short snapshot you provided.

commented

i'm glad to get ur quick reply.
i have another quick question: trained wav files must be the same size, eg.32K OR, must be less than 1 second?

commented

error stack is shown below:

C:\Users\JFU\anaconda3\envs\env38\python.exe C:/WORKSPACE/Spoken-Keyword-Spotting/src/main.py
2021-11-02 13:58:01.704671: W tensorflow/stream_executor/platform/default/dso_loader.cc:55] Could not load dynamic library 'cudart64_101.dll'; dlerror: cudart64_101.dll not found
2021-11-02 13:58:01.704822: I tensorflow/stream_executor/cuda/cudart_stub.cc:29] Ignore above cudart dlerror if you do not have a GPU set up on your machine.
Training model
Dataset statistics
Train files: 51410
Validation files: 6640
Dev test files: 6675
Test files: 2567
2021-11-02 13:58:07.875339: W tensorflow/stream_executor/platform/default/dso_loader.cc:55] Could not load dynamic library 'nvcuda.dll'; dlerror: nvcuda.dll not found
2021-11-02 13:58:07.875475: E tensorflow/stream_executor/cuda/cuda_driver.cc:313] failed call to cuInit: UNKNOWN ERROR (303)
2021-11-02 13:58:07.881674: I tensorflow/stream_executor/cuda/cuda_diagnostics.cc:169] retrieving CUDA diagnostic information for host: JFU-LAPTOP
2021-11-02 13:58:07.881881: I tensorflow/stream_executor/cuda/cuda_diagnostics.cc:176] hostname: JFU-LAPTOP
2021-11-02 13:58:07.882268: I tensorflow/core/platform/cpu_feature_guard.cc:143] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX2
2021-11-02 13:58:07.891825: I tensorflow/compiler/xla/service/service.cc:168] XLA service 0x19452a35900 initialized for platform Host (this does not guarantee that XLA will be used). Devices:
2021-11-02 13:58:07.891975: I tensorflow/compiler/xla/service/service.cc:176] StreamExecutor device (0): Host, Default Version
Model: "sequential"
...
...
..
Total params: 930,403
Trainable params: 927,873
Non-trainable params: 2,530


Epoch 1/25
401/401 [==============================] - 221s 550ms/step - loss: 2.2773 - sparse_categorical_accuracy: 0.3448 - val_loss: 1.2852 - val_sparse_categorical_accuracy: 0.6045 - lr: 0.0010
Epoch 2/25
401/401 [==============================] - 269s 672ms/step - loss: 0.8611 - sparse_categorical_accuracy: 0.7339 - val_loss: 0.5980 - val_sparse_categorical_accuracy: 0.8114 - lr: 0.0010
Epoch 3/25
401/401 [==============================] - 290s 724ms/step - loss: 0.5668 - sparse_categorical_accuracy: 0.8260 - val_loss: 0.3616 - val_sparse_categorical_accuracy: 0.8905 - lr: 0.0010
.........
Epoch 23/25
401/401 [==============================] - 404s 1s/step - loss: 0.1639 - sparse_categorical_accuracy: 0.9494 - val_loss: 0.1798 - val_sparse_categorical_accuracy: 0.9487 - lr: 0.0010
Epoch 24/25
401/401 [==============================] - 408s 1s/step - loss: 0.1644 - sparse_categorical_accuracy: 0.9489 - val_loss: 0.1863 - val_sparse_categorical_accuracy: 0.9490 - lr: 0.0010
Epoch 25/25
401/401 [==============================] - 420s 1s/step - loss: 0.1572 - sparse_categorical_accuracy: 0.9519 - val_loss: 0.1741 - val_sparse_categorical_accuracy: 0.9487 - lr: 0.0010
Saving model
Saving training history
Traceback (most recent call last):
File "C:/WORKSPACE/Spoken-Keyword-Spotting/src/main.py", line 25, in
main()
File "C:/WORKSPACE/Spoken-Keyword-Spotting/src/main.py", line 21, in main
get_kws_model()
File "C:\WORKSPACE\Spoken-Keyword-Spotting\src\model_train.py", line 141, in get_kws_model
X_train = feature_extractor.predict(get_data, use_multiprocessing=True)
File "C:\Users\JFU\anaconda3\envs\env38\lib\site-packages\tensorflow\python\keras\engine\training.py", line 88, in _method_wrapper
return method(self, *args, **kwargs)
File "C:\Users\JFU\anaconda3\envs\env38\lib\site-packages\tensorflow\python\keras\engine\training.py", line 1268, in predict
tmp_batch_outputs = predict_function(iterator)
File "C:\Users\JFU\anaconda3\envs\env38\lib\site-packages\tensorflow\python\eager\def_function.py", line 580, in call
result = self._call(*args, **kwds)
File "C:\Users\JFU\anaconda3\envs\env38\lib\site-packages\tensorflow\python\eager\def_function.py", line 650, in _call
return self._concrete_stateful_fn._filtered_call(canon_args, canon_kwds) # pylint: disable=protected-access
File "C:\Users\JFU\anaconda3\envs\env38\lib\site-packages\tensorflow\python\eager\function.py", line 1661, in _filtered_call
return self._call_flat(
File "C:\Users\JFU\anaconda3\envs\env38\lib\site-packages\tensorflow\python\eager\function.py", line 1745, in _call_flat
return self._build_call_outputs(self._inference_function.call(
File "C:\Users\JFU\anaconda3\envs\env38\lib\site-packages\tensorflow\python\eager\function.py", line 593, in call
outputs = execute.execute(
File "C:\Users\JFU\anaconda3\envs\env38\lib\site-packages\tensorflow\python\eager\execute.py", line 59, in quick_execute
tensors = pywrap_tfe.TFE_Py_Execute(ctx._handle, device_name, op_name,
tensorflow.python.framework.errors_impl.InvalidArgumentError: Cannot batch tensors with different shapes in component 0. First element had shape [1445,40] and element 1 had shape [1330,40].
[[node IteratorGetNext (defined at \WORKSPACE\Spoken-Keyword-Spotting\src\model_train.py:141) ]] [Op:__inference_predict_function_29742]

Function call stack:
predict_function

Process finished with exit code 1

I believe the sampling rate should be the same as the one used for training.

commented

hi vinneth,
how to train other key word?
thanks,
JFU