Problems training on jamendo - RuntimeError: invalid argument 0: Sizes of tensors must match except in dimension 0
mathigatti opened this issue · comments
Hi, thanks for this awesome project! I'm trying to train it with jamendo-moodtheme tags but I'm getting an error.
I'm trying it on a google colab VM with a cuda enabled GPU.
I downloaded the mel-spectrograms from the jamendo repository specyfing the melspecs data type and autotagging_moodtheme dataset. Then in this project I just replaced the TAGS variables in the code with this and the tsv files with the moodtheme ones from here.
Everything looked fine but for some reason I'm receiving the attached error after running the training code.
The mel spectrograms have 92 bands and different lengths, that might be causing problems maybe?
Let me know if anyone knows what might be the problem :)
Thanks in advance!
# My code
%tensorflow_version 1.x
%cd /content/sota-music-tagging-models/src/
!python -u main.py --data_path /content/data --dataset jamendo-mood
My error message
Namespace(batch_size=16, data_path='/content/data', dataset='jamendo-mood', log_step=20, lr=0.0001, model_load_path='.', model_save_path='./../models', model_type='hcnn', n_epochs=200, num_workers=0, use_tensorboard=1)
Traceback (most recent call last):
File "main.py", line 61, in <module>
main(config)
File "main.py", line 39, in main
solver.train()
File "/content/sota-music-tagging-models/src/solver.py", line 172, in train
for x, y in self.data_loader:
File "/usr/local/lib/python3.6/dist-packages/torch/utils/data/dataloader.py", line 346, in __next__
data = self.dataset_fetcher.fetch(index) # may raise StopIteration
File "/usr/local/lib/python3.6/dist-packages/torch/utils/data/_utils/fetch.py", line 47, in fetch
return self.collate_fn(data)
File "/usr/local/lib/python3.6/dist-packages/torch/utils/data/_utils/collate.py", line 80, in default_collate
return [default_collate(samples) for samples in transposed]
File "/usr/local/lib/python3.6/dist-packages/torch/utils/data/_utils/collate.py", line 80, in <listcomp>
return [default_collate(samples) for samples in transposed]
File "/usr/local/lib/python3.6/dist-packages/torch/utils/data/_utils/collate.py", line 65, in default_collate
return default_collate([torch.as_tensor(b) for b in batch])
File "/usr/local/lib/python3.6/dist-packages/torch/utils/data/_utils/collate.py", line 56, in default_collate
return torch.stack(batch, 0, out=out)
RuntimeError: invalid argument 0: Sizes of tensors must match except in dimension 0. Got 20546 and 9168 in dimension 2 at /pytorch/aten/src/TH/generic/THTensor.cpp:689
Hi @mathigatti
This error is because you have different lengths of audio in a single batch. As the error says, the data loader got "20546 and 9168 in dimension 2". You need to crop them to have a same length.
Before that, let me check this first. It looks like you are trying to use Mel spectrogram inputs. Implemented models in this repository use raw audio inputs and it extracts Mel spectrograms on-the-fly. So, please use raw audio inputs.
If you want to use Mel spectrogram inputs with your own data loader, you need to modify model.py
. You can simply remove self.spec
and self.to_db
from model.py
.
It worked perfectly after downloading the mp3 files thank you very much!!