audioku / cross-accent-maml-asr

Meta-learning model agnostic (MAML) implementation for cross-accented ASR

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Unable to run meta training the model

vuminhdiep opened this issue · comments

@gentaiscool @SamuelCahyawijaya can you explain how to fix it because I tried to run meta_train.py but the clips are named like ffffd0eda81a6e48d9d3a5cf9f2b0aa5f17db75603fa7e32c5b978f568528e5129f77af9ffca8eb01dd4512c3a93a569df29678ca3e810a1bbfa2a65359704a7.mp3 which is different from the csv file in manifests folder with names like
./data/CommonVoice2_dataset/clips/common_voice_en_1100186.mp3 so that's why the code doesn't run for training the model. Here is the output when I tried to run the command line to train the meta_train:

RuntimeError: Error loading audio file: failed to open file ./data/CommonVoice2_dataset/clips/f91b898dcfaf8655bdbbed448068c1faae5ebe598b045de2a04a6335c967846990ae8f24070b64b8a3e33ba11aaaec51b7be927ab680eeb652945f974b77e8f3

Error: pop from empty list, fetching new data...
formats: can't open input file `./data/CommonVoice2_dataset/clips/1f68b4dfb437b760d5b2f0c86ebd19a0c425ecb5474778c50adbaee40cd50800b97411ece34a531250857868c048a3da53f4a087464319f1b88c97b16ea6e40d': No such file or directory
Exception in thread Thread-361:
Traceback (most recent call last):
  File "/usr/lib/python3.6/threading.py", line 916, in _bootstrap_inner
    self.run()
  File "/usr/lib/python3.6/threading.py", line 864, in run
    self._target(*self._args, **self._kwargs)
  File "/opt/vud/thesis/cross-accent-maml-asr/trainer/asr/meta_trainer.py", line 125, in fetch_train_batch
    batch_data = train_data_list[manifest_id].sample(k_train, k_valid, manifest_id)
  File "/opt/vud/thesis/cross-accent-maml-asr/utils/data_loader.py", line 274, in sample
    spect = self.parse_audio(audio_path)[:,:self.args.src_max_len]
  File "/opt/vud/thesis/cross-accent-maml-asr/utils/data_loader.py", line 69, in parse_audio
    y = load_audio(audio_path)
  File "/opt/vud/thesis/cross-accent-maml-asr/utils/audio.py", line 8, in load_audio
    sound, _ = torchaudio.load(path, format="mp3") #remove normalization=True
  File "/opt/vud/thesis/cross-accent-maml-asr/venv/lib/python3.6/site-packages/torchaudio/backend/sox_io_backend.py", line 153, in load
    filepath, frame_offset, num_frames, normalize, channels_first, format)

@vuminhdiep , sorry for the late reply, I am not sure why the current version dataset has a different naming standard. I suspect that, since the CommonVoice dataset is not as mature before, they did a lot of refactoring on their end and update the naming standard.

In this case, I would suggest creating new manifest files based on the current dataset. You can refer to our manifest generator notebook, to generate the manifest files.

Hope it helps!