audioku / cross-accent-maml-asr

Meta-learning model agnostic (MAML) implementation for cross-accented ASR

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Video file name appeared garbled

PengWenChen opened this issue · comments

Hello.

Thanks for the sharing of the code.
I've tried to download the data by the command
cd data && bash download_cv2.sh
But the video names in clips directory are all garbled like below.
ffffd0eda81a6e48d9d3a5cf9f2b0aa5f17db75603fa7e32c5b978f568528e5129f77af9ffca8eb01dd4512c3a93a569df29678ca3e810a1bbfa2a65359704a7.mp3

Is that normal?

I use Ubuntu 18.04.4

I have the same issue.

@gentaiscool @SamuelCahyawijaya can you explain how to fix it because I tried to run meta_train.py but the clips are named like ffffd0eda81a6e48d9d3a5cf9f2b0aa5f17db75603fa7e32c5b978f568528e5129f77af9ffca8eb01dd4512c3a93a569df29678ca3e810a1bbfa2a65359704a7.mp3 which is different from the csv file in manifests folder with names like
./data/CommonVoice2_dataset/clips/common_voice_en_1100186.mp3 so that's why the code doesn't run for training the model. Here is the output when I tried to run the command line to train the meta_train:

RuntimeError: Error loading audio file: failed to open file ./data/CommonVoice2_dataset/clips/f91b898dcfaf8655bdbbed448068c1faae5ebe598b045de2a04a6335c967846990ae8f24070b64b8a3e33ba11aaaec51b7be927ab680eeb652945f974b77e8f3

Error: pop from empty list, fetching new data...
formats: can't open input file `./data/CommonVoice2_dataset/clips/1f68b4dfb437b760d5b2f0c86ebd19a0c425ecb5474778c50adbaee40cd50800b97411ece34a531250857868c048a3da53f4a087464319f1b88c97b16ea6e40d': No such file or directory
Exception in thread Thread-361:
Traceback (most recent call last):
  File "/usr/lib/python3.6/threading.py", line 916, in _bootstrap_inner
    self.run()
  File "/usr/lib/python3.6/threading.py", line 864, in run
    self._target(*self._args, **self._kwargs)
  File "/opt/vud/thesis/cross-accent-maml-asr/trainer/asr/meta_trainer.py", line 125, in fetch_train_batch
    batch_data = train_data_list[manifest_id].sample(k_train, k_valid, manifest_id)
  File "/opt/vud/thesis/cross-accent-maml-asr/utils/data_loader.py", line 274, in sample
    spect = self.parse_audio(audio_path)[:,:self.args.src_max_len]
  File "/opt/vud/thesis/cross-accent-maml-asr/utils/data_loader.py", line 69, in parse_audio
    y = load_audio(audio_path)
  File "/opt/vud/thesis/cross-accent-maml-asr/utils/audio.py", line 8, in load_audio
    sound, _ = torchaudio.load(path, format="mp3") #remove normalization=True
  File "/opt/vud/thesis/cross-accent-maml-asr/venv/lib/python3.6/site-packages/torchaudio/backend/sox_io_backend.py", line 153, in load
    filepath, frame_offset, num_frames, normalize, channels_first, format)

Hello, sorry for the late, I think CommonVoice changes their naming convention. Kindly check this issue for more info.