Module not found: No module named 'examples.speech_recognition'

Question

Module not found: No module named 'examples.speech_recognition'

andergisomon opened this issue a year ago · comments

I was trying out the colab notebook, selected the l1107 model and changed the language to dtp and when running the ASR inference I got this error:

I have not tried this with the smaller model and lang='eng'

ModuleNotFoundError                       Traceback (most recent call last)
[/usr/local/lib/python3.10/dist-packages/easymms/models/asr.py](https://localhost:8080/#) in <module>
     28 try:
---> 29     from fairseq.examples.speech_recognition.new.infer import hydra_main
     30 except ImportError:
[/content/fairseq/examples/speech_recognition/__init__.py](https://localhost:8080/#) in <module>
----> 1 from . import criterions, models, tasks  # noqa

[/content/fairseq/examples/speech_recognition/criterions/__init__.py](https://localhost:8080/#) in <module>
     14         criterion_name = file[: file.find(".py")]
---> 15         importlib.import_module(
     16             "examples.speech_recognition.criterions." + criterion_name

[/usr/lib/python3.10/importlib/__init__.py](https://localhost:8080/#) in import_module(name, package)
    125             level += 1
--> 126     return _bootstrap._gcd_import(name[level:], package, level)
    127 

ModuleNotFoundError: No module named 'examples.speech_recognition'

During handling of the above exception, another exception occurred:

ModuleNotFoundError                       Traceback (most recent call last)
[<ipython-input-6-161905b8ad81>](https://localhost:8080/#) in <cell line: 1>()
----> 1 from easymms.models.asr import ASRModel
      2 
      3 asr = ASRModel(model=f'./models/{model}.pt')
      4 
      5 transcriptions = asr.transcribe(files, lang='dtp', align=False)

[/usr/local/lib/python3.10/dist-packages/easymms/models/asr.py](https://localhost:8080/#) in <module>
     29     from fairseq.examples.speech_recognition.new.infer import hydra_main
     30 except ImportError:
---> 31     from examples.speech_recognition.new.infer import hydra_main
     32 
     33 

ModuleNotFoundError: No module named 'examples.speech_recognition'

klem · Answer 1 · Fri Jun 02 2023 15:01:41 GMT+0800 (China Standard Time)

Error on line 29 of asr.py:

Import "fairseq.examples.speech_recognition.new.infer" could not be resolvedPyright(reportMissingImports)

Abdeladim S. · Answer 2 · Sat Jun 03 2023 03:32:41 GMT+0800 (China Standard Time)

@andergisomon, Did you get this error on colab, or you running it locally ?

klem · Answer 3 · Sat Jun 03 2023 03:34:34 GMT+0800 (China Standard Time)

@andergisomon, Did you get this error on colab, or you running it locally ?

I have yet to try it locally, but I ran it on colab.

Abdeladim S. · Answer 4 · Sat Jun 03 2023 03:45:26 GMT+0800 (China Standard Time)

oh, was it on Colab? I have recently tested it and I didn't get any errors.
I will try to rerun it and fix it if needed.

Abdeladim S. · Answer 5 · Sat Jun 03 2023 06:22:34 GMT+0800 (China Standard Time)

@andergisomon, I have made some changes to the notebook to handle that error differently.
Could you please give it a try now ?

klem · Answer 6 · Sat Jun 03 2023 18:44:31 GMT+0800 (China Standard Time)

@abdeladim-s Hello. I just tried the colab notebook and I'm surprised. It took 9 minutes (and 17GB of RAM) to transcribe a 2 second audio sample of my own voice, but the transcription was 100% accurate. Earlier I tried using ASR via huggingface transformers and while it took less than 8GB of RAM and ran faster the transcription was completely garbled.

There has to be something I'm missing about running the inference on the ASR model, but the current docs just don't go into the details enough.

klem · Answer 7 · Sat Jun 03 2023 23:55:01 GMT+0800 (China Standard Time)

Is there a way to speed up the inference through EasyMMS? As in utilizing the GPU runtime as opposed to CPU.

Abdeladim S. · Answer 8 · Sun Jun 04 2023 08:46:45 GMT+0800 (China Standard Time)

Hi @andergisomon,

Yes you can speed up the inference by using the GPU instead of the CPU, to do this you need to select the GPU runtime and use device='cuda' in the transcribe function.
You can also speed up the inference by choosing a smaller model, but this depends on the target language you are using unfortunately.

That being said, transcribing a 2 seconds audio in 9 minutes seems weird! I tried it with even 30 s audio and it took just a few minutes.
If you can share the audio I can test it as well and I will let you know if I have any idea ?

klem · Answer 9 · Sun Jun 04 2023 12:22:48 GMT+0800 (China Standard Time)

@abdeladim-s

What's weird was I tried it again with 16 seconds of audio and it took 11 minutes with the colab notebook you sent, only taking 3 minutes more than the 2 second audio sample. But I did use the l1107 model on the free CPU runtime with the language code changed. I didn't change anything else.

Abdeladim S. · Answer 10 · Mon Jun 05 2023 04:20:45 GMT+0800 (China Standard Time)

@andergisomon
I think the bottleneck is to load the model into memory, The 1107 model is 14G size so you will need at least this amount of memory before even doing any operation.
if you run the inference of that 16s audio after the 2s one, the second inference won't take much time because the model is already loaded into memory, that's why I made the files variable as a list, so we can run the inference on all the files before releasing the model.