Unable to import LaserEncoderPipeline

Question

Unable to import LaserEncoderPipeline

sumedhan-r opened this issue 3 months ago · comments

While calling LaserEncoderPipeline for the purpose of downstream NLP tasks, the first error that popped up was a ValueError, which stated

mutable default <class 'fairseq.dataclass.configs.CommonConfig'> for field common is not allowed: use default_factory

I asked the opinion of ChatGPT on the same and had got a code that slightly modified the line where the Config classes are declared.

After having made changes to all the respective Config related classes, I was getting another Error, namely ValidationError which stated

Object of unsupported type: '_MISSING_TYPE'
full_key:
reference_type=None
object_type=None

Below are attached some screenshots related to the errors. Please look into this at the earliest.

sumedhan-r · Answer 1 · Tue Mar 26 2024 12:32:20 GMT+0800 (China Standard Time)

The suggestion given by ChatGPT is as follows :

@dataclass class FairseqConfig(FairseqDataclass): common: CommonConfig = field(default_factory=CommonConfig) common_eval: CommonEvalConfig = field(default_factory=CommonEvalConfig)

David Dale · Answer 2 · Tue Apr 16 2024 23:56:52 GMT+0800 (China Standard Time)

@sumedhan-r can you please indicate the versions of fairseq and omegaconf that you are using, and the minimal code required to reproduce the problem?

dpdeb · Answer 3 · Mon Apr 22 2024 22:11:01 GMT+0800 (China Standard Time)

I am getting this error too. I was able to recreate it in Python 3.11 right from the import statement:
from laser_encoders import LaserEncoderPipeline

My sense is that ChatGPT's suggestion is on the right track. Specifically, there are a number of statements toward the end of fairseq/dataclass/configs.py that assign a mutable type as a default. Assigning these using the pattern field(default_factory=x) makes these errors go away. (See https://stackoverflow.com/questions/53632152/why-cant-dataclasses-have-mutable-defaults-in-their-class-attributes-declaratio for an additional explanation)

It appears the dubious pattern is still used through the latest version of fairseq, v0.12.2, which was released in 2022.

David Dale · Answer 4 · Mon Apr 22 2024 22:28:10 GMT+0800 (China Standard Time)

@dpdeb @sumedhan-r can you please indicate the versions of fairseq and omegaconf that cause the error?

When I am installing all packages from scratch (see this Colab notebook for repro), I get fairseq-0.12.2, omegaconf-2.0.6 and laser_encoders-0.0.1 installed by default, and they are working fine.

Artjoms Kučerjavijs · Answer 5 · Thu Apr 25 2024 16:51:46 GMT+0800 (China Standard Time)

So, I have got the same error:

raise ex # set end OC_CAUSE=1 for full backtrace
^^^^^^^^
omegaconf.errors.ValidationError: Object of unsupported type: '_MISSING_TYPE'
full_key:
reference_type=None
object_type=None

I also changed classes fairseq\dataclass\configs.py and hydra\conf_init_.py to use default_factory and then received the error above. Has anyone come up with a fix? :)

David Dale · Answer 6 · Thu Apr 25 2024 17:01:09 GMT+0800 (China Standard Time)

@TheHappyLemon how can I reproduce your error?
Can you please share a Colab notebook or something that reproduces it?

Artjoms Kučerjavijs · Answer 7 · Thu Apr 25 2024 17:15:15 GMT+0800 (China Standard Time)

@avidale
Well, I havent done anything special. Firstly, I installed laser_encoders through anaconda with pip install laser_encoders. Then when i wanted to just import the library I got error stating that I should use default_factory for some configs.py. So it is done just with

from laser_encoders import LaserEncoderPipeline

Initially, I thought I have to install newest versions of dependent package fairseq. But I just couldn`t install fairseq at all, because I was getting error: FileNotFoundError: [Errno 2] No such file or directory: 'VERSION.txt. There are multiple issues for this error, like skrub-data/skrub#476 I also tried to install it from local clone, but then I was getting errors described here facebookresearch/demucs#423 So I decided to abandon this idea and fix error with default_factory. I did fixes in miniconda3\Lib\site-packages\fairseq\dataclass\configs.py and in miniconda3\Lib\site-packages\hydra\conf_init_.py And then finally I received the error I commented.

I have following packages versions (ran pip install laser_encoders to get this info):
Requirement already satisfied: laser_encoders in c:\users\artem\miniconda3\lib\site-packages (0.0.1)
Requirement already satisfied: fairseq>=0.12.2 in c:\users\artem\miniconda3\lib\site-packages (from laser_encoders) (0.12.2)
Requirement already satisfied: omegaconf<2.1 in c:\users\artem\miniconda3\lib\site-packages (from fairseq>=0.12.2->laser_encoders) (2.0.6)

So I dont know what whould be the best way to reproduce. Maybe do a clean install? Idk :(

Artjoms Kučerjavijs · Answer 8 · Thu Apr 25 2024 17:28:27 GMT+0800 (China Standard Time)

@avidale

I just created a new fresh virtual conda environment, ran pip install laser_encoders, succesfully installed following packages:

Installing collected packages: tbb, sentencepiece, intel-openmp, bitarray, antlr4-python3-runtime, unicategories, portalocker, omegaconf, mkl, cython, torch, sacremoses, sacrebleu, hydra-core, torchaudio, fairseq, laser_encoders
Successfully installed antlr4-python3-runtime-4.8 bitarray-2.9.2 cython-3.0.10 fairseq-0.12.2 hydra-core-1.0.7 intel-openmp-2021.4.0 laser_encoders-0.0.1 mkl-2021.4.0 omegaconf-2.0.6 portalocker-2.8.2 sacrebleu-2.4.2 sacremoses-0.1.0 sentencepiece-0.2.0 tbb-2021.12.0 torch-2.3.0 torchaudio-2.3.0 unicategories-0.1.2

And after running from laser_encoders import LaserEncoderPipeline, I got:

raise ValueError(f'mutable default {type(f.default)} for field '
ValueError: mutable default <class 'fairseq.dataclass.configs.CommonConfig'> for field common is not allowed: use default_factory

Artjoms Kučerjavijs · Answer 9 · Thu Apr 25 2024 17:44:03 GMT+0800 (China Standard Time)

Error is reproduced on windows 11 with this environment:
conda create -n test_env python=3.11.8 anaconda

David Dale · Answer 10 · Thu Apr 25 2024 23:04:16 GMT+0800 (China Standard Time)

Apparently, Fairseq is not supporting Python 3.11 and newer versions; see e.g. facebookresearch/fairseq#5191.

Thus, there are 3 possible solutions for you:

Downgrade your Python to 3.10 or below
Fork Fairseq and fix the error (this pull request facebookresearch/fairseq#5359 might be what you need)
Migrate from the LASER encoder (which uses Fairseq which has already become pretty stale) to the SONAR encoder (which performs better and is based on Fairseq2, a package that currently enjoys better support).