Unable to import LaserEncoderPipeline
sumedhan-r opened this issue · comments
While calling LaserEncoderPipeline for the purpose of downstream NLP tasks, the first error that popped up was a ValueError, which stated
mutable default <class 'fairseq.dataclass.configs.CommonConfig'> for field common is not allowed: use default_factory
I asked the opinion of ChatGPT on the same and had got a code that slightly modified the line where the Config classes are declared.
After having made changes to all the respective Config related classes, I was getting another Error, namely ValidationError which stated
Object of unsupported type: '_MISSING_TYPE'
full_key:
reference_type=None
object_type=None
Below are attached some screenshots related to the errors. Please look into this at the earliest.
The suggestion given by ChatGPT is as follows :
@dataclass class FairseqConfig(FairseqDataclass): common: CommonConfig = field(default_factory=CommonConfig) common_eval: CommonEvalConfig = field(default_factory=CommonEvalConfig)
@sumedhan-r can you please indicate the versions of fairseq and omegaconf that you are using, and the minimal code required to reproduce the problem?
I am getting this error too. I was able to recreate it in Python 3.11 right from the import statement:
from laser_encoders import LaserEncoderPipeline
My sense is that ChatGPT's suggestion is on the right track. Specifically, there are a number of statements toward the end of fairseq/dataclass/configs.py that assign a mutable type as a default. Assigning these using the pattern field(default_factory=x)
makes these errors go away. (See https://stackoverflow.com/questions/53632152/why-cant-dataclasses-have-mutable-defaults-in-their-class-attributes-declaratio for an additional explanation)
It appears the dubious pattern is still used through the latest version of fairseq, v0.12.2, which was released in 2022.
@dpdeb @sumedhan-r can you please indicate the versions of fairseq
and omegaconf
that cause the error?
When I am installing all packages from scratch (see this Colab notebook for repro), I get fairseq-0.12.2
, omegaconf-2.0.6
and laser_encoders-0.0.1
installed by default, and they are working fine.
So, I have got the same error:
raise ex # set end OC_CAUSE=1 for full backtrace
^^^^^^^^
omegaconf.errors.ValidationError: Object of unsupported type: '_MISSING_TYPE'
full_key:
reference_type=None
object_type=None
I also changed classes fairseq\dataclass\configs.py and hydra\conf_init_.py to use default_factory and then received the error above. Has anyone come up with a fix? :)
@TheHappyLemon how can I reproduce your error?
Can you please share a Colab notebook or something that reproduces it?
@avidale
Well, I havent done anything special. Firstly, I installed laser_encoders through anaconda with pip install laser_encoders. Then when i wanted to just import the library I got error stating that I should use default_factory for some configs.py. So it is done just with
from laser_encoders import LaserEncoderPipeline
Initially, I thought I have to install newest versions of dependent package fairseq. But I just couldn`t install fairseq at all, because I was getting error: FileNotFoundError: [Errno 2] No such file or directory: 'VERSION.txt. There are multiple issues for this error, like skrub-data/skrub#476 I also tried to install it from local clone, but then I was getting errors described here facebookresearch/demucs#423 So I decided to abandon this idea and fix error with default_factory. I did fixes in miniconda3\Lib\site-packages\fairseq\dataclass\configs.py and in miniconda3\Lib\site-packages\hydra\conf_init_.py And then finally I received the error I commented.
I have following packages versions (ran pip install laser_encoders to get this info):
Requirement already satisfied: laser_encoders in c:\users\artem\miniconda3\lib\site-packages (0.0.1)
Requirement already satisfied: fairseq>=0.12.2 in c:\users\artem\miniconda3\lib\site-packages (from laser_encoders) (0.12.2)
Requirement already satisfied: omegaconf<2.1 in c:\users\artem\miniconda3\lib\site-packages (from fairseq>=0.12.2->laser_encoders) (2.0.6)
So I dont know what whould be the best way to reproduce. Maybe do a clean install? Idk :(
I just created a new fresh virtual conda environment, ran pip install laser_encoders, succesfully installed following packages:
Installing collected packages: tbb, sentencepiece, intel-openmp, bitarray, antlr4-python3-runtime, unicategories, portalocker, omegaconf, mkl, cython, torch, sacremoses, sacrebleu, hydra-core, torchaudio, fairseq, laser_encoders
Successfully installed antlr4-python3-runtime-4.8 bitarray-2.9.2 cython-3.0.10 fairseq-0.12.2 hydra-core-1.0.7 intel-openmp-2021.4.0 laser_encoders-0.0.1 mkl-2021.4.0 omegaconf-2.0.6 portalocker-2.8.2 sacrebleu-2.4.2 sacremoses-0.1.0 sentencepiece-0.2.0 tbb-2021.12.0 torch-2.3.0 torchaudio-2.3.0 unicategories-0.1.2
And after running from laser_encoders import LaserEncoderPipeline
, I got:
raise ValueError(f'mutable default {type(f.default)} for field '
ValueError: mutable default <class 'fairseq.dataclass.configs.CommonConfig'> for field common is not allowed: use default_factory
Error is reproduced on windows 11 with this environment:
conda create -n test_env python=3.11.8 anaconda
Apparently, Fairseq is not supporting Python 3.11 and newer versions; see e.g. facebookresearch/fairseq#5191.
Thus, there are 3 possible solutions for you:
- Downgrade your Python to 3.10 or below
- Fork Fairseq and fix the error (this pull request facebookresearch/fairseq#5359 might be what you need)
- Migrate from the LASER encoder (which uses Fairseq which has already become pretty stale) to the SONAR encoder (which performs better and is based on Fairseq2, a package that currently enjoys better support).