facebookresearch / muss

Code and models used in "MUSS Multilingual Unsupervised Sentence Simplification by Mining Paraphrases".

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

AttributeError: module 'faiss' has no attribute 'METRIC_L2_DIST

NomadXD opened this issue · comments

I'm getting the following error when I execute python scripts/mine_sequences.py .

Traceback (most recent call last):
  File "scripts/mine_sequences.py", line 118, in <module>
    train_sentences, get_index_name(), get_embeddings, faiss.METRIC_L2_DIST, base_index_dir
AttributeError: module 'faiss' has no attribute 'METRIC_L2_DIST

Getting this error in both faiss-gpu and faiss-cpu. However when I use it as faiss.METRIC_L2 (without DIST), it works fine. Any idea about the issue ?

Hi @NomadXD
That's a good question, maybe it was changed in faiss.
I will rename it to METRIC_L2 then and hope it doesn't bring unforeseen bugs.

@louismartin Thank you. Is there any specific configuration that I can use for training a small test sample ??? Like in cc_net we can provide the config as a json and tweak the config for a test sample. I just want to do few modifications and test it with a sample before training on actual data.

Hi @NomadXD , I'm reoppening the issue for your question. I'm currently working on a simple script to do what you asked, hope to have some udpates for you soon.

@louismartin Thank you very much. Really appreciate that !! I'm trying to use the project as the baseline for my final year project in the university. I'm trying to train the model for text simplification of Sinhala language(language used in Sri Lanka) , which is a low resource language. I have a corpus that consists of 1 million unparalleled data. I will be getting GPUs from my university to train the model but before that I need to do few modifications to the source and test with some sample data. Ideally something that I can run on google colabs GPU to test. Is it feasible to run a small sample on google colabs ?

I just made the training script simpler if you want to train a single model on google colab.
Tell me if that works well on your end and we can close the issue.

If you want to also mine the data, you will have to run all the mining pipelline, I am not sure that it will work on google colab, but you can try.

@louismartin Hi Louis, I tried to execute that scripts/train_model.py script but I'm getting the following error.

OSError: [E050] Can't find model 'en_core_web_md'. It doesn't seem to be a shortcut link, a Python package or a valid path to a data directory.

I just executed that script after doing a pip install -e . Maybe something else needs to be done before that ?

Hi @NomadXD , you need to install spacy's model with python -m spacy download en_core_web_md

Closing but feel free to open a new issue if you have further questions or problems