dropreg / R-Drop

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Unable to preprocess data for summarization

samiksome92 opened this issue · comments

I followed these instructions:

git clone https://github.com/dropreg/R-Drop.git
cd R-Drop/fairseq_src/
pip install --editable .

and tried to preprocess the data for summarization by running,

bash script/preprocess.sh

However, I get the following error:

/users/gpu/samiks/anaconda3/envs/rdrop/bin/python: No module named examples.roberta.multiprocessing_bpe_encoder

It seems multiprocessing_bpe_encoder is missing from this repo. Are we supposed to run the preprocessing with a separate fairseq install?

I reinstalled FairSeq and apex, this problem can be fixed.

commented

The file "multiprocessing_bpe_encoder.py" is fairseq roberta scripts:
wget https://raw.githubusercontent.com/pytorch/fairseq/main/examples/roberta/multiprocessing_bpe_encoder.py