bitextor / bifixer

Tool to fix bitexts and tag near-duplicates for removal

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Bifixer doesn't see input file

Syrkovski opened this issue · comments

Hello, I tried use bifixer, but I have error.
The command I used:

python3.7 bifixer/bifixer.py --scol 1 --tcol 2 --ignore_duplicates corpus corpus.bifixed en zh

The error:

bifixer.py: error: argument input: can't open 'corpus': [Errno 2] No such file or directory: 'corpus'

I have the corpus.en and corpus.zh, but bifixer doesn't see it.

Hi @Syrkovski .
It has to be in a single file (paste corpus.en corpus.zh > corpus)

Thank you