How to use opuscleaner-clean with stdin?
eu9ene opened this issue · comments
I'm trying something like that and it doesn't do anything:
paste <(pigz -dc data/train-parts/ELRC-3075-wikipedia_health-v1.en-ru.en.gz) \
<(pigz -dc data/train-parts/ELRC-3075-wikipedia_health-v1.en-ru.ru.gz) \
| opuscleaner-clean --input=- data/train-parts/ELRC-3075-wikipedia_health-v1.en-ru.filters.json en ru
Am I doing it wrong?
This works:
opuscleaner-clean data/train-parts/ELRC-3075-wikipedia_health-v1.en-ru.filters.json en ru
It was an indentation error, the code to run the pipeline was in the else-case that only triggers when you pass no --input
🤦