facebookresearch / flores

Facebook Low Resource (FLoRes) MT Benchmark

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Can't Decompress ”commoncrawl.deduped.en.xz“.

zdou0830 opened this issue · comments

Hi, I'm having trouble decompressing ”commoncrawl.deduped.en.xz“.

unxz: commoncrawl.deduped.en.xz: Unexpected end of input

I can decompress other files. Is there anything wrong with the file?

Originally posted by @zdou0830 in #5 (comment)


@zdou0830, I can repro:

  curl http://data.statmt.org/wmt19/parallel-corpus-filtering/commoncrawl.deduped.en.xz > commoncrawl.deduped.en.xz
  unxz commoncrawl.deduped.en.xz
unxz: commoncrawl.deduped.en.xz: Unexpected end of input

I suggest emailing wmt-tasks@googlegroups.com about this issue.
I'm going to close since the data is not in this repository.