Wikitext-103 URL is down
albertz opened this issue · comments
text/torchtext/datasets/wikitext103.py
Line 11 in 4bf6b30
All links to https://s3.amazonaws.com/research.metamind.io are not working anymore. I get "Access Denied".
For reference, one copy I found is via pardata:
https://github.com/CODAIT/pardata/blob/1d1600ad3eed6894da7dbddc451cd38aa03c770c/tests/schemata/datasets.yaml#L42C21-L42C99
But it's not exactly the same file (tar.gz instead of zip), but it looks like it has the same content (the files: LICENSE.txt README.txt wiki.test.tokens wiki.train.tokens wiki.valid.tokens).
Another copy of the data is on HuggingFace in various forms, for example: https://huggingface.co/datasets/wikitext
Hi Albertz, I faced exactly same issue on torchtext 0.17.2. Have you got a neat solution to this issue? I found datasets from other sources may need adaption 1by1.
I did not found the zip files anywhere. But I was using the tar.gz files instead which I linked above, which seem to contain the same content.