mlfoundations / open_lm

A repository for research on medium sized language models.

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Remote Sync FSSPEC cannot upload large checkpoints

Skylion007 opened this issue · comments

Large checkpoints cannot be uploaded using the memory mapper methods used by the fsspec backend here. It makes assumptions that all files are under 5GB. Uploading a larger file will not work using these methods. You must change the following config value for the s3fs file system https://github.com/fsspec/s3fs/blob/a28863f084a91ee78d9cd65bd6767b2f44e81b33/s3fs/core.py#L213 or use methods that allow fsspec to know the size of the file in advanced.

You cannot use methods that rely on the pipe abstraction: https://github.com/fsspec/filesystem_spec/blob/0bb3f26c412d7ad9b2d52a5c32265014709d1c1f/fsspec/mapping.py#L128