opendatacube / datacube-alchemist

Dataset to Dataset Transformations

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Upload injects extra subdirectory when destination folder exists on s3 already

Kirill888 opened this issue · comments

happens when process is killed half way through upload.

def upload_now(self):
fs = fsspec.filesystem('s3')
fs.put(self._location, self.s3location, recursive=True)

The docs for fs.put(...) don't say much, but code does this:

    def put(self, lpath, rpath, recursive=False, **kwargs):
        """ Upload file from local """
        if recursive:
            lpaths = []
            for dirname, subdirlist, filelist in os.walk(lpath):
                lpaths += [os.path.join(dirname, filename) for filename in filelist]
            rootdir = os.path.basename(lpath.rstrip("/"))
            if self.exists(rpath):
                # copy lpath inside rpath directory
                rpath2 = os.path.join(rpath, rootdir)
            else:
                # copy lpath as rpath directory
                rpath2 = rpath
            rpaths = [
                os.path.join(rpath2, path[len(lpath) :].lstrip("/")) for path in lpaths
            ]
        else:
            lpaths = [lpath]
            rpaths = [rpath]
        for lpath, rpath in zip(lpaths, rpaths):
            with open(lpath, "rb") as f1:
                with self.open(rpath, "wb", **kwargs) as f2:
                    data = True
                    while data:
                        data = f1.read(self.blocksize)
                        f2.write(data)

I different upload message is used now.